WSP
WSP (Windows Search Protocol) support in samba
support in samba
- since samba-4.20 samba ships a command line client for searching using the WSP protocol. The 'wspsearch' cli client does not work against a samba server as it currently does not implement the WSP protocol
WSP server support
- The WSP protocol is not supported upstream in samba yet. However, there are a couple of upstream merge requests currently open
- 1. Support rawpipe services (servers using named pipes but not using the dcerpc protocol) I wanted to have the wsp server to be managed in the same way as dcerpc servers are but I didn't manage to succeed in doing that so it isn't currently possible for example to have a pool of wsp servers, instead just a single wsp server instance is launched. See here
- 2. Allow mapping between authenticated samba user and elastic/opensearch basic user. The allows samba using (spotlight or WSP) to authenticate over http with a basic elastic/opensearch user or an apikey (elasticsearch only). See here
- 3 A merge request with the WSP stand alone server code (including the merge requests above) See here
what works with windows clients ?
searching by kind (from dropdown in search view)
searching by phrase (entered into search bar)
Matches for search phrase are also highlighted if the associated content summary contains the search phrase
folder search results
Recently support for folder results has been added. Although it has always been possible to search the both the fscrawler main and folder indexes, it wasn't until recently possible to distinguish whether a match was a folder or file result. This is now possible with later versions of fscrawler. By setting the 'elasticsearch:wsp use fscrawler folders' smb.conf parameter to true folder results are now included in any searches performed.
can I try it out ?
Yes you can, the simplest way to try it out is to use docker compose to run a test or demo setup with practically zero configuration or interaction. The compose setup is also ideal for testing, experimentation or just getting used to the various pieces that need to be configured and setup. Please see https://gitlab.com/npower/wsp-container Note: This project is not intended for production use but for testing and experimentation.
Alternatively if you are willing you can build from a git branch. Note: This branch is based off a recent samba-4.22.x branch with all of the merge requests above combined together.
git clone git://git.samba.org/npower/samba.git samba-wsp cd samba-wsp git checkout -b current_wsp_422_wip origin/current_wsp_422_wip ./configure.developer # (and install all the dependencies) make sudo make install
Or finally if using Opensuse you can install a WSP enabled samba 4.22.x version from one of the following repositories
E.g for tumbleweed
zypper addrepo https://download.opensuse.org/repositories/home:/npower:/samba_WSP/openSUSE_Tumbleweed/home:npower:samba_WSP.repo wsp zypper in --allow-vendor-change samba-wsp
WSP running and testing using elasticsearch
install elasticsearch
using elasticsearch-8.15.2 (latest version at time of writing)
rpm -ivh elasticsearch-8.15.2-x86_64.rpm
take note of the generated built-in superuser 'elastic' (output as part of the rpm install)
if desired change the generated superuser password
/usr/share/elasticsearch/bin/elasticsearch-reset-password -iu elastic
start it
systemctl daemon-reload systemctl start elasticsearch.service
check if is running
systemctrl status elasticsearch.service
check communication
curl -k -uelastic:elastic https://127.0.0.1:9200
should respond with
{
"name" : "localhost.localdomain",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "n-AXwOQeTOSddb_p3UXsUQ",
"version" : {
"number" : "8.15.2",
"build_flavor" : "default",
"build_type" : "rpm",
"build_hash" : "98adf7bf6bb69b66ab95b761c9e5aadb0bb059a3",
"build_date" : "2024-09-19T10:06:03.564235954Z",
"build_snapshot" : false,
"lucene_version" : "9.11.1",
"minimum_wire_compatibility_version" : "7.17.0",
"minimum_index_compatibility_version" : "7.0.0"
},
"tagline" : "You Know, for Search"
}
configure elasticsearch
Note: This is a developer setup, not suitable for production, please refer to the elasticsearch documentation for specific information about securing elasticsearch
- disable ssl
- for testing it is convenient to be able to easily see the communication between samba and elasticsearch unencrypted, of course ssl can be re-enabled after a working setup has been established.
in /etc/elasticsearch/elasticsearch.yml:
xpack.security.http.ssl:
- enabled: true + enabled: false
WSP running and testing using opensearch
install opensearch
using opensearch-2.15.0 (latest version at time of writing)
OPENSEARCH_INITIAL_ADMIN_PASSWORD=1234?Changeme rpm -ivh opensearch-2.15.0-linux-x64.rpm
start it
systemctl daemon-reload systemctl start opensearch.service
check if is running
systemctl status opensearch.service
check communication
curl -k -uadmin:1234?Changeme https://127.0.0.1:9200
should respond with
{
"name" : "localhost.localdomain",
"cluster_name" : "opensearch",
"cluster_uuid" : "6fJA5WMmSiK2wc4rHdkVvw",
"version" : {
"number" : "7.10.2",
"build_type" : "rpm",
"build_hash" : "61dbcd0795c9bfe9b81e5762175414bc38bbcadf",
"build_date" : "2024-06-20T03:27:31.591886152Z",
"build_snapshot" : false,
"lucene_version" : "9.10.0",
"minimum_wire_compatibility_version" : "7.10.0",
"minimum_index_compatibility_version" : "7.0.0"
},
"tagline" : "The OpenSearch Project: https://opensearch.org/"
}
configure opensearch
Note: This is a developer setup, not suitable for production, please refer to the opensearch documentation for specific information about securing opensearch
- disable ssl
- for testing it is convenient to be able to easily see the communication between samba and opensearch unencrypted, of course ssl can be re-enabled after a working setup has been established.
in /etc/opensearch/opensearch.yml:
-plugins.security.ssl.http.enabled: false +plugins.security.ssl.http.enabled: true
- allow fscrawler to talk to opensearch
- Add following line to /etc/opensearch/opensearch.yml
compatibility.override_main_response_version: true (to allow fscrawler to communicate with opensearch)
Use fscrawler to index files for elasticsearch or opensearch
using latest fscrawler snapshot version 2.10 (at time of writing)
- identify (or create) some locations on the filesystem (which are accessible from samba shares) that have content you would like to index
- install fscrawler
unzip fscrawler-distribution-2.10-20240702.144319-374.zip
- create a user to use to communicate with elasticsearch or opensearch to populate the index
- Here we will use the 'elastic' user that comes already setup with elaticsearch. Note: the elasticsearch 'elastic' user is a super user. You might want to consider creating a specific elasticsearch user for fscrawler to use that has appropriate roles assigned. e.g. with 'just enough' privileges to access the index(s) you want to create/modify. Same applies to opensearch
for elasticsearch see creating users, creating roles, creating roles, creating API key similarly for opensearch see here and associated documentation.
- use fscrawler to create an index
./fscrawler-distribution-2.10-SNAPSHOT/bin/fscrawler --setup index_name
- edit the config file ~/.fscrawler/index_name/_settings.yaml created in the last step
example of a minimal config (without ssl) For details on defaults please see the comments in the content of the _settings.yaml created in the above step
--- name: "index_name" fs: url: "/path/to/index/files" continue_on_error: true attributes_support: true raw_metadata: true elasticsearch: nodes: - url: "http://localhost:9200" ssl_verification: false username: "elastic" password: "somesecretpassword"
- run fscrawler again
./fscrawler-distribution-2.10-SNAPSHOT/bin/fscrawler index_name --loop 1
Configure WSP for samba
use the following global configuration
wsp backend = elasticsearch
optionally map one, more or all authenticated samba user(s) to one or more search user(s) with the following 'username map' configuration
elasticsearch:username_map = /etc/samba/map_searchusers
username map format
The optional username map file is processed as per the normal username map file (G), the map file is parsed line by line, each line should contain a single user on the left (to map to) followed by a '=' followed by list of usernames to map from e.g.
opensearch_user = samba user
'*' can be used in place of a 'samba user' to match all currently unmatched samba users. An unmapped user resulting from no matches in the username map file (or the absence of a definition for 'elasticsearch:username map) will result in the server communicating with anonymous authentication
Example
admin = *
will map all previously unmapped (in username map file) users to admin
Note: passwords associated with the search users are managed by the net command.
Storing a password to be used for normal http_auth authentication for a 'search' (e.g. elasticsearch or opensearch) user
net setsearchuser -Uelastic%1234 -W ""
Storing an apikey to be used with http_apikey authentication where 'id' and 'api_key' are obtained from the results of an apikey creation api call. For more info about creating api keys please see creating api keys
{
"id" : "23mvj5IBhjGmVofgCu1p",
"name" : "searchserver",
"api_key" : "ZXfRonkITEyf_dBHk_4o_g",
"encoded" : "MjNtdmo1SUJoakdtVm9mZ0N1MXA6WlhmUm9ua0lURXlmX2RCSGtfNG9fZw=="
}
net setsearchuser -U23mvj5IBhjGmVofgCu1p%ZXfRonkITEyf_dBHk_4o_g -W "" apikey
Deleting a username/password (or apikey)
net setsearchuser -Uelastic -W "" delete
Listing search users/id(s) stored
net listsearchusers
Retrieving password/key associated with user/id
net getsearchuser -Uelastic -W ""
use the following share configuration
wsp = true elasticsearch:index = index_name elasticsearch:max results = 200 elasticsearch:wsp_acl_filtering=true elasticsearch:wsp use fscrawler folders = true
start samba
systemctrl start smb.service
start wspd server
If you have installed from the suse repo above you should be able to simply enable socket activation (which will autostart the wspd server whenever a client connects if it isn't already running)
systemctl enable wspd.socket systemctl start wspd.socket
If you have built from source and haven't configured with '--with-systemd-install-services' then you can start the WSP service manually (wspd is located in the directory configured by --libexecdir) e.g.
/usr/lib64/samba/wspd --timeout=0
Note: specifying the timeout as zero prevents the wsp server from exiting after a specific idle time
wsp smb.conf configuration parameters
| name | value | description | |
|---|---|---|---|
| elasticsearch:delegator (S) [*] | elastic | Elasticsearch user used to submit search requests on behalf of other users | |
| elasticsearch:user (S) [*] | esuser | Elasticsearch user used to authenticate to Elasticsearch | |
| elasticsearch:username_map (S) [*] | es_users_map | A path to a file containing mapping of samb a client usernames to elasticsearch usernames (Note: elasticsearch:username_map & elasticsearch:user are mutually exclusive with elasticsearch:user preferred if both settings are present) | |
| elasticsearch:address (S) | needle.haystack.samba.org | Specifies the name of the Elasticsearch server to us | |
| elasticsearch:index (S) | index | name of elasticsearch index to search | |
| elasticsearch:max results (S) | 100 | number of results to return | |
| elasticsearch:port (S) | 8000 | TCP port of Elasticsearch server to use | |
| elasticsearch:use tls (S) | no | A boolean value with specifies whether to use HTTPS when talking to the Elasticsearch server |
| name | value | description |
|---|---|---|
| elasticsearch:wsp use fscrawler folders (S) [*] | yes | A boolean specifying whether to additionally use fscrawler's folder index to include folder results in searches. |
| elasticsearch:wsp_acl_filtering (S) [*] | false | A boolean specifying if results should be filtered based on whether the smbclient user has access to the result |
| elasticsearch:wsp_mappings (G) [*] | mymappings.json | A json file specifying the mapping of elasticsearch attributes to WSP properties (and associated conversions) |
| wsp backend (G) [*] | elasticsearch | string specifying the supported backend (elasticsearch currently the only supported backend) |
[*] These parameters are not yet in the 'official' samba sources
use wspsearch cli or windows client to search for content (e.g. pictures)
wspsearch -U$user%$password //$host/$share --kind picture
Securing indexed content
It is important to understand that the samba WSP server only translates the WSP protocol messages that make up a query and/or traversal and retrieval of results into requests that elastic/opensearch server can understand. Those requests are made against elastic/opensearch using either the anonymous user, basic user or an api key (only for elasticsearch)
samba WSP communicates with elastic/openseach using encrypted (or plain text) http connections either using anonymous access or basic authentication.
elastic/opensearch provide for authentication and access control via their own security so it is the permissions of the authenticated elastic/opensearch user (or more precisely the role associated with that user) that determines what information can be retrieved and not (at least not directly) the authenticated samba user. (note: when using basic authentication (or api keys) there is a mapping defined between the authenticated samba user and the authenticated elastic/opensearch user)
populating the index is a separate process, fscrawler is a tool/process for scanning files (of different types e.g. videos, photographs, documents etc.) and populating an elastic/opensearch index with the metadata that it has extracted from those files. In order to access the files fscrawler of course needs to run as a user that has permission to read those files. The fscrawler configuration can have elastic/opensearch credentials defined that will be used to communicate with the elastic/opensearch instance or those credentials can be passed on the command line
so the users involved in searching and retrieving results are
- the local user on the host machine (where the files to be indexed are located) that fscrawler process runs as to read the files to produce meta data which will be used to populate the index
- the elastic/opensearch user used by fscrawler to populate the elastic/opensearch index
- the elastic/opensearch user used by the samba WSP server to query the elastic/opensearch index
- the authenticated samba user to perform the search (who may or maynot have permission to access the filenames returned by the search)
There is therefore a likely disconnect between the user used to populate the index, the user used to search that index and the authenticated samba user.
The samba WSP server is however able to acl check the files in the results and filter out results that the authenticated samba user cannot read, this can be enabled with the following setting
elasticsearch:wsp_acl_filtering [S]
However enabling acl filtering brings with it some penalties
- query results are now cached,
- you MUST specify a limit on the number of results returned (cached)
- elasticsearch:max results [S] or the query will fail.
- depending on the data stored in the indexes searches could be take a much greater time than expected, for example a search might yield a large amount of results but if the results that satisfy the acl check are mostly the end of the result set then a very large amount of results (or maybe even all the results) will need to be tested before the 'max results' to be cached is reached
- because results are now cached the memory footprint will be impacted and many concurrent searches could affect available memory of the host system
If at all possible it is better to NOT enable acl filtering. That way the search results don't need to be additionally filtered however the flip side is
- filenames and paths not readable to the authenticated samba user MAY be returned as results of a search (possible information leak)
- similarly information such as keywords or content from a document result may be exposed to a user that otherwise might not have permission to access that information
The user to authenticate against elastic/opensearch is specified by the mapping between the authenticated samba user and an elastic/opensearch basic user (see elasticsearch:username map [G])
Using elastic/opensearch to limit results
elastic/opensearch users have roles associated with them, those roles can be configured to limit or allow access to specific indexes. So, it is possible to limit access to the information stored in an index based on the role associated with the authenticated elastic/opensearch user. The authenticated elastic/opensearch user to be used to communicate with elastic/openseach can be mapped to the authenticated samba user (see above)
Added granularity can be added by using index names that are user specific, e.g the index name configured to used by samba when sending a query to elasticsearch can use variable substitutions so the index name could for example be based on a variation of the username of the authenticated samba user (this is suitable for an index dedicated to specific share that is a personal data store for that user)
the role associated with the authentication elastic/opensearch user can also be modified to use field level security which allows sensitive fields to included or excluded from the results of a query
elasticsearch:
opensearch:
Additionally to align the users that have access to a share (and it's files) you could use a combination of 'force user', 'valid groups' or 'valid users' to limit access to shares to be searched with elastic/opensearch. This for example could be used to ensure there is a match between the files 'crawled' using a user with a certain group and the group used to access the share (and it's files).
