SMB3 kernel status
This page describes the plan, design and work in progress of the efforts to implement SMB3 and later dialects in the kernel cifs/smb3 client (cifs.ko).
The minimum version for use of the Linux kernel SMB3 support is kernel version 3.12 (or backport of cifs.ko module version 2.02 or later) but kernel version 3.18 or later (or equivalent, ie cifs module version 2.04 or later) is recommended for best SMB3 support.
- SMB 2.0 (SMB2.02 dialect) was introduced with Windows Vista/2008 and includes a useful data integrity feature ("durable file handles"). Implementation in the kernel client is complete.
- SMB 2.1 was introduced with Windows 7/Windows 2008R2.
- Basic support for SMB 2.1 was added in kernel version 3.07
- Features done:
- multi credit/large MTU
- Features TODO:
- resilient file handles
- branch cache
- Features TODO:
- SMB 3 (previously known as SMB2.2 dialect) was introduced with Windows 8 and Windows Server 2012. SMB3 support in the kernel wsa much improved in kernel version 3.12. SMB3 dialect defines the following features:
- Basic support for SMB3 is included, as are security improvements (improved faster more secure packet signing, secure negotiate protection against downgrade attacks
- In addition the client can now do network interface discovery (a new FSCTL)
- Still need to do:
- share level encryption
- directory leases
- persistent file handles
- multi channel
- and also SMB direct (SMB3 over RDMA)
- witness notification protocol (a new RPC service)
- Support for a misc. set of loosely related storage features for virtualization (new fsctls, T10 block copy offload, TRIM etc.)
- remote shadow copy support
- branch cache v2
- SMB3.02 was introduced in Windows 8.1 (Windows 'Blue') and Windows Server 2012 R2. Among the new protocol features are those particularly useful for virtualization (HyperV):
- SMB3.02 dialect is not yet negotiated by Samba servers
- SMB3.02 dialect can be requested by the Linux cifs client ("vers=3.02" on mount) but the new optional features, unique to SMB3.02, are not requested.
- Unbuffered I/O flags (ie a 'no cache flag' which may be sent on read or write)
- New RDMA remote invalidate flag
- MS-RSVD (a set of remoteable FSCTLs that improve "SCSI over SMB3")
- Asymmetric Shares (extensions to Witness protocol to allow moving users of one share to a different server, eg for load balancing or maintenance - previously witness protocol could only do this on a per server rather than per-share basis).
- SMB3.1.1 was introduced in Windows 10. Among the new features defined:
- Improvements to security negotiation ("negotiate contexts") and dynamically selectable Cipher and Hash Algorithms.
- New FSCTL for server side copying of file ranges (DUPLICATE_EXTENTS)
Prerequisite / accompanying work
implement durable open and durable reconnect with reopening files
Durable handle cross-node
No mechanism currently in place to reconnect to server other than the one that you have mounted against (e.g. we do not fail over to an alternate DFS referral if two servers export the same DFS path, and we do not support witness protocol failover so can't reconnect to a different server yet).
Multi Credit / Large MTU
Resilient File Handles
- Encryption and improved packet signing
- Secure negotiate (complete)
- Directory leases are a mechanism for caching metadata read operations/directory listings of child objects of a directory (File leases are a mechanism for caching the data operations.)
- The client maintains separate caches for each user context, but still using just one lease to invalidate the cache. This is needed because access based enumeration may cause different directory listing depending on the user context.
Persistent File Handles
Persistent file handles are a like durable file handles with strong guarantees. They are requested with the durable v2 create request blob with the persistent flag set to true. The server only grants persistent handles on shares that are marked CA (continuously available).
There is no finished design yet for the implementation of persistent handles. The foundations have been layed with the introduction of durable handles. The challenge is to implement the additional guarantees.
Witness Notification Protocol
- we need a tool to display the witness registrations
- we need a tool to move client to a different node
aka SMB 3.0 over RDMA
SMB-Direct backend for smb_transport abstraction
RDMA Read/Write support in the client
Remote Shadow Copy (FSRVP)
Not an SMB 3.0 specific feature per se.
- Need to add:
- add rpcclient support for FSRVP commands
- implement user interface (/proc or /sys or ioctl) and tools for this.
Branch Cache v2
Branch Cache is a wide area network caching protocol implemented in Windows 7 and later. It allows the server to return hashes of the data to the client, and then the client can use these hashes to request copies of the actual data from nearby systems, optimizing network bandwidth. Although Branch Cache is not SMB3 specific (e.g. HTTP etc) it is useful in conjunction with SMB2.1 and SMB3 file serving to improve WAN performance and better optimize bandwidth usage. See MS-PCCRC, MS-PCCRD, MS-PCCRR.
See http://www.snia.org/sites/default/files2/SDC2013/presentations/SMB3/DavidKruse_SMB3_Update.pdf SMB3.02 is very similar to SMB3 but with some optional features added. Note that the Linux CIFS client can negotiate SMB3.02 dialect (with these optional features disabled) by specifying vers=3.02 on mount. Samba server can not currently negotiate SMB3.02 as it does not have support for the new READ/WRITE flags (and the RDMA and Witness protocol improvements for SMB3.02 are not possible until the corresponding prerequisite optional SMB3.0 features that they are based on are added).
Currently cifs.ko can negotiate SMB3.02 dialect (vers=3.02) but does not request the optional features listed below so a vers=3.02 mount acts much like a vers=3.0 mount.
SMB Direct Remote Invalidation. Improves performance.
New ReadWrite Flags
SMB2_READFLAG_UNBUFFERED and SMB2_WRITEFLAG_UNBUFFERED allow the client to indicate whether or not any particular individual i/o request (read or write) should be cached by the server or not. There are no interfaces in the Linux kernel VFS for per i/o flags yet, so support for this on the wire would require a private ioctl on the client.
The Witness protocol can now signal to Windows clients to 'move' from one share to another, to allow more flexible migration, allowing taking a volume offline without taking the whole server down, with applications continuing to run even as the storage which that application uses is moved. Previous versions of the witness protocol allowed users of one server to be moved to another server, but this allows more granular movement - those using a particular share now can be redirected on the fly to another share.
Cluster-Wide Durable Handles
Work in progress