SMB3-Linux

From SambaWiki

There are various requirements for full POSIX compatibility, and other requirements which although not strictly POSIX (such as support for symlinks and the fallocate system call) are common in Linux and various Unix variants and useful to applications. The goal is to implement emulation strategies and extensions to the SMB3 protocol which are as small as reasonably possible but implement the most important of these missing features, allowing the network file system to appear nearly identical to a local file system to users and the applications they run, without creating unacceptable performance or configuration problems.

Requirements

In this document POSIX CC stands for POSIX Create Context which is a chunk of data that can be optionally included in a Create request/response.

The general requirements for SMB3 POSIX extensions include the following:


POSIX mode bits

The primitive 07777 bits used to control who can access a file. (RWX bits for user, group, other + sticky,setuid,setgid bits)

status

Can now be set on creation via the POSIX CC.

Emulatable via the "cifsacl" (cifs.ko mount option for cifs which pulls them from the server's "RichACL" (NTFS/SMB3/NFSv4). Using an approach similar to the "NFSv4 mode ACE" may be helpful as well. Prototype not complete. SMB3_SetACL and SMB3_GetACL worker functions for Linux's cifs.ko have been prototyped but not reviewed yet.

mkdir setuid/setgid: In Linux, mkdir() strips setuid and setgid bits (not a bug).

mkdir user read/execute: Samba returns access denied on mkdir of a directory which doesn't have the read and execute for the owner, regardless of whether if the directory was successfully created. It needs u=rx to succeed. It needs to be workaround in cifs.ko (TODO: try mkdir + setinfo?)

POSIX file ownership

UID and GID owners. Windows typically only has one or the other, and expresses them as global "SIDs" with longer UUIDs rather than locally defined UIDs.

status

See POSIX mode bits status.

Symbolic links

Windows now has the concept of reparse points. Can this be reused in SMB somehow?

status

Use the "mfsymlinks" approach used by Apple among others. Implemented in cifs.ko. Will be in kernel 3.18 and later. Should be backportable to earlier kernels.

Case sensitivity

status

Files opened with the POSIX Create Context get POSIX semantics, including case sensitivity.

No reserved path characters

Mapping 7 reserved characters (not allowed in SMB3/CIFS/NTFS/Windows but allowed in POSIX). They include: * ? < > : | \

status

There are 2 ways to do this:

  • Send the path unmodified with a POSIX CC
  • Map the reserved characters to an unreserved but "invalid" unicode range. 2 mappings already exist:
    • Microsoft's "SFU" (SUA) mapping
    • Apple's "SFM" mapping.

The SFU mapping is available in CIFS (and SMB3 in 3.18) with the "mapchars" mount option but we plan to use the Apple ("SFM") mapping approach by default in 3.18 kernel and later (Samba requires the "vfs_fruit" module to implement the Apple mapping of the seven reserved characters).

mkfifo and mknod

status

These are emulated using the same approach that Microsoft SFU and others did. Uses the "sfu" mount option (available in 3.18 kernel or later).

POSIX unlink and rename behavior

  • unlink: deleting an open file, removing it from the namespace, occurs in POSIX but not Windows
  • rename: renaming a directories that has open files, perfectly legal in POSIX but not in Windows (even recursivley)

status

Emulatable over SMB3 for most cases (using "delete on close" and using an approach like "nfs silly rename"). 3.18 kernel will better handle these but "POSIX Create Context" are still likely to be required for a few use cases.

POSIX byte range locks

POSIX "advisory" byte range locks (SMB3 allows Windows style "mandatory" byte range locks). POSIX locks are also merged when they overlap, and all locks are released on file close making them both confusing to use (locally on Linux file systems, and even more so over network file systems) and more difficult to emulate. Although many dislike the POSIX byte range lock behavior, their implementation in SMB3 would help some applications.

status

POSIX CC will enable POSIX flavor of locks on the handle.

Emulated via mandatory locks today, and can also be "local only" (with a cifs.ko mount option "nobrl").

More information returned in stat() syscall

  • Slight differences in "stat" system call (and the mode/ownership information noted above)
  • Additional information returned on the statfs" system call:
    • f_files; /* total file nodes in file system */
    • f_ffree; /* free file nodes in fs */

status

  • stat: Use POSIX information level to get additional stat fields in QUERY INFO and FIND requests.
  • statfs: fields still missing

POSIX ACL support

Linux implements an ACL model for local file systems which is less complex than the more common "RichACLs" (ie NFSv4 or NTFS/SMB/SMB3 ACLs) but easier to understand.

status

Could be mapped to SMB3/NTFS RichACLs which are a superset of POSIX ACLs. Also could be handled via "POSIX Create Context".

fallocate() parameters

Many fallocate options are available, most but not all are mappable to various existing SMB3 ioctls.

TODO: examples

status

Partially implemented already, and also a few other new Linux syscalls which are not broadly implemented: more research needed.

Code & tests

Sample smb.conf for samba (see pike README):

[global]
server max protocol = SMB3_11
unix extensions = yes

[share]
create mask = 07777
directory mask = 07777
mangled names = no
path = /tmp/share
read only = no
guest ok = yes

Linux kernel mount options:

mount –t smb3 //<address>/<share> /mnt -o username=<user>,password=<pass>,vers=3.1.1,posix,mfsymlinks,nomapposix,noperm

POSIX extension wire protocol status

As of 21-06-2018 from JRA's master-smb2 branch. Note that all integers are in Little-Endian.

Negotiate Context

SMB2_POSIX_EXTENSIONS 0x100

Actual length/fields not decided yet, use the context data length field.

Create Context

For client requests

New create context. If a file is open with this context, the handle gets POSIX_SEMANTICS flag set.

  • Context tag: SMB2_CREATE_TAG_POSIX "\x93\xAD\x25\x50\x9C\xB4\x11\xE7\xB4\x23\x83\xDE\x96\x8B\xCD\x7C"
  • Context payload size: 4 bytes

Unix perm mode to be used for the new file/dir. The bits used are as follow (note the values are in octal):

#define UNIX_X_OTH			0000001
#define UNIX_W_OTH			0000002
#define UNIX_R_OTH			0000004
#define UNIX_X_GRP			0000010
#define UNIX_W_GRP                      0000020
#define UNIX_R_GRP                      0000040
#define UNIX_X_USR                      0000100
#define UNIX_W_USR                      0000200
#define UNIX_R_USR                      0000400
#define UNIX_STICKY                     0001000
#define UNIX_SET_GID                    0002000
#define UNIX_SET_UID                    0004000

For server responses

The server can respond to CREATE request with this POSIX context too (same context tag).

  • Context payload size: 12 + 2*28 = 68 bytes.
u32  SMB_STRUCT_STAT->st_ex_nlink // number of hardlinks
u32  FILE_FLAG_REPARSE            // "reparse_tag"
u32  unix_perms_to_wire(SMB_STRUCT_STAT->st_ex_mode & ~S_IFMT)
sid  sid_owner
sid  sid_group


A sid is encoded as follow:

u8  sid_rev_num
u8  num_auths (range 0-5)
buf id_auth (6 bytes)
[u32 sub_uath] * num_auths
u8   padding to make it 28 bytes

Info level

New info level requestable via GETINFO or FIND. The payload contains a POSIX Create Context response at the end.

  • Level value: SMB2_FIND_POSIX_INFORMATION 0x64
  • Payload length: 136.
    • 68 + POSIXCreateContextResponse (see above)
u64 put_long_date_timespec(SMB_STRUCT_STAT->st_ex_btime) // birth
u64 put_long_date_timespec(SMB_STRUCT_STAT->st_ex_atime) // access
u64 put_long_date_timespec(SMB_STRUCT_STAT->st_ex_mtime) // last write
u64 put_long_date_timespec(SMB_STRUCT_STAT->st_ex_ctime) // change
u64 # bytes used on disk
u64 file size
u32 dos attributes
u64 inode
u32 SMB_STRUCT_STAT->st_ex_dev // device ID
u32 zero
POSIXCreateContextResponse (size=68 bytes)

For FIND (directory listing) there is some extra data at the start (offset to the next directory entry) and the file name at the end:

u32   next_offset
u32   ignored
POSIXInformation
u32   file_name_byte_count
utf16 file_name (NOT UTF8!)

POSIX extensions codepaths in samba

SMB2_OP_QUERY_DIRECTORY:
 smbd_smb2_request_process_query_directory
 smbd_smb2_query_directory_send
 smbd_dirptr_lanman2_entry
 smbd_marshall_dir_entry
  store_smb2_posix_info <--- sends next_offset + info + posix cc rsp + filename (length + utf16)
    smb2_posix_cc_info
SMB2_OP_GETINFO:
smbd_smb2_getinfo_send
smbd_do_qfilepathinfo
  store_smb2_posix_info <--- sends info + posix cc rsp
    smb2_posix_cc_info
SMB2_OP_CREATE:
smbd_smb2_create_send
smbd_smb2_create_after_exec
    smb2_posix_cc_info  <--- sends POSIX create context resp