Writing a Samba VFS Module: Difference between revisions

From SambaWiki
Line 119: Line 119:


Any VFS function not implemented in any VFS module in the stack is handled in vfs_default.c.
Any VFS function not implemented in any VFS module in the stack is handled in vfs_default.c.

= Two Types of File Systems =


= Building Your VFS Module =
= Building Your VFS Module =

Revision as of 20:17, 28 February 2015

Introduction

Since there have been significant changes between Samba 3.x (and earlier) and Samba 4.0 and above, I thought I would start a new document on this topic rather than trying to overload the earlier document with more complex versions specific differences. A lot of content was copied from the previous version of the document with the permission of its author.

The remaining sections deal with:

  1. The Samba VFS Layer contains a discussion of the VFS Layer.
  2. Building your VFS module contains a discussion of how to get your module build.
  3. Etc.

The Samba VFS Layer

The Samba VFS provides a mechanism to allow people to extend the functionality of Samba in useful ways. Some examples are:

  1. Convert NTFS ACLs to NFSv4 ACLs for storing in a file system that supports them. The GPFS VFS module does this and the same could be done for Linux when RichACL support is complete.
  2. Support features that a vendor has implemented in their file system that Linux file systems do not support. # The OneFS VFS module from Isilon interfaces with their in-kernel distributed file system which provides more complete NTFS functionality, including four file times, etc.
  3. Implement features like Alternate Data Streams.
  4. Implement full NT ACL support by storing them in XATTRs and correctly handling the semantics (see source3/modules/vfs_acl_xattr.c and source3/modules/vfs_acl_common.c.)
  5. Support user-space file systems, perhaps accessible via a shared memory interface or via a user-space library (eg, Ceph's libceph.) Modules that do this are vfs_ceph.c and vfs_glusterfs.c
  6. A Samba VFS is a shared library (eg, acl_xattr.so), or module, that implements some or all of the functions that the Samba VFS interface makes available to provides the desired functionality. In addition VFS modules can be stacked (if they have been written for that), and there is a default VFS (source3/modules/vfs_default.c) that provides the default Samba functionality for those functions that are not implemented higher in the stack or that earlier modules also call.

The following diagrams help illustrate some of the concepts in more detail. Samba-vfs-diag-1.gif

The things to note here are:

  1. There are a number of layers to Samba.
  2. Protocol processing code in Samba will usually call one or more VFS Functions.
  3. Your specific Samba configuration can use a number of VFS modules that do not have to overlap. That is, they can each implement different sets of VFS functions (of which, more below). However, they can also be stacked.
  4. There is a default VFS module (which is statically linked into Samba) that provides implementations of all VFS functions and functions as a backstop. That is, it will be called in the event that no other module implements a particular function or will be called last if the functions in your module pass control down the stack.
  5. The default VFS module, vfs_defaults.c (source3/modules/vfs_defaults.c) calls back into Samba, usually via the sys_xxx routines, but sometimes it calls other modules.

If you want to find out what a particular VFS function does in general you should check the code in vfs_defaults.c. If you want to find out what an existing VFS module check its code in source3/modules.

The above figure also illustrates the flow of control through Samba and the VFS modules. The steps are similar to the following:

  1. An SMB request comes into Samba (steps 1 or 11), which results in Samba calling VFS routines. The call is via a macro in the source code that looks like SMB_VFS_XXX, eg, SMB_VFS_STAT to retrieve file metadata.
  2. The VFS layer calls the entry point in the first VFS module in the stack that implements the requested function. In the figure above, Req 1 results in a call to an entry point (step 2) in vfs_mod_1.so while Req 2 results in a call to an entry point (step 12) in vfs_mod_2.so.
  3. If the called function needs the functionality provided by other modules in the stack, it calls VFS_SMB_NEXT_XXX, which in the illustration ends up in the default VFS module, vfs_default.c. That is, the VFS function called in vfs_mod_1.so in step 2 above then results in a call to the NEXT function (step 3) and ends up in vfs_default.c
  4. The entry points in the default VFS module typically call functions in the system layer, eg, sys_stat (step 4).
  5. The system module calls into the kernel via a system call, eg, the stat system call (step 5).
  6. The system call returns to the system module (step 6), which
  7. Returns to the function in vfs_default.c that called the system layer (step 7), which
  8. Returns up the stack to the VFS module (step 8), which
  9. Returns to the main Samba code (step 9), which
  10. Formats and sends an SMB response (step 10).

Also, Req 2 is processed slightly differently. In this case, the entry point in vfs_mod_2.so that is called decides that it can handle everything itself, so it returns to the main Samba code (step 13) which then formats and sends an SMB response (step 14).

It should be noted that the Samba VFS interface now (Samba 4.2) contains some 200 different functions and that a VFS module does not have to implement them all (with an exception noted below.) If a module does not implement a particular VFS function, the required function within vfs_default.c will be called. However, it should be pointed out that if your module implements a particular request in its entirety, then it does not need to invoke functions below it in the stack. Further, functions below it in the stack are not automatically invoked, rather, the module writer must explicitly invoke modules below it in the stack by calling the NEXT module.

These points can be illustrated with code examples from existing VFS modules.

The following disconnect function was taken from vfs_ceph.c (source3/modules/vfs_ceph.c). It must be the last module in the stack for reasons discussed later, and thus does not call SMB_VFS_NEXT_DISCONNECT.

static void cephwrap_disconnect(struct vfs_handle_struct *handle)
{
       if (!cmount) {
               DEBUG(0, ("[CEPH] Error, ceph not mounted\n"));
               return;
       }

       /* Should we unmount/shutdown? Only if the last disconnect? */
       if (--cmount_cnt) {
               DEBUG(10, ("[CEPH] Not shuting down CEPH because still more connections\n"));
               return;
       }

       ceph_shutdown(cmount);

       cmount = NULL;  /* Make it safe */
}

This can be compared with the disconnect function taken from vfs_full_audit.c (source3/modules/vfs_full_audit.c).

static void smb_full_audit_disconnect(vfs_handle_struct *handle)
{
       SMB_VFS_NEXT_DISCONNECT(handle);

       do_log(SMB_VFS_OP_DISCONNECT, True, handle,
              "%s", lp_servicename(talloc_tos(), SNUM(handle->conn)));

       /* The bitmaps will be disconnected when the private
          data is deleted. */
}

There are a couple of things to note here:

  1. cephwrap_disconnect does not call SMB_VFS_NEXT_DISCONNECT while smb_full_audit_disconnect does. This suggests that the vfs_ceph module expects to be the lowest module in the stack while the vfs_full_audit module will play well with other modules below it.
  2. cephwrap_disconnect counts the number of disconnects (and connects, most likely) and only performs its real function if the current call is the last disconnect to the module. This is because the connect and disconnect functions are called once for each share that uses the module and there should be only one handle to the user-space daemon.

NOTE If you use DFS referrals, the Samba DFS Referral code will call the connect and disconnect functions of the VFS layer for the share that the referrals are on, so you should be prepared for this if you have any setup or cleanup actions you need to perform that should only be performed on the first connect and the last disconnect.

The Samba VFS functions can be separated into the following classes:

  1. Disk, or file system operations, like mounting and unmounting functions (actually called connect and disconnect), quota and free space handling routines, a statvfs function, and so forth.
  2. Directory operations, like opendir, readdir, mkdir, etc.
  3. File operations. This is the largest class of VFS functions, and includes functions for opening and closing files, reading and writing files, obtaining metadata information, and all the other operations you can perform on a file.
  4. NT ACL operations, like setting and getting an NT ACL on a file or directory. These functions actually deal in security descriptors, which can contain ACLs.
  5. POSIX ACL operations, for setting POSIX acls on files.
  6. Extended Attribute operations, for setting and retrieving XATTRs on files.
  7. AIO operations, for handling asynchronous operations.
  8. Offline operations, for handling offline operations.
  9. Durable handle operations, for handlinh operations on durable handles.

You tell Samba about any VFS modules you want used for a share in the smb.conf file. You do this with the vfs objects parameter for those shares you want to use VFS modules for.

For example:

[global]

...

[share1]
     path = /some/path
     vfs objects = acl_xattr my_vfs_obj
     ....

In this example we have specified that the share share1 uses two VFS objects in the order they are listed:

  1. A VFS object called acl_xattr. Any VFS functions this object implements will be called first. If they call a NEXT function, that function in the next module in the stack will be called. See below for more details on the NEXT function.
  2. A VFS object called my_vfs_obj. Functions in the my_vfs_obj VFS module will be called if they are not implemented in the acl_xattr module, or if the acl_xattr module explicitly calls the NEXT function and there is one in the my_vfs_obj VFS module.

Any VFS function not implemented in any VFS module in the stack is handled in vfs_default.c.

Two Types of File Systems

Building Your VFS Module

To be continued.