Writing a Samba VFS in Samba 3.6 and earlier

Revision as of 15:00, 14 May 2014 by Rsharpe (talk | contribs) (Include Files)

Introduction

This document is intended to help people who want to write Samba VFS modules. It is a Wiki-based version of an earlier document written by Richard Sharpe that can be found at Writing a Samba VFS Module.

The rest of this document is organized into a number of sections that:

  1. Provides and outline of the Samba VFS and discuss the interactions between the main Samba code, the VFS Layer, VFS modules and the underlying OS.
  2. Discusses two different types of file systems that module writers might want to write a VFS module for.
  3. Provides more detail on actually writing a Samba VFS and some of the functions and macros Samba makes available to help you.
  4. Lists differences in the VFS for different versions of Samba.
  5. Introduces some existing VFS modules, especially in the context of the two file system types outlined above.
  6. Gives details on the steps module writes will have to take to add their code and build their module.
  7. Provides some information on adding additional VFS routines over and above those already provided.

Authors

This document was converted from the original and extended by Richard Sharpe.

Others who have contributed are:

The Samba VFS

The Samba VFS provides a mechanism to allow people to extend the functionality of Samba in useful ways. Some examples are:

  • Convert NTFS ACLs to NFSv4 ACLs for storing in a file system that supports them. The GPFS VFS module does this and the same could be done for Linux when RichACL support is complete.
  • Support features that a vendor has implemented in their file system that Linux file systems do not support. The OneFS VFS module from Isilon interfaces with their in-kernel distributed file system which provides more complete NTFS functionality, including four file times, etc.
  • Implement features like Alternate Data Streams.
  • Implement full NT ACL support by storing them in XATTRs and correctly handling the semantics (see source3/modules/vfs_acl_xattr.c and source3/modules/vfs_acl_common.c.)
  • Support user-space file systems, perhaps accessible via a shared memory interface or via a user-space library (eg, Ceph's libceph.)

A Samba VFS is a shared library (eg, acl_xattr.so), or module, that implements some or all of the functions that the Samba VFS interface makes available to provides the desired functionality. In addition VFS modules can be stacked (if they have been written for that), and there is a default VFS (source3/modules/vfs_default.c) that provides the default Samba functionality for those functions that are not implemented higher in the stack or that earlier modules also call.

NOTE! Samba also makes it possible to use VFS modules statically on those systems that do not support shared libraries. Brief comments about this are included at the end of this document.

The following diagrams help illustrate some of the concepts in more detail.

Samba-vfs-diag-1.gif

The things to note here are:

  1. There are a number of layers to Samba.
  2. Protocol processing code in Samba will usually call one or more VFS Functions.
  3. Your specific Samba configuration can use a number of VFS modules that do not have to overlap. That is, they can each implement different sets of VFS functions (of which, more below). However, they can also be stacked.
  4. There is a default VFS module (which is statically linked into Samba) that provides implementations of all VFS functions and functions as a backstop. That is, it will be called in the event that no other module implements a particular function or will be called last if the functions in your module pass control down the stack.
  5. The default VFS module, vfs_defaults.c (source3/modules/vfs_defaults.c) calls back into Samba, usually via the sys_xxx routines, but sometimes it calls other modules.

If you want to find out what a particular VFS function does you should check the code in vfs_defaults.c.

The above figure also illustrates the flow of control through Samba and the VFS modules. The steps are similar to the following:

  1. An SMB request comes into Samba (steps 1 or 11), which results in Samba calling VFS routines. The call is via a macro in the source code that looks like SMB_VFS_XXX, eg, SMB_VFS_STAT to retrieve file metadata.
  2. The VFS layer calls the entry point in the first VFS module in the stack that implements the requested function. In the figure above, Req 1 results in a call to an entry point (step 2) in vfs_mod_1.so while Req 2 results in a call to an entry point (step 12) in vfs_mod_2.so.
  3. If the called function needs the functionality provided by other modules in the stack, it calls VFS_SMB_NEXT_XXX, which in the illustration ends up in the default VFS module, vfs_default.c. The VFS function called in vfs_mod_1.so in step 2 above then results in a call to the NEXT function (step 3) and ends up in vfs_default.c
  4. The entry points in the default VFS module typically call functions in the system layer, eg, sys_stat (step 4).
  5. The system module calls into the kernel via a system call, eg, the stat system call (step 5).
  6. The system call returns to the system module (step 6), which
  7. Returns to the function in vfs_default.c that called the system layer (step 7), which
  8. Returns up the stack to the VFS module (step 8), which
  9. Returns to the main Samba code (step 9), which
  10. Formats and sends an SMB response (step 10).

Also, Req 2 is processed slightly differently. In this case, the entry point in vfs_mod_2.so that is called decides that it can handle everything itself, so it returns to the main Samba code (step 13) which then formats and sends an SMB response (step 14).

It should be noted that the Samba VFS interface contains some 120 different functions and that a VFS does not have to implement them all (with an exception noted below.) If a module does not implement a particular VFS function, the required function within vfs_default.c will be called. However, it should be pointed out that if your module implements a particular request in its entirety, then it does not need to invoke functions below it in the stack. Further, functions below it in the stack are not automatically invoked, rather, the module writer must explicitly invoke modules below it in the stack by calling the NEXT module.

These points can be illustrated with code examples from existing VFS modules.

The following disconnect function was taken from vfs_ceph.c (source3/modules/vfs_ceph.c). It must be the last module in the stack for reasons discussed later, and thus does not call SMB_VFS_NEXT_DISCONNECT.

static void cephwrap_disconnect(struct vfs_handle_struct *handle)
{
       if (!cmount) {
               DEBUG(0, ("[CEPH] Error, ceph not mounted\n"));
               return;
       }

       /* Should we unmount/shutdown? Only if the last disconnect? */
       if (--cmount_cnt) {
               DEBUG(10, ("[CEPH] Not shuting down CEPH because still more connections\n"));
               return;
       }

       ceph_shutdown(cmount);

       cmount = NULL;  /* Make it safe */
}

This can be compared with the disconnect function taken from vfs_full_audit.c (source3/modules/vfs_full_audit.c).

static void smb_full_audit_disconnect(vfs_handle_struct *handle)
{
       SMB_VFS_NEXT_DISCONNECT(handle);

       do_log(SMB_VFS_OP_DISCONNECT, True, handle,
              "%s", lp_servicename(talloc_tos(), SNUM(handle->conn)));

       /* The bitmaps will be disconnected when the private
          data is deleted. */
}

There are a couple of things to note here:

  1. cephwrap_disconnect does not call SMB_VFS_NEXT_DISCONNECT while smb_full_audit_disconnect does. This suggests that the vfs_ceph module expects to be the lowest module in the stack while the vfs_full_audit module will play well with other modules below it.
  2. cephwrap_disconnect counts the number of disconnects (and connects, most likely) and only performs its real function if the current call is the last disconnect to the module. This is because the connect and disconnect functions are called once for each share that uses the module.

It should be noted that the Samba VFS interface contains some 120 different functions and that a VFS does not have to implement them all (with an exception noted below.) If a module does not implement a particular VFS function, the required function within vfs_default.c will be called. However, it should be pointed out that if your module implements a particular request in its entirety, then it does not need to invoke functions below it in the stack. Further, functions below it in the stack are not automatically invoked, rather, the module writer must explicitly invoke modules below it in the stack by calling the NEXT module.

The Samba VFS functions can be separated into the following classes:

  1. Disk, or file system operations, like mounting and unmounting functions (actually called connect and disconnect), quota and free space handling routines, a statvfs function, and so forth.
  2. Directory operations, like opendir, readdir, mkdir, etc.
  3. File operations. This is the largest class of VFS functions, and includes functions for opening and closing files, reading and writing files, obtaining metadata information, and all the other operations you can perform on a file.
  4. NT ACL operations, like setting and getting an NT ACL on a file or directory. These functions actually deal in security descriptors, which can contain ACLs.
  5. POSIX ACL operations, for setting POSIX acls on files.
  6. Extended Attribute operations, for setting and retrieving XATTRs on files.
  7. AIO operations, for handling asynchronous operations.
  8. Offline operations, for handling offline operations.

You tell Samba about any VFS modules you want used for a share in the smb.conf file. You do this with the vfs objects parameter for those shares you want to use VFS modules for.

For example:

[global]

...

[share1]
     path = /some/path
     vfs objects = acl_xattr my_vfs_obj
     ....

In this example we have specified that the share share1 uses two VFS objects in the order they are listed:

  1. A VFS object called acl_xattr. Any VFS functions this object implements will be called first. If they call a NEXT function, that function in the next module in the stack will be called. See below for more details on the NEXT function.
  2. A VFS object called my_vfs_obj. Functions in the my_vfs_obj VFS module will be called if they are not implemented in the acl_xattr module, or if the acl_xattr module explicitly calls the NEXT function and there is one in the my_vfs_obj VFS module.

Any VFS function not implemented in any VFS module in the stack is handled in vfs_default.c.

Two Types of File Systems

From the point of view of a Samba VFS writer there are two types of file systems:

  1. A file system that is accessed via system calls and for which the system provides file descriptors, and
  2. A file system that is accessed from user space, typically via a user-space library. An example is Ceph when accessed via libceph. It should be noted that a FUSE file system is not a user-space file system from the point of view of Samba, because the kernel understands all the FDs relating to a FUSE file system.

The reason for distinguishing between these two types of file system is the following. Many Samba VFS routines deal with file descriptors (FDs). Any VFS for a user-space file system provides file descriptors that the kernel does not understand (it possibly supplies an index into a table of objects that are managed by the VFS.) For that reason, a VFS module for a user-space file system must implement all VFS routines and cannot forward any requests to the default VFS module, because the default VFS module will eventually result in calling a system call with a file descriptor that the kernel knows nothing about, or knows about but it is not the intended file descriptor, and you could end up closing some random file with unintended results.

Samba-vfs-diag-1.gif

The above figure has already been discussed further above.

Samba-vfs-diag-2.gif

The above figure illustrates a VFS module for accessing a file system in user space. Such a file system might be accessed via NFS requests directly to an NFS server (on the same computer, or a different computer) or via a shared memory segment, etc. The essential point is that such a module must implement all VFS functions and not let any fall through to vfs_default.c.

Writing a VFS Module

Before writing your own Samba VFS module have a look at the existing modules to see if any combination (stack) of existing modules supplies the functionality you need, or if any existing module supplies some of the functionality you need. For example, if you are thinking of storing Security Descriptors (AKA NT ACLS) in XATTR-like objects in your file system, there is already a module for doing that called acl_xattr. As long as you provide it with a way to store XATTRs, and do a few other things, it should work and already does all the hard work for you. The source code for all the VFS modules is in source3/modules.

When you write a VFS module you supply three things in one or more files:

  1. A module initialization routine that tells Samba what VFS routines are handled by this module. This routine is called something like vfs_my_module_init, and its signature is specified below.
  2. A VFS function pointers structure (vfs_fn_pointer) to the VFS routines implemented in this module. By using standard C89 initialization, you only initialize this structure with pointers to the functions you actually implement.
  3. The actual VFS functions you implement along with any supporting functions, etc. However, first you have to give it a name and place the code in a file. If you are building your module within the Samba source tree it will need to be placed in the directory source3/modules, and the main file (the one that contains your module's initialization routine as mentioned below) must be called vfs_<module_name>.c

For example, vfs_my_module.c. The remainder of this document will use this name in the examples.

The rest of this section deals with:

  1. The Life Cycle of a VFS Module
  2. VFS Module Initialization
  3. VFS Function Pointer Structure
  4. Include Files
  5. VFS Functions
  6. Memory Management and talloc
  7. Providing Context between Calls
  8. Module Specific Parameters
  9. Extending the Samba files_struct structure
  10. AIO Handling in a VFS Module
  11. Conforming to the VFS Interfaces
  12. Be prepared for "stat" opens
  13. Things to watch out for

The Life Cycle of a VFS Module

When a client issues a TREE_CONNECT request (either because of a NET USE command or mapping a network drive) samba calls SMB_VFS_CONNECT which results in the connect_fn in your VFS module (if defined) being called.

The connect_fn has the following signature (the name of the function can be anything you like):

static int my_module_connect(vfs_handle_struct *handle, 
                             const char *service, 
                             const char *user)

This call gives you the opportunity to create and save context information for calls to other functions. If your module is not designed to be the last in the stack then your connect_fn should give other modules a chance to capture connection information as well, using:

      int ret = SMB_VFS_NEXT_CONNECT(handle, service, user); 

of course, you should check the return code and cleanup if an error occurs in a lower module.

When the client disconnects from the share that your VFS module is connected to, Samba will call your disconnect function:

static void my_module_disconnect(vfs_handle_struct *handle) 
{ 
/* Perform whatever actions are needed here */
}

In general you do not need to clean up memory allocated with talloc in your connection module if that memory was allocated using the connection structure (handle->conn) as a context, as it will all be cleaned up when the connection structure is freed with TALLOC_FREE.

Of course, if your module has no need to capture connection and disconnection events, you do not need to define these routines.

Between these two calls, Samba will call the functions you have defined as necessary passing them the same vfs_handle_struct on each call, along with other parameters as required by each function.

Please note, if your VFS module should only perform connect-time actions on the first connection, and disconnect-time actions on the last disconnect, you should carefully manage that in your module. If you configure multiple shares to use your VFS module you should be aware that your connect and disconnect functions will be called once for each such share that the user connects to.

Similarly, since Samba uses a fork model, where each client gets a separate smbd, your connect function will be called in each smbd for each share that uses the module that clients connect to.

VFS Module Initialization

Your module must contain an entry point called vfs_my_module_init, which the build system will actually convert to samba_init_module or init_samba_module if you are building your module as a shared library. (Actually, the name depends on the version as explained below, and if your module is static, you must use one of samba_init_module or init_samba_module. However, see not below for a solution to this complexity) The initialization routine has one simple task to perform: Register itself along with the set of functions it implements. The following is an example:

NTSTATUS vfs_my_module_init(void) 
{ 
        return smb_register_vfs(SMB_VFS_INTERFACE_VERSION, "my_module", 
                                &vfs_my_module_fns); 
}

The things to note are:

  1. As mentioned above, this function must be called vfs_<module_name>_init, it returns an NTSTATUS and does not take any parameters.
  2. It returns the result of calling smb_register_vfs with three variables as shown.
  3. You can name the variable that contains the functions you implement anything you want, however the practice has been to name it as shown.
  4. If registration fails, none of the routines in your module will be called, but there are likely to be bigger problems, in that case.

This code can be cut from an existing module and pasted into yours with the appropriate changes made.

NOTE! If your module has undefined symbols, then Samba will not even call your module's init function, and attempts to connect to the share will fail.

NOTE Also! If you are building your module outside the Samba source tree (and not changing configure.in, as described below) you can call this function samba_init_module in the master branch or init_samba_module in earlier versions (3.5.x and 3.6.x). However, there is now a way of avoiding these naming problems for modules that are build outside the Samba source tree. See bug #8822 at http://bugzilla.samba.org.

VFS Function Pointer Structure

Your module must declare and initialize a struct vfs_fn_pointers structure. The following is an example.

static struct vfs_fn_pointers vfs_my_module_fns = { 
       .getxattr = my_module_getxattr, 
       .fgetxattr = my_module_fgetxattr, 
       .setxattr = my_module_setxattr, 
       .fsetxattr = my_module_fsetxattr, 
       .listxattr = my_module_listxattr, 
};

The variable must be declared static so that it does not cause conflicts with any symbol exported by Samba or any other module. In addition, you only need to initialize pointers to just those VFS functions you are implementing (using the C89 initialization syntax.) You would generally declare this variable before you declare the init function discussed above.

Include Files

Your module will need to invoke some include files. You will need includes.h, but you might also need to include a few more:

  • system/filesys.h if you need access to many of the file system calls, like fcntl, etc. See lib/replace/system/filesys.h to determine what system include files this file pulls in.• smbd/smbd.h if you need access to definitions for NT ACLs etc.

These should all be included before your code.

VFS Functions

Memory Management and talloc

Providing Context between Calls

Module Specific Parameters

Extending the Samba files_struct structure

AIO Handling in a VFS Module

Conforming to the VFS Interfaces

Be prepared for "stat" opens

Things to watch out for

TBD

Samba Version VFS Differences

TBD

Some Existing VFS Modules

TBD

Building, Installing and Debugging your VFS Module

TBD

Adding New VFS Routines