Writing a Samba VFS Module: Difference between revisions

From SambaWiki
Line 650: Line 650:
deps can be used to add additional dependencies, init_function must be set to an empty string.
deps can be used to add additional dependencies, init_function must be set to an empty string.


And finally, add your new module to the list of modules built as shared objects by default in source3/wscript:
And finally, add your new module to the list of modules built as shared objects by default in source3/wscript default_shared_modules.extend:


--- a/source3/wscript
--- a/source3/wscript

Revision as of 07:27, 16 December 2016

Introduction

Since there have been significant changes between Samba 3.x (and earlier) and Samba 4.0 and above, I thought I would start a new document on this topic rather than trying to overload the earlier document with more complex version-specific differences. A lot of content was copied from the previous version of the document with the permission of its author.

The remaining sections deal with:

  1. The Samba VFS Layer contains a discussion of the VFS Layer.
  2. Two Types of File Systems contains a discussion of the two types of file systems that you might interact with.
  3. Writing a VFS Module
  4. Building your VFS module contains a discussion of how to get your module build.
  5. Etc.

If you are looking for a quick example you can find a complete VFS Module in the section on Building your VFS Module.

The Samba VFS Layer

The Samba VFS provides a mechanism to allow people to extend the functionality of Samba in useful ways. Some examples are:

  1. Convert NTFS ACLs to NFSv4 ACLs for storing in a file system that supports them. The GPFS VFS module does this and the same could be done for Linux when RichACL support is complete.
  2. Support features that a vendor has implemented in their file system that Linux file systems do not support.
  3. Implement features like Alternate Data Streams.
  4. Implement full NT ACL support by storing them in XATTRs and correctly handling the semantics (see source3/modules/vfs_acl_xattr.c and source3/modules/vfs_acl_common.c.)
  5. Support user-space file systems, perhaps accessible via a shared memory interface or via a user-space library (eg, Ceph's libceph.) Modules that do this are vfs_ceph.c and vfs_glusterfs.c

A Samba VFS is a shared library (eg, acl_xattr.so), or module, that implements some or all of the functions that the Samba VFS interface makes available to provides the desired functionality. In addition VFS modules can be stacked (if they have been written for that), and there is a default VFS (source3/modules/vfs_default.c) that provides the default Samba functionality for those functions that are not implemented higher in the stack or that earlier modules also call.

The following diagrams help illustrate some of the concepts in more detail. Samba-vfs-diag-1.gif

The things to note here are:

  1. There are a number of layers to Samba.
  2. Protocol processing code in Samba will usually call one or more VFS Functions.
  3. Your specific Samba configuration can use a number of VFS modules that do not have to overlap. That is, they can each implement different sets of VFS functions (of which, more below). However, they can also be stacked.
  4. There is a default VFS module (which is statically linked into Samba) that provides implementations of all VFS functions and acts as a backstop. That is, it will be called in the event that no other module implements a particular function or will be called last if the functions in your module pass control down the stack.
  5. The default VFS module, vfs_defaults.c (source3/modules/vfs_defaults.c) calls back into Samba, usually via the sys_xxx routines, but sometimes it calls other modules.

If you want to find out what a particular VFS function does in general you should check the code in vfs_defaults.c. If you want to find out what an existing VFS module check its code in source3/modules.

The above figure also illustrates the flow of control through Samba and the VFS modules. The steps are similar to the following:

  1. An SMB request comes into Samba (steps 1 or 11), which results in Samba calling VFS routines. The call is via a macro in the source code that looks like SMB_VFS_XXX, eg, SMB_VFS_STAT to retrieve file metadata.
  2. The VFS layer calls the entry point in the first VFS module in the stack that implements the requested function. In the figure above, Req 1 results in a call to an entry point (step 2) in vfs_mod_1.so while Req 2 results in a call to an entry point (step 12) in vfs_mod_2.so.
  3. If the called function needs the functionality provided by other modules in the stack, it calls VFS_SMB_NEXT_XXX, which in the illustration ends up in the default VFS module, vfs_default.c. That is, the VFS function called in vfs_mod_1.so in step 2 above then results in a call to the NEXT function (step 3) and ends up in vfs_default.c
  4. The entry points in the default VFS module typically call functions in the system layer, eg, sys_stat (step 4).
  5. The system module calls into the kernel via a system call, eg, the stat system call (step 5).
  6. The system call returns to the system module (step 6), which
  7. Returns to the function in vfs_default.c that called the system layer (step 7), which
  8. Returns up the stack to the VFS module (step 8), which
  9. Returns to the main Samba code (step 9), which
  10. Formats and sends an SMB response (step 10).

Also, Req 2 is processed slightly differently. In this case, the entry point in vfs_mod_2.so that is called decides that it can handle everything itself, so it returns to the main Samba code (step 13) which then formats and sends an SMB response (step 14).

It should be noted that the Samba VFS interface now (Samba 4.2) contains some 200 different functions and that a VFS module does not have to implement them all (with an exception noted below.) If a module does not implement a particular VFS function, the required function within vfs_default.c will be called. However, it should be pointed out that if your module implements a particular request in its entirety, then it does not need to invoke functions below it in the stack. Further, functions below it in the stack are not automatically invoked, rather, the module writer must explicitly invoke modules below it in the stack by calling the NEXT module.

These points can be illustrated with code examples from existing VFS modules.

The following disconnect function was taken from vfs_ceph.c (source3/modules/vfs_ceph.c). It must be the last module in the stack for reasons discussed later, and thus does not call SMB_VFS_NEXT_DISCONNECT.

static void cephwrap_disconnect(struct vfs_handle_struct *handle)
{
       if (!cmount) {
               DEBUG(0, ("[CEPH] Error, ceph not mounted\n"));
               return;
       }

       /* Should we unmount/shutdown? Only if the last disconnect? */
       if (--cmount_cnt) {
               DEBUG(10, ("[CEPH] Not shuting down CEPH because still more connections\n"));
               return;
       }

       ceph_shutdown(cmount);

       cmount = NULL;  /* Make it safe */
}

This can be compared with the disconnect function taken from vfs_full_audit.c (source3/modules/vfs_full_audit.c).

static void smb_full_audit_disconnect(vfs_handle_struct *handle)
{
       SMB_VFS_NEXT_DISCONNECT(handle);

       do_log(SMB_VFS_OP_DISCONNECT, True, handle,
              "%s", lp_servicename(talloc_tos(), SNUM(handle->conn)));

       /* The bitmaps will be disconnected when the private
          data is deleted. */
}

There are a couple of things to note here:

  1. cephwrap_disconnect does not call SMB_VFS_NEXT_DISCONNECT while smb_full_audit_disconnect does. This suggests that the vfs_ceph module expects to be the lowest module in the stack while the vfs_full_audit module will play well with other modules below it.
  2. cephwrap_disconnect counts the number of disconnects (and connects, most likely) and only performs its real function if the current call is the last disconnect to the module. This is because the connect and disconnect functions are called once for each share that uses the module and there should be only one handle to the user-space daemon.

NOTE If you use DFS referrals, the Samba DFS Referral code will call the connect and disconnect functions of the VFS layer for the share that the referrals are on, so you should be prepared for this if you have any setup or cleanup actions you need to perform that should only be performed on the first connect and the last disconnect.

The Samba VFS functions can be separated into the following classes:

  1. Disk, or file system operations, like mounting and unmounting functions (actually called connect and disconnect), quota and free space handling routines, a statvfs function, and so forth.
  2. Directory operations, like opendir, readdir, mkdir, etc.
  3. File operations. This is the largest class of VFS functions, and includes functions for opening and closing files, reading and writing files, obtaining metadata information, and all the other operations you can perform on a file.
  4. NT ACL operations, like setting and getting an NT ACL on a file or directory. These functions actually deal in security descriptors, which can contain ACLs.
  5. POSIX ACL operations, for setting POSIX acls on files.
  6. Extended Attribute operations, for setting and retrieving XATTRs on files.
  7. AIO operations, for handling asynchronous operations.
  8. Offline operations, for handling offline operations.
  9. Durable handle operations, for handling operations on durable handles.

You tell Samba about any VFS modules you want used for a share in the smb.conf file. You do this with the vfs objects parameter for those shares you want to use VFS modules for.

For example:

[global]

...

[share1]
     path = /some/path
     vfs objects = acl_xattr my_vfs_obj
     ....

In this example we have specified that the share share1 uses two VFS objects in the order they are listed:

  1. A VFS object called acl_xattr. Any VFS functions this object implements will be called first. If they call a NEXT function, that function in the next module in the stack will be called. See below for more details on the NEXT function.
  2. A VFS object called my_vfs_obj. Functions in the my_vfs_obj VFS module will be called if they are not implemented in the acl_xattr module, or if the acl_xattr module explicitly calls the NEXT function and there is one in the my_vfs_obj VFS module.

Any VFS function not implemented in any VFS module in the stack is handled in vfs_default.c.

Two Types of File Systems

From the point of view of a Samba VFS writer there are two types of file systems:

  1. A file system that is accessed via system calls and for which the system provides file descriptors, and
  2. A file system that is accessed from user space, typically via a user-space library. An example is Ceph when accessed via libceph. It should be noted that a FUSE file system is not a user-space file system from the point of view of Samba, because the kernel understands all the FDs relating to a FUSE file system.

The reason for distinguishing between these two types of file system is the following. Many Samba VFS routines deal with file descriptors (FDs). Any VFS for a user-space file system provides file descriptors that the kernel does not understand (it possibly supplies an index into a table of objects that are managed by the VFS.) For that reason, a VFS module for a user-space file system must implement all VFS routines and cannot forward any requests to the default VFS module, because the default VFS module will eventually result in calling a system call with a file descriptor that the kernel knows nothing about, or knows about but it is not the intended file descriptor, and you could end up closing some random file with unintended results.

(Actually, the above claim is not strictly true. For example, you do not really need to implement the CREATEFILE method when you are dealing with a user-space file system because Samba calls the other functions to implement CREATEFILE and unless you are doing something really crazy you probably should let Samba provide all that functionality.)

Samba-vfs-diag-1.gif

The above figure has already been discussed further above.

Samba-vfs-diag-2.gif

The above figure illustrates a VFS module for accessing a file system in user space. Such a file system might be accessed via NFS requests directly to an NFS server (on the same computer, or a different computer) or via a shared memory segment, etc. The essential point is that such a module must implement all VFS functions and not let any fall through to vfs_default.c.

The salient points are:

  1. The final VFS module must implement all VFS routines (not strictly correct because some, like CreateFile, can go into Samba and they will call other VFS modules for accessing files etc.)
  2. The module vfs_my_module.so forwards requests to the user-space file system in step 4. The response returns in step 5.

Writing a VFS Module

Before writing your own Samba VFS module have a look at the existing modules to see if any combination (stack) of existing modules supplies the functionality you need, or if any existing module supplies some of the functionality you need. For example, if you are thinking of storing Security Descriptors (AKA NT ACLS) in XATTR-like objects in your file system, there is already a module for doing that called acl_xattr. As long as you provide it with a way to store XATTRs, and do a few other things, it should work and already does all the hard work for you. The source code for all the VFS modules is in source3/modules.

When you write a VFS module you supply three things in one or more files:

  1. A module initialization routine that tells Samba what VFS routines are handled by this module. This routine is called something like vfs_my_module_init, and its signature is specified below.
  2. A VFS function pointers structure (vfs_fn_pointer) to the VFS routines implemented in this module. By using standard C89 initialization, you only initialize this structure with pointers to the functions you actually implement.
  3. The actual VFS functions you implement along with any supporting functions, etc. However, first you have to give it a name and place the code in a file. If you are building your module within the Samba source tree it will need to be placed in the directory source3/modules, and the main file (the one that contains your module's initialization routine as mentioned below) should be called vfs_<module_name>.c. For example, vfs_my_module.c.

The rest of this section deals with:

  1. The Life Cycle of a VFS Module
  2. VFS Module Initialization
  3. VFS Function Pointer Structure
  4. Include Files
  5. VFS Functions
  6. Memory Management and talloc
  7. Providing Context between Calls
  8. Module Specific Parameters
  9. Extending the Samba files_struct structure
  10. AIO Handling in a VFS Module
  11. Conforming to the VFS Interfaces
  12. Be prepared for "stat" opens
  13. Things to watch out for

The Life Cycle of a VFS Module

When a client issues a TREE_CONNECT request (either because of a NET USE command or mapping a network drive) samba calls SMB_VFS_CONNECT which results in the connect_fn in your VFS module (if defined) being called.

The connect_fn has the following signature (the name of the function can be anything you like):

static int my_module_connect(vfs_handle_struct *handle, 
                             const char *service, 
                             const char *user)

This call gives you the opportunity to create and save context information for calls to other functions. If your module is not designed to be the last in the stack then your connect_fn should give other modules a chance to capture connection information as well, using:

      int ret = SMB_VFS_NEXT_CONNECT(handle, service, user); 

of course, you should check the return code and cleanup if an error occurs in a lower module.

When the client disconnects from the share that your VFS module is connected to, Samba will call your disconnect function:

static void my_module_disconnect(vfs_handle_struct *handle) 
{ 
/* Perform whatever actions are needed here */
}

In general you do not need to clean up memory allocated with talloc in your connection module if that memory was allocated using the connection structure (handle->conn) as a context, as it will all be cleaned up when the connection structure is freed with TALLOC_FREE.

Of course, if your module has no need to capture connection and disconnection events, you do not need to define these routines.

Between these two calls, Samba will call the functions you have defined as necessary passing them the same vfs_handle_struct on each call, along with other parameters as required by each function.

Please note, if your VFS module should only perform connect-time actions on the first connection, and disconnect-time actions on the last disconnect, you should carefully manage that in your module. If you configure multiple shares to use your VFS module you should be aware that your connect and disconnect functions will be called once for each such share that the user connects to.

Similarly, since Samba uses a fork model, where each client gets a separate smbd, your connect function will be called in each smbd for each share that uses the module that clients connect to.

VFS Module Initialization

Your module must contain an entry point called vfs_my_module_init, which the build system will actually convert to samba_init_module if you are building your module as a shared library.

The initialization routine has one simple task to perform: Register the module along with the set of functions it implements. Samba calls this function upon the first use of the module. The following is an example:

NTSTATUS vfs_my_module_init(void) 
{ 
        return smb_register_vfs(SMB_VFS_INTERFACE_VERSION, "my_module", 
                                &vfs_my_module_fns); 
}

The things to note are:

  1. As mentioned above, this function must be called vfs_<module_name>_init, it returns an NTSTATUS and does not take any parameters.
  2. It returns the result of calling smb_register_vfs with three variables as shown.
  3. You can name the variable that contains the functions you implement anything you want, however the practice has been to name it as shown.
  4. If registration fails, none of the routines in your module will be called, but there are likely to be bigger problems, in that case.

This code can be cut from an existing module and pasted into yours with the appropriate changes made.

NOTE! If your module has undefined symbols (not defined in Samba or another shared library), then Samba will not even call your module's init function, and attempts to connect to the share will fail.

VFS Function Pointer Structure

Your module must declare and initialize a struct vfs_fn_pointers structure. The following is an example.

static struct vfs_fn_pointers vfs_my_module_fns = { 
       .getxattr_fn = my_module_getxattr, 
       .fgetxattr_fn = my_module_fgetxattr, 
       .setxattr_fn = my_module_setxattr, 
       .fsetxattr_fn = my_module_fsetxattr, 
       .listxattr_fn = my_module_listxattr, 
};

The variable must be declared static so that it does not cause conflicts with any symbol exported by Samba or any other module. In addition, you only need to initialize pointers to just those VFS functions you are implementing (using the C99 initialization syntax.) You would generally declare this variable before you declare the init function discussed above.

Include Files

Your module will need to invoke some include files. You will need includes.h, but you might also need to include a few more:

  • system/filesys.h if you need access to many of the file system calls, like fcntl, etc. See lib/replace/system/filesys.h to determine what system include files this file pulls in.
  • smbd/smbd.h if you need access to definitions for NT ACLs etc.

VFS Functions

These are the meat of your Samba VFS module and I can only provide generic information here. Functions in Samba modules return several different types:

  1. int return values, in which case a value less than zero means an error has occurred, and the error value is available in errno, or
  2. NTSTATUS return value. Here, if the underlying functions you are calling communicate errors through errno then you have to convert them to NTSTATUS values using map_nt_error_from_unix, or
  3. Pointers to things like SMB_STRUCT_DIR where you return NULL to indicate an error and set errno to a UNIX error.

If your functions are adding functionality to that already provided by Samba or existing modules in the stack (after your module) you will generally make calls to SMB_VFS_NEXT_XXX, where XXX is the name of the function you are providing (eg, UNLINK if you are providing UNLINK functionality, in which case you will call SMB_VFS_NEXT_UNLINK.

You can also call any other VFS function that is relevant, eg SMB_VFS_STAT, but you will have to ensure that you pass the correct parameters, eg:

       ret = SMB_VFS_STAT(handle->conn, smb_fname_cpath);

This brings us to the parameters that your functions will have to deal with. The first parameter passed to each Samba VFS function is a pointer to vfs_handle_struct, which contains information you might need, like the connection structure (share, etc) that the request relates to, and so forth. Another parameter you might receive is a pointer to a files_struct or a struct filename_struct. Others that you might also receive include character strings for paths, integer values, etc. You should peruse existing Samba VFS functions to see some of the values you might receive.

In addition, you should be aware that Samba has an extended STAT structure, SMB_STRUCT_STAT. In some versions of Samba (3.6.0 and above, I think) you can use init_stat_ex_from_stat to convert a normal Unix struct stat variable into an SMB_STRUCT_STAT for return to Samba. However, if the underlying module you are extending has its own extended stat structure that is not compatible with SMB_STRUCT_STAT you will have to supply a routine to convert your stat struct to an SMB_STRUCT_STAT.

Memory Management and talloc

You should have an understanding of talloc if you work on Samba VFS modules. You can find more information on the whole talloc library in <source-dir>/lib/talloc/talloc_guide.txt. Because talloc is a hierarchical allocation system that allows you to free all allocations within a single context with one call to talloc_free, it makes memory management much easier. To do this, talloc makes use of talloc contexts to keep track of allocations.

You should consider the following rules of thumb:

  1. Always use talloc routines rather than malloc, calloc, etc.
  2. Some VFS routines are called with a talloc context as one of their arguments. You should use the supplied talloc context for all allocations in such routines unless you have a good reason to use a different context.
  3. If the memory you are allocating needs to survive until the client disconnects from the share, then use the connection structure, handle->conn, as your talloc context.
  4. If the memory you are allocating needs to survive for the duration of an open file, then use the files struct as your talloc context.
  5. If the memory you are allocating will be used in a separate thread and needs to survive beyond any of the other contexts mentioned here, create a new talloc context with talloc_new(NULL). Of course, you are responsible then to call talloc_free on the context at some time in order to clean up the memory.
  6. If the memory you are allocating should be de-allocated somewhere above you or when the current SMB requests (that provoked the VFS call) completes, then use talloc_tos().

You should note that talloc_tos() will give you the current top of stack of the stack of talloc contexts, and the memory you have allocated using talloc_tos() as a context will be freed as soon as the current talloc stack frame goes away. You can always, of course, explicitly free memory you know is no longer needed with talloc_free.

Providing Context between Calls

As mentioned above, the first parameter to all VFS functions is the vfs_handle_struct, which is unique for each share and module, so you can store context information in the structure pointed to by the handle. You can save information in the 'handle' in the following way:

        config = talloc_zero(handle->conn, struct my_module_config_data); 
        if (!config) { 
               SMB_VFS_NEXT_DISCONNECT(handle); 
               DEBUG(0, ("talloc_zero() failed\n")); return -1; 
        } 
        SMB_VFS_HANDLE_SET_DATA(handle, my_module_context_data, 
                                NULL, struct my_module_config_data, 
                                return -1);

SMB_VFS_HANDLE_SET_DATA is a macro, and its arguments are:

  1. handle, the VFS handle.
  2. A pointer to some data that you want to associate with the handle.
  3. A pointer to a function to free the data you are saving. It is set to NULL above, which means that this VFS module will explicitly free the data (in a disconnect function.)
  4. The data type of the structure that param 2 points to.
  5. A command to be executed if handle is NULL.

You can use this handle data to keep track of information relating to the file system backing the share, or to maintain parameters related to this instance of the share, or both. It is a pointer to a structure you declare.You can retrieve handle data in your VFS functions subsequently using the following macro:

        SMB_VFS_HANDLE_GET_DATA(handle, config, 
                                struct my_module_config_data, return next); 

You should also be aware of the macros SMB_VFS_HANDLE_FREE_DATA and SMB_VFS_HANDLE_TEST_DATA. Check the include file source3/include/vfs.h.

Module Specific Parameters

You might also want to retrieve module-specific parameters from the smb.conf file in your connect function. This can be done using:

        config->some_bool_param = lp_parm_bool(SNUM(handle->conn),
                                               "my_module", "someboolparam", true); 

These parameters should be entered in the smb.conf file in the format:

[global]
...
    my_module:someboolparam = yes 
...

Such parameters can also appear in share sections.

There are also other parameter retrieving functions you should be aware of, like:

  • lp_parm_const_string, which returns a pointer to a const string,
  • lp_parm_talloc_string, which returns a pointer to a new string created with a call to a talloc routine,
  • etc.

You can find examples of these in other VFS modules and you can find all such functions in source3/param/loadparm.c.

Extending the Samba files_struct structure

In addition to the above functions, you can extend Samba's files_struct with an extension of your own. Each module in the stack can add their own extension, but only one extension can be added per file per module. You add the extension with:

        p_var = (struct my_struct *) VFS_ADD_FSP_EXTENSION(handle, 
                                                           fsp,  struct my_struct, 
                                                           NULL); 

after which you can update the fields in the structure that you now have a pointer to. You can fetch an extension with:

        p_var = (struct my_struct *)VFS_FETCH_FSP_EXTENSION(handle, fsp);

There is also VFS_REMOVE_FSP_EXTENSION and VFS_MEMCTX_FSP_EXTENSION, which can be found in source3/include/vfs.h, although they reduce to functions in source3/smbd/vfs.c.

You should use talloc when you allocate space for your extension and the best talloc context to use at this point is the fsp itself because it means that your extension will be released when the fsp is released.

A good choice of talloc functions would be talloc_zero.

AIO Handling in a VFS Module

Samba supports the use of AIO and provides a number of VFS functions to allow VFS module writers to also support AIO. With AIO, as with a number of other areas you need to be aware of Samba's TEVENT facility. A good starting point for studying how a module can implement AIO is the default VFS module.

NOTE! These need updating.

These functions are:

Function Description
SMB_VFS_PREAD_SEND This is used to initiate an AIO read request. If all went well initiating the request, return 0, otherwise return -1 after setting errno to something appropriate.
SMB_VFS_PWRITE_SEND This is used to initiate an AIO write request. If all went well initiating

the request, return 0, otherwise return -1 after setting errno to something appropriate.

SMB_VFS_AIO_RETURN This is used to retrieve the returned status from a successfully initiated

AIO operation. That is, whether it ultimately succeeded or failed.

SMB_VFS_AIO_CANCEL This is used to cancel an already initiated AIO operation. If you

managed to do so, return AIO_CANCELED or AIO_NOTCANCELED, AIO_ALLDONE or -1 as appropriate and set errno where appropriate.

SMB_VFS_AIO_ERROR This is used to retrieve the status of AIO operations that were

successfully initiated. Return EINPROGRESS, ECANCELED or an other error to indicate an error, or 0 to indicate that the operation has successfully completed.

SMB_VFS_AIO_FSYNC Samba 3 does not currently use this VFS routine.
SMB_VFS_AIO_SUSPEND This is used to clean up initiated AIO operations when a client drops a

connection. Consult the Samba code for more details, specifically source3/smbd/aio.c.

SMB_VFS_AIO_FORCE This is used to tell Samba whether or not the read or write operation

Samba is about to initiate via AIO should be performed via AIO. That is, your module gets to veto the initiation of AIO requests on a request by request basis if it wants to. Return FALSE if you are happy to allow the operation to be an AIO operation, otherwise return TRUE if you don't want that operation being sent via AIO.

The default behavior is to call the standard system AIO routines, aio_read/aio_read64, aio_write/aio_write64 and aio_return/aio_return64.

The main thing to be aware of here is that if you support AIO in your VFS module, and you do not simply pass them on to normal kernal AIO routines (either via sys_aio_xxx routines or directly via system calls) then you must simulate the normal AIO completion behavior. That is, you must signal RT_SIGNAL_AIO somewhere in your module (perhaps in the async threads) when the operations ultimately complete.

Conforming to the VFS Interfaces

Many of the VFS routines provide a POSIX interface. This means that they must return value of -1 if an error has occurred and must set errno to a POSIX error value. Otherwise they should return 0 or greater if no error has occured.

An example is SMB_VFS_GETXATTR, which is used to retrieve an XATTR on a file. If the passed in buffer is too small to contain the XATTR on the file, the routine should return -1 as its result and set errno to ERANGE.

Other routines return an NTSTATUS result, and you have to test them with the correct macros. Failing to conform to the correct interface semantics can cause bad results. Generally, the compiler will catch problems, except failing to set errno as discussed above.

Be prepared for "stat" opens

There are cases where Samba will call VFS modules with an FSP that refers to a file that has not been opened. In these cases fsp->fh->fd will contain the value -1. This can happen when Windows opens the file with an access mode that only requests READ ATTRIBUTES, for example. Be prepared for such cases and do not assume you will always have a valid value in fsp->fh->fd.

However, in VFS modules that must read or write files, the file you must access will already be open.

The Meaning of Path Names

What's in path names passed down to VFS functions? Are they absolute? Are they relative? Relative to what?

The short answer is: "Paths are relative to the root of the share, except when they are not". A slightly longer answer is "Paths are either absolute or relative to the current working directory". Both answers could be confusing to module writers, so the following elaborates on that.

At the SMB protocol level, path names are relative to the root of the share. When processing SMB requests, smbd changes the working directory (chdir()) to the root of the share, and then typically forwards those paths as-is (translated to UTF-8). Thus, a VFS module would normally see the same path requested by the SMB protocol, which is a path relative to the root of the share. However, there are exceptions:

  1. When changing the working directory, smbd also uses the VFS chdir_fn function, and is using an absolute path - the path of the share. Thus, if you implement chdir_fn, expect to see absolute paths.
  2. Some VFS modules modify the path and forward the request to the layer below them. Some of them can modify the path into an absolute path, depending on configuration (for example, vfs_shadow_copy2, if configured with an absolute snapsharepath).
  3. Finally, smbd itself could use relative paths which are not relative to the root of the share. This happens when smbd changes into a directory inside the share to perform some "sensitive stuff". One example is changing the ownership of the file into that of the parent directory (inherit owner = yes). However, the recurring pattern with this kind of chdir() is that smbd first makes a relative chdir (to get into the path inside the share), and then returns to the original path using an absolute path.

So as a module writer, how do I cope with all this?

Handling Absolute Paths

Absolute paths usually have meaning only in the context of real file systems. A module that exposes a kernel file system (such as the default VFS module or the btrfs module) would have no problem handling an absolute path. On the other hand, getting an absolute path can be confusing to a module which provides a user-space file system.

As noted above, smbd would only use absolute paths in order to change directory into the share's root directory. A user-space module can simply ignore this kind of chdir or use it to flag the current state as "in share root dir".

On the other hand, some VFS modules could modify the path into an absolute path. The simplest way for a user-space file system VFS module to deal with that is to say that it cannot be stacked with other modules which convert the path to an absolute path.

Handling Relative Paths

As explained above, a relative path would mostly be relative to the share's root, and always relative to the current working directory.

One way to cope with relative paths therefore is to keep track of the current working directory. The default VFS module for example does this - the kernel keeps track of current directory for it.

The following algorithm could be used by VFS modules to track the "current relative working directory", by implementing chdir_fn VFS function:

  1. Initial state is "at share's root" - current relative directory is an empty string.
  2. A chdir with an absolute path can be assumed to be "back into the share's root" - clear the current relative directory (this rule could be broken by another VFS module but it's easily verifiable by comparing with handle->conn->connectpath).
  3. A chdir with a relative path appends the path to the current relative directory.
  4. For any other file operation, to reconstruct the path relative to the share's root, append the supplied path to the current relative directory.

A different way to cope with relative paths is to ignore the cases where the current directory is not the share's root either because your module doesn't care about the exact location of the file (for example it performs some character conversion on path names), or you don't intend to support the cases where the current directory is not the share's root.

Building, Installing and Debugging your VFS Module

Building Your VFS Module

Unfortunately there is no way currently to build your VFS module out of the Samba source tree. This means that you will have to add your source code to the Samba source tree to build it.

At a minimum you will have to create a new section in source3/modules/wscript_build describing your module.

Basic VFS Module Building

Here is a small VFS module I added for other reasons along with its module source code and the wscript addition. First the source code for the module vfs_fake_compression:

/*
   Unix SMB/CIFS implementation.

   Copyright (C) David Disseldorp 2011-2013
   Copyright (C) Richard Sharpe 2014

   Provide a simple VFS module that implements what HyperV needs in 
   compression support, called fake compression.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 3 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program.  If not, see <http://www.gnu.org/licenses/>.
*/

#include "includes.h"
#include "librpc/gen_ndr/ioctl.h"

static uint32_t fc_fs_capabilities(struct vfs_handle_struct *handle,
                                  enum timestamp_set_resolution *_ts_res)
{
       uint32_t fs_capabilities;
       enum timestamp_set_resolution ts_res;

       /* inherit default capabilities, expose compression support */
       fs_capabilities = SMB_VFS_NEXT_FS_CAPABILITIES(handle, &ts_res);
       fs_capabilities |= FILE_FILE_COMPRESSION;
       *_ts_res = ts_res;

       return fs_capabilities;
}

static NTSTATUS fc_get_compression(struct vfs_handle_struct *handle,
                                  TALLOC_CTX *mem_ctx,
                                  struct files_struct *fsp,
                                  struct smb_filename *smb_fname,
                                  uint16_t *_compression_fmt)
{
       *_compression_fmt = COMPRESSION_FORMAT_NONE;
       return NT_STATUS_OK;
}

static NTSTATUS fc_set_compression(struct vfs_handle_struct *handle,
                                  TALLOC_CTX *mem_ctx,
                                  struct files_struct *fsp,
                                  uint16_t compression_fmt)
{
       NTSTATUS status;

       if ((fsp == NULL) || (fsp->fh->fd == -1)) {
               status = NT_STATUS_INVALID_PARAMETER;
               goto err_out;
       }

       status = NT_STATUS_OK;
err_out:
       return status;
}

static struct vfs_fn_pointers fake_compression_fns = {
       .fs_capabilities_fn = fc_fs_capabilities,
       .get_compression_fn = fc_get_compression,
       .set_compression_fn = fc_set_compression,
};

NTSTATUS vfs_fake_compression_init(void);
NTSTATUS vfs_fake_compression_init(void)
{
       return smb_register_vfs(SMB_VFS_INTERFACE_VERSION,
                               "fake_compression", &fake_compression_fns);
}

Then the addition to source3/modules/wscript_build:

bld.SAMBA3_MODULE('vfs_fake_compression',
                 subsystem='vfs',
                 source='vfs_fake_compression.c',
                 deps='',
                 init_function='',
                 internal_module=bld.SAMBA3_IS_STATIC_MODULE('vfs_fake_compression'),
                 enabled=bld.SAMBA3_IS_ENABLED_MODULE('vfs_fake_compression'))

deps can be used to add additional dependencies, init_function must be set to an empty string.

And finally, add your new module to the list of modules built as shared objects by default in source3/wscript default_shared_modules.extend:

--- a/source3/wscript
+++ b/source3/wscript
@@ -1576,6 +1576,7 @@ main() {
                                      vfs_worm
                                      vfs_crossrename vfs_linux_xfs_sgid
                                      vfs_time_audit idmap_autorid idmap_tdb2
+                                     vfs_fake_compression
                                      idmap_ad
                                      idmap_script
                                      idmap_rid idmap_hash idmap_rfc2307))

Note! I have shown this as a patch so you can see the context, but you get the idea, I am sure.

Advanced Building Tricks

Note! This information might not be quite correct as yet.

You might want to only build your module if the option --enable-yourmodule is used when configure is run.

You can achieve this by adding the following two sections to source3/wscript just after a similar section for --enable-vxfs:

opt.add_option('--enable-yourmodule',
                  help=("enable support for YourModule (default=no)"),
                  action="store_true", dest='enable_yourmodule', default=False)

Then add another section to source3/wscript around the check for Options.options.enable_vxfs like the following:

    if Options.options.enable_yourmodule:
        conf.DEFINE('HAVE_YOUR_MODULE', '1')

Then add another section to source3/wscript around the place where HAVE_VXFS is being checked for with a if conf.CONFIG_SET('HAVE_VXFS') test:

    if conf.CONFIG_SET('HAVE_YOURMODULE'):
        default_shared_modules.extend(TO_LIST('vfs_yourmodule'))

Finally, add a section to source3/modules/wscript_build after the vfs_vxfs section like the following:

bld.SAMBA3_MODULE('vfs_yourmodule',
                 subsystem='vfs',
                 source='vfs_yourmodule.c',
                 init_function=,
                 internal_module=bld.SAMBA3_IS_STATIC_MODULE('vfs_yourmodule'),
                 enabled=bld.SAMBA3_IS_ENABLED_MODULE('vfs_yourmodule'))

If you need to link with a library, then the following should help.

First, in source3/wscript add the following somewhere:

conf.CHECK_LIB('some-lib', shlib=False)

Of course, if the library is shared, change False above to True.

Then in source3/modules/wscript add something like the following:

bld.SAMBA3_MODULE('vfs_yourmodule',
                 subsystem='vfs',
                 source='vfs_yourmodule.c',
                 init_function=,
                 deps='some-lib',
                 internal_module=bld.SAMBA3_IS_STATIC_MODULE('vfs_yourmodule'),
                 enabled=bld.SAMBA3_IS_ENABLED_MODULE('vfs_yourmodule'))

Installing Your Module

Debugging

To be continued.

Possibly Helpful Hints

Remember that SMB_VFS_REALPATH is called very early by Samba code that is dealing with file/path names. It is called from check_reduced_name which is called from check_name. It is your first opportunity to deal with names. However, if you have enabled wide links or allow follow symlinks, SMB_VFS_REALPATH might not be called.