Difference between revisions of "SoC/Ideas"

(Add remote (RPC) support for Samba's registry configuration backend)
(link to applying page)
 
(187 intermediate revisions by 22 users not shown)
Line 1: Line 1:
=Google Summer of Code: Suggested Project ideas=
+
= Applying to Samba =  
  
==Samba 3==
+
See our main [[SoC|Google Summer of Code @ Samba]] page for '''important details''', [[SoC/Applying|plus these extended notes on applying]] to Samba.
  
===User-space ACL implementation===
+
= Google Summer of Code: Suggested Project ideas =
  
Samba4 contains an implementation of NT-style access control lists in user space. [http://git.samba.org/?p=vl/samba.git/.git;a=shortlog;h=ea_acl] contains a rudimentary start of this code to Samba3. This project should complete the implementation started by Volker Lendecke, together with a lot of tests that verify individual operations in Samba3 do the correct access checks.
+
The following are Samba project ideas for Summer of Code.
 +
Of course you are free to come up with Samba related ideas not listed here.
 +
Please discuss your planned project by either sending an email to [https://lists.samba.org/mailman/listinfo/samba-technical samba-technical@lists.samba.org] or joining us on irc://irc.freenode.net/#samba-technical.
  
Possible mentor: Volker Lendecke
+
==Samba==
  
===SACL VFS Module ===
+
Some additional possible GSoC topics can be found in Bugzilla in the form of bugs which are marked as "Feature request": [https://bugzilla.samba.org/buglist.cgi?query_format=advanced&short_desc=Feature%20request&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&short_desc_type=allwordssubstr&product=Samba%204.0 here]. Questions regarding complexity and requirements should be directed to the technical mailing list.
  
Support for file systems SACLs (i.e. file system auditing) on in Samba in a VFS module by utilizing Extended Attributes. This project should ensure that users are able to successfully view and modify auditing settings on files and directories using the Windows Explorer.
+
<!-- Commented out possibly stale proposals
  
Possible mentors:
+
===Browsing support in Samba 4===
 +
Samba 4 still needs support for mailslots in general and in particular for the BROWSE mailslot. Should come with tests. Documentation of the BROWSER protocol is available here:
 +
http://msdn.microsoft.com/en-us/library/cc201609(PROT.10).aspx
 +
http://ubiqx.org/cifs/Browsing.html
 +
 
 +
*Difficulty: Hard
 +
*Language(s): C
 +
*Possible mentors: Stefan Metzmacher
 +
 
 +
===Implement login / logout related counter update===
 +
For the moment the attributes related to login and logout are not
 +
updated by Samba4.
 +
The goal of this project is to understand in which case windows update
 +
the counters (ie. most probably during interactive logon but also maybe
 +
with some netlogon calls ?) and to implement counter and timestamp
 +
update is Samba code so that this information can be available.
 +
This project of course includes the development of unit tests.
 +
 
 +
*Difficulty: Easy
 +
*Language(s): C
 +
*Possible mentors: Andrew Bartlett
  
===Consolidate internal LDAP SASL support ===
+
===Improve regedit===
 +
 +
Last year someone has started to write a ncurses based registry editor. The editor could be improved, like put some effort in a better look and feel and adding dcerpc winreg support to remotely connect to the registry.
  
Samba has two separate copies of LDAP support routines. One is used to access Active Directories servers when operating as a member server and the other is used for implementing the LDAP passdb backend feature. Applicants should be comfortable with LDAP directories services and SASL mechanisms such as GSS-SPNEGO.
+
See https://git.samba.org/?p=asn/samba.git;a=shortlog;h=refs/heads/regedit
  
Possible mentors:
+
* Difficulty: Medium
 +
* Language(s): C
 +
* Possible mentors: Andreas Schneider, Michael Adam
  
===Backport EndPointMapper and support for ncacn_ip_tcp===
 
  
Current Samba 3.0 releases only support MS-RPC over SMB named pipes. The SAMBA_4_0 code base has support RPC directory over TCP which could be back ported in some degree to the SAMBA_3_0 tree.  A pre-requisite task may be to backport the events framework first
+
===Utilize libsmbclient server-side copy support in file managers===
  
Possible mentors: [[JelmerVernooij]]
+
With libsmbclient now supporting server-side copy requests via [https://git.samba.org/?p=samba.git;a=commit;h=f73bcf4934be89f83e86459bc695b7d28348565c cli_splice()], file managers making use of libsmbclient can be changed to utilize server-side copy support for greatly improved remote copy performance. Potential file manager targets include [https://bugzilla.gnome.org/show_bug.cgi?id=771022 GNOME Files/Nautilus (gvfs_smb)], Dolphin (kio_smb) and Kodi's File Manager.
  
===Alternative configuration backends===
+
*Difficulty: Easy, Medium
 +
*Language(s): C, C++
 +
*Possible mentors: David Disseldorp
  
'''NOTE''': Samba 3.2.0 has a registry configuration backend as an alternative to smb.conf, which makes this project obsolete at least to some extent - ''Michael''
 
  
The current smb.conf does not allow the flexible set read and modify individual key/value pairs from within smbd. This project would explore using a registry like backend that would allow more flexibility is managing Samba's configuration settings outside on a simple text editor. The LibElektra project provides a common configuration library with storage plugins. One possibility would be to implement an LDB backend along with a simple command line interpreter/editor.
+
===Windows Search Protocol WSP client library and torture tests===
  
The interaction between this and the registry shares in Samba3 should be carefully considered.
+
The Windows Search Protocol WSP is used to implement remote full filesystem indexing (indexed search) between Windows machines. We would like to support this functionality in Samba, interfacing with existing indexing tools on Unix systems (such as GNOME Tracker).
  
Possible mentors:
+
This is a DCE/RPC protocol. See http://msdn.microsoft.com/en-us/library/cc251767.aspx .
  
===Full SAM implementation ===
+
The student should write a (un)marshalling library to push and pull PDUs and an asynchronous client library on top of the Samba raw smb client library.
Provide a new database passdb backend that provides both the Unix and Win32 attributes without the use of external commands such as the "add user script".  This could be based around LDB (perhaps with the Samba4/AD layout), TDB, or some other database.  
 
  
Possible mentors:
+
The student should write sub-tests for smbtorture which should demonstrate how the protocol works against a Windows server.
 +
The student doesn't have to implement the Samba server code.
 +
Noel Power from SUSE has done some basic server implementation, he should be able to give guidance
  
===SNMP support===
+
*Difficulty: Medium, Hard
Explore supporting the LANMAN SNMP MIB included with Windows server operating systems in smbd, nmbd, and winbindd.
+
*Language(s): C, (Python)
 +
*Possible Mentors: Noel Power
  
Possible mentors:
+
-->
 +
=== Print System Asynchronous Remote Protocol Wireshark Dissectors ===
  
===Solve the overabundance of configuration parameters===
+
The Print System Asynchronous Remote Protocol ([https://msdn.microsoft.com/en-us/library/cc238080.aspx MS-PAR]) is a replacement for the synchronous Print System Remote Protocol (MS-RPRN). MS-PAR inherits many message and buffer formats from the old protocol, but allows for asynchronous submission and notification of print jobs. Further details of the protocol can be found in Günther and Andreas' [https://sambaxp.org/archive-data-samba/SambaXP2013-DATA/thu/track2/Guenther_Deschner_Andreas_Schneider-Printing_Samba_4.pdf SambaXP presentation].
(warning: long gloves required?)
 
  
Samba 3.0 includes many config options which are necessary only in extreme circumstances. Only a small percentage of the parameters are required in most installations. The challenge is to remove the more advanced settings from immediate misuse while still providing access to them when absolutely necessary.
+
The student should write Wireshark dissectors for MS-PAR.
 +
The student should improve existing smbtorture tests, which demonstrate how the protocol works against a Windows server.
  
Possible mentors:
+
*Difficulty: Medium
 +
*Language(s): C
 +
*Possible Mentors: Andreas Schneider (supported by Günther Deschner)
  
===Convert more of the Samba 4 IDL to be used in Samba 3===
+
===Ceph RADOS key-value store as an alternative to TDB===
3.2 will start using more autogenerated DCE/RPC pull/push functions. At the moment, winreg, wkssvc, dssetup, lsarpc, netlogon and samr have been converted. The following interfaces still have to be converted (and would ideally also have the related tests from Samba 4 pass against them):
 
  
* srvsvc
+
Ceph offers a highly scalable and fault-tolerant storage system. Samba is already capable of sharing data located on the [https://ceph.com/ceph-storage/file-system/ Ceph Filesystem], however scale-out sharing (the same data exposed by multiple Samba nodes) currently requires the use of [https://ctdb.samba.org/ CTDB] for consistent and coherent state across Samba cluster nodes. In such a setup CTDB provides a clustered database with persistent key-value data storage and locking. Database usage is abstracted out via a generic ''dbwrap'' interface.<br>
* svcctl (already started)
 
* eventlog (only 1 call left)
 
* ntsvcs (already started)
 
* spoolss (samba4's idl should be fixed first.)
 
  
Possible mentors: [[JelmerVernooij]]
+
Ceph's librados library provides an API for the storage and retrieval of arbitrary key-value data via the ''omap'' functions. A watch/notify protocol is also provided as a mechanism for synchronising client state (locking). Key-value data stored in the RADOS back-end inherits the same redundancy features as regular objects, making it a potentially good candidate as a replacement for CTDB in scale-out Samba clusters.
  
===Design and Implement a New Printer API for Better CUPS Integration===
+
This task involves the implementation and testing of a new ''dbwrap'' back-end that uses librados for the storage, retrieval and locking of Samba key-value state. Ideally, the candidate would also allow time for benchmarking.
  
The current internal printing API used to interact with a Unix printing systems (e.g. LPD or CUPS) mirrors the existing "print command" and other parameters from smb.conf.  These were originaly designed to work with LPD and SysV spoolers.  The CUPS library has a much richer interface for interacting with applications such as Samba.  The goal of this project would be to redesign the current print system support to expose these more advanced features in CUPS to Samba administrators and to provide a better user experience when deploying Samba as a print server in mixed Windows/Unix networks.
+
*Difficulty: Medium
 +
*Language(s): C
 +
*Possible Mentors: David Disseldorp
  
Possible mentors:
+
=== Samba binary size reduction ===
  
===Finish the move to the new NDR based winbind protocol===
+
Samba has grown to quite a bloated beast. This task will focus on some areas where the bloat can easily be reduced, e.g.
 +
* Removal of unused autogenerated librpc code
 +
** RPC client and server code when only serialization functions are used
 +
** optional struct print routines (pidl noprint?)
 +
** Some knowledge of perl would help here (for pidl)
 +
* Add new build options to compile Samba without certain functionality
 +
** Undesired DCE/RPC services
 +
** Printing support
 +
** Legacy SMB/CIFS support (stretch goal)
  
Samba 3.2 has a libwbclient library which abstracts the access to winbindd. However there're
+
==== The NDR parsing Gordian knot ====
some caller's left which directly access the winbindd socket (e.g. wbinfo).
 
  
The student should add the missing functions to libwbclient and convert wbinfo
+
What we need is for someone to take the ideas in the page, and the
(and maybe also nss_winbind.so and pam_winbind.so) to only use libwbclient.so.
+
concepts from the patch and make them a practical solution for Samba.
  
It is aimed to replace the struct based winbind protocol in samba3 with a new NDR based one.
+
A particular spot where we over-link is via the ndr-table subsytem,
The infrastructure for the autodection and request handling between the new and old protocol
+
which links to most of our (large) generated parsers for every RPC
is already finished. Also the winbind internal communication uses almost only the new protocol.
+
protocol we know.
It's available in this branch http://gitweb.samba.org/?p=metze/samba/wb-ndr.git;a=shortlog;h=v3-2-wb-ndr.
 
  
The student should also start convert the winbindd code to support the new NDR based
+
Some bits of the code that link to ndr-table only want some metadata,
protocol also on the client socket. However the old struct based protocol should
+
but they end up linking to each NDR_* subsystem because the metadata is
still be available, but the old protocol requests should be implemented
+
at the bottom of each parser (the ndr_*.c files).
just as wrappers to the new protocol, so that the winbind internal logic
 
only uses the struct based interface.
 
  
The student should also extend the winbind torture tests in Samba4
+
The primary task may well end up being in Perl, or Python, or C, the
to test the struct based and NDR based interfaces automaticly in the
+
challenge here will not be deep programming but lateral thinking about
build-farm (make test).
+
how to break up the dependency chains.
  
The student doesn't need to convert libwbclient to use the new protocol.
+
Making this more challenging or interesting (depending on your
 +
perspective), the RPC server is being rewritten, but for now the nexus
 +
in source3/rpc_server/rpc_ncacn_np.c and the calls to
 +
ndr_table_by_uuid() remain.
  
Possible mentors: Stefan Metzmacher
+
What applicants may wish to do is see if that can be re-written in
 +
such a way that does not require linking in the whole NDR
 +
parser, just to obtain the information actually used.
  
===Develop a .reg file registry dump/restore utility for Samba3===
+
====The broader issue====
  
The goal of this project is to develop a library for reading and writing .reg files
+
The challenge is that Samba over-links because some of the dependencies
as described here: [http://support.microsoft.com/?scid=kb%3Ben-us%3B310516&x=16&y=16].
+
are not fine grained enough.
A subcommand of the ''net'' utility would be used to save and restore parts or all
 
of Samba3's registry.
 
  
A simple project would develop this upon the existing Samba3 registry code.
+
The fundamental challenge is likely to be a lot of cases where:
This task can also be achieved as part of a more ambitious project described in the
 
next chapter.
 
  
Possible mentors: [[Obnox|Michael Adam]] and [[JelmerVernooij]]
+
* A depends on B
 +
* C depends on D
  
===Reconcile Samba3 and Samba4 registry code===
+
But B and C are in the same file, but B does not actually depend on C.
  
Samba4's registry code is a well structured mostly isolated library with replaceable
+
There are likely many other parts of Samba which link to large things
backends mounted onto hives, similar to Windows registry structure. Samba3 registry code
+
that are not needed.  For example, perhaps we should allow a build
has many legacy parts that are currently being hidden more and more under a winreg-api-like
+
without printing support, or parsing of the printing protocols?
interface. Special actions like dynamic overlays are tied to various registry subkeys
 
by "hooks".
 
  
Reconciling the two registry versions would mostly mean porting Samba4 registry code to Samba3,
+
The challenge is is doing this in a way that is clean, and mostly done
things have to be adapted for Samba3 registry hooks functionality to be preserved.
+
in the build system, not by #ifdef in the code.
  
A .reg file utility could be obtained more or less as side result of this project since ther
+
Skills in determining binary dependencies, as well as the build system
is alread some support for this in Samba4.
+
dependencies (to compare with) would be very helpful in this project.  
  
Possible Mentors: [[JelmerVernooij]] and [[Obnox|Michael Adam]]
 
  
===Add remote (RPC) support for Samba configuration===
+
*Difficulty: medium
 +
*Language(s): Some knowledge of C and python would be helpful
 +
*Possible Mentors: Andrew Bartlett
 +
*References: [https://lists.samba.org/archive/samba-technical/2019-November/134539.html patches by Andrew Bartlett in 2020]
  
Since recently, Samba has a registry based configuration backend: Configuration
+
=== Samba selftest efficiency improvement ===
data is stored inside the registry key HKEY_LOCAL_MACHINE\Software\Samba\smbconf.
 
Access to this configuration is available through a module that makes use of
 
the "reg_api" interface which is similar to the winreg API. This project would
 
at first develop a common api for registry access local through reg_api and
 
remote through the winreg rpc client code. This new module could be abstraced
 
from the code of the utility "net rpc registry". With this new module, the registry
 
configuration code could be enhanced to allow for remote configuration.
 
  
Possible Mentors: [[Obnox|Michael Adam]]
+
Samba's selftest and GitLab CI pipeline has grown to quite a bloated beast. Even where we save wall-clock time by using parallel virtual machines, CPU time is spent costing the Samba Team money and using electricity more broadly.
  
==Samba 4==
+
This task will focus on some areas where the expense, bloat and duplication can easily be reduced, e.g.
 +
* De-duplication of duplicate test runs
 +
* Caching of compilation output between stages
 +
* Use of pipeline stages to avoid starting 20 potentially expensive jobs if a smoke-test build does not pass
  
===FRS: File Replication Service===
+
The ideal candidate will do some of their own investigation to show they understand how savings could be made, rather than just sticking to this list.
The protocol used for the File Replication Service in Active Directory is not currently understood.
 
  
This project would be to document this protocol, and implement a working client as part of our torture suite. An ambitious student could try and implement a server as well, but getting the client done would be enough of a challenge for a SOC project. This project would suit someone who is interested in delving into the intricacies of protocol analysis.
+
*Difficulty: medium
 +
*Language(s): Some knowledge of C and python would be helpful, an understanding of GitLab pipelines will help a lot
 +
*Possible Mentors: Andrew Bartlett
  
Possible mentors:
+
===Samba AD DC as the ideal POSIX Directory===
  
===Compression in the ndr layer===
+
Samba is a great Active Directory Domain Controller, but it is not an ideal directory server for a large, passionate and important user base:  Sites with Samba SMB servers, but also general purpose Linux servers.  A smaller subset of these sites also have Linux desktops.  These sites may also have Windows servers, but they like the Windows desktops, are not the focus.  
The DRSUAPI DsGetNCChanges() call uses compression for replicating large chunks of objects.
 
  
We already have the MSZIP decompression working in samba4 and know the student should
+
These sites often used Samba + OpenLDAP, and are finding the move to Samba's AD DC a bit difficult, because schema extension is hard, some things are not done automatically (like uidNumber allocation), and in general the focus has been around matching Windows not listening to the needs of this part of our user base.  
implement the MSZIP compression code used for the server code. The idea is to base
 
the work on zlib and contribute the results back to the zlib authors.
 
  
There's also another compression algorithm "XPRESS" used in windows 2003.
+
Specific research should be done into what FreeIPA does well in targeting this user segment, and what customisations advanced users of OpenLDAP apply.  
The algorithm is described in the MS-DRSR document of the Microsoft WSPP
 
documentation (See http://msdn2.microsoft.com/en-us/library/cc203213.aspx).
 
The student should implement windows 2003 compression/decompression.
 
  
The student should also write a testsuite, so that the compression is tested
+
This project would be to propose a number of specific improvements, and to add both tests and an implementation of these improvements to Samba.
automaticly in the build-farm (make test).
 
  
The algorithm is also used between Outlook and Exchange, so this would also
+
*Difficulty: Hard
help the OpenChange team.
+
*Languages(s): C, Python
 +
*Possible Mentors: Andrew Bartlett
  
Possible mentors: Stefan Metzmacher
+
===GitLab CI of Samba for non-linux platforms (FreeBSD in particular)===
  
===Windows Search Protocol WSP client library and torture tests===
+
Samba uses GitLab CI to improve the quality of our patches.  Efforts are currently underway to extend the docker container used from just Ubuntu 14.04 to later versions and other distributions.
 +
 
 +
However, we occasionally have issues ensuring Samba still builds and operates on FreeBSD and other non-linux platforms.
 +
 
 +
The idea would be to have a docker image and .gitlab-ci.yml code to support it that runs FreeBSD and then builds and runs Samba's testsuite inside that FreeBSD nested VM, while still outputting the results to the normal gitlab-ci.
 +
 
 +
This differs from just running GitLab CI runners on FreeBSD as we need auto-scale, destroy the host and guest at the end of the test and run on Linux docker (such as the free GitLab,com CI runners).
 +
 
 +
As a stretch goal, being able to run some tests against a specific linux kernel and a raw ext4 filesystem (rather than unionfs) via qemu rather than depending on the docker host configuraiton.
  
The Windows Search Protocol WSP is used to implement remote full filesystem
+
*Difficulty: Medium
indexing (indexed search) between windows machines. We would like to
+
*Language(s): Python, shell, YAML
support this functionality in Samba, interfacing with existing
+
*Possible Mentors: Andrew Bartlett
indexing tools on Unix systems (such as beagle).
 
  
This is a new protocol based on SMB named pipes
+
===Integrate Samba AD DC deployment and management with Cockpit===
\pipe\ci_skads or \pipe\MSFTEWDS.
 
See http://msdn2.microsoft.com/en-us/library/cc216195.aspx.
 
  
The student should write a (un)marshalling library
+
A prototype at https://github.com/abbra/cockpit-app-samba-ad shows how we can integrate Samba AD deployment with Cockpit Linux management console. A goal of this task is to move forward with this prototype to produce a fully working Samba AD management tool for common operations supported by 'samba-tool' command line utility.
to push and pull PDUs and an async client library
 
on top of the samba4 raw smb client library.
 
  
The student should write sub tests for smbtorture
+
The prototype is only a demo what could be done. For comprehensive work following materials need to be consulted:
which should demostrate how the protocol works against
 
a windows server.
 
  
The student doesn't have to implement the samba4 server code.  
+
* cockpit-design, as they provide a good researched UX and UI designs for Cockpit apps for multiple areas over past few years, https://github.com/cockpit-project/cockpit-design/
 +
* SuSE YaST work around https://github.com/yast?q=samba (see https://sambaxp.org/fileadmin/user_upload/sambaxp2019-slides/mulder_sambaxp2019_samba_active_adrectory_tools_windows_admin.pdf for some details)
 +
* Cockpit starter kit, https://github.com/cockpit-project/starter-kit
 +
* 389-ds Cockpit plugin, https://pagure.io/389-ds-base/blob/master/f/src/cockpit/389-console
 +
* Cockpit virtual machines interfaces (both cockpit-docker and cockpit-podman), see more https://github.com/cockpit-project
  
Possible mentors:
+
The project would need to investigate:
  
===Samba4 Domain Member support===
+
* a possible UX and UI design
Samba4 needs various bits of work, to become a useful domain member:
+
* define base set of use cases that can be mapped to distinct Samba AD as covered in https://wiki.samba.org/index.php/Setting_up_Samba_as_an_Active_Directory_Domain_Controller, https://wiki.samba.org/index.php/Setting_up_Samba_as_a_Domain_Member, https://wiki.samba.org/index.php/Joining_a_Samba_DC_to_an_Existing_Active_Directory, and other domain controller related tasks as described in https://wiki.samba.org/index.php/User_Documentation
Students may wish to take on some or all of the tasks below
+
* build actual Cockpit app that implements a clearly defined subset of those use cases.
* Implement Kerberos handling in Samba4
 
* Research and improve other aspects of domain member support
 
  
Possible mentors:
+
A successful project proposal would be able to present a set of use cases proposed to focus on since this is a huge area, a scope to work on, how one would expect to work on deliverables, technology and process wise.
  
===LDB ACLs===
+
It would be nice to see how this project would evolve after the GSoC session would end, since it is clearly a longer term work that would need to be passed over and shared with more contributors.
Currently, Samba4 uses a module known as 'kludge_acls' to perform some basic access control on Samba4's database.  This is highly inflexible, and needs to be replaced with real NT ACLs on the elements.
 
  
Possible mentors:
+
*Difficulty: Medium
 +
*Language(s): Python, JavaScript (React), CSS, HTML
 +
*Possible Mentors: Alexander Bokovoy
  
===CIFS POSIX extensions in Samba4===
 
Samba4 does not implement the CIFS POSIX extensions at this stage. 
 
  
A testsuite needs to be written, to prove consistent behavior, and the Samba3 compatible server implemented.
+
<!-- Commented out possibly stale proposals
  
Possible mentors:
+
===Make libsmbclient thread-safe for Gnome VFS===
  
===Implement a Berkeley DB backend for LDB===
+
libsmbclient is currently not thread-safe, even when different threads use different libsmbclient contexts. This has a big impact on use by the Gnome VFS code. The easiest fix is to add a "Big Lock" around all elements of libsmbclient that are not currently thread safe. This is not fine-grained threading support, but would add mutexes to creating and any manipulation of contexts. Locks also need to be added around any calls into the parameter reading/writing subsystems, and many other places within libsmbclient. Success would be to report safe hellgrind usage on test code using multiple libsmbclient contexts to access common files from a single server/share.
(less important now we have transactions in tdb?)
 
  
LDB currently supports both TDB and LDAP backend storage
+
*Difficulty: Medium, Hard
mechanisms. Another alternative backend possibility is the Berkeley DB
+
*Language(s): C
database.
+
*Possible mentors: Jeremy Allison
 +
-->
  
Possible mentor:
+
==Linux Kernel SMB Client Improvements==
  
===Dbench and Nbench workload generator based on Samba 4===
+
The Linux Kernel has a module called cifs.ko which is independent from Samba (it doesn't share code) that allows users to mount remote shares. It supports multiple dialects of SMB (1, 2, 3). The protocol dialects are now officially documented by Microsoft (See [https://msdn.microsoft.com/en-us/library/cc246231.aspx MS-SMB], [https://msdn.microsoft.com/en-us/library/cc246482.aspx MS-SMB2]) so students shouldn't have to worry about reverse engineering to understand them. The [https://www.wireshark.org/ Wireshark] open source network sniffer&dissector is a very good learning tool as well.
  
Samba 4 has two NTVFS backends (cifs and nbench) which allow to create a CIFS proxy which:
 
  
# intercepts CIFS traffic and forwards that to a remote server, and
+
Interested students should contact Steve French or the [mailto:linux-cifs@vger.kernel.org linux-cifs mailing list] to discuss possible improvements to the Linux Kernel CIFS VFS client.
# write down a workload scenario file for Nbench and Dbench tools
 
  
For every connection there is one log file is written. These logs can
+
=== Add machine-readable debug & stats /proc file ===
later be replayed by nbench/dbench against any file system/server to
+
* We currently output debug and statistic information under /proc/fs/cifs/ (DebugData, Stats, ...). We need to stop outputing free format text that breaks all parsers out there everytime we add things to it. Clean up the cifsdebug.c file (it is kind of messy). Possibly generate a hierarchy of /proc files (e.g. a dir per tcp connection, subdirs for session, files for tcons) instead of dumping everything in one file.
reproduce the same workload. Implementation of the following things
+
* Make a nice visualizer/dashboard thing to get an overview. This could be console or GUI or...
will allow to create a specialized workload generator to test various
+
* Language: C for the kernel stuff, Userspace can be C, C++, Python.
usage scenarios based on real world applications:
+
* Difficulty: Low
  
# Add a functionality to post-process generated logs to create a combined dbench/nbench scenario representing multiple-client access pattern.
+
=== Add additional ftrace (trace-cmd) tracepoints and better GUI ===
# Add CIFS Posix Extensions support to both Nbench NTVFS backend and dbench/nbench tools so that Linux applications running against Linux CIFS file system could be profiled.
+
* Add more dynamic tracepoints to cifs.ko for commonly needed user scenarios, and add GUI (and/or CLI) tool to make it easier to enable/disable cifs.ko dynamic trace points (See /sys/kernel/debug/tracing/events/cifs/ for the pseudo-files that are currently configured manually for tracing or via trace-cmd) Make a nice native/console/web UI for it.
# Package resulting solution as simply-installable and configurable application similar to Samba4WINS package
+
* Language: C (any kernel changes) and userspace C, C++ or Python
 +
* Difficulty: Low
  
Possible mentors:
+
=== Add performance analysis cli tools ===
 +
* Add more perf tools for SMB3 client (similar to iostat or nfsstat) that leverage (and possibly extend what is captured) in /proc/fs/cifs/Stats but make it easier to analyze performance of a cifs mount
 +
* Language: C, C++ or Python
 +
* Difficulty: Low
  
===Re-implement smbclient in python===
+
=== Write the One-True-Tool to unify probe/setup/configuration cifs.ko properly ===
 +
* There are too many knobs in different places at the moment: request-keys, idmap, cifscreds, /proc stuff. This goal of this project would be to write one CLI tool that would wrap everything under a common interface. It could handle getting/setting ACL as well.
 +
* This would be a userspace project.
 +
* The implementation shouldn't too hard but the student will need to learn about the existing configuration ways and usecases which can be long.
 +
* Language: C, C++, Python
 +
* Difficulty: Medium
  
Samba4 now has an embedded python interpreter, which is used for
+
=== Add support for DAC (Claims Based ACLs) to the smb3 kernel module and tools ===
both web configuration and for command line tools. The interpreter has
+
* Similar to what was done to extend the Apache access control model to allow richer ACL semantics,
access to the extensive internal C library of Samba. We would like
+
* SMB3's access control model was extended (at least in Windows).  See e.g. [https://docs.microsoft.com/en-us/windows/security/identity-protection/access-control/dynamic-access-control Dynamic Access Control].
smbclient to be rewritten in python, making it much more easily extendable
+
* Add support to the SMB3 kernel module and user space tools (or create new ones) to allow viewing and managing claims based ACLs (DAC) from the client.
by administrators.
+
* Difficulty: Medium
 +
* Language: C (kernel), C/C++/Python (user space tools)
  
Possible mentors: [[JelmerVernooij]]
+
=== Add support for ODX (T10) Copy Offload to the smb3 kernel module ===
 +
* Windows, and various NAS servers support ODX copy offload (e.g. [https://docs.netapp.com/ontap-9/index.jsp?topic=%2Fcom.netapp.doc.cdot-famg-cifs%2FGUID-1323806A-F37B-46AF-B123-E40FCD362B33.html NetApp]),
 +
* to allow much faster server side copy.
 +
* Add support to the kernel client for this and integration with existing or new copy offload tools to make it easy to use. 
 +
* For more information see [https://msdn.microsoft.com/en-us/library/cc246482.aspx MS-SMB2] and [https://www.slideshare.net/gordonross/smb3-offload-data-transfer-odx SMB3 offload data transfer] and [https://docs.microsoft.com/en-us/windows-hardware/drivers/ifs/offloaded-data-transfers Offloaded data transfers] and references at bottom of [[Server-Side_Copy]].
 +
* Language: C
 +
* Difficulty: Medium/High
  
===GQ replacement using GTK and LDB===
+
===VFS change notification support in cifs.ko ===
 +
* The kernel provides a file/dir notification API (inotify, dnotify). The SMB protocol also provides a way to get notified of file changes. This project would be about implementing the inotify API for cifs.ko by making use of the SMB notification mechanism.
 +
* Difficulty: Hard
 +
* Language: C
  
GQ is a widely used LDAP query tool. Many LDAP administrators would
+
===Failover/Continuous Availability and HA improvements (Witness protocol)===
benefit if a similar tool were constructed, in particular with similar
+
* Benefits: Improved reliability, data integrity - may also allow planned migrations (moving data from one server to another).
schema knowledge. Using LDB as a backend could allow easy use of
+
* Challenges: Complexity, requires additional RPC infrastructure in client. There is a Samba user space prototype of the Witness protocol that could be reused (since we only need the client part of the RPC calls).
Samba-supported SASL mechanisms for easier authentication. Likewise,
+
* Language: C
an LDB editor in SWAT would be very useful. Building it with
+
* Difficulty: High
interactive functionality would make it a very powerful way to manage
 
Samba4's LDB databases.
 
  
Possible mentors: [[JelmerVernooij]]
+
<!-- Commenting out stale proposals
  
===Browsing support in Samba 4===
+
===File Copy Offload: T10 operations, and improved tools for using CopyChunk===
Samba 4 still needs support for mailslots in general and in particular for the BROWSE mailslot. Should come with tests.
+
* Benefits: Improved performance.  Copy offload is useful for quickly replicating large files, and for backup and for virtualization. Good news is that one copy offload mechanism (CopyChunk) already works.  Windows 2012 introduced a [https://msdn.microsoft.com/en-us/library/windows/desktop/hh848056(v=vs.85).aspx second mechanism] and also see pages 33 to 42 of [http://www.snia.org/sites/default/files/SNIA_SMB3_final.pdf SNIA SMB3]).  May be even more useful if TRIM/DISCARD support also added.  This is also very timely given the recent support in the linux kernel vfs being added for the copy_range API.
 +
* Challenges: Ensuring semantics match what is being used in the new copy_range Linux kernel interface, and if not either emulate the alternate semantics, enhance copy_range or provide additional private ioctls to handle the SMB3 copy offload semantics (CopyChunk vs. ODX)
 +
* Language: C
 +
* Difficulty: Low / moderate
 +
* Possible Mentors: Steve French
  
Possible mentors: [[JelmerVernooij]]
+
===Multiadapter support===
 +
* Benefits: Big performance advantage for some common cases (e.g. RSS capable adapters, and also two adapter scenarios) and prepares for RDMA in the future which will help cifs.ko in even more workloads.
 +
* Challenges: Testing may require more physical hardware (two, dual adapter machines to demonstrate performance improvements).
 +
* Language: C
 +
* Difficulty: Moderate
 +
* Possible Mentors: Steve French
  
===Extension of the GTK+ frontends===
+
===Directory oplocks===
There are a couple of GTK+ frontends for Samba4 (see [[SambaGtk]]). These are very limited at the moment but you could work on expanding them and further integrating them with GNOME. In C or Python
+
* Benefits: Will reduce network load a lot in some workloads, and improve performance as well. Works with recent Windows servers (Windows 2012 and later e.g.).
 +
* Challenges: Samba does not support it yet (although this might help drive changes to the Server and Linux VFS eventually, if we have client support).
 +
* Language: C
 +
* Difficulty: Moderate
 +
* Possible Mentors: Steve French
  
Possible mentors: [[JelmerVernooij]]
+
=== Support for SELinux ===
 +
* Mac Security Label support is important for virtualization and useful for improved security some workloads.  Support for setting/getting these labels over the wire was investigated in the NFS version 4 workgroup.  Adding support to the CIFS Unix Extensions (Linux kernel client and Samba server) should be possible, especially if this is just a new class of extended attribute.  The goal would be to support this feature of SELinux to allow KVM and other applications to take advantage of security labels.  Some of the background requirements are loosely related to the (nfs equivalent of) what is mentioned in: [http://tools.ietf.org/html/draft-quigley-nfsv4-sec-label-01 NFSv4]
 +
* Language: C
 +
* Difficulty: Hard
 +
* Possible Mentors: Steve French
  
==Miscellaneous==
+
===Create GUI or command-line tools for displaying /proc/fs/cifs statistics and and mount/session status===
 +
* Might also involve some cleanup of the in-kernel stats / status output.
 +
* A mostly complete [http://oss.sgi.com/archives/pcp/2013-08/msg00090.html cifs.ko Performance Co-Pilot (PCP) monitoring agent] was implemented in 2013.
 +
* Language: some C (for kernel code), something else for GUI?
 +
* Difficulty: Easy
 +
* Possible Mentors: Steve French
  
===Linux Kernel CIFS client improvements===
+
===Create a common uid mapping mechanism for Linux nfs and cifs vfs clients===
Interested students should contact Steve French and discuss possible improvements to the Linux Kernel CIFS VFS client. Here are some ideas to get you started:
+
* or maybe just figure out a way to hook cifs up to rpc.idmapd
 +
* add a way for the client to remap the uids returned by the server to uids which would be valid on the client (or to a default if such uid does not exist).
 +
* This is helpful especially when the server supports the CIFS Unix Extensions and has different uids and gids mapping than the client
 +
* Difficulty: Hard
 +
* Possible Mentors: Steve French
  
* improved async/vectored i/o support (improves performance)
+
===Support for retrieving snapshots, encrypted files, or compressed files from Windows===
* CIFS->Samba DFS extensions
+
* Difficulty: Medium
* prototype SMB2 client
+
* Possible Mentors: Steve French
* more generic uid mapping facility (when server supports Unix extensions but different uid space)
 
* finish up of POSIX->NT ACL mapping
 
* integration of cifs client with Dave Howell's fscache (for offline caching)
 
* cifs->Samba automated test facility (build verification)
 
  
Possible mentors: Steve French
+
===cifs->Samba automated test facility===
 +
* Do build verification similar to what we can now do with the Samba server and tools in the Samba build farm.  Mounts from the Linux SMB3, SMB2 and CIFS kernel clients could be tested with posix file i/o tests which might include modified versions of the "connectathon" and xfstest test suites and others.  The goal is to quickly identify problems with newly integrated patches by running automatically against a variety of cifs/smb2/smb3 mounts (and mount options) to ensure that regressions aren't introduced.
 +
* xfstests support for CIFS was added as part of [[SoC/2014]].
 +
* Difficulty: Medium
 +
* Possible Mentors: Steve French
  
===Static and dynamic code analysis===
+
===Other Random Ideas===
We regularly use tools such as the IBM Checker, and Valgrind to work over our codebase.
+
* Ideas aren't limited to these, feel free to propose something else:
 +
** Improve integration between cifs.ko and userspace Samba tools and libraries.  Allow userspace Samba libraries to use an existing CIFS mount if it exists by passing requests (via an ioctl or other user->kernel IPC) to cifs.ko.  This could improve performance but also more naturally allow use of the same credentials for a user across file and management operations (e.g. listing shares via smbclient and mounting that share).
 +
** Create a GUI for creating and managing Linux cifs mounts, and more easily configuring the many complex cifs mount options, statistics (/proc/fs/cifs)
 +
** Support for alternate transport protocols (other than TCP sockets).  Adding support for SCTP to cifs/smb2 kernel clients and Samba server or perhaps more interesting add support for Linux's "virtio" transport to the cifs/smb2 kernel clients and Samba server (to allow optimized mounts and zero-copy transfer of data from virtualized guests to hosts on the same box)
 +
** Support for features (such as directory delegations) which NFS version 4.1 has but which current CIFS even with the most current CIFS->Samba protocol extensions (CIFS Unix Extensions) do not have -- will probably need server support too.
 +
** Add additional library support or modify Samba client libraries so they can use existing kernel cifs functions (such as sending SMBs on negotiated sessions when the kernel client already has a session to the server).  With the addition of library to access cifs's pipe (in kernel), Samba client libraries or other dce/rpc code could use cifs kernel sessions for management of and over cifs mounts.
 +
** Add libraries and utilities to manage acls (cifs kernel client has an extended attribute for setting/getting "raw" cifs acls but userspace posix acl tools obviously can't be used to manage cifs specific acl features).
 +
*Difficulty: Low
 +
*Language(s): C
 +
*Possible mentors: Steve French
  
These produce many warnings, and in particular the IBM Checker has found many 'unfixed' issues in the code base.  Many are false positives, but many are also very serious issues.  Students will need to work with the team and the upstream developers to resolve as many of these as possible. To see the current IBM Checker output look at the build farm output for the host 'snab' at http://build.samba.org/
+
==Build Farm==
  
Possible mentors:
+
The [[http://build.samba.org/ Build Farm]] is a set of machines with different configuration that regularly rebuild the latest snapshots of Samba and other projects on different platforms, to catch portability issues. It has a web interface and sends out emails.
  
===Windows GUI Testing===
+
===Improve Build Farm look and Feel===
With GUI automation tools, test the behaviour of windows applications against [[Samba3]] and [[Samba4]]. This needs to be integrated into the existing Windows testing code.
+
Samba's [http://build.samba.org build farm] still hasn't adopt the new Samba graphical chart and the look and feel is not very good.
 +
With this submission we propose to address this with the following objectives:
  
Last year we had a very successful summer of code project which added automated windows testing to Samba. This tested only command line tools. We would like to expand this to include testing of GUI applications.
+
*Main ideas:
 +
** Adopt the new samba style
 +
** Improve reporting (ie. present which are the build that can't be built and which are not, daily emails, ...)
 +
** Make test errors quickly accessible, in this [http://build.samba.org/build.cgi/build/d72e624c4a62a62e8d34b0c54efc2a97c0493aa9 example], user has to scroll a long time before meeting the errors
 +
** Add the capacity to manage flaky tests, reduce emails alerts (ie. need 2 consecutive builds with the same flacky test to trigger a real error)
 +
** Improve page loading speed (ajax ?)
 +
*Difficulty: Easy to Medium
 +
*Language(s): HTML, CSS, Python
 +
*Possible mentors: Matthieu Patou
 +
-->
  
Possible mentors:
+
== Wireshark ==
 +
Wireshark has two SMB dissectors: "smb" for SMB1, "smb2" for SMB2 and above. It also has a DCE/RPC ([https://en.wikipedia.org/wiki/Microsoft_RPC MSRPC]) dissector that is generated from Samba IDL files.

Latest revision as of 23:39, 4 June 2020

Applying to Samba

See our main Google Summer of Code @ Samba page for important details, plus these extended notes on applying to Samba.

Google Summer of Code: Suggested Project ideas

The following are Samba project ideas for Summer of Code. Of course you are free to come up with Samba related ideas not listed here. Please discuss your planned project by either sending an email to samba-technical@lists.samba.org or joining us on irc://irc.freenode.net/#samba-technical.

Samba

Some additional possible GSoC topics can be found in Bugzilla in the form of bugs which are marked as "Feature request": here. Questions regarding complexity and requirements should be directed to the technical mailing list.

Print System Asynchronous Remote Protocol Wireshark Dissectors

The Print System Asynchronous Remote Protocol (MS-PAR) is a replacement for the synchronous Print System Remote Protocol (MS-RPRN). MS-PAR inherits many message and buffer formats from the old protocol, but allows for asynchronous submission and notification of print jobs. Further details of the protocol can be found in Günther and Andreas' SambaXP presentation.

The student should write Wireshark dissectors for MS-PAR. The student should improve existing smbtorture tests, which demonstrate how the protocol works against a Windows server.

  • Difficulty: Medium
  • Language(s): C
  • Possible Mentors: Andreas Schneider (supported by Günther Deschner)

Ceph RADOS key-value store as an alternative to TDB

Ceph offers a highly scalable and fault-tolerant storage system. Samba is already capable of sharing data located on the Ceph Filesystem, however scale-out sharing (the same data exposed by multiple Samba nodes) currently requires the use of CTDB for consistent and coherent state across Samba cluster nodes. In such a setup CTDB provides a clustered database with persistent key-value data storage and locking. Database usage is abstracted out via a generic dbwrap interface.

Ceph's librados library provides an API for the storage and retrieval of arbitrary key-value data via the omap functions. A watch/notify protocol is also provided as a mechanism for synchronising client state (locking). Key-value data stored in the RADOS back-end inherits the same redundancy features as regular objects, making it a potentially good candidate as a replacement for CTDB in scale-out Samba clusters.

This task involves the implementation and testing of a new dbwrap back-end that uses librados for the storage, retrieval and locking of Samba key-value state. Ideally, the candidate would also allow time for benchmarking.

  • Difficulty: Medium
  • Language(s): C
  • Possible Mentors: David Disseldorp

Samba binary size reduction

Samba has grown to quite a bloated beast. This task will focus on some areas where the bloat can easily be reduced, e.g.

  • Removal of unused autogenerated librpc code
    • RPC client and server code when only serialization functions are used
    • optional struct print routines (pidl noprint?)
    • Some knowledge of perl would help here (for pidl)
  • Add new build options to compile Samba without certain functionality
    • Undesired DCE/RPC services
    • Printing support
    • Legacy SMB/CIFS support (stretch goal)

The NDR parsing Gordian knot

What we need is for someone to take the ideas in the page, and the concepts from the patch and make them a practical solution for Samba.

A particular spot where we over-link is via the ndr-table subsytem, which links to most of our (large) generated parsers for every RPC protocol we know.

Some bits of the code that link to ndr-table only want some metadata, but they end up linking to each NDR_* subsystem because the metadata is at the bottom of each parser (the ndr_*.c files).

The primary task may well end up being in Perl, or Python, or C, the challenge here will not be deep programming but lateral thinking about how to break up the dependency chains.

Making this more challenging or interesting (depending on your perspective), the RPC server is being rewritten, but for now the nexus in source3/rpc_server/rpc_ncacn_np.c and the calls to ndr_table_by_uuid() remain.

What applicants may wish to do is see if that can be re-written in such a way that does not require linking in the whole NDR parser, just to obtain the information actually used.

The broader issue

The challenge is that Samba over-links because some of the dependencies are not fine grained enough.

The fundamental challenge is likely to be a lot of cases where:

  • A depends on B
  • C depends on D

But B and C are in the same file, but B does not actually depend on C.

There are likely many other parts of Samba which link to large things that are not needed. For example, perhaps we should allow a build without printing support, or parsing of the printing protocols?

The challenge is is doing this in a way that is clean, and mostly done in the build system, not by #ifdef in the code.

Skills in determining binary dependencies, as well as the build system dependencies (to compare with) would be very helpful in this project.


Samba selftest efficiency improvement

Samba's selftest and GitLab CI pipeline has grown to quite a bloated beast. Even where we save wall-clock time by using parallel virtual machines, CPU time is spent costing the Samba Team money and using electricity more broadly.

This task will focus on some areas where the expense, bloat and duplication can easily be reduced, e.g.

  • De-duplication of duplicate test runs
  • Caching of compilation output between stages
  • Use of pipeline stages to avoid starting 20 potentially expensive jobs if a smoke-test build does not pass

The ideal candidate will do some of their own investigation to show they understand how savings could be made, rather than just sticking to this list.

  • Difficulty: medium
  • Language(s): Some knowledge of C and python would be helpful, an understanding of GitLab pipelines will help a lot
  • Possible Mentors: Andrew Bartlett

Samba AD DC as the ideal POSIX Directory

Samba is a great Active Directory Domain Controller, but it is not an ideal directory server for a large, passionate and important user base: Sites with Samba SMB servers, but also general purpose Linux servers. A smaller subset of these sites also have Linux desktops. These sites may also have Windows servers, but they like the Windows desktops, are not the focus.

These sites often used Samba + OpenLDAP, and are finding the move to Samba's AD DC a bit difficult, because schema extension is hard, some things are not done automatically (like uidNumber allocation), and in general the focus has been around matching Windows not listening to the needs of this part of our user base.

Specific research should be done into what FreeIPA does well in targeting this user segment, and what customisations advanced users of OpenLDAP apply.

This project would be to propose a number of specific improvements, and to add both tests and an implementation of these improvements to Samba.

  • Difficulty: Hard
  • Languages(s): C, Python
  • Possible Mentors: Andrew Bartlett

GitLab CI of Samba for non-linux platforms (FreeBSD in particular)

Samba uses GitLab CI to improve the quality of our patches. Efforts are currently underway to extend the docker container used from just Ubuntu 14.04 to later versions and other distributions.

However, we occasionally have issues ensuring Samba still builds and operates on FreeBSD and other non-linux platforms.

The idea would be to have a docker image and .gitlab-ci.yml code to support it that runs FreeBSD and then builds and runs Samba's testsuite inside that FreeBSD nested VM, while still outputting the results to the normal gitlab-ci.

This differs from just running GitLab CI runners on FreeBSD as we need auto-scale, destroy the host and guest at the end of the test and run on Linux docker (such as the free GitLab,com CI runners).

As a stretch goal, being able to run some tests against a specific linux kernel and a raw ext4 filesystem (rather than unionfs) via qemu rather than depending on the docker host configuraiton.

  • Difficulty: Medium
  • Language(s): Python, shell, YAML
  • Possible Mentors: Andrew Bartlett

Integrate Samba AD DC deployment and management with Cockpit

A prototype at https://github.com/abbra/cockpit-app-samba-ad shows how we can integrate Samba AD deployment with Cockpit Linux management console. A goal of this task is to move forward with this prototype to produce a fully working Samba AD management tool for common operations supported by 'samba-tool' command line utility.

The prototype is only a demo what could be done. For comprehensive work following materials need to be consulted:

The project would need to investigate:

A successful project proposal would be able to present a set of use cases proposed to focus on since this is a huge area, a scope to work on, how one would expect to work on deliverables, technology and process wise.

It would be nice to see how this project would evolve after the GSoC session would end, since it is clearly a longer term work that would need to be passed over and shared with more contributors.

  • Difficulty: Medium
  • Language(s): Python, JavaScript (React), CSS, HTML
  • Possible Mentors: Alexander Bokovoy


Linux Kernel SMB Client Improvements

The Linux Kernel has a module called cifs.ko which is independent from Samba (it doesn't share code) that allows users to mount remote shares. It supports multiple dialects of SMB (1, 2, 3). The protocol dialects are now officially documented by Microsoft (See MS-SMB, MS-SMB2) so students shouldn't have to worry about reverse engineering to understand them. The Wireshark open source network sniffer&dissector is a very good learning tool as well.


Interested students should contact Steve French or the linux-cifs mailing list to discuss possible improvements to the Linux Kernel CIFS VFS client.

Add machine-readable debug & stats /proc file

  • We currently output debug and statistic information under /proc/fs/cifs/ (DebugData, Stats, ...). We need to stop outputing free format text that breaks all parsers out there everytime we add things to it. Clean up the cifsdebug.c file (it is kind of messy). Possibly generate a hierarchy of /proc files (e.g. a dir per tcp connection, subdirs for session, files for tcons) instead of dumping everything in one file.
  • Make a nice visualizer/dashboard thing to get an overview. This could be console or GUI or...
  • Language: C for the kernel stuff, Userspace can be C, C++, Python.
  • Difficulty: Low

Add additional ftrace (trace-cmd) tracepoints and better GUI

  • Add more dynamic tracepoints to cifs.ko for commonly needed user scenarios, and add GUI (and/or CLI) tool to make it easier to enable/disable cifs.ko dynamic trace points (See /sys/kernel/debug/tracing/events/cifs/ for the pseudo-files that are currently configured manually for tracing or via trace-cmd) Make a nice native/console/web UI for it.
  • Language: C (any kernel changes) and userspace C, C++ or Python
  • Difficulty: Low

Add performance analysis cli tools

  • Add more perf tools for SMB3 client (similar to iostat or nfsstat) that leverage (and possibly extend what is captured) in /proc/fs/cifs/Stats but make it easier to analyze performance of a cifs mount
  • Language: C, C++ or Python
  • Difficulty: Low

Write the One-True-Tool to unify probe/setup/configuration cifs.ko properly

  • There are too many knobs in different places at the moment: request-keys, idmap, cifscreds, /proc stuff. This goal of this project would be to write one CLI tool that would wrap everything under a common interface. It could handle getting/setting ACL as well.
  • This would be a userspace project.
  • The implementation shouldn't too hard but the student will need to learn about the existing configuration ways and usecases which can be long.
  • Language: C, C++, Python
  • Difficulty: Medium

Add support for DAC (Claims Based ACLs) to the smb3 kernel module and tools

  • Similar to what was done to extend the Apache access control model to allow richer ACL semantics,
  • SMB3's access control model was extended (at least in Windows). See e.g. Dynamic Access Control.
  • Add support to the SMB3 kernel module and user space tools (or create new ones) to allow viewing and managing claims based ACLs (DAC) from the client.
  • Difficulty: Medium
  • Language: C (kernel), C/C++/Python (user space tools)

Add support for ODX (T10) Copy Offload to the smb3 kernel module

  • Windows, and various NAS servers support ODX copy offload (e.g. NetApp),
  • to allow much faster server side copy.
  • Add support to the kernel client for this and integration with existing or new copy offload tools to make it easy to use.
  • For more information see MS-SMB2 and SMB3 offload data transfer and Offloaded data transfers and references at bottom of Server-Side_Copy.
  • Language: C
  • Difficulty: Medium/High

VFS change notification support in cifs.ko

  • The kernel provides a file/dir notification API (inotify, dnotify). The SMB protocol also provides a way to get notified of file changes. This project would be about implementing the inotify API for cifs.ko by making use of the SMB notification mechanism.
  • Difficulty: Hard
  • Language: C

Failover/Continuous Availability and HA improvements (Witness protocol)

  • Benefits: Improved reliability, data integrity - may also allow planned migrations (moving data from one server to another).
  • Challenges: Complexity, requires additional RPC infrastructure in client. There is a Samba user space prototype of the Witness protocol that could be reused (since we only need the client part of the RPC calls).
  • Language: C
  • Difficulty: High


Wireshark

Wireshark has two SMB dissectors: "smb" for SMB1, "smb2" for SMB2 and above. It also has a DCE/RPC (MSRPC) dissector that is generated from Samba IDL files.