SoC/Ideas: Difference between revisions

From SambaWiki
m (reduce focus on dbwrap: another option is to implement a ctdb protocol server which uses Ceph omap at the backend)
(23 intermediate revisions by 6 users not shown)
Line 1: Line 1:
= Google Summer of Code: Suggested Project ideas =
= Google Summer of Code: Suggested Project ideas =


The following are the Samba project ideas for Summer of Code.
The following are Samba project ideas for Summer of Code.
Of course you are free to come up with ideas not listed here.
Of course you are free to come up with Samba related ideas not listed here.
Please discuss the your planned project by either joining us on irc://irc.freenode.net/#samba-technical or
Please discuss your planned project by either sending an email to [https://lists.samba.org/mailman/listinfo/samba-technical samba-technical@lists.samba.org] or joining us on irc://irc.freenode.net/#samba-technical.
by sending email to samba-technical@lists.samba.org


==Samba==
==Samba==
Line 80: Line 79:
*Difficulty: Medium
*Difficulty: Medium
*Language(s): C
*Language(s): C
*Possible Mentors: Andreas Schneider
*Possible Mentors: Andreas Schneider (supported by Günther Deschner)


===Ceph RADOS key-value store as an alternative to TDB===

===dbwrap back-end for Ceph RADOS key-value storage===


Ceph offers a highly scalable and fault-tolerant storage system. Samba is already capable of sharing data located on the [https://ceph.com/ceph-storage/file-system/ Ceph Filesystem], however scale-out sharing (the same data exposed by multiple Samba nodes) currently requires the use of [https://ctdb.samba.org/ CTDB] for consistent and coherent state across Samba cluster nodes. In such a setup CTDB provides a clustered database with persistent key-value data storage and locking. Database usage is abstracted out via a generic ''dbwrap'' interface.<br>
Ceph offers a highly scalable and fault-tolerant storage system. Samba is already capable of sharing data located on the [https://ceph.com/ceph-storage/file-system/ Ceph Filesystem], however scale-out sharing (the same data exposed by multiple Samba nodes) currently requires the use of [https://ctdb.samba.org/ CTDB] for consistent and coherent state across Samba cluster nodes. In such a setup CTDB provides a clustered database with persistent key-value data storage and locking. Database usage is abstracted out via a generic ''dbwrap'' interface.<br>
Line 94: Line 92:
*Language(s): C
*Language(s): C
*Possible Mentors: David Disseldorp
*Possible Mentors: David Disseldorp

<!-- Commented out possibly stale proposals
=== Samba binary size reduction ===

Samba has grown to quite a bloated beast. This task will focus on some areas where the bloat can easily be reduced, e.g.
* Removal of unused autogenerated librpc code
** RPC client and server code when only serialization functions are used
** optional struct print routines (pidl noprint?)
** Some knowledge of perl would help here (for pidl)
* Add new build options to compile Samba without certain functionality
** Undesired DCE/RPC services
** Printing support
** Legacy SMB/CIFS support (stretch goal)

*Difficulty: easy
*Language(s): Some knowledge of C and python would be helpful
*Possible Mentors: David Disseldorp (supported by Andrew Bartlett)

=== Fuzz Samba using [http://lcamtuf.coredump.cx/afl/ American Fuzzy Lop] ===

Try to find bugs in the server or client components of Samba using [http://lcamtuf.coredump.cx/afl/ AFL]. This is not the usual AFL scenario where you can fuzz a file format parser. It needs to be a bit smart about it, as the way the SMB protocol works, you need to do a series of steps before reaching arbitrary SMB commands (protocol negotiation, session setup, tree connect, ...). Similarly some checks like packet signing should be worked around to reach the best results from AFL.

Samba code will need to be modified in hackish ways to make this work. I don't expect to be able to merge it back but if some of the modification required could be cleaned up and integrated in would be a bonus.

* Difficulty: medium
* Language(s): Some knowledge of C
* Possible Mentors: Aurélien Aptel


===Samba AD DC as the ideal POSIX Directory===
===Samba AD DC as the ideal POSIX Directory===
Line 109: Line 132:
*Languages(s): C, Python
*Languages(s): C, Python
*Possible Mentors: Andrew Bartlett
*Possible Mentors: Andrew Bartlett

===GitLab CI of Samba for non-linux platforms (FreeBSD in particular)===

Samba uses GitLab CI to improve the quality of our patches. Efforts are currently underway to extend the docker container used from just Ubuntu 14.04 to later versions and other distributions.

However, we occasionally have issues ensuring Samba still builds and operates on FreeBSD and other non-linux platforms.

The idea would be to have a docker image and .gitlab-ci.yml code to support it that runs FreeBSD and then builds and runs Samba's testsuite inside that FreeBSD nested VM, while still outputting the results to the normal gitlab-ci.

This differs from just running GitLab CI runners on FreeBSD as we need auto-scale, destroy the host and guest at the end of the test and run on Linux docker (such as the free GitLab,com CI runners).

As a stretch goal, being able to run some tests against a specific linux kernel and a raw ext4 filesystem (rather than unionfs) via qemu rather than depending on the docker host configuraiton.

*Difficulty: Medium
*Language(s): Python, shell, YAML
*Possible Mentors: Andrew Bartlett

===Integrate Samba AD DC deployment and management with Cockpit===

A prototype at https://github.com/abbra/cockpit-app-samba-ad shows how we can integrate Samba AD deployment with Cockpit Linux management console. A goal of this task is to move forward with this prototype to produce a fully working Samba AD management tool for common operations supported by 'samba-tool' command line utility.

*Difficulty: Medium
*Language(s): Python, JavaScript (React), CSS, HTML
*Possible Mentors: Alexander Bokovoy


<!-- Commented out possibly stale proposals


===Make libsmbclient thread-safe for Gnome VFS===
===Make libsmbclient thread-safe for Gnome VFS===
Line 118: Line 168:
*Possible mentors: Jeremy Allison
*Possible mentors: Jeremy Allison
-->
-->



==Linux Kernel SMB Client Improvements==
==Linux Kernel SMB Client Improvements==

The Linux Kernel has a module called cifs.ko which is independent from Samba (it doesn't share code) that allows users to mount remote shares. It supports multiple dialects of SMB (1, 2, 3). The protocol dialects are now officially documented by Microsoft (See [https://msdn.microsoft.com/en-us/library/cc246231.aspx MS-SMB], [https://msdn.microsoft.com/en-us/library/cc246482.aspx MS-SMB2]) so students shouldn't have to worry about reverse engineering to understand them. The [https://www.wireshark.org/ Wireshark] open source network sniffer&dissector is a very good learning tool as well.


Interested students should contact Steve French or the [mailto:linux-cifs@vger.kernel.org linux-cifs mailing list] to discuss possible improvements to the Linux Kernel CIFS VFS client.
Interested students should contact Steve French or the [mailto:linux-cifs@vger.kernel.org linux-cifs mailing list] to discuss possible improvements to the Linux Kernel CIFS VFS client.


=== Add machine-readable debug & stats /proc file ===
=== Add machine-readable debug & stats /proc file ===
* Stop outputing free format text that breaks all parsers out there everytime we add things to it. Clean up the cifsdebug.c file (its kind of messy). Possibly generate a hierarchy of files (e.g. a dir per tcp connection, subdirs for session, files for tcons). Make a nice native/console/web UI for it.
* We currently output debug and statistic information under /proc/fs/cifs/ (DebugData, Stats, ...). We need to stop outputing free format text that breaks all parsers out there everytime we add things to it. Clean up the cifsdebug.c file (it is kind of messy). Possibly generate a hierarchy of /proc files (e.g. a dir per tcp connection, subdirs for session, files for tcons) instead of dumping everything in one file.
* Make a nice visualizer/dashboard thing to get an overview. This could be console or GUI or...
* Language: C
* Language: C for the kernel stuff, Userspace can be C, C++, Python.
* Difficulty: Low
* Difficulty: Low

=== Add additional ftrace (trace-cmd) tracepoints and better GUI ===
=== Add additional ftrace (trace-cmd) tracepoints and better GUI ===
* Add more dynamic tracepoints to cifs.ko for commonly needed user scenarios, and add GUI (and/or CLI) tool to make it easier to enable/disable cifs.ko dynamic trace points (See /sys/kernel/debug/tracing/events/cifs/ for the pseudo-files that are currently configured manually for tracing or via trace-cmd) Make a nice native/console/web UI for it.
* Add more dynamic tracepoints to cifs.ko for commonly needed user scenarios, and add GUI (and/or CLI) tool to make it easier to enable/disable cifs.ko dynamic trace points (See /sys/kernel/debug/tracing/events/cifs/ for the pseudo-files that are currently configured manually for tracing or via trace-cmd) Make a nice native/console/web UI for it.
* Language: C (any kernel changes) and userspace C, C++ or Python
* Language: C (any kernel changes) and userspace C, C++ or Python
* Difficulty: Low
* Difficulty: Low

=== Add performance analysis cli tools ===
=== Add performance analysis cli tools ===
* Add more perf tools for SMB3 client (similar to iostat or nfsstat) that leverage (and possibly extend what is captured) in /proc/fs/cifs/Stats but make it easier to analyze performance of a cifs mount
* Add more perf tools for SMB3 client (similar to iostat or nfsstat) that leverage (and possibly extend what is captured) in /proc/fs/cifs/Stats but make it easier to analyze performance of a cifs mount
Line 137: Line 193:


=== Write the One-True-Tool to unify probe/setup/configuration cifs.ko properly ===
=== Write the One-True-Tool to unify probe/setup/configuration cifs.ko properly ===
* Too many knobs in different places at the moment: request-keys, idmap, cifscreds, /proc stuff
* There are too many knobs in different places at the moment: request-keys, idmap, cifscreds, /proc stuff. This goal of this project would be to write one CLI tool that would wrap everything under a common interface. It could handle getting/setting ACL as well.
* This would be a userspace project.
* Would handle ACL stuff as well (nice gui to get/set)
* The implementation shouldn't too hard but the student will need to learn about the existing configuration ways and usecases which can be long.
* Language: C, C++, Python
* Difficulty: Medium


=== Add support for DAC (Claims Based ACLs) to the smb3 kernel module and tools ===
* Similar to what was done to extend the Apache access control model to allow richer ACL semantics,
* SMB3's access control model was extended (at least in Windows). See e.g. [https://docs.microsoft.com/en-us/windows/security/identity-protection/access-control/dynamic-access-control Dynamic Access Control].
* Add support to the SMB3 kernel module and user space tools (or create new ones) to allow viewing and managing claims based ACLs (DAC) from the client.
* Difficulty: Medium
* Language: C (kernel), C/C++/Python (user space tools)


=== Improve smbcmp, the capture diff tool ===
=== Add support for ODX (T10) Copy Offload to the smb3 kernel module ===
* Windows, and various NAS servers support ODX copy offload (e.g. [https://docs.netapp.com/ontap-9/index.jsp?topic=%2Fcom.netapp.doc.cdot-famg-cifs%2FGUID-1323806A-F37B-46AF-B123-E40FCD362B33.html NetApp]),
* Use or combine current tshark output with the XML output to do better diffs
* to allow much faster server side copy.
* Better UI?
* Add support to the kernel client for this and integration with existing or new copy offload tools to make it easy to use.
* Language: Python (rewrite in something else is OK)
* For more information see [https://msdn.microsoft.com/en-us/library/cc246482.aspx MS-SMB2] and [https://www.slideshare.net/gordonross/smb3-offload-data-transfer-odx SMB3 offload data transfer] and [https://docs.microsoft.com/en-us/windows-hardware/drivers/ifs/offloaded-data-transfers Offloaded data transfers] and references at bottom of [[Server-Side_Copy]].
* Language: C
* Difficulty: Medium/High


===VFS change notification support in cifs.ko ===
* The kernel provides a file/dir notification API (inotify, dnotify). The SMB protocol also provides a way to get notified of file changes. This project would be about implementing the inotify API for cifs.ko by making use of the SMB notification mechanism.
* Difficulty: Hard
* Language: C

===Failover/Continuous Availability and HA improvements (Witness protocol)===
* Benefits: Improved reliability, data integrity - may also allow planned migrations (moving data from one server to another).
* Challenges: Complexity, requires additional RPC infrastructure in client. There is a Samba user space prototype of the Witness protocol that could be reused (since we only need the client part of the RPC calls).
* Language: C
* Difficulty: High


<!-- Commenting out stale proposals
<!-- Commenting out stale proposals


===File Copy Offload: T10 operations, and improved tools for using CopyChunk===
===File Copy Offload: T10 operations, and improved tools for using CopyChunk===
* Benefits: Improved performance. Copy offload is useful for quickly replicating large files, and for backup and for virtualization. Good news is that one copy offload mechanism (CopyChunk) already works. Windows 2012 introduced a second mechanism (https://msdn.microsoft.com/en-us/library/windows/desktop/hh848056(v=vs.85).aspx and also see pages 33 to 42 of http://www.snia.org/sites/default/files/SNIA_SMB3_final.pdf). May be even more useful if TRIM/DISCARD support also added. This is also very timely given the recent support in the linux kernel vfs being added for the copy_range API.
* Benefits: Improved performance. Copy offload is useful for quickly replicating large files, and for backup and for virtualization. Good news is that one copy offload mechanism (CopyChunk) already works. Windows 2012 introduced a [https://msdn.microsoft.com/en-us/library/windows/desktop/hh848056(v=vs.85).aspx second mechanism] and also see pages 33 to 42 of [http://www.snia.org/sites/default/files/SNIA_SMB3_final.pdf SNIA SMB3]). May be even more useful if TRIM/DISCARD support also added. This is also very timely given the recent support in the linux kernel vfs being added for the copy_range API.
* Challenges: Ensuring semantics match what is being used in the new copy_range Linux kernel interface, and if not either emulate the alternate semantics, enhance copy_range or provide additional private ioctls to handle the SMB3 copy offload semantics (CopyChunk vs. ODX)
* Challenges: Ensuring semantics match what is being used in the new copy_range Linux kernel interface, and if not either emulate the alternate semantics, enhance copy_range or provide additional private ioctls to handle the SMB3 copy offload semantics (CopyChunk vs. ODX)
* Language: C
* Language: C
Line 168: Line 246:
* Language: C
* Language: C
* Difficulty: Moderate
* Difficulty: Moderate
* Possible Mentors: Steve French

===Failover/Continuous Availability and HA improvements (Witness protocol)===
* Benefits: Improved reliability, data integrity - may also allow planned migrations (moving data from one server to another). This is very timely given the recent addition of resilient and persistent handle support to the Linux smb3 kernel client.
* Challenges: Complexity, requires additional RPC infrastructure in client.
* Language: C
* Difficulty: High
* Possible Mentors: Steve French
* Possible Mentors: Steve French


=== Support for SELinux ===
=== Support for SELinux ===
* Mac Security Label support is important for virtualization and useful for improved security some workloads. Support for setting/getting these labels over the wire was investigated in the NFS version 4 workgroup. Adding support to the CIFS Unix Extensions (Linux kernel client and Samba server) should be possible, especially if this is just a new class of extended attribute. The goal would be to support this feature of SELinux to allow KVM and other applications to take advantage of security labels. Some of the background requirements are loosely related to the (nfs equivalent of) what is mentioned in: http://tools.ietf.org/html/draft-quigley-nfsv4-sec-label-01
* Mac Security Label support is important for virtualization and useful for improved security some workloads. Support for setting/getting these labels over the wire was investigated in the NFS version 4 workgroup. Adding support to the CIFS Unix Extensions (Linux kernel client and Samba server) should be possible, especially if this is just a new class of extended attribute. The goal would be to support this feature of SELinux to allow KVM and other applications to take advantage of security labels. Some of the background requirements are loosely related to the (nfs equivalent of) what is mentioned in: [http://tools.ietf.org/html/draft-quigley-nfsv4-sec-label-01 NFSv4]
* Language: C
* Language: C
* Difficulty: Hard
* Difficulty: Hard
Line 194: Line 265:
* add a way for the client to remap the uids returned by the server to uids which would be valid on the client (or to a default if such uid does not exist).
* add a way for the client to remap the uids returned by the server to uids which would be valid on the client (or to a default if such uid does not exist).
* This is helpful especially when the server supports the CIFS Unix Extensions and has different uids and gids mapping than the client
* This is helpful especially when the server supports the CIFS Unix Extensions and has different uids and gids mapping than the client
* Difficulty: Hard
* Possible Mentors: Steve French

===VFS change notification support===
* add VFS support for calling into the filesystem when setting up notifications
* add code to cifs/smb2 to set up and deal with notifications from the server in response to inotify/dnotify calls
* Difficulty: Hard
* Difficulty: Hard
* Possible Mentors: Steve French
* Possible Mentors: Steve French
Line 243: Line 308:
*Possible mentors: Matthieu Patou
*Possible mentors: Matthieu Patou
-->
-->

== Wireshark ==
Wireshark has two SMB dissectors: "smb" for SMB1, "smb2" for SMB2 and above. It also has a DCE/RPC ([https://en.wikipedia.org/wiki/Microsoft_RPC MSRPC]) dissector that is generated from Samba IDL files.

Revision as of 14:27, 28 January 2020

Google Summer of Code: Suggested Project ideas

The following are Samba project ideas for Summer of Code. Of course you are free to come up with Samba related ideas not listed here. Please discuss your planned project by either sending an email to samba-technical@lists.samba.org or joining us on irc://irc.freenode.net/#samba-technical.

Samba

Some additional possible GSoC topics can be found in Bugzilla in the form of bugs which are marked as "Feature request": here. Questions regarding complexity and requirements should be directed to the technical mailing list.

Print System Asynchronous Remote Protocol Wireshark Dissectors

The Print System Asynchronous Remote Protocol (MS-PAR) is a replacement for the synchronous Print System Remote Protocol (MS-RPRN). MS-PAR inherits many message and buffer formats from the old protocol, but allows for asynchronous submission and notification of print jobs. Further details of the protocol can be found in Günther and Andreas' SambaXP presentation.

The student should write Wireshark dissectors for MS-PAR. The student should improve existing smbtorture tests, which demonstrate how the protocol works against a Windows server.

  • Difficulty: Medium
  • Language(s): C
  • Possible Mentors: Andreas Schneider (supported by Günther Deschner)

Ceph RADOS key-value store as an alternative to TDB

Ceph offers a highly scalable and fault-tolerant storage system. Samba is already capable of sharing data located on the Ceph Filesystem, however scale-out sharing (the same data exposed by multiple Samba nodes) currently requires the use of CTDB for consistent and coherent state across Samba cluster nodes. In such a setup CTDB provides a clustered database with persistent key-value data storage and locking. Database usage is abstracted out via a generic dbwrap interface.

Ceph's librados library provides an API for the storage and retrieval of arbitrary key-value data via the omap functions. A watch/notify protocol is also provided as a mechanism for synchronising client state (locking). Key-value data stored in the RADOS back-end inherits the same redundancy features as regular objects, making it a potentially good candidate as a replacement for CTDB in scale-out Samba clusters.

This task involves the implementation and testing of a new dbwrap back-end that uses librados for the storage, retrieval and locking of Samba key-value state. Ideally, the candidate would also allow time for benchmarking.

  • Difficulty: Medium
  • Language(s): C
  • Possible Mentors: David Disseldorp

Samba binary size reduction

Samba has grown to quite a bloated beast. This task will focus on some areas where the bloat can easily be reduced, e.g.

  • Removal of unused autogenerated librpc code
    • RPC client and server code when only serialization functions are used
    • optional struct print routines (pidl noprint?)
    • Some knowledge of perl would help here (for pidl)
  • Add new build options to compile Samba without certain functionality
    • Undesired DCE/RPC services
    • Printing support
    • Legacy SMB/CIFS support (stretch goal)
  • Difficulty: easy
  • Language(s): Some knowledge of C and python would be helpful
  • Possible Mentors: David Disseldorp (supported by Andrew Bartlett)

Fuzz Samba using American Fuzzy Lop

Try to find bugs in the server or client components of Samba using AFL. This is not the usual AFL scenario where you can fuzz a file format parser. It needs to be a bit smart about it, as the way the SMB protocol works, you need to do a series of steps before reaching arbitrary SMB commands (protocol negotiation, session setup, tree connect, ...). Similarly some checks like packet signing should be worked around to reach the best results from AFL.

Samba code will need to be modified in hackish ways to make this work. I don't expect to be able to merge it back but if some of the modification required could be cleaned up and integrated in would be a bonus.

  • Difficulty: medium
  • Language(s): Some knowledge of C
  • Possible Mentors: Aurélien Aptel

Samba AD DC as the ideal POSIX Directory

Samba is a great Active Directory Domain Controller, but it is not an ideal directory server for a large, passionate and important user base: Sites with Samba SMB servers, but also general purpose Linux servers. A smaller subset of these sites also have Linux desktops. These sites may also have Windows servers, but they like the Windows desktops, are not the focus.

These sites often used Samba + OpenLDAP, and are finding the move to Samba's AD DC a bit difficult, because schema extension is hard, some things are not done automatically (like uidNumber allocation), and in general the focus has been around matching Windows not listening to the needs of this part of our user base.

Specific research should be done into what FreeIPA does well in targeting this user segment, and what customisations advanced users of OpenLDAP apply.

This project would be to propose a number of specific improvements, and to add both tests and an implementation of these improvements to Samba.

  • Difficulty: Hard
  • Languages(s): C, Python
  • Possible Mentors: Andrew Bartlett

GitLab CI of Samba for non-linux platforms (FreeBSD in particular)

Samba uses GitLab CI to improve the quality of our patches. Efforts are currently underway to extend the docker container used from just Ubuntu 14.04 to later versions and other distributions.

However, we occasionally have issues ensuring Samba still builds and operates on FreeBSD and other non-linux platforms.

The idea would be to have a docker image and .gitlab-ci.yml code to support it that runs FreeBSD and then builds and runs Samba's testsuite inside that FreeBSD nested VM, while still outputting the results to the normal gitlab-ci.

This differs from just running GitLab CI runners on FreeBSD as we need auto-scale, destroy the host and guest at the end of the test and run on Linux docker (such as the free GitLab,com CI runners).

As a stretch goal, being able to run some tests against a specific linux kernel and a raw ext4 filesystem (rather than unionfs) via qemu rather than depending on the docker host configuraiton.

  • Difficulty: Medium
  • Language(s): Python, shell, YAML
  • Possible Mentors: Andrew Bartlett

Integrate Samba AD DC deployment and management with Cockpit

A prototype at https://github.com/abbra/cockpit-app-samba-ad shows how we can integrate Samba AD deployment with Cockpit Linux management console. A goal of this task is to move forward with this prototype to produce a fully working Samba AD management tool for common operations supported by 'samba-tool' command line utility.

  • Difficulty: Medium
  • Language(s): Python, JavaScript (React), CSS, HTML
  • Possible Mentors: Alexander Bokovoy


Linux Kernel SMB Client Improvements

The Linux Kernel has a module called cifs.ko which is independent from Samba (it doesn't share code) that allows users to mount remote shares. It supports multiple dialects of SMB (1, 2, 3). The protocol dialects are now officially documented by Microsoft (See MS-SMB, MS-SMB2) so students shouldn't have to worry about reverse engineering to understand them. The Wireshark open source network sniffer&dissector is a very good learning tool as well.


Interested students should contact Steve French or the linux-cifs mailing list to discuss possible improvements to the Linux Kernel CIFS VFS client.

Add machine-readable debug & stats /proc file

  • We currently output debug and statistic information under /proc/fs/cifs/ (DebugData, Stats, ...). We need to stop outputing free format text that breaks all parsers out there everytime we add things to it. Clean up the cifsdebug.c file (it is kind of messy). Possibly generate a hierarchy of /proc files (e.g. a dir per tcp connection, subdirs for session, files for tcons) instead of dumping everything in one file.
  • Make a nice visualizer/dashboard thing to get an overview. This could be console or GUI or...
  • Language: C for the kernel stuff, Userspace can be C, C++, Python.
  • Difficulty: Low

Add additional ftrace (trace-cmd) tracepoints and better GUI

  • Add more dynamic tracepoints to cifs.ko for commonly needed user scenarios, and add GUI (and/or CLI) tool to make it easier to enable/disable cifs.ko dynamic trace points (See /sys/kernel/debug/tracing/events/cifs/ for the pseudo-files that are currently configured manually for tracing or via trace-cmd) Make a nice native/console/web UI for it.
  • Language: C (any kernel changes) and userspace C, C++ or Python
  • Difficulty: Low

Add performance analysis cli tools

  • Add more perf tools for SMB3 client (similar to iostat or nfsstat) that leverage (and possibly extend what is captured) in /proc/fs/cifs/Stats but make it easier to analyze performance of a cifs mount
  • Language: C, C++ or Python
  • Difficulty: Low

Write the One-True-Tool to unify probe/setup/configuration cifs.ko properly

  • There are too many knobs in different places at the moment: request-keys, idmap, cifscreds, /proc stuff. This goal of this project would be to write one CLI tool that would wrap everything under a common interface. It could handle getting/setting ACL as well.
  • This would be a userspace project.
  • The implementation shouldn't too hard but the student will need to learn about the existing configuration ways and usecases which can be long.
  • Language: C, C++, Python
  • Difficulty: Medium

Add support for DAC (Claims Based ACLs) to the smb3 kernel module and tools

  • Similar to what was done to extend the Apache access control model to allow richer ACL semantics,
  • SMB3's access control model was extended (at least in Windows). See e.g. Dynamic Access Control.
  • Add support to the SMB3 kernel module and user space tools (or create new ones) to allow viewing and managing claims based ACLs (DAC) from the client.
  • Difficulty: Medium
  • Language: C (kernel), C/C++/Python (user space tools)

Add support for ODX (T10) Copy Offload to the smb3 kernel module

  • Windows, and various NAS servers support ODX copy offload (e.g. NetApp),
  • to allow much faster server side copy.
  • Add support to the kernel client for this and integration with existing or new copy offload tools to make it easy to use.
  • For more information see MS-SMB2 and SMB3 offload data transfer and Offloaded data transfers and references at bottom of Server-Side_Copy.
  • Language: C
  • Difficulty: Medium/High

VFS change notification support in cifs.ko

  • The kernel provides a file/dir notification API (inotify, dnotify). The SMB protocol also provides a way to get notified of file changes. This project would be about implementing the inotify API for cifs.ko by making use of the SMB notification mechanism.
  • Difficulty: Hard
  • Language: C

Failover/Continuous Availability and HA improvements (Witness protocol)

  • Benefits: Improved reliability, data integrity - may also allow planned migrations (moving data from one server to another).
  • Challenges: Complexity, requires additional RPC infrastructure in client. There is a Samba user space prototype of the Witness protocol that could be reused (since we only need the client part of the RPC calls).
  • Language: C
  • Difficulty: High


Wireshark

Wireshark has two SMB dissectors: "smb" for SMB1, "smb2" for SMB2 and above. It also has a DCE/RPC (MSRPC) dissector that is generated from Samba IDL files.