Difference between revisions of "SoC/Ideas"

(add Ceph RADOS dbwrap back-end task)
m (Remove myself as possible mentor for Windows Search Protocol task)
Line 87: Line 87:
 
*Difficulty: Medium, Hard
 
*Difficulty: Medium, Hard
 
*Language(s): C, (Python)
 
*Language(s): C, (Python)
*Possible Mentors: David Disseldorp
+
*Possible Mentors:
  
 
=== Print System Asynchronous Remote Protocol client library and torture tests===
 
=== Print System Asynchronous Remote Protocol client library and torture tests===

Revision as of 14:44, 19 February 2015

Google Summer of Code: Suggested Project ideas

The following are the Samba project ideas for Summer of Code. Of course you are free to come up with ideas not listed here. Please discuss the your planned project by either joining us on irc://irc.freenode.net/#samba-technical or by sending email to samba-technical@lists.samba.org

Most of our projects will require C programming skills, but the Samba section has a couple of Python projects.

Samba

Some additional possible GSoC topics can be found in Bugzilla in the form of bugs which are marked as "Feature request": here. Questions regarding complexity and requirements should be directed to the technical mailing list.


Add (lib)smbclient server-side copy support

Using the Copy-Chunk server-side copy FSCTL, an SMB client can request that a server copy a specific range of bytes from one file to another, without needing to transfer the data across the network. Copy-Chunk is now supported by Samba's SMB2 file-server. The goal of this task is to implement smbclient support for server-side copy operations using FSCTL_SRV_REQUEST_RESUME_KEY and FSCTL_SRV_COPYCHUNK SMB2 requests.

  • Difficulty: Easy, Medium
  • Language(s): C
  • Possible mentors: David Disseldorp

Utilize libsmbclient server-side copy support in file managers

Following the completion of libsmbclient server-side copy support, file managers making use of libsmbclient can be changed to utilize server-side copy support for greatly improved remote copy performance. Potential file manager targets include GNOME Files/Nautilus (gvfs_smb), Dolphon (kio_smb) and XBMC's File Manager.

  • Difficulty: Easy, Medium
  • Language(s): C, C++
  • Possible mentors: David Disseldorp

Improve libcli/dns

Samba comes with its own asynchronous DNS parser framework developed for the internal DNS server. Basic calls have been implemented for a client-side library as well, but a more fleshed out implementation would be needed. The goal of this project is to implement more high-level calls handling DNS requests, such as UDP/TCP switchover and client-side GSS-TSIG cryptography. A test suite excercising all the functions is required and can be used to cross-check and complement the existing DNS server tests already shipped by Samba. This testsuite should use cmocka.

See Samba's gitweb for the current code.

  • Difficulty: Medium
  • Language(s): C
  • Possible mentors: Kai Blin

Windows Search Protocol WSP client library and torture tests

The Windows Search Protocol WSP is used to implement remote full filesystem indexing (indexed search) between Windows machines. We would like to support this functionality in Samba, interfacing with existing indexing tools on Unix systems (such as GNOME Tracker).

This is a DCE/RPC protocol. See http://msdn.microsoft.com/en-us/library/cc251767.aspx .

The student should write a (un)marshalling library to push and pull PDUs and an asynchronous client library on top of the Samba raw smb client library.

The student should write sub-tests for smbtorture which should demonstrate how the protocol works against a Windows server. The student doesn't have to implement the Samba server code.

  • Difficulty: Medium, Hard
  • Language(s): C, (Python)
  • Possible Mentors:

Print System Asynchronous Remote Protocol client library and torture tests

The Print System Asynchronous Remote Protocol (MS-PAR) is a replacement for the synchronous Print System Remote Protocol (MS-RPRN). MS-PAR inherits many message and buffer formats from the old protocol, but allows for asynchronous submission and notification of print jobs. Further details of the protocol can be found in Günther and Andreas' SambaXP presentation.

The student should write a (un)marshalling library to push and pull MS-PAR PDUs, and an asynchronous client library on top of the Samba raw smb client library.

The student should write sub-tests for smbtorture which should demonstrate how the protocol works against a Windows server. The student doesn't have to implement the Samba server code.

  • Difficulty: Medium, Hard
  • Language(s): C
  • Possible Mentors: Andreas Schneider, David Disseldorp

dbwrap back-end for Ceph RADOS key-value storage

Ceph offers a highly scalable and fault-tolerant storage system. Samba is already capable of sharing data located on the Ceph Filesystem, however scale-out sharing (the same data exposed by multiple Samba nodes) currently requires the use of CTDB for consistent and coherent state across Samba cluster nodes. In such a setup CTDB provides a clustered database with persistent key-value data storage and locking. Database usage is abstracted out via a generic dbwrap interface.

Ceph's librados library provides an API for the storage and retrieval of arbitrary key-value data via the omap functions. A watch/notify protocol is also provided as a mechanism for synchronising client state (locking). Key-value data stored in the RADOS back-end inherits the same redundancy features as regular objects, making it a potentially good candidate as a replacement for CTDB in scale-out Samba clusters.

This task involves the implementation and testing of a new dbwrap back-end that uses librados for the storage, retrieval and locking of Samba key-value state. Ideally, the candidate would also allow time for benchmarking, and an investigation of scalability bottlenecks.

  • Difficulty: Medium, Hard
  • Language(s): C
  • Possible Mentors: David Disseldorp


Linux Kernel CIFS/SMB2/SMB3 client improvements

Interested students should contact Steve French (or Jeff Layton) and discuss possible improvements to the Linux Kernel CIFS VFS client. Here are some ideas to get you started:

SMB3 protocol improvements

  • The SMB2 protocol (followon to cifs) and now the SMB3 protocol, new to Windows 8, Windows 2012 and Samba 4, add many useful performance enhancements and new features. SMB2.1 support, and even minimal SMB3 support, is merged into the Linux kernel client since the 3.8 kernel, but there are many useful optional features still to be implemented. A sample list of possible features to code includes:
  • Support for using multiple network interfaces at once under the same mount (SMB3 multichannel)
  • Improved directory and metadata caching ("directory oplocks")
  • Improved failover in clustering environments ("persistent file handles" and more generally SMB3 "continuous availability" support)
  • File copy offload (there are multiple server side copy mechanisms possible with SMB3, T10 copy offload, SMB2/SMB3 "copy chunk" as well as an older cifs mechanism "SMB Copy" that could be implemented, compared and optimized). This is especially timely given the improvements in Samba/btrfs integration which better optimizes SMB2/SMB3 "copy chunk" handling on the server. This could improve file copy performance by orders of magnitude.
  • HA availability improvements on server and/or client (the SMB3 "Witness protocol")
  • Language: C
  • Difficulty: Varies, Medium to Hard, but most of the protocol features are at least well documented in WSPP and have implementations in current Windows clients and servers already
  • Possible Mentors: Steve French

Support for SELinux

  • Mac Security Label support is important for virtualization and useful for improved security some workloads. Support for setting/getting these labels over the wire was investigated in the NFS version 4 workgroup. Adding support to the CIFS Unix Extensions (Linux kernel client and Samba server) should be possible, especially if this is just a new class of extended attribute. The goal would be to support this feature of SELinux to allow KVM and other applications to take advantage of security labels. Some of the background requirements are loosely related to the (nfs equivalent of) what is mentioned in: http://tools.ietf.org/html/draft-quigley-nfsv4-sec-label-01
  • Language: C
  • Difficulty: Hard
  • Possible Mentors: Steve French

Create GUI or command-line tools for displaying /proc/fs/cifs statistics and and mount/session status

  • Might also involve some cleanup of the in-kernel stats / status output.
  • A mostly complete cifs.ko Performance Co-Pilot (PCP) monitoring agent was implemented in 2013.
  • Language: some C (for kernel code), something else for GUI?
  • Difficulty: Easy
  • Possible Mentors: Steve French

Create a common uid mapping mechanism for Linux nfs and cifs vfs clients

  • or maybe just figure out a way to hook cifs up to rpc.idmapd
  • add a way for the client to remap the uids returned by the server to uids which would be valid on the client (or to a default if such uid does not exist).
  • This is helpful especially when the server supports the CIFS Unix Extensions and has different uids and gids mapping than the client
  • Difficulty: Hard
  • Possible Mentors: Steve French

VFS change notification support

  • add VFS support for calling into the filesystem when setting up notifications
  • add code to cifs/smb2 to set up and deal with notifications from the server in response to inotify/dnotify calls
  • Difficulty: Hard
  • Possible Mentors: Steve French

Support for retrieving snapshots, encrypted files, or compressed files from Windows

  • Difficulty: Medium
  • Possible Mentors: Steve French

cifs->Samba automated test facility

  • Do build verification similar to what we can now do with the Samba server and tools in the Samba build farm. Mounts from the Linux SMB2 and CIFS kernel clients could be tested with posix file i/o tests which might include modified versions of the "connectathon" and xfstest test suites and others. The goal is to quickly identify problems with newly integrated patches.
  • xfstests support for CIFS was added as part of SoC/2014.
  • Difficulty: Hard
  • Possible Mentors: Steve French

Other Random Ideas

  • Ideas aren't limited to these, feel free to propose something else:
    • Create a GUI for creating and managing Linux cifs mounts, and more easily configuring the many complex cifs mount options, statistics (/proc/fs/cifs)
    • Support for alternate transport protocols (other than TCP sockets). Adding support for SCTP to cifs/smb2 kernel clients and Samba server or perhaps more interesting add support for Linux's "virtio" transport to the cifs/smb2 kernel clients and Samba server (to allow optimized mounts and zero-copy transfer of data from virtualized guests to hosts on the same box)
    • Support for features (such as directory delegations) which NFS version 4.1 has but which current CIFS even with the most current CIFS->Samba protocol extensions (CIFS Unix Extensions) do not have -- will probably need server support too.
    • Add additional library support or modify Samba client libraries so they can use existing kernel cifs functions (such as sending SMBs on negotiated sessions when the kernel client already has a session to the server). With the addition of library to access cifs's pipe (in kernel), Samba client libraries or other dce/rpc code could use cifs kernel sessions for management of and over cifs mounts.
    • Add libraries and utilities to manage acls (cifs kernel client has an extended attribute for setting/getting "raw" cifs acls but userspace posix acl tools obviously can't be used to manage cifs specific acl features).
  • Difficulty: Varies
  • Language(s): C
  • Possible mentors: Steve French