LinuxCIFS troubleshooting: Difference between revisions

Revision as of 22:57, 26 March 2018

Asking for Help

The best place to ask for help with Linux CIFS is on the linux-cifs mailing list. When asking for help, it's best to provide some basic info:

The kernel version you're using (the output of uname -r)
The mount.cifs version you're using (mount.cifs -V)
A clear, concise description of the problem
A description of the CIFS server with which you're having trouble (Windows version if it's windows, samba version if it's samba, name of the appliance if it's something else)
if you're able to mount the host, get the contents of /proc/fs/cifs/DebugData

Enabling Debugging

The CIFS code contains a number of debugging statements that can be enabled. If you ask for help on the list, one of the developers may ask you for this info. You can also turn it on on your own, but it's not generally helpful unless you're willing to dig into the code.

To enable debugging, echo a non-zero value into /proc/fs/cifs/cifsFYI. For example:

# modprobe cifs
# echo 'module cifs +p' > /sys/kernel/debug/dynamic_debug/control
# echo 'file fs/cifs/* +p' > /sys/kernel/debug/dynamic_debug/control
# echo 7 > /proc/fs/cifs/cifsFYI

Additional detail on debugging the CIFS/SMB3 Linux kernel client can be found at https://wiki.samba.org/index.php/LinuxCIFS_troubleshooting

To disable it:

# echo 0 > /proc/fs/cifs/cifsFYI

These messages end up in the kernel ring buffer. You can view them using dmesg.

# dmesg

syslog will generally also pick up much of it, but if the rate of messages is rather large, syslog tends to drop some of them. Getting the info straight out of the ring buffer is generally preferred since that's lossless.

This debugging however can be rather chatty and have a significant impact on performance. It's often best to use this with easily reproducible problems. That is:

turn on debugging
reproduce the issue
turn off debugging

Debugging info can contain sensitive data like IP addresses and filenames. Take care when sending this information.

Wire Captures

It's sometimes helpful to capture wire traffic between the client and server. The easiest way to do this is with wireshark which is a graphical network analysis tool. In many cases however, it's not easy or possible to run wireshark directly on one of the hosts. In that case, it's often easier to capture the network traffic in binary format to a file and then feed it into an analyzer to look over it. That also makes it possible to send it to someone who can do some analysis on it.

Here's an example of doing this:

# tcpdump -i eth0 -s0 -w /tmp/cifs-traffic.pcap host cifs_server.example.com and port 445

...of course, tcpdump has a lot of options, so these are just an example. In particular you'll want to modify the capture filter depending on what machine you're running the capture on, etc... An excellent overview presentation describing using wireshark to trace SMB workloads can be found at https://www.snia.org/sites/default/orig/sdc_archives/2008_presentations/monday/RonnieSahlberg_UsingWireshark.pdf

The captured traffic in this above example will be in /mnt/cifs-traffic.pcap. Before sending these around, it's a good idea to compress them as they squash down fairly well.

In general, the SMB protocol can be fairly chatty so it's best to use this in a similar manner to the debugging above:

start the capture
reproduce the problem
stop the capture

Wire captures can also contain sensitive data like addresses, password hashes, filenames and data. Be careful to whom you send it. In general, don't send this to mailing lists unless you know that the data isn't sensitive.

Oopses

Occasionally the kernel will panic. When it does, it's helpful to capture the entire message including the kernel messages leading up to the oops. There's a lot of info in an oops message but the main thing that helps debugging is determining where the machine panicked. Here's one way to do this:

Save off the oops message. The main thing that you see in there is a dump of the registers on the CPU that panicked. For instance, an oops on a 32-bit ix86 machine might look something like this:

BUG: unable to handle kernel NULL pointer dereference at 00000414
IP: [<c110d057>] cifs_writepages+0x35/0x60a

...the "IP:" line refers to the instruction pointer. That tells us what instruction the CPU was executing at the time that it panicked. The problem is though that due to architecture and compiler differences, etc, we can't directly turn that into a line of code. Here's how to do that:

Open the kernel module with gdb:

$ gdb cifs.ko

...eventually it should come to a (gdb) prompt. If you're running a vendor kernel, then you may need debuginfo packages for this to work. Once you get a gdb prompt, run:

(gdb) list *(cifs_writepages+0x35)

...obviously, you should replace the stuff in the parenthesis with whatever your oops message says. Pasting the list output can help developers help you.

Anonymous

Search

LinuxCIFS troubleshooting: Difference between revisions

Namespaces

More

Page actions

Revision as of 22:57, 26 March 2018

Contents

Asking for Help

Enabling Debugging

Wire Captures

Oopses

Navigation

Navigation

Wiki tools

Wiki tools

@@ Line 19: / Line 19: @@
 # echo 7 > /proc/fs/cifs/cifsFYI
 </pre>
+Additional detail on debugging the CIFS/SMB3 Linux kernel client can be found at https://wiki.samba.org/index.php/LinuxCIFS_troubleshooting
 To disable it:

Anonymous

Search

LinuxCIFS troubleshooting: Difference between revisions

Revision as of 22:57, 26 March 2018

Asking for Help

Enabling Debugging

Wire Captures

Oopses

Navigation

Wiki tools

Page tools