LinuxCIFS troubleshooting

From SambaWiki
Jump to: navigation, search

Contents

Asking for Help

The best place to ask for help with Linux CIFS is on the linux-cifs mailing list. When asking for help, it's best to provide some basic info:

  • The kernel version you're using (the output of uname -r)
  • The mount.cifs version you're using (mount.cifs -V)
  • A clear, concise description of the problem
  • A description of the CIFS server with which you're having trouble (Windows version if it's windows, samba version if it's samba, name of the appliance if it's something else)
  • if you're able to mount the host, get the contents of /proc/fs/cifs/DebugData

Enabling Debugging

The CIFS code contains a number of debugging statements that can be enabled. If you ask for help on the list, one of the developers may ask you for this info. You can also turn it on on your own, but it's not generally helpful unless you're willing to dig into the code.

To enable debugging, echo a non-zero value into /proc/fs/cifs/cifsFYI. For example:

# modprobe cifs
# echo 7 > /proc/fs/cifs/cifsFYI

To disable it:

# echo 0 > /proc/fs/cifs/cifsFYI

These messages end up in the kernel ring buffer. You can view them using dmesg.

# dmesg

syslog will generally also pick up much of it, but if the rate of messages is rather large, syslog tends to drop some of them. Getting the info straight out of the ring buffer is generally preferred since that's lossless.

This debugging however can be rather chatty and have a significant impact on performance. It's often best to use this with easily reproducible problems. That is:

  • turn on debugging
  • reproduce the issue
  • turn off debugging

Debugging info can contain sensitive data like IP addresses and filenames. Take care when sending this information.

Wire Captures

It's sometimes helpful to capture wire traffic between the client and server. The easiest way to do this is with wireshark which is a graphical network analysis tool. In many cases however, it's not easy or possible to run wireshark directly on one of the hosts. In that case, it's often easier to capture the network traffic in binary format to a file and then feed it into an analyzer to look over it. That also makes it possible to send it to someone who can do some analysis on it.

Here's an example of doing this:

# tcpdump -i eth0 -s0 -w /tmp/cifs-traffic.pcap host cifs_server.example.com and port 445

...of course, tcpdump has a lot of options, so these are just an example. In particular you'll want to modify the capture filter depending on what machine you're running the capture on, etc...

The captured traffic in this above example will be in /mnt/cifs-traffic.pcap. Before sending these around, it's a good idea to compress them as they squash down fairly well.

In general, the SMB protocol can be fairly chatty so it's best to use this in a similar manner to the debugging above:

  • start the capture
  • reproduce the problem
  • stop the capture

Wire captures can also contain sensitive data like addresses, password hashes, filenames and data. Be careful to whom you send it. In general, don't send this to mailing lists unless you know that the data isn't sensitive.

Oopses

Occasionally the kernel will panic. When it does, it's helpful to capture the entire message including the kernel messages leading up to the oops. There's a lot of info in an oops message but the main thing that helps debugging is determining where the machine panicked. Here's one way to do this:

Save off the oops message. The main thing that you see in there is a dump of the registers on the CPU that panicked. For instance, an oops on a 32-bit ix86 machine might look something like this:

BUG: unable to handle kernel NULL pointer dereference at 00000414
IP: [<c110d057>] cifs_writepages+0x35/0x60a

...the "IP:" line refers to the instruction pointer. That tells us what instruction the CPU was executing at the time that it panicked. The problem is though that due to architecture and compiler differences, etc, we can't directly turn that into a line of code. Here's how to do that:

Open the kernel module with gdb:

$ gdb cifs.ko

...eventually it should come to a (gdb) prompt. If you're running a vendor kernel, then you may need debuginfo packages for this to work. Once you get a gdb prompt, run:

(gdb) list *(cifs_writepages+0x35)

...obviously, you should replace the stuff in the parenthesis with whatever your oops message says. Pasting the list output can help developers help you.

Personal tools
Namespaces

Variants
Actions
Navigation
Tools