Sunday, January 18, 2009

Wireless Troubleshooting

Hardware Verification

The first critical step is to ensure that your wireless device is recognized by your system. There are a variety of methods to verify that your system did this successfully. Here are some methods:

*
The “dmesg” command can quite often contain detailed messages indicating that the wirelss devices was properly detected.
*
If the card is an ISA card, you are usually out of luck.
*
If the card is a PCI card (miniPCI/miniPCI Express/PCI Express), you need to use the command “lspci” to display the card identification strings.
*
If the hardware is a USB dongle, you need to use the command “lsusb” to display the dongle identification strings. In some case, “lsusb” doesn't work (for example if usbfs is not mounted), and you can get the identification strings from the kernel log using “dmesg” (or in /var/log/messages).
*
If the card is a Cardbus card (32 bits Pcmcia), and if you are using kernel 2.6.X or kernel 2.4.X with the kernel Pcmcia subsystem, you need to use the command “lspci” to display the card identification strings. If the card is a Cardbus card (32 bits Pcmcia), and if you are using an older kernel with the standalone Pcmcia subsystem, you need to use the command “cardctl ident” display the card identification strings. Try both and see what comes out.
*
If the card is a true Pcmcia card (16 bits), and if you are using kernel 2.6.14 or later, you need to use the command “pccardctl ident” to display the card identification strings. If the card is a true Pcmcia card (16 bits), and if you are using an older kernel, you need to use the command “cardctl ident” display the card identification strings. Note that cardmgr will also write some identification strings in the message logs (/var/log/daemon.log) that may be different from the real card identification strings.

Needless to say, if your wireless device is not detected by your system, you will have to investigate and correct the problem.
Modprobe

Start by running “modprobe ”.
View iwconfig output

Run the “iwconfig” command and look for wireless devices. Based on the driver, look for an appropriately named interface such as ath0, rausb0, etc. The presence indicates that at least the driver is loaded. The absence likely means it did not. This at least gives you a starting point on the problem solving.

A common problem is that your system has both ieee80211 and mac80211 versions of the drivers. Having wmaster0 typically indicates you are using the new mac80211 drivers. Having wifi0 or eth0 typically means you are using the older (legacy) ieee80211 drivers. Having both wmaster0 and wifi0/eth0 (as well as weird interface names like wlan0_rename) might indicate a udev problem. Based on what which ones you really want, you may have to blacklist or move one or more drivers.
View dmesg output

Run the “dmesg” command and look for errors relating to your wireless device. At a minimum there should be some messages relating to your device loading and the module initializing it. If there are no messages or errors, you will have to investigate and correct the problem.

See the next entry of a problem commonly seen: “unknown symbol”.
"unknown symbol" error

When loading the driver kernel module you get a “unknown symbol” error message for one more field names. Sometimes you will see this in the dmesg output as well. This is caused by module you are loading not being matching the kernel version you are running.

First, determine which kernel you are running with “uname -r”. Then use your package manager to determine if you have kernels, kernel headers or kernel development packages that are older.

If you use the RPM package manager then “rpm -qa | grep kernel”. So if you get something like:

kernel-headers-2.6.24.4-64.fc8
kernel-2.6.24.4-64.fc8
kernel-devel-2.6.24.4-64.fc8
kernel-headers-2.6.24.1-15.fc8
kernel-2.6.24.1-15.fc8
kernel-devel-2.6.24.1-15.fc8

In the example above, there are kernel headers and a kernel development package that match the kernel we are running. If you are missing them, the use yum or equivalent on your distribution to install them such as:

yum -y install kernel-headers
yum -y install kernel-devel

Lets assume that “uname -r” returned “2.6.24.4-64.fc8” then all the 2.6.24.1-15 ones are old and need to be removed. So you remove all the old ones:

rpm -e 2.6.24.4-64.fc8
rpm -e kernel-2.6.24.1-15.fc8
rpm -e kernel-devel-2.6.24.1-15.fc8

Also change to ”/lib/modules” and do a directory listing and remove any directory referring to old kernel versions.

Once you are finished, you can do ”“rpm -qa | grep kernel” and confirm everything looks good. At this point, recompile your wireless drivers and reboot the system.
View lsmod output

Run the “lsmod” command can be used to see the loaded modules. Confirm that the kernel module for your wireless device is actually loaded. If it is not loaded, you will have to investigate and correct the problem.

Sometimes other modules conflict with the one you are trying to run. See blacklisting below. Additionally, conflicting modules can be moved out of the module tree. If you do this, run “depmod -ae” afterwards.
View modinfo output

Run “modinfo ”. This will confirm the module is actually in the modules tree. As well, confirm it is the correct version. Do a “ls -l ” and confirm the date matches when you compiled it. It is not uncommon that you are not running the correct module version.
Blacklisting

A common problem on newer kernels is that the new mac80211 version of the driver gets loaded instead of the older legacy driver, or vice versa. If that is the case, then you need to blacklist the wrong modules by editing /etc/modprobe.d/blacklist. First, determine the broken module names and add them to the blacklist file as “blacklist ”.

Specifically for madwifi-ng, do a locate or find for ath5k.ko. If ath5k.ko exists then add “blacklist ath5k” to /etc/modprobe.d/blacklist and reboot. Same for the other way around: if you want to load ath5k, but madwifi-ng gets loaded instead, add “blacklist ath_pci” to /etc/modprobe.d/blacklist.
Reload Driver

Although it is not very “scientific”, sometimes simply unloading then reloading the driver will get it working. This is done with the rmmod and modprobe commands.

For b43 and b43legacy, it might also be necessary to reload the underlying SSB module. Similarly, rt2x00 and p54 might need reloading of the common modules (p54common, rt2x00lib, rt2x00usb, rt2x00pci). Sometimes (especially with mac80211 drivers), reloading the stack (for example, modules “cfg80211” and “mac80211”) might do the trick.

No comments: