Tag Archives: debugging

autoreconf

I love GNU autoconf. I remember the days of downloading a .zip or a .tar.Z of source and having to manually edit a config.h, full of symbols I didn’t understand. GNU autoconf came along and now we just ./configure && make && make install.

Building Kismet from source, I had a difficulty with libusb which is not installed on my linux laptop. I hadn’t installed libusb because I hadn’t needed it yet. Building Kismet on another machine worked fine (libusb installed).

./configure
...
checking for libusb... no
configure: error: Package requirements (libusb-1.0) were not met:

Package 'libusb-1.0', required by 'virtual:world', not found

I was curious about Kismet’s modularity. OK, probably just need to check the configure flags. Usually a properly modular program would allow me to disable USB.

% ./configure --help...
 --disable-usb Disable libUSB support
...

However, “./configure -disable-usb” still gave me the libusb error. Puzzling. I finally noticed the ./configure was reporting an error, right after starting.

% ./configure --disable-usb
configure: WARNING: unrecognized options: --disable-usb
...

Debugging the generated configure script is a pain. Time to dust off my ancient autoconf knowledge. The configure script starts with the configure.ac. I opened that up and searched for the disable-usb;

AC_ARG_ENABLE(libusb,
  AS_HELP_STRING([--disable-usb], [Disable libUSB support]),

The help string is there. But why doesn’t it work? Is it something simple?

 AC_ARG_ENABLE(libusb,
- AS_HELP_STRING([--disable-usb], [Disable libUSB support]),
+ AS_HELP_STRING([--disable-libusb], [Disable libUSB support]),

It might be this simple. Now how do I rebuild the configure script from the configure.ac? It’s been a long time but I remember a magic ‘autoreconf’.

% sudo dnf install autoconf automake
% autoreconf

The autoreconf failed with a complaint about AC_PYTHON_MODULE macro. (I’ve lost the actual message to the mists of scrollback.) Autoconf is built around m4. A quick Google search for AC_PYTHON_MODULE leads to an m4 macro library: https://www.gnu.org/software/autoconf-archive/

Download the tarball, ./configure && make && make install and then try autoreconf again. Works! git diff shows the Kismet configure script updated. Run the configure again with –disable-libusb and no complaints.

 

Debugging a Linux Hard Lockup

Building the linux-wireless-testing tree described in https://wireless.wiki.kernel.org/en/developers/documentation/git-guide   I have multiple MediaTek 7612 USB cards I’d like to get working. There’s exciting work going on in wireless to bring the MediaTek wifi into the mainline kernel.

I built the kernel on a Shuttle, loaded my new kernel. Plugged in the Alfa AWUS036ACM dongle, loaded up wpa_supplicant and my scripts. Desktop was unresponsive: hard lock up.

Power cycle. I’m on Fedora 28 which is systemd which uses journald. The journalctl -b -1 didn’t show any kernel panic.

Debugging a hard lock up of a kernel module seems easier than a kernel boot. The netconsole module https://www.kernel.org/doc/Documentation/networking/netconsole.txt will let me see the kernel log messages leading up to the lock-up.

I also need to investigate the watchdogs https://www.kernel.org/doc/Documentation/lockup-watchdogs.txt

Netconsole.

% sudo modprobe netconsole @/enp3s0,5566@172.19.10.254/

[69677.819810] netconsole: unknown parameter '@/enp3s0,5566@172' ignored
[69677.819903] console [netcon0] enabled
[69677.819904] netconsole: network logging started

OK, what did I screw up. Is there a way I can tickle the kernel into logging a message to test my config? Not from userspace, but Google search found the following useful post. (Oh, StackOverflow (and related) is there nothing you can’t do?)

https://serverfault.com/questions/140354/how-to-add-message-that-will-be-read-with-dmesg/140358#140358

Load/unload the module. Still not seeing any messages. Oh, right. I always forget this step. https://elinux.org/Debugging_by_printing#Netconsole_resources

echo 8 > /proc/sys/kernel/debug

Skipping over the “unknown parameter” problem for now and just setting the parameters manually. Here is what I have:

...trillian:coconut% cd /sys/kernel/config/netconsole/
...trillian:netconsole% ls
arthurdent/
...trillian:netconsole% cd arthurdent/
...trillian:arthurdent% ls
dev_name enabled extended local_ip local_mac local_port remote_ip remote_mac remote_port
...trillian:arthurdent% cat remote_mac
70:88:6b:81:ac:64
...trillian:arthurdent% cat remote_ip
172.16.17.181
...trillian:arthurdent% cat remote_port
5566
...trillian:arthurdent% cat local_ip
172.16.17.92
...trillian:arthurdent% cat local_port
6665
...trillian:arthurdent% cat dev_name
enp0s31f6

So now run netcat on arthurdent, my other machine.

nc -l -u 5566

Can test my netcat from trillian by sending a udp packet:

ls | nc -u 172.16.17.181 5566

And now time to turn the crank and watch the chaos unfold. Start up my wpa_supplicant and wpa_cli script and boom crash.

[ 546.841947] wlp0s20f0u1u2: authenticate with c4:b9:cd:dc:48:40
[ 547.140202] wlp0s20f0u1u2: send auth to c4:b9:cd:dc:48:40 (try 1/3)
[ 547.140226] BUG: unable to handle kernel NULL pointer dereference at 0000000000000011
[ 547.140228] PGD 800000083e025067 P4D 800000083e025067 PUD 83e072067 PMD 0
[ 547.140233] Oops: 0000 [#1] SMP PTI
[ 547.140236] CPU: 2 PID: 2503 Comm: wpa_supplicant Not tainted 4.19.0-rc2-wt #1
[ 547.140238] Hardware name: Shuttle Inc. SZ170/FZ170, BIOS 2.09 08/01/2017
[ 547.140258] RIP: 0010:ieee80211_wake_txqs+0x1e3/0x3d0 [mac80211]
[ 547.140261] Code: 4c 89 fe 4c 89 ef 48 8b 92 b0 02 00 00 e8 45 d6 2e f4 48 8b 3c 24 e8 0c bc 00 f4 48 83 c5 08 48 3b 6c 2
4 08 74 a0 4c 8b 7d 00 <41> 0f b6 57 11 3b 54 24 18 75 e6 4d 8d a7 28 ff ff ff f0 49 0f ba
[ 547.140263] RSP: 0018:ffff95038ea83ed0 EFLAGS: 00010293
[ 547.140265] RAX: ffff95038419e978 RBX: ffff95038419e000 RCX: ffff950384b18760
[ 547.140267] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff950384b18828
[ 547.140269] RBP: ffff95038419e970 R08: 00039e862fcdb16e R09: ffff95038a9b4230
[ 547.140271] R10: 0000000000000420 R11: ffffb8efc46c78d0 R12: ffff9503798251d0
[ 547.140273] R13: ffff950384b18760 R14: 0000000000000000 R15: 0000000000000000
[ 547.140275] FS: 00007f42ed60cdc0(0000) GS:ffff95038ea80000(0000) knlGS:0000000000000000
[ 547.140277] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 547.140279] CR2: 0000000000000011 CR3: 0000000835b30001 CR4: 00000000003606e0
[ 547.140281] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 547.140283] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 547.140284] Call Trace:

(Crash truncated here.)  Posted to the linux-wireless mailing list and am now getting some help.