Tag Archives: Linux

Command Line JSON

I just stumbled across a wonderful tool: command line JSON validation/pretty-print.

https://docs.python.org/3/library/json.html#module-json.tool

I often work with JSON with our routers. I use curl to read from our API endpoints which return JSON. Getting back large blobs of JSON is useful but hard to read all jumbled together.

% curl –basic –user admin:${CP_PASSWORD} http://172.16.22.1/api/status/wlan/radio/0

Command line output of router wifi survey api output.

Now instead I can pipe to python3 -m json.tool and the JSON will be cleanly formatted and humanly readable.

C ‘enum’ is Annoying.

Writing a blog post is hard. “I’ll do it later,” I keep thinking. Maybe it’s like flossing–“I’ll do it later.” Next thing I know I’m spraying blood onto the ceiling while the hygienist tut-tuts about my bad habits. I floss (write) now and my future self will thank me.

I’m tinkering with nl80211, the successor to WEXT. WEXT is the Linux Wireless Extensions, a set of standardized ioctl for communicating userspace to the kernel level wireless drivers. WEXT is amazing but limited so the smart people got together and created a much more flexible system around NetLink.

As I’m tinkering with nl80211, I discover a frustrating extensive use of C enum. Using C enum in a network protocol is frustrating because with a large enum (say, greater than 10 elements), converting from an integer in a debug message back to the actual enum is well nigh impossible.

Case in point, the nl80211.h enum nl80211_attrs is ~400 lines and about 260 elements. In my little nl80211 baby-steps code, I fetch NL80211_CMD_GET_INTERFACE and get back an array filled with attribute + value.

        int i;
        for (i=0 ; i<NL80211_ATTR_MAX ; i++ ) {
                if (tb_msg[i]) {
                        printf("%d=%p type=%d len=%d\n", i, (void *)tb_msg[i], nla_type(tb_msg[i]), nla_len(tb_msg[i]));
                        hex_dump("msg", (unsigned char *)tb_msg[i], nla_len(tb_msg[i]));
                }
        }

Again this is baby steps code. I have no idea what I’m actually doing. I’m poking the box.

46=0x14db108 type=46 len=4
msg 0x00001080 08 00 2e 00 ....
206=0x14db110 type=206 len=1
msg 0x00001090 05 . 
217=0x14db120 type=217 len=4 
msg 0x000010a0 08 00 d9 00 ....
256=0x14db118 type=256 len=4
msg 0x000010b0 08 00 00 01 ....

Now I have a list of attributes coming back from the call. What the foo is 217? 256? A visit to gdb will give me answers!  The last element in the enum is NL80211_ATTR_PORT_AUTHORIZED so what is its value? First print the symbol, gives me the symbol. Print as hex (p/x) and as decimal (p/d) shows me the numerical value.

(gdb) p NL80211_ATTR_PORT_AUTHORIZED
$1 = NL80211_ATTR_PORT_AUTHORIZED
(gdb) p/x NL80211_ATTR_PORT_AUTHORIZED 
$2 = 0x103
(gdb) p/d NL80211_ATTR_PORT_AUTHORIZED 
$3 = 259

To find the symbol value of my integer, I can do the reverse in gdb. The typecast will convert an integer to the enum type.

(gdb) p (enum nl80211_attrs)46
$6 = NL80211_ATTR_GENERATION
(gdb) p (enum nl80211_attrs)206
$7 = NL80211_ATTR_MAX_CSA_COUNTERS
(gdb) p (enum nl80211_attrs)217
$8 = NL80211_ATTR_EXT_FEATURES
(gdb) p (enum nl80211_attrs)256
$9 = NL80211_ATTR_SCHED_SCAN_MAX_REQS

Having to dig into gdb for every enum in nl80211.h will be tiring. The problems grow in other header files that are nests of #ifdefs.

I me personally prefer using #define for symbols like this. The explicit link of a symbol to a value in source form is helpful.

 

Continuing to Debug a OpenSSL Build

Continuing to tinker with the OpenSSL problems I’ve been having. Have a chunk of software whose building requires the older version of OpenSSL to be installed. But I want the newest OpenSSL dev package on my machine. We’re not going to get along.

I discovered I could selectively static link some of the libraries into the program. This was completely mind blowing feature to me. I started using the GCC linker way way back and was quite disconcerted by the -lfoo syntax which would find libfoo.a for linking.

In trying to force this software to build against a local copy of the older OpenSSL, I found problems with the final program trying to find the OpenSSL dynamic libraries. LD_LIBRARY_PATH would allow me to aim the software at my local lib but I was hoping there would be something simpler (that didn’t require setting that env var every time). Could I static link the executable? It’s been a while since I tried to static link a Linux app.

Quick google and found an even better answer:

https://stackoverflow.com/questions/6578484/telling-gcc-directly-to-link-a-library-statically

In the Makefile, I simply had to use -l:filename to link directly to a specific static library. With the -L flag, I could aim at a specific library location.

LDFLAGS += -lffi -lutil -lz -lm -lpthread -l:libssl.a -l:libcrypto.a -lrt -ldl

I had no idea this was possible with the linker. I’ve been using the GNU compilers for so many years and had no idea this was possible.

 

 

Debugging a Linux Hard Lockup

Building the linux-wireless-testing tree described in https://wireless.wiki.kernel.org/en/developers/documentation/git-guide   I have multiple MediaTek 7612 USB cards I’d like to get working. There’s exciting work going on in wireless to bring the MediaTek wifi into the mainline kernel.

I built the kernel on a Shuttle, loaded my new kernel. Plugged in the Alfa AWUS036ACM dongle, loaded up wpa_supplicant and my scripts. Desktop was unresponsive: hard lock up.

Power cycle. I’m on Fedora 28 which is systemd which uses journald. The journalctl -b -1 didn’t show any kernel panic.

Debugging a hard lock up of a kernel module seems easier than a kernel boot. The netconsole module https://www.kernel.org/doc/Documentation/networking/netconsole.txt will let me see the kernel log messages leading up to the lock-up.

I also need to investigate the watchdogs https://www.kernel.org/doc/Documentation/lockup-watchdogs.txt

Netconsole.

% sudo modprobe netconsole @/enp3s0,5566@172.19.10.254/

[69677.819810] netconsole: unknown parameter '@/enp3s0,5566@172' ignored
[69677.819903] console [netcon0] enabled
[69677.819904] netconsole: network logging started

OK, what did I screw up. Is there a way I can tickle the kernel into logging a message to test my config? Not from userspace, but Google search found the following useful post. (Oh, StackOverflow (and related) is there nothing you can’t do?)

https://serverfault.com/questions/140354/how-to-add-message-that-will-be-read-with-dmesg/140358#140358

Load/unload the module. Still not seeing any messages. Oh, right. I always forget this step. https://elinux.org/Debugging_by_printing#Netconsole_resources

echo 8 > /proc/sys/kernel/debug

Skipping over the “unknown parameter” problem for now and just setting the parameters manually. Here is what I have:

...trillian:coconut% cd /sys/kernel/config/netconsole/
...trillian:netconsole% ls
arthurdent/
...trillian:netconsole% cd arthurdent/
...trillian:arthurdent% ls
dev_name enabled extended local_ip local_mac local_port remote_ip remote_mac remote_port
...trillian:arthurdent% cat remote_mac
70:88:6b:81:ac:64
...trillian:arthurdent% cat remote_ip
172.16.17.181
...trillian:arthurdent% cat remote_port
5566
...trillian:arthurdent% cat local_ip
172.16.17.92
...trillian:arthurdent% cat local_port
6665
...trillian:arthurdent% cat dev_name
enp0s31f6

So now run netcat on arthurdent, my other machine.

nc -l -u 5566

Can test my netcat from trillian by sending a udp packet:

ls | nc -u 172.16.17.181 5566

And now time to turn the crank and watch the chaos unfold. Start up my wpa_supplicant and wpa_cli script and boom crash.

[ 546.841947] wlp0s20f0u1u2: authenticate with c4:b9:cd:dc:48:40
[ 547.140202] wlp0s20f0u1u2: send auth to c4:b9:cd:dc:48:40 (try 1/3)
[ 547.140226] BUG: unable to handle kernel NULL pointer dereference at 0000000000000011
[ 547.140228] PGD 800000083e025067 P4D 800000083e025067 PUD 83e072067 PMD 0
[ 547.140233] Oops: 0000 [#1] SMP PTI
[ 547.140236] CPU: 2 PID: 2503 Comm: wpa_supplicant Not tainted 4.19.0-rc2-wt #1
[ 547.140238] Hardware name: Shuttle Inc. SZ170/FZ170, BIOS 2.09 08/01/2017
[ 547.140258] RIP: 0010:ieee80211_wake_txqs+0x1e3/0x3d0 [mac80211]
[ 547.140261] Code: 4c 89 fe 4c 89 ef 48 8b 92 b0 02 00 00 e8 45 d6 2e f4 48 8b 3c 24 e8 0c bc 00 f4 48 83 c5 08 48 3b 6c 2
4 08 74 a0 4c 8b 7d 00 <41> 0f b6 57 11 3b 54 24 18 75 e6 4d 8d a7 28 ff ff ff f0 49 0f ba
[ 547.140263] RSP: 0018:ffff95038ea83ed0 EFLAGS: 00010293
[ 547.140265] RAX: ffff95038419e978 RBX: ffff95038419e000 RCX: ffff950384b18760
[ 547.140267] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff950384b18828
[ 547.140269] RBP: ffff95038419e970 R08: 00039e862fcdb16e R09: ffff95038a9b4230
[ 547.140271] R10: 0000000000000420 R11: ffffb8efc46c78d0 R12: ffff9503798251d0
[ 547.140273] R13: ffff950384b18760 R14: 0000000000000000 R15: 0000000000000000
[ 547.140275] FS: 00007f42ed60cdc0(0000) GS:ffff95038ea80000(0000) knlGS:0000000000000000
[ 547.140277] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 547.140279] CR2: 0000000000000011 CR3: 0000000835b30001 CR4: 00000000003606e0
[ 547.140281] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 547.140283] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 547.140284] Call Trace:

(Crash truncated here.)  Posted to the linux-wireless mailing list and am now getting some help.