Cross Compiling Rust (Notes)

I’ve been fiddling with cross compiling Rust to our ARM-64 platforms. I’m hitting a wall because Rust elfs require the system loader and C libs. I’m writing these notes to document what I’ve found so far. Would like to come back and discover how to build a static ARM-64 executable. Started here: https://github.com/japaric/rust-cross

Step 1. Install Rust. https://www.rust-lang.org/tools/install

Step2. Install a C cross compiler. Rust uses the C linker.

% sudo apt install gcc-aarch64-linux-gnu 
or 
% sudo dnf install gcc-aarch64-linux-gnu

Step 3. Add the cross compiler targets to Rust.

rustup target add aarch64-unknown-linux-gnu
rustup target add aarch64-unknown-linux-musl

Step 4. Add to the cargo config

…marvin:hello-rust% cat ~/.cargo/config
[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"
[target.aarch64-unknown-linux-musl]
linker = "aarch64-linux-gcc"

Step 5. Create a simple new project.

cargo new --bin hello-rust
cd hello-rust

Step 6. Build

cross build --target aarch64-unknown-linux-gnu

Now examining the final executable, I’m finding it requires the glibc loader and libraries. We’re using uClibc because historical reasons.

% file ./target/aarch64-unknown-linux-gnu/debug/hello-rust
./target/aarch64-unknown-linux-gnu/debug/hello-rust: ELF 64-bit LSB shared object, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=f62e92025124bc8018570c700b6e5faeecfdd671, for GNU/Linux 3.7.0, with debug_info, not stripped
% readelf -a ./target/aarch64-unknown-linux-gnu/debug/hello-rust
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040
0x0000000000000230 0x0000000000000230 R 0x8
INTERP 0x0000000000000270 0x0000000000000270 0x0000000000000270
0x000000000000001b 0x000000000000001b R 0x1
[Requesting program interpreter: /lib/ld-linux-aarch64.so.1]
[snip]
Dynamic section at offset 0x3aac8 contains 30 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libdl.so.2]
0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0]
0x0000000000000001 (NEEDED) Shared library: [libgcc_s.so.1]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]

I tried the Rust musl cross compiler but hit a wall with some compile errors around, I think, hard/soft floats. The musl cross compiler I found here: https://toolchains.bootlin.com/releases_aarch64.html

There are notes explaining musl can be static linked. https://github.com/japaric/rust-cross#how-do-i-compile-a-fully-statically-linked-rust-binaries

I should google those errors. Found:

https://github.com/rust-lang/rust/issues/46651

Looks like it might be fixed but not in a released Rust? Need to investigate further.

Update: Installed the nightly build and voila! static link musl elf.

% rustup toolchain install nightly
% rustup default nightly-x86_64-unknown-linux-gnu
% cross build --target aarch64-unknown-linux-musl
% file ./target/aarch64-unknown-linux-musl/debug/hello-rust
./target/aarch64-unknown-linux-musl/debug/hello-rust: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), statically linked, with debug_info, not stripped

Copied the resulting elf to my firmware. IT WORKS! It’s big, but it works.

/var/tmp # ls -l hello-rust
 -rwxr--r--    1 root     root       3878608 Oct 26 12:55 hello-rust*
 /var/tmp # ./hello-rust
 Hello, world!

Message sequence number mismatch in libnl.

I ran into a problem with libnl and sequence numbers while working on my wifi scanner application. My app would subscript to the NL80211_CMD_NEW_SCAN_RESULTS and NL80211_CMD_SCHED_SCAN_RESULTS events. On receiving those events, I would send a NL80211_CMD_GET_SCAN.

Sometimes, I would start getting a -NLE_SEQ_MISMATCH on receiving the scan survey data. Once that occurred, I would receive no more scan data.

The libnl sequence numbers are described briefly here: https://www.infradead.org/~tgr/libnl/doc/core.html#core_seq_num They’re a simple way to associate a request with a response. They’re not bulletproof and do not claim to be.

The sequence numbers are tracked per socket in the private ‘struct nl_sock’. Both are initialized to time(0) in __alloc_socket().

struct nl_sock
{
...
     unsigned int s_seq_next;
     unsigned int s_seq_expect;
...
};

The nlmsghdr contains the sequence number. Note the same sequence number can be used multiple times. When there is more data that can fit in a single nl_msg, the data is broken across multiple nl_msg, indicated by flag NLM_F_MULTI (“Multipart message, terminated by NLMSG_DONE”).

struct nlmsghdr {
        __u32           nlmsg_len;      /* Length of message including header */
        __u16           nlmsg_type;     /* Message content */
        __u16           nlmsg_flags;    /* Additional flags */
        __u32           nlmsg_seq;      /* Sequence number */
        __u32           nlmsg_pid;      /* Sending process port ID */
};

The nlmsghdr->nlmsg_seq is assigned in nl_complete_msg() which is called before the nl_msg is sent to the nl_sock. The socket ‘next’ is incremented at this time.

        if (nlh->nlmsg_seq == NL_AUTO_SEQ) { 
                nlh->nlmsg_seq = sk->s_seq_next++;
                NL_DBG(3, "nl_complete_msg(%p): Increased next " \
                           "sequence number to %d\n",
                           sk, sk->s_seq_next);
        }

The sequence number is checked in recvmsgs(), which is the core of libnl’s nl_msg receive handling. The recvmsgs() is responsible for calling several callbacks and for checking sequence numbers.


                if (hdr->nlmsg_type == NLMSG_DONE ||
                    hdr->nlmsg_type == NLMSG_ERROR ||
                    hdr->nlmsg_type == NLMSG_NOOP ||
                    hdr->nlmsg_type == NLMSG_OVERRUN) {
                        /* We can't check for !NLM_F_MULTI since some netlink
                         * users in the kernel are broken. */
                        sk->s_seq_expect++;
                        NL_DBG(3, "recvmsgs(%p): Increased expected " \
                               "sequence number to %d\n",
                               sk, sk->s_seq_expect);
                }

When the NLMSG_DONE is received, the expected sequence number is increased. If that DONE or ERROR aren’t received, the expected sequence number is never incremented.

The sequence number seq_next is advanced when a new nl_msg is created. The sequence number seq_expect is advanced when an incoming nl_msg is DONE or ERROR (or NOOP or OVERRUN, which I haven’t encountered yet).

The sequence number checking occurs also in recvmsgs(). In my code, I’m not using the NL_CB_SEQ_CHECK callback and leaving auto-ack mode enabled, so the sequence number checking in recvmsgs() is enforced. (As I’m still learning libnl, I was using the same pattern as the iw library.) Note this check happens before the DONE+ERROR check which increments the seq_expect.

                /* Sequence number checking. The check may be done by
                 * the user, otherwise a very simple check is applied
                 * enforcing strict ordering */
                if (cb->cb_set[NL_CB_SEQ_CHECK]) {
                        NL_CB_CALL(cb, NL_CB_SEQ_CHECK, msg);

                /* Only do sequence checking if auto-ack mode is enabled */
                } else if (!(sk->s_flags & NL_NO_AUTO_ACK)) {
                        NL_DBG(3, "recvmsgs(%p) : nlmsg_seq=%d s_seq_expect=%d\n", 
                                        sk, hdr->nlmsg_seq, sk->s_seq_expect);
                        if (hdr->nlmsg_seq != sk->s_seq_expect) {
                                if (cb->cb_set[NL_CB_INVALID])
                                        NL_CB_CALL(cb, NL_CB_INVALID, msg);
                                else {
                                        err = -NLE_SEQ_MISMATCH;
                                        nl_msg_dump(msg, stdout);
                                        goto out;
                                }
                        }
                }

A simple transaction could look like:

sk->seq_nextsk->seq_expecthdr->seq
7296872968new nl_msg; assigned 72968; seq_next++
729697296872968 (MULTI)
729697296872968 (MULTI)
729697296872968 (MULTI+DONE); seq match! seq_expect++
7296972969

Moving on to the problem I encountered. The trigger of the problem is a 2nd CMD_NEW_SCAN_RESULTS or CMD_SCHED_SCAN_RESULTS received while already reading a CMD_GET_SCAN. The kernel doesn’t like interleaving the get-scan-results apparently so refuses with an -EBUSY error. The request increments the sk->seq_next but the error response nl_msg->seq doesn’t match the sk->seq_expect (which is tracking the previous request) and so the nl_msg is dropped before hitting the DONE check that would increment sk->seq_expect.

Once the sequence numbers get into this state, there is no exit. The nl_socket is perpetually at the wrong sequence number. The only solution is to close/re-open the socket on receiving a -NLE_SEQ_MISMATCH.

A better solution would be to avoid getting into this state in the first place. Perhaps not send a new CMD_GET_SCAN while a previous fetch is already running. I’m still tinkering with solutions.

IEEE 802.11 Standards, Alphabetically.

IEEE802.11 Standards.

https://en.wikipedia.org/wiki/IEEE_802.11

One Letter, Alphabetical Order.

https://en.wikipedia.org/wiki/IEEE_802.11a-1999 IEEE Std 802.11a-1999: High-speed Physical Layer in the 5 GHz Band (Amendment 1)

https://en.wikipedia.org/wiki/IEEE_802.11b-1999 IEEE Std 802.11b-1999: Higher-Speed Physical Layer Extension in the 2.4 GHz Band (Amendment 2)

IEEE Std 802.11b-1999/Corrigendum 1-2001: Higher-speed Physical Layer (PHY) extension in the 2.4 GHz band (Corrigendum 1 to Amendment 2)

https://en.wikipedia.org/wiki/IEEE_802.11d-2001 IEEE Std 802.11d-2001: Specification for operation in additional regulatory domains (Amendment 3)

https://en.wikipedia.org/wiki/IEEE_802.11e-2005 IEEE Std 802.11e-2005: Medium Access Control (MAC) Quality of Service Enhancements (Amendment 8)

https://en.wikipedia.org/wiki/IEEE_802.11g-2003 IEEE Std 802.11g-2003: Further Higher Data Rate Extension in the 2.4 GHz Band (Amendment 4)

https://en.wikipedia.org/wiki/IEEE_802.11h-2003 IEEE Std 802.11h-2003: Spectrum and Transmit Power Management Extensions in the 5 GHz band in Europe (Amendment 5)

https://en.wikipedia.org/wiki/IEEE_802.11i-2004 IEEE Std 802.11i-2004: Medium Access Control (MAC) Security Enhancements (Amendment 6) TKIP.

https://en.wikipedia.org/wiki/IEEE_802.11j-2004 IEEE Std 802.11j-2004: 4.9 GHz–5 GHz Operation in Japan (Amendment 7)

https://en.wikipedia.org/wiki/IEEE_802.11k-2008 IEEE Std 802.11k-2008: Radio Resource Measurement of Wireless LANs (Amendment 1) RMM Radio Resource Management.

https://en.wikipedia.org/wiki/IEEE_802.11n-2009 IEEE Std 802.11n-2009: Enhancements for Higher Throughput (Amendment 5). AKA WiFi-5

https://en.wikipedia.org/wiki/IEEE_802.11p IEEE Std 802.11p-2010: Wireless Access in Vehicular Environments (Amendment 6)

https://en.wikipedia.org/wiki/IEEE_802.11r-2008 IEEE Std 802.11r-2008: Fast Basic Service Set (BSS) Transition (Amendment 2). Fast Roaming

https://en.wikipedia.org/wiki/IEEE_802.11s IEEE Std 802.11s-2011: Mesh Networking (Amendment 10)

https://en.wikipedia.org/wiki/IEEE_802.11u IEEE Std 802.11u-2011: Internetworking with External Networks (Amendment 9). QoS.

https://en.wikipedia.org/wiki/IEEE_802.11v IEEE Std 802.11v-2011: Wireless Network Management (Amendment 8). WNM.

https://en.wikipedia.org/wiki/IEEE_802.11w-2009 IEEE Std 802.11w-2009: Protected Management Frames (Amendment 4). PMF.

https://en.wikipedia.org/wiki/IEEE_802.11y-2008 IEEE Std 802.11y-2008: 3650–3700 MHz Operation in USA (Amendment 3) Extra spectrum. Never caught on, no radios support it.

https://en.wikipedia.org/wiki/IEEE_802.11#802.11z IEEE Std 802.11z-2010: Extensions to Direct-Link Setup (DLS) (Amendment 7)

Two Letter ‘a’, Alphabetical Order. 

IEEE Std 802.11aa-2012: MAC Enhancements for Robust Audio Video Streaming (Amendment 2) 10

https://en.wikipedia.org/wiki/IEEE_802.11ac IEEE Std 802.11ac-2013: Enhancements for Very High Throughput for Operation in Bands below 6 GHz (Amendment 4) 

https://en.wikipedia.org/wiki/IEEE_802.11ad  IEEE Std 802.11ad-2012: Enhancements for Very High Throughput in the 60 GHz Band (Amendment 3)

IEEE Std 802.11ae-2012: Prioritization of Management Frames (Amendment 1)

https://en.wikipedia.org/wiki/IEEE_802.11af   IEEE Std 802.11af-2013: Television White Spaces (TVWS) Operation (Amendment 5)

https://en.wikipedia.org/wiki/IEEE_802.11ah  HaLow  900 MHz

https://en.wikipedia.org/wiki/IEEE_802.11ai  FILS Fast Initial Link Setup

https://en.wikipedia.org/wiki/IEEE_802.11#802.11aj China Millimeter Wave (CMMW) (45GHz)

https://en.wikipedia.org/wiki/IEEE_802.11#802.11aq pre-association discovery of services (extends 802.11u)

https://en.wikipedia.org/wiki/IEEE_802.11ax  HE (High Efficiency)  aka WiFi6

https://en.wikipedia.org/wiki/IEEE_802.11ay Second WiGig standard (60 GHz) (cf. 802.11ad)

Two Letter ‘b’, Alphabetical Order.

https://en.wikipedia.org/wiki/IEEE_802.11#802.11ba  Wake-up Radio (WUR) 

https://en.wikipedia.org/wiki/IEEE_802.11be Extremely High Throughput (EHT) aka WiFi7

Command Line JSON

I just stumbled across a wonderful tool: command line JSON validation/pretty-print.

https://docs.python.org/3/library/json.html#module-json.tool

I often work with JSON with our routers. I use curl to read from our API endpoints which return JSON. Getting back large blobs of JSON is useful but hard to read all jumbled together.

% curl –basic –user admin:${CP_PASSWORD} http://172.16.22.1/api/status/wlan/radio/0

Command line output of router wifi survey api output.

Now instead I can pipe to python3 -m json.tool and the JSON will be cleanly formatted and humanly readable.

C++ filesystem vs experimental::filesystem

I’m tinkering with C++ again.  I’m trying to use as much standard C++17 as possible because it’s a good reason to learn the C++17 standards.

Boost has a filesystem library. The library evolved into a core C++ library standard. GCC 7 has the filesystem module in the std::experimental namespace. GCC 7 moved the filesystem into the top level std:: namespace.

I’m trying to write a program that works in both recent Fedora and Ubuntu. Fedora29 uses GCC8, Ubuntu18.04 uses GCC7. Dang it. The experimental filesystem module still exists in GCC8 (/usr/include/c++/8/experimental/filesystem) but I’d like to use the latest and greatest available, if possible (/usr/include/c++/8/filesystem).

I can rename the namespace via the “namespace fs = ” which is really useful. Is a bit like Pythons “import module as newname” but I need to know which header and which namespace to rename. My first thought was duplicating the __cplusplus macro used in the filesystem header in the two aforementioned paths in Fedora.

#if __cplusplus >= 201703L
#include <filesystem> // gcc8 (Fedora29+)
namespace fs = std::experimental::filesystem;
#elif __cplusplus >= 201103L
#include <experimental/filesystem> // gcc7 (Ubuntu 18.04)
namespace fs = std::experimental::filesystem;
#endif

No such luck. The __cplusplus macro is set based on the -stdc++NN command line arg. So my build under Ubuntu also uses -stdc++17 and would hit the __cplusplus >= 201703L.

I don’t want to specifically test for GCC version in an #ifdef. I’d also like to eventually build under OSX (Clang+XCode) and Windows (VC++).

I’m using CMake. I think I’m going to have to do some test compiles with CMake to determine which header to use.

 

Firefox vs Chrome

Microsoft recently announced they were moving their Edge browser to the Chromium renderer. That makes me sad, not least for the risk of monoculture, but mostly because Google does not have my best interests at heart.

Google is an advertising company. They will not allow anything into Chromium that will have an adverse effect on advertising. To whit, anything that discourages human interaction will likely never make it into Chromium.

Eye catching is the name of the game in advertising. Any way to drag my attention to a particular point is the goal. We humans, having evolved as a potential prey animal, have a finely tuned visual system that will alert us to movement. “Oh, crap! Leopard in the leaves! Run!”

Case in point: animated gifs. Blinking, flashing, obnoxious animated gif have been the advertisement standard since Netscape banner ads debuted. Then came Flash. Now HTML5 has the <video> tag. My brain feels under assault.

Firefox has had the ability to disable animated gifs for years and years. There is still no built-in way to disable gif in Chrome. There are plug-ins that claim to halt/block animated gif, but I’ve never found one that actually works. The only solution to Chrome has been to globally block images and cherry pick sites to allow images (a whitelist). Firefox +1, Chrome -1.

Flashblock and other plugins would cure the scourge of Flash videos. Those plugins also made most of the web loading times tolerable. Chrome has “ask permission to run” for their built-in Flash player. Firefox +1, Chrome +1.

And now Flash is dying (buh bye) and HTML5 <video> is taking over. Video support is baked into the browser. Now the web is all about <video> advertising. Firefox to the rescue again. https://support.mozilla.org/en-US/questions/1238033  Seems to work so far. Firefox +1, Chrome -1.

Total score so far: Firefox 3, Chrome 1.  Long live Firefox.

Firefox about:config

image.animation_mode;none  <– disable animated gifs (has worked for years)

media.autoplay.default;1  <– stop auto play HTML video (works so far)

 

Fun with Python Regex

I’m a computer language nerd. I like programming languages. I’m in no way an expert. I do enjoy digging into new languages and even digging into the low level jiggery-pokery of languages I use day-to-day. Like C. But US$200 for the C standard language spec? Are you kidding me? No. Digging around on the committee website I found a draft: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf

My first pass at the C enum parser is incredibly simple. I don’t want to write a full language parser because (a) that’s a lot of work and (b) it’s already been done before. I want something that would take a couple days tops to create and let me get back to fiddling with nl80211.

I’m using Python for the parser because I know Python pretty well. I’ve tinkered with the the C++ std::regex library as well but not for this project.

C comments can be either block comments surrounded by /*  */  or line comments starting with //.  For my toy parser, I’m handling comments very naively.

I’m mostly interested in the nl80211.h so I’m making some assumptions about the code format as I’m puttering with regex.

A C enum BNF-ish is (from the above pdf):

Screen Shot 2018-12-09 at 9.05.57 AM

I’m amusing myself with WordPress’ text formatting. I get distracted way too easily. (I dug further into the doc to find the definition of enumeration-constant.)

enum-specifier:

enum identifieropt { enumerator-list }
enum identifieropt { enumerator-list , }
enum identifier

enumerator-list:

enumerator
enumerator-list , enumerator

enumerator:

enumeration-constant
enumeration-constant = constant-expression

enumeration-constant:

identifier

Focus, Dave. Focus. HTML is just another language and it’s easy to get distracted. The break between LHS and RHS above is driving me nuts but I will move on. Focus!

A C identifier can be described by the Python regex “[a-zA-Z_][a-zA-Z_0-9]*”  Python regex whitespace is “\s” Required whitespace would be “\s+”. My first naive regex that searched for a starting enum: “enum\s+([a-zA-Z_][a-zA-Z_0-9]*)\s+{”  I’m assuming the enum+identifier+openbrace is on the same line.

The enumerator-list is another regex but more complicated because of the optional expression. I started with: “([a-zA-Z_][a-zA-Z_0-9]*)\s+,”  for a simple match. The constant-expression match would be “([a-zA-Z_][a-zA-Z_0-9]*)\s*=\s*([a-zA-Z_0-9]*)?,” and the copy paste started getting on my nerves.

I started fiddling with a Python printf-y .format() and a stumbled across a brain blast. Python f-strings are amazing when used with regex. Instead of trying to build a .format() or a %s block, I can assign my regex to a var. And I have a very readable regex. I can build up my regex piece by piece (for greater or for ill).

# C-style variable name
identifier = "[a-zA-Z_][a-zA-Z_0-9]*"

# using f strings to save myself some confusion 
open_brace = "\{"
close_brace = "\}"
whitespace = "\s+"
number = "-?[0-9]+"
operator = "(?P<operator>\+|-|<<)" # XXX subset of actual C operators

# I'm very sure this is not the proper use of the term 'atom'
# atom := number | identifier 
atom = f"(?:{identifier}|{number})"

# expression := atom
# := atom operator atom 
expression = f"({atom}({whitespace}{operator}{whitespace}{atom})?)"

# enum member regex
symbol_matcher = re.compile(f"(?P<identifier>{identifier})({whitespace}={whitespace}(?P<expression>{expression}))?")

# start of an emum declaration (XXX assumes open brace on same line as the
# 'enum' keywoard
enum_matcher = re.compile(f"enum{whitespace}({identifier}){whitespace}{open_brace}")

 

The f-string uses variables from Python’s context. So f”{identifier}{whitespace}{operator}{whitespace}” will expand to “[a-zA-Z_][a-zA-Z_0-9]*\s+(\+|-|<<)\s+” The f-string is much easier to read. The ?P<name> is a Python regex feature that stores the grouped regex expression into a key “name”.

s = "NL80211_NAN_FUNC_ATTR_MAX = NUM_NL80211_NAN_FUNC_ATTR - 1"
robj = symbol_matcher.search(s)
print(robj)
print(robj.groups())
print(robj.groupdict())

The code snippet gives me the following. Definitely need a lot of testing.

<_sre.SRE_Match object; span=(0, 57), match='NL80211_NAN_FUNC_ATTR_MAX = NUM_NL80211_NAN_FUNC_>
('NL80211_NAN_FUNC_ATTR_MAX', ' = NUM_NL80211_NAN_FUNC_ATTR - 1', 'NUM_NL80211_NAN_FUNC_ATTR - 1', 'NUM_NL80211_NAN_FUNC_ATTR - 1', ' - 1', '-')
{'identifier': 'NL80211_NAN_FUNC_ATTR_MAX', 'expression': 'NUM_NL80211_NAN_FUNC_ATTR - 1', 'operator': '-'}

Parsing C ‘enum’ with Python Regex.

In my previous post I mentioned I find C ‘enum’ a big annoying. An enum value captured in a network track or a packet hexdump is difficult to track backwards to a symbolic value. The nl80211.h header file uses enum extensively. As I continue to learn netlink and nl80211, I’d like a quick way to convert those enum into a human value.

Why not parse the header file and decode the enum? Python is my go-to for small string tasks.  (I mentioned I was parsing C enum in Python to a friend and he said, “That’s a very Dave move.” I’ll take it as a compliment, I suppose.)

C enum is straightforward: start counting as zero and auto increment. If there is an RHS expression, then that enum takes on that value and the auto increment continues from there.

enum nl80211_commands {
     /* don't change the order or add anything between, this is ABI! */
     NL80211_CMD_UNSPEC,

     NL80211_CMD_GET_WIPHY, /* can dump */
     NL80211_CMD_SET_WIPHY,
     NL80211_CMD_NEW_WIPHY,
     NL80211_CMD_DEL_WIPHY,
[snip]

(At some point I wonder if I should spring for the extra WordPress plugin for code formatting. Maybe.)

In the nl80211_commands above, NL80211_CMD_UNSPEC == 0, then NL80211_CMD_GET_WIPHY == 1. Simple counter. But danger lurks.

The RHS can be an expression. The expression can be a simple value.

enum nl80211_user_reg_hint_type {
        NL80211_USER_REG_HINT_USER      = 0,
        NL80211_USER_REG_HINT_CELL_BASE = 1,
        NL80211_USER_REG_HINT_INDOOR    = 2,
};

Or a complicated expression.

enum nl80211_tdls_peer_capability {
        NL80211_TDLS_PEER_HT = 1<<0,
        NL80211_TDLS_PEER_VHT = 1<<1,
        NL80211_TDLS_PEER_WMM = 1<<2,
};

The expression can reference previous values in the same enum as the next example. (Emphasis added.)

enum nl80211_sched_scan_plan {
        __NL80211_SCHED_SCAN_PLAN_INVALID,
        NL80211_SCHED_SCAN_PLAN_INTERVAL,
        NL80211_SCHED_SCAN_PLAN_ITERATIONS,

        /* keep last */
        __NL80211_SCHED_SCAN_PLAN_AFTER_LAST,
        NL80211_SCHED_SCAN_PLAN_MAX =
                __NL80211_SCHED_SCAN_PLAN_AFTER_LAST - 1
};

Here’s my favorite example, showing the enum auto increment counter being reset. The example below adds new symbol identical to an existing symbol’s value and the auto increment continues on its merry way.

enum nl80211_commands {
[snip]
     NL80211_CMD_GET_BEACON,
     NL80211_CMD_SET_BEACON,
     NL80211_CMD_START_AP,
     NL80211_CMD_NEW_BEACON = NL80211_CMD_START_AP,
     NL80211_CMD_STOP_AP,
     NL80211_CMD_DEL_BEACON = NL80211_CMD_STOP_AP,
[snip]

I think it would be interesting to create a regex that can parse the enum. There are of course simpler ways to do this: I could just continue to use gdb. Most of the enums are small so not a big deal to manually count them. The large enum I could copy to a new file and manually count. But I like tinkering with regexes. And I’ve had this problem of decoding large enum for as long as I’ve used C (a long time). And it seems like a fun little project.

I’ve had co-workers do woodwork to relax. Several co-workers are mountain bikers (Boise is fantastic for mountain biking.) Video games are always a good way to relax. I like to tinker with small code projects.

C ‘enum’ is Annoying.

Writing a blog post is hard. “I’ll do it later,” I keep thinking. Maybe it’s like flossing–“I’ll do it later.” Next thing I know I’m spraying blood onto the ceiling while the hygienist tut-tuts about my bad habits. I floss (write) now and my future self will thank me.

I’m tinkering with nl80211, the successor to WEXT. WEXT is the Linux Wireless Extensions, a set of standardized ioctl for communicating userspace to the kernel level wireless drivers. WEXT is amazing but limited so the smart people got together and created a much more flexible system around NetLink.

As I’m tinkering with nl80211, I discover a frustrating extensive use of C enum. Using C enum in a network protocol is frustrating because with a large enum (say, greater than 10 elements), converting from an integer in a debug message back to the actual enum is well nigh impossible.

Case in point, the nl80211.h enum nl80211_attrs is ~400 lines and about 260 elements. In my little nl80211 baby-steps code, I fetch NL80211_CMD_GET_INTERFACE and get back an array filled with attribute + value.

        int i;
        for (i=0 ; i<NL80211_ATTR_MAX ; i++ ) {
                if (tb_msg[i]) {
                        printf("%d=%p type=%d len=%d\n", i, (void *)tb_msg[i], nla_type(tb_msg[i]), nla_len(tb_msg[i]));
                        hex_dump("msg", (unsigned char *)tb_msg[i], nla_len(tb_msg[i]));
                }
        }

Again this is baby steps code. I have no idea what I’m actually doing. I’m poking the box.

46=0x14db108 type=46 len=4
msg 0x00001080 08 00 2e 00 ....
206=0x14db110 type=206 len=1
msg 0x00001090 05 . 
217=0x14db120 type=217 len=4 
msg 0x000010a0 08 00 d9 00 ....
256=0x14db118 type=256 len=4
msg 0x000010b0 08 00 00 01 ....

Now I have a list of attributes coming back from the call. What the foo is 217? 256? A visit to gdb will give me answers!  The last element in the enum is NL80211_ATTR_PORT_AUTHORIZED so what is its value? First print the symbol, gives me the symbol. Print as hex (p/x) and as decimal (p/d) shows me the numerical value.

(gdb) p NL80211_ATTR_PORT_AUTHORIZED
$1 = NL80211_ATTR_PORT_AUTHORIZED
(gdb) p/x NL80211_ATTR_PORT_AUTHORIZED 
$2 = 0x103
(gdb) p/d NL80211_ATTR_PORT_AUTHORIZED 
$3 = 259

To find the symbol value of my integer, I can do the reverse in gdb. The typecast will convert an integer to the enum type.

(gdb) p (enum nl80211_attrs)46
$6 = NL80211_ATTR_GENERATION
(gdb) p (enum nl80211_attrs)206
$7 = NL80211_ATTR_MAX_CSA_COUNTERS
(gdb) p (enum nl80211_attrs)217
$8 = NL80211_ATTR_EXT_FEATURES
(gdb) p (enum nl80211_attrs)256
$9 = NL80211_ATTR_SCHED_SCAN_MAX_REQS

Having to dig into gdb for every enum in nl80211.h will be tiring. The problems grow in other header files that are nests of #ifdefs.

I me personally prefer using #define for symbols like this. The explicit link of a symbol to a value in source form is helpful.