Category Archives: C/C++

C++ filesystem vs experimental::filesystem

I’m tinkering with C++ again.  I’m trying to use as much standard C++17 as possible because it’s a good reason to learn the C++17 standards.

Boost has a filesystem library. The library evolved into a core C++ library standard. GCC 7 has the filesystem module in the std::experimental namespace. GCC 7 moved the filesystem into the top level std:: namespace.

I’m trying to write a program that works in both recent Fedora and Ubuntu. Fedora29 uses GCC8, Ubuntu18.04 uses GCC7. Dang it. The experimental filesystem module still exists in GCC8 (/usr/include/c++/8/experimental/filesystem) but I’d like to use the latest and greatest available, if possible (/usr/include/c++/8/filesystem).

I can rename the namespace via the “namespace fs = ” which is really useful. Is a bit like Pythons “import module as newname” but I need to know which header and which namespace to rename. My first thought was duplicating the __cplusplus macro used in the filesystem header in the two aforementioned paths in Fedora.

#if __cplusplus >= 201703L
#include <filesystem> // gcc8 (Fedora29+)
namespace fs = std::experimental::filesystem;
#elif __cplusplus >= 201103L
#include <experimental/filesystem> // gcc7 (Ubuntu 18.04)
namespace fs = std::experimental::filesystem;
#endif

No such luck. The __cplusplus macro is set based on the -stdc++NN command line arg. So my build under Ubuntu also uses -stdc++17 and would hit the __cplusplus >= 201703L.

I don’t want to specifically test for GCC version in an #ifdef. I’d also like to eventually build under OSX (Clang+XCode) and Windows (VC++).

I’m using CMake. I think I’m going to have to do some test compiles with CMake to determine which header to use.

 

Parsing C ‘enum’ with Python Regex.

In my previous post I mentioned I find C ‘enum’ a big annoying. An enum value captured in a network track or a packet hexdump is difficult to track backwards to a symbolic value. The nl80211.h header file uses enum extensively. As I continue to learn netlink and nl80211, I’d like a quick way to convert those enum into a human value.

Why not parse the header file and decode the enum? Python is my go-to for small string tasks.  (I mentioned I was parsing C enum in Python to a friend and he said, “That’s a very Dave move.” I’ll take it as a compliment, I suppose.)

C enum is straightforward: start counting as zero and auto increment. If there is an RHS expression, then that enum takes on that value and the auto increment continues from there.

enum nl80211_commands {
     /* don't change the order or add anything between, this is ABI! */
     NL80211_CMD_UNSPEC,

     NL80211_CMD_GET_WIPHY, /* can dump */
     NL80211_CMD_SET_WIPHY,
     NL80211_CMD_NEW_WIPHY,
     NL80211_CMD_DEL_WIPHY,
[snip]

(At some point I wonder if I should spring for the extra WordPress plugin for code formatting. Maybe.)

In the nl80211_commands above, NL80211_CMD_UNSPEC == 0, then NL80211_CMD_GET_WIPHY == 1. Simple counter. But danger lurks.

The RHS can be an expression. The expression can be a simple value.

enum nl80211_user_reg_hint_type {
        NL80211_USER_REG_HINT_USER      = 0,
        NL80211_USER_REG_HINT_CELL_BASE = 1,
        NL80211_USER_REG_HINT_INDOOR    = 2,
};

Or a complicated expression.

enum nl80211_tdls_peer_capability {
        NL80211_TDLS_PEER_HT = 1<<0,
        NL80211_TDLS_PEER_VHT = 1<<1,
        NL80211_TDLS_PEER_WMM = 1<<2,
};

The expression can reference previous values in the same enum as the next example. (Emphasis added.)

enum nl80211_sched_scan_plan {
        __NL80211_SCHED_SCAN_PLAN_INVALID,
        NL80211_SCHED_SCAN_PLAN_INTERVAL,
        NL80211_SCHED_SCAN_PLAN_ITERATIONS,

        /* keep last */
        __NL80211_SCHED_SCAN_PLAN_AFTER_LAST,
        NL80211_SCHED_SCAN_PLAN_MAX =
                __NL80211_SCHED_SCAN_PLAN_AFTER_LAST - 1
};

Here’s my favorite example, showing the enum auto increment counter being reset. The example below adds new symbol identical to an existing symbol’s value and the auto increment continues on its merry way.

enum nl80211_commands {
[snip]
     NL80211_CMD_GET_BEACON,
     NL80211_CMD_SET_BEACON,
     NL80211_CMD_START_AP,
     NL80211_CMD_NEW_BEACON = NL80211_CMD_START_AP,
     NL80211_CMD_STOP_AP,
     NL80211_CMD_DEL_BEACON = NL80211_CMD_STOP_AP,
[snip]

I think it would be interesting to create a regex that can parse the enum. There are of course simpler ways to do this: I could just continue to use gdb. Most of the enums are small so not a big deal to manually count them. The large enum I could copy to a new file and manually count. But I like tinkering with regexes. And I’ve had this problem of decoding large enum for as long as I’ve used C (a long time). And it seems like a fun little project.

I’ve had co-workers do woodwork to relax. Several co-workers are mountain bikers (Boise is fantastic for mountain biking.) Video games are always a good way to relax. I like to tinker with small code projects.

C ‘enum’ is Annoying.

Writing a blog post is hard. “I’ll do it later,” I keep thinking. Maybe it’s like flossing–“I’ll do it later.” Next thing I know I’m spraying blood onto the ceiling while the hygienist tut-tuts about my bad habits. I floss (write) now and my future self will thank me.

I’m tinkering with nl80211, the successor to WEXT. WEXT is the Linux Wireless Extensions, a set of standardized ioctl for communicating userspace to the kernel level wireless drivers. WEXT is amazing but limited so the smart people got together and created a much more flexible system around NetLink.

As I’m tinkering with nl80211, I discover a frustrating extensive use of C enum. Using C enum in a network protocol is frustrating because with a large enum (say, greater than 10 elements), converting from an integer in a debug message back to the actual enum is well nigh impossible.

Case in point, the nl80211.h enum nl80211_attrs is ~400 lines and about 260 elements. In my little nl80211 baby-steps code, I fetch NL80211_CMD_GET_INTERFACE and get back an array filled with attribute + value.

        int i;
        for (i=0 ; i<NL80211_ATTR_MAX ; i++ ) {
                if (tb_msg[i]) {
                        printf("%d=%p type=%d len=%d\n", i, (void *)tb_msg[i], nla_type(tb_msg[i]), nla_len(tb_msg[i]));
                        hex_dump("msg", (unsigned char *)tb_msg[i], nla_len(tb_msg[i]));
                }
        }

Again this is baby steps code. I have no idea what I’m actually doing. I’m poking the box.

46=0x14db108 type=46 len=4
msg 0x00001080 08 00 2e 00 ....
206=0x14db110 type=206 len=1
msg 0x00001090 05 . 
217=0x14db120 type=217 len=4 
msg 0x000010a0 08 00 d9 00 ....
256=0x14db118 type=256 len=4
msg 0x000010b0 08 00 00 01 ....

Now I have a list of attributes coming back from the call. What the foo is 217? 256? A visit to gdb will give me answers!  The last element in the enum is NL80211_ATTR_PORT_AUTHORIZED so what is its value? First print the symbol, gives me the symbol. Print as hex (p/x) and as decimal (p/d) shows me the numerical value.

(gdb) p NL80211_ATTR_PORT_AUTHORIZED
$1 = NL80211_ATTR_PORT_AUTHORIZED
(gdb) p/x NL80211_ATTR_PORT_AUTHORIZED 
$2 = 0x103
(gdb) p/d NL80211_ATTR_PORT_AUTHORIZED 
$3 = 259

To find the symbol value of my integer, I can do the reverse in gdb. The typecast will convert an integer to the enum type.

(gdb) p (enum nl80211_attrs)46
$6 = NL80211_ATTR_GENERATION
(gdb) p (enum nl80211_attrs)206
$7 = NL80211_ATTR_MAX_CSA_COUNTERS
(gdb) p (enum nl80211_attrs)217
$8 = NL80211_ATTR_EXT_FEATURES
(gdb) p (enum nl80211_attrs)256
$9 = NL80211_ATTR_SCHED_SCAN_MAX_REQS

Having to dig into gdb for every enum in nl80211.h will be tiring. The problems grow in other header files that are nests of #ifdefs.

I me personally prefer using #define for symbols like this. The explicit link of a symbol to a value in source form is helpful.