Getting silly with C, part (void*)2
They won't be able to find bugs in your code if they can't figure out how it works.
In the previous installment of our introductory series about the C language, we outlined the basic syntax of the switch() statement (demo link):
#include <stdio.h> int main() { int i = 1; switch (i) if (0) case 0: puts("i = 0"); else if (0) case 1 ... 10: puts("i = 1 ... 10"); else if (0) case 11: puts("i = 11"); else if (0) default: puts("i = something else"); return 0; }
Today, let’s continue with function declarations. You will be delighted to discover that the following code compiles cleanly with gcc -Wall — and that it calls puts() exactly once (link):
int typedef puts(char* puts; char* puts; char* puts); int main() { puts(puts); puts("Welcome to my humble abode!"); }
The explanation is simple. The first line is not actually a function declaration: instead, because of the buried (and out-of-order) typedef keyword, it defines a type. More specifically, it defines a function type — a useless but permitted construct in C that shouldn’t be confused with the more practical function pointer type. In the end, we create a type named puts that can be used to declare a function according to the template of int <fn>(…).
The second mystery might be what’s going on with the parameters — char* puts; char* puts; char* puts. This syntax is an obscure GNU extension called a forward parameter declaration. The intended use is this:
int myfunction(int len; char data[len], int len) { ... }
In essence, it’s a way to tell the compiler that a parameter of a certain type and name is forthcoming, so that we can reference it ahead of the time. You can apparently have as many semicolon-delimited forward declarations as you’d like, but in the end, we’re just creating a type for int <fn>(char*). At the typedef stage, parameter names are ignored and have no global effects, so repeating puts in there is just for show.
Past this point, puts is a type, and we can’t redefine the symbol in the global scope; that said, the language permits symbols to be shadowed within nested blocks (link):
char* foo = "bar"; int main() { int foo = 123; /* No error */ return foo; }
…less intuitively, the same also applies to types (link):
typedef float foo; int main() { int foo = 123; /* No error */ return foo; }
The final piece of the puzzle is the observation that parentheses can be added in variable definitions with no ill effects (link):
int main() { int (foo) = 123; return foo; }
This brings us back to what the puts(puts) line in main() actually does: it is not a function call at all! Instead, it’s equivalent to “puts puts” — a declaration of a function named puts that follows the int <fn>(char*) type template. Critically, this newly-instantiated puts symbol clobbers the global type, so the next reference to puts is a real function call.
All right, all right — too easy! Let’s move on to control flow (link):
#include <stdio.h> typedef int _(); int main() { puts("Welcome to my humble program."); _ main asm("_"); } int z() { puts("ANYTHING IS POSSIBLE AT ZOMBO.COM"); return 0; _ z asm("main"); }
If you run this code, the only output will be an ad for zombo.com. Why? For one, we have another function typedef in there — but if we get rid of it for clarity, the code still doesn’t make much sense (link):
... int main() { puts("Welcome to my humble program."); int main() asm("_"); } ...
The other trick is the asm(“…”) syntax. It’s not actually an assembly block; when the keyword appears in a function or variable declaration, it specifies an underlying “assembler name” for the C symbol. You’d normally use it like so (link):
int foobar(char*) asm("puts"); int main() { foobar("Hello world"); return 0; }
In our earlier example, we attach the renaming directive to a local declaration of main() that shadows the global symbol — but the result of the renaming is global! In effect, we renamed main() to _() and z() to main(). Clang complains, but GCC doesn’t mind at all.
Let’s follow that trail for a bit longer. Check out the following code (link):
int main() { i("This is fine."); return 0; } [[gnu::unused]] void elsewhere() { typedef int i(); for (i i, i asm("puts"), i;;); }
The code prints “this is fine”. Recent versions of GCC allow the iterator variable of a for() loop to be a function declaration, so we can bury the renaming there!
I don’t quite know why this for() syntax is now allowed — perhaps it’s some toxic-waste spillover from the C++ world — but it lets us write loops like this (link):
#include <stdio.h> int main() { int i = 0; for (void _(){} i++ < 3; ) _(), puts("this is fine."); return 0; }
A two-expression for()?! Sort of! Three expressions are still required by the compiler, but in C, there is an implicit semicolon after a function definition (void _() { }). In other words, for (<fn-def> <sth>; <sth>) is correct, while for (<fn-def>; <sth>; <sth>) is not.
While I have you here, did you know that the C language has a BASIC compatibility mode that enables line numbering? It’s true, use if (BASIC) to see if your compiler supports the feature (link):
#include <stdio.h> typedef int BASIC[]; int main() { if ((BASIC) { [20] puts("cruel"), [30] puts("world"), [10] puts("hello"), }); }
I’m sure you can figure this one out. And with this, we conclude today’s introductory lesson of C. Until next time, fellow programmers!
If you liked this article, please subscribe! Unlike most other social media, Substack is not a walled garden and not an addictive doomscrolling experience. It’s just a way to stay in touch with the writers you like.
Delightfully ç̶̼͇̘̱̫͋̂̇u̷̧͕̻̤̍̀̽̀̆͜r̸̪̦̣̥̣̆̈́̉̒̔̐̓ș̷͓̯̭͈͈͕̃̇ĕ̶̡̙̫̽͊̔̄ḑ̸̢͖̦͎̱̦̆. Love-hated it!
Totally degen, loved it :) How come you are so familiar with all those arcane obscure specifications of the C standard? Had you been a business lawyer, you would be filthy rich right now, combining arcane loop-holes ad absurdum to get anything.