lcamtuf’s thing

lcamtuf’s thing

Share this post

lcamtuf’s thing
lcamtuf’s thing
Getting silly with C, part (void*)2
Copy link
Facebook
Email
Notes
More

Getting silly with C, part (void*)2

They won't be able to find bugs in your code if they can't figure out how it works.

Jan 10, 2025
19

Share this post

lcamtuf’s thing
lcamtuf’s thing
Getting silly with C, part (void*)2
Copy link
Facebook
Email
Notes
More
12
Share

In the previous installment of our introductory series about the C language, we outlined the basic syntax of the switch() statement (demo link):

#include <stdio.h>

int main() {
 
  int i = 1;

  switch (i)

         if (0) case 0:        puts("i = 0");
    else if (0) case 1 ... 10: puts("i = 1 ... 10");
    else if (0) case 11:       puts("i = 11");
    else if (0) default:       puts("i = something else");

  return 0;

}

Today, let’s continue with function declarations. You will be delighted to discover that the following code compiles cleanly with gcc -Wall — and that it calls puts() exactly once (link):

int typedef puts(char* puts; char* puts; char* puts);

int main() {
  puts(puts);
  puts("Welcome to my humble abode!");
}

The explanation is simple. The first line is not actually a function declaration: instead, because of the buried (and out-of-order) typedef keyword, it defines a type. More specifically, it defines a function type — a fairly useless but permitted construct in C that shouldn’t be confused with the more practical function pointer type. In the end, we create a type named puts that can be used to declare a function according to the template of int <fn>(…).

The second mystery might be what’s going on with the parameters — char* puts; char* puts; char* puts. This syntax is an obscure GNU extension called a forward parameter declaration. The intended use is this:

int myfunction(int len; char data[len], int len) {
  ...
}

In essence, it’s a way to tell the compiler that a parameter of a certain type and name is forthcoming, so that we can reference it ahead of the time. You can apparently have as many semicolon-delimited forward declarations as you’d like, but in the end, we’re just creating a type for int <fn>(char*). At the typedef stage, parameter names are ignored and have no global effects, so repeating puts in there is just for show.

Past this point, puts is a type, and we can’t redefine the symbol in the global scope; that said, the language permits symbols to be shadowed within nested blocks (link):

char* foo = "bar";

int main() {
  int foo = 123; /* No error */
  return foo;
}

…less intuitively, the same also applies to types (link):

typedef float foo;

int main() {
  int foo = 123; /* No error */
  return foo;
}

The final piece of the puzzle is the observation that parentheses can be added in variable declarations with no ill effects (link):

int main() {
  int (foo) = 123;
  return foo;
}

This brings us back to what the puts(puts) line in main() actually does: it is not a function call at all! Instead, it’s equivalent to “puts puts” — a declaration of a function named puts that follows the int <fn>(char*) type template. Critically, this newly-instantiated puts symbol clobbers the global type, so the next reference to puts is a real function call.

All right, all right — too easy! Let’s move on to control flow (link):

#include <stdio.h>

typedef int _();

int main() {
  puts("Welcome to my humble program.");
  _ main asm("_");
}

int z() {
  puts("ANYTHING IS POSSIBLE AT ZOMBO.COM");
  return 0;
  _ z asm("main");
}

If you run this code, the only output will be an ad for zombo.com. Why? For one, we have another function typedef in there — but if we get rid of it for clarity, the code still doesn’t make much sense (link):

...

int main() {
  puts("Welcome to my humble program.");
  int main() asm("_");
}

...

The other trick is the asm(“…”) syntax. It’s not actually an assembly block; when the keyword appears in a function or variable declaration, it specifies an underlying “assembler name” for the C symbol. You’d normally use it like so (link):

int foobar(char*) asm("puts");

int main() {
  foobar("Hello world");
  return 0;
}

In our earlier example, we attach the renaming directive to a local declaration of main() that shadows the global symbol — but the result of the renaming is global! In effect, we renamed main() to _() and z() to main(). Clang complains, but GCC doesn’t mind at all.

Let’s follow that trail for a bit longer. Check out the following code (link):

int main() {
  i("This is fine.");
  return 0;
}

[[gnu::unused]] void elsewhere() {
  typedef int i();
  for (i i, i asm("puts"), i;;);
}

The code prints “this is fine”. Recent versions of GCC allow the iterator variable of a for() loop to be a function declaration, so we can bury the renaming there!

I don’t quite know why this for() syntax is now allowed, but it gets even more wacky if we trade a function declaration for a fully-fledged definition (link):

#include <stdio.h>

int main() { 
  int i = 0;
  for (void _(){} i++ < 3; ) _(), puts("this is fine.");
  return 0;
}

A two-expression for()?! Sort of! Three independent elements are still expected, but a function definition (void _() { }) is special in C in that it doesn’t end with a semicolon. In other words, for (<fn-def> <expr>; <expr>) is “correct”, while for (<fn-def>; <expr>; <expr>) would be not.

While I have you here, did you know that the C language has a BASIC compatibility mode that enables line numbering? It’s true, use if (BASIC) to see if your compiler supports the feature (link):

#include <stdio.h>

typedef int BASIC[];

int main() {

  if ((BASIC) {
    [20] puts("cruel"),
    [30] puts("world"),
    [10] puts("hello"),
  });

}

I’m sure you can figure this one out. And with this, we conclude today’s introductory lesson of C. Until next time, fellow programmers!


I write well-researched, original articles about geek culture, algorithms, and more. If you like the content, please subscribe. It’s increasingly difficult to stay in touch with readers via social media; my typical post on X is shown to less than 5% of my followers and gets a ~0.2% clickthrough rate.

19

Share this post

lcamtuf’s thing
lcamtuf’s thing
Getting silly with C, part (void*)2
Copy link
Facebook
Email
Notes
More
12
Share

Discussion about this post

lcamtuf
Jan 11Edited

If you're curious about my remark that the function typedef is useless, it's because you can't use it for definitions, just for declarations. You can do this:

typedef int functype(char* x);

functype myfunc;

...and it's essentially equivalent to:

int myfunc(char* x);

...but then you still need to define myfunc() "manually" by restating the parameters and the return value. This won't work:

typedef int functype(char* x);

functype myfunc {

puts("Hello world");

}

Expand full comment
Reply
Share
3 replies by lcamtuf and others
Manuel Capel
Jan 10

Totally degen, loved it :) How come you are so familiar with all those arcane obscure specifications of the C standard? Had you been a business lawyer, you would be filthy rich right now, combining arcane loop-holes ad absurdum to get anything.

Expand full comment
Reply
Share
2 replies by lcamtuf and others
10 more comments...

No posts

Ready for more?

© 2025 lcamtuf
Publisher Privacy
Substack
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More