Prev | Contents | Next

























  25. I know someone will fight me on that, but it’s gotta be at least in the top three, right?↩︎

  26. Well, technically there are more than two, but hey, let’s pretend there are two—ignorance is bliss, right?↩︎



  29. Technically, it contains preprocessor directives and function prototypes (more on that later) for common input and output needs.↩︎


  31. If you don’t give it an output filename, it will export to a file called a.out by default—this filename has its roots deep in Unix history.↩︎


  33. A “byte” is typically an 8-bit binary number. Think of it as an integer that can only hold the values from 0 to 255, inclusive. Technically, C allows bytes to be any number of bits and if you want to unambiguously refer to an 8-bit number, you should use the term octet. But programmers are going assume you mean 8-bits when you say “byte” unless you specify otherwise.↩︎

  34. I’m seriously oversimplifying how modern memory works, here. But the mental model works, so please forgive me.↩︎

  35. I’m lying here a little. Technically 3.14159 is of type double, but we’re not there yet and I want you to associate float with “Floating Point”, and C will happily coerce that type into a float. In short, don’t worry about it until later.↩︎

  36. Read this as “pointer to a char” or “char pointer”. “Char” for character. Though I can’t find a study, it seems anecdotally most people pronounce this as “char”, a minority say “car”, and a handful say “care”. We’ll talk more about pointers later.↩︎

  37. Colloquially, we say they have “random” values, but they aren’t truly—or even pseudo-truly—random numbers.↩︎

  38. This isn’t strictly 100% true. When we get to learning about static storage duration, you’ll find the some variables are initialized to zero automatically. But the safe thing to do is always initialize them.↩︎

  39. The _t is short for type.↩︎

  40. Except for with variable length arrays—but that’s a story for another time.↩︎


  42. This was considered such a hazard that the designers of the Go Programming Language made break the default; you have to explicitly use Go’s fallthrough statement if you want to fall into the next case.↩︎

  43. Never say “never”.↩︎

  44. Typically. I’m sure there are exceptions out there in the dark corridors of computing history.↩︎

  45. A byte is a number made up of no more than 8 binary digits, or bits for short. This means in decimal digits just like grandma used to use, it can hold an unsigned number between 0 and 255, inclusive.↩︎

  46. The order that bytes come in is referred to as the endianness of the number. The usual suspects are big-endian (with the most significant byte first) and little-endian (with the most-significant byte last), or, uncommonly now, mixed-endian (with the most-significant bytes somewhere else).↩︎

  47. That is, base 16 with digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F.↩︎

  48. That’s not all! It’s used in /*comments*/ and multiplication and in function prototypes with variable length arrays! It’s all the same *, but the context gives it different meaning.↩︎



  51. The pointer type variables are a, d, f, and i, because those are the ones with * in front of them.↩︎

  52. These days, anyway.↩︎

  53. Again, not really, but variable-length arrays—of which I’m not really a fan—are a story for another time.↩︎

  54. Since arrays are just pointers to the first element of the array under the hood, there’s no additional information recording the length.↩︎

  55. Because when you pass an array to a function, you’re actually just passing a pointer to the first element of that array, not the “entire” array.↩︎

  56. In the good old MS-DOS days before memory protection was a thing, I was writing some particularly abusive C code that deliberately engaged in all kinds of undefined behavior. But I knew what I was doing, and things were working pretty well. Until I made a misstep that caused a lockup and, as I found upon reboot, nuked all my BIOS settings. That was fun. (Shout-out to @man for those fun times.)↩︎

  57. There are a lot of things that cause undefined behavior, not just out-of-bounds array accesses. This is what makes the C language so exciting.↩︎


  59. This is technically incorrect, as a pointer to an array and a pointer to the first element of an array have different types. But we can burn that bridge when we get to it.↩︎

  60. C11 §¶1 requires it be greater than zero. But you might see code out there with arrays declared of zero length at the end of structs and GCC is particularly lenient about it unless you compile with -pedantic. This zero-length array was a hackish mechanism for making variable-length structures. Unfortunately, it’s technically undefined behavior to access such an array even though it basically worked everywhere. C99 codified a well-defined replacement for it called flexible array members, which we’ll chat about later.↩︎

  61. This is also equivalent: void print_2D_array(int (*a)[3]), but that’s more than I want to get into right now.↩︎

  62. Though it is true that C doesn’t track the length of strings.↩︎

  63. If you’re using the basic character set or an 8-bit character set, you’re used to one character being one byte. This isn’t true in all character encodings, though.↩︎

  64. This is different than the NULL pointer, and I’ll abbreviate it NUL when talking about the character versus NULL for the pointer.↩︎

  65. Later we’ll learn a neater way to do it with pointer arithmetic.↩︎

  66. There’s a safer function called strncpy() that you should probably use instead, but we’ll get to that later.↩︎

  67. Although in C individual items in memory like ints are referred to as “objects”, they’re not objects in an object-oriented programming sense.↩︎

  68. The Saturn was a popular brand of economy car in the United States until it was put out of business by the 2008 crash, sadly so to us fans.↩︎

  69. A pointer is likely 8 bytes on a 64-bit system.↩︎

  70. A deep copy follows pointer in the struct and copies the data they point to, as well. A shallow copy just copies the pointers, but not the things they point to. C doesn’t come with any built-in deep copy functionality.↩︎




  74. We used to have three different newlines in broad effect: Carriage Return (CR, used on old Macs), Linefeed (LF, used on Unix systems), and Carriage Return/Linefeed (CRLF, used on Windows systems). Thankfully the introduction of OS X, being Unix-based, reduced this number to two.↩︎

  75. If the buffer’s not big enough to read in an entire line, it’ll just stop reading mid-line, and the next call to fgets() will continue reading the rest of the line.↩︎

  76. Normally the second program would read all the bytes at once, and then print them out in a loop. That would be more efficient. But we’re going for demo value, here.↩︎



  79. And this is why I used individual bytes in my fwrite() and fread() examples, above, shrewdly.↩︎


  81. We’ll talk more about these later.↩︎

  82. Recall that the sizeof operator tells you the size in bytes of an object in memory.↩︎

  83. Or string, which is really an array of chars. Somewhat peculiarly, you can also have a pointer that references one past the end of the array without a problem and still do math on it. You just can’t dereference it when it’s out there.↩︎



  86. Because remember that array notation is just a dereference and some pointer math, and you can’t dereference a void*!↩︎

  87. You can also cast the void* to another type, but we haven’t gotten to casts yet.↩︎

  88. Or until the program exits, in which case all the memory allocated by it is freed. Asterisk: some systems allow you to allocate memory that persists after a program exits, but it’s system dependent, out of scope for this guide, and you’ll certainly never do it on accident.↩︎



  91. “Bit” is short for binary digit. Binary is just another way of representing numbers. Instead of digits 0-9 like we’re used to, it’s digits 0-1.↩︎


  93. The industry term for a sequence of exactly, indisputably 8 bits is an octet.↩︎

  94. In general, f you have an \(n\) bit two’s complement number, the signed range is \(-2^{n-1}\) to \(2^{n-1}-1\). And the unsigned range is \(0\) to \(2^n-1\).↩︎




  98. Depends on if a char defaults to signed char or unsigned char↩︎


  100. My char is signed.↩︎


  102. This program runs as its comments indicate on a system with FLT_DIG of 6 that uses IEEE-754 base-2 floating point numbers. Otherwise, you might get different output.↩︎

  103. It’s really surprising to me that C doesn’t have this in the spec yet. In the C99 Rationale document, they write, “A proposal to add binary constants was rejected due to lack of precedent and insufficient utility.” Which seems kind of silly in light of some of the other features they kitchen-sinked in there! I’ll bet one of the next releases has it.↩︎


  105. They’re the same except snprintf() allows you to specify a maximum number of bytes to output, preventing the overrunning of the end of your string. So it’s safer.↩︎


  107. We have to pass a pointer to badchar to strtoul() or it won’t be able to modify it in any way we can see, analogous to why you have to pass a pointer to an int to a function if you want that function to be able to change that value of that int.↩︎

  108. Each character has a value associated with it for any given character encoding scheme.↩︎

  109. In practice, what’s probably happening on your implementation is that the high-order bits are just being dropped from the result, so a 16-bit number 0x1234 being converted to an 8-bit number ends up as 0x0034, or just 0x34.↩︎

  110. Again, in practice, what will likely happen on your system is that the bit pattern for the original will be truncated and then just used to represent the signed number, two’s complement. For example, my system takes an unsigned char of 192 and converts it to signed char -64. In two’s complement, the bit pattern for both these numbers is binary 11000000.↩︎

  111. Not really—it’s just discarded regularly.↩︎

  112. Functions with a variable number of arguments.↩︎

  113. This is rarely done because the compiler will complain and having a prototype is the Right Thing to do. I think this still works for historic reasons, before prototypes were a thing.↩︎





  118. Historially, MS-DOS and Windows programs would do this differently than Unix. In Unix, the shell would expand the wildcard into all matching files before your program saw it, whereas the Microsoft variants would pass the wildcard expression into the program to deal with. In any case, there are arguments that get passed into the program.↩︎

  119. Since they’re just regular parameter names, you don’t actually have to call them argc and argv. But it’s so very idiomatic to use those names, if you get creative, other C programmers will look at you with a suspicious eye, indeed!↩︎

  120. ps, Process Status, is a Unix command to see what processes are running at the moment.↩︎



  123. In Windows cmd.exe, type echo %errorlevel%. In PowerShell, type $LastExitCode.↩︎

  124. If you need a numeric value, convert the string with something like atoi() or strtol().↩︎

  125. In Windows CMD.EXE, use set FROTZ=value. In PowerShell, use $Env:FROTZ=value.↩︎


  127. You can’t always just wrap the code in /* */ comments because those won’t nest.↩︎

  128. This isn’t really a macro—it’s technically an identifier. But it’s the only predefined identifier and it feels very macro-like, so I’m including it here. Like a rebel.↩︎

  129. A hosted implementation basically means you’re running the full C standard, probably on an operating system of some kind. Which you probably are. If you’re running on bare metal in some kind of embedded system, you’re probably on a standalone implementation.↩︎

  130. OK, I know that was a cop-out answer. Basically there’s an optional extension compilers can implement wherein they agree to limit certain types of undefined behavior so that the C code is more amenable to static code analysis. It is unlikely you’ll need to use this.↩︎

  131. Breakin’ the law… breakin’ the law…↩︎


  133. Technically we say that it has an incomplete type.↩︎

  134. Though some compilers have options to force this to occur—search for __attribute__((packed)) to see how to do this with GCC.↩︎

  135. super isn’t a keyword, incidentally. I’m just stealing some OOP terminology.↩︎

  136. Assuming 8-bit chars, i.e. CHAR_BIT == 8.↩︎


  138. I just made up that number, but it’s probably not far off↩︎

  139. There’s some devil in the details with values that are stored in registers only, but we can safely ignore that for our purposes here. Also the C spec makes no stance on these “register” things beyond the register keyword, the description for which doesn’t mention registers.↩︎

  140. You’re very likely to get different numbers on yours.↩︎

  141. There is absolutely nothing in the spec that says this will always work this way, but it happens to work this way on my system.↩︎

  142. Even if E is NULL, it turns out, weirdly.↩︎


  144. Your C compiler is not required to add padding bytes, and the values of any padding bytes that are added are indeterminate.↩︎

  145. This will vary depending on the architecture, but my system is little endian, which means the least-significant byte of the number is stored first. Big endian systems will have the 12 first and the 78 last. But the spec doesn’t dictate anything about this representation.↩︎

  146. It’s an optional feature, so it might not be there—but it probably is.↩︎

  147. I’m printing out the 16-bit values reversed since I’m on a little-endian machine and it makes it easier to read here.↩︎

  148. Assuming they point to the same array object.↩︎

  149. The Go Programming Language drew its type declaration syntax inspiration from the opposite of what C does.↩︎

  150. Not that other languages don’t do this—they do. It is interesting how many modern languages use the same operators for bitwise that C does.↩︎


  152. That is, us lowly developers aren’t supposed to know what’s in there or what it means. The spec doesn’t dictate what it is in detail.↩︎

  153. Honestly, it would be possible to remove that limitation from the language, but the idea is that the macros va_start(), va_arg(), and va_end() should be able to be written in C. And to make that happen, we need some way to initialize a pointer to the location of the first parameter. And to do that, we need the name of the first parameter. It would require a language extension to make this possible, and so far the committee hasn’t found a rationale for doing so.↩︎

  154. “This planet has—or rather had—a problem, which was this: most of the people living on it were unhappy for pretty much of the time. Many solutions were suggested for this problem, but most of these were largely concerned with the movement of small green pieces of paper, which was odd because on the whole it wasn’t the small green pieces of paper that were unhappy.” —The Hitchhiker’s Guide to the Galaxy, Douglas Adams↩︎

  155. Remember that char is just a byte-sized integer.↩︎

  156. Except for isdigit() and isxdigit().↩︎

  157. For example, we could store the code point in a big-endian 32-bit integer. Straightforward! We just invented an encoding! Actually not; that’s what UTF-32BE encoding is. Oh well—back to the grind!↩︎

  158. Ish. Technically, it’s variable width—there’s a way to represent code points higher than \(2^{16}\) by putting two UTF-16 characters together.↩︎

  159. There’s a special character called the Byte Order Mark (BOM), code point 0xFEFF, that can optionally precede the data stream and indicate the endianess. It is not required, however.↩︎

  160. Again, this is only true in UTF-16 for characters that fit in two bytes.↩︎



  163. Presumably the compiler makes the best effort to translate the code point to whatever the output encoding is, but I can’t find any guarantees in the spec.↩︎

  164. With a format specifier like "%.12s", for example.↩︎

  165. wcscoll() is the same as wcsxfrm() followed by wcscmp().↩︎

  166. Ish—things get funky with multi-char16_t UTF-16 encodings.↩︎




  170. Apparently it doesn’t do Unix-style signals at all deep down, and they’re simulated for console apps.↩︎

  171. Confusingly, sig_atomic_t predates the lock-free atomics and is not the same thing.↩︎

  172. If sig_action_t is signed, the range will be at least -127 to 127. If unsigned, at least 0 to 255.↩︎

  173. This is due to how VLAs are typically allocated on the stack, whereas static variables are on the heap. And the whole idea with VLAs is they’ll be automatically dellocated when the stack frame is popped at the end of the function.↩︎


  175. I’d like to point out that using goto in all these cases is avoidable. You can use variables and loops instead. It’s just that some people think goto produces the best code in those circumstances.↩︎


  177. Which isn’t quite the same, since it’s an array, not a pointer to an int.↩︎

  178. A variable used here is an expression.↩︎

  179. Both “stack pointer” and “program counter” are related to the underlying architecture and C implementation, and are not part of the spec.↩︎

  180. The rationale here is that the program might store a value temporarily in a CPU register while it’s doing work on it. In that timeframe, the register holds the correct value, and the value on the stack might be out of date. Then later the register values would get overwritten and the changes to the variable lost.↩︎

  181. That is, remain allocated until the program ends with no way to free it.↩︎

  182. This works because in C, pointers are the same size regardless of the type of data they point to. So the compiler doesn’t need to know the size of the struct node at this point; it just needs to know the size of a pointer.↩︎


  184. This was a harder one to research, and I’ll take any more information anyone can give me. I could be defined as _Complex_I or _Imaginary_I, if the latter exists. _Imaginary_I will handle signed zeros, but _Complex_I might not. This has implications with branch cuts and other complex-numbery-mathy things. Maybe. Can you tell I’m really getting out of my element here? In any case, the CMPLX() macros behave as if I were defined as _Imaginary_I, with signed zeros, even if _Imaginary_I doesn’t exist on the system.↩︎

  185. The simplicity of this statement doesn’t do justice to the incredible amount of work that goes into simply understanding how floating point actually functions.↩︎

  186. This is the only one that doesn’t begin with an extra leading c, strangely.↩︎

  187. Some architectures have different sized data that the CPU and RAM can operate with at a faster rate than others. In those cases, if you need the fastest 8-bit number, it might give you have a 16- or 32-bit type instead because that’s just faster. So with this, you won’t know how big the type is, but it will be least as big as you say.↩︎

  188. Namely, the system has 8, 16, 32, or 64 bit integers with no padding that use two’s complement representation, in which case the intN_t variant for that particular number of bits must be defined.↩︎

  189. On Earth, anyway. Who know what crazy systems they use out there↩︎

  190. OK, don’t murder me! GMT is technically a time zone while UTC is a global time system. Also some countries might adjust GMT for daylight saving time, whereas UTC is never adjusted for daylight saving time.↩︎

  191. Admittedly, there are more than two.↩︎



  194. You will on POSIX, where time_t is definitely an integer. Unfortunately the entire world isn’t POSIX, so there we are.↩︎


  196. I’m more a fan of shared-nothing, myself, and my skills with classic multithreading constructs are rusty, to say the least.↩︎

  197. Yes, pthreads with a “p”. It’s short for POSIX threads, a library that C11 borrowed liberally from for its threads implementation.↩︎

  198. Per §7.1.4¶5.↩︎

  199. Unless you thrd_detach(). More on this later.↩︎

  200. Though I don’t think they have to be. It’s just that the threads don’t seem to get rescheduled until some system call like might happen with a printf()… which is why I have the printf() in there.↩︎

  201. Short for “mutual exclusion”, AKA a “lock” on a section of code that only one thread is permitted to execute.↩︎

  202. That is, your process will go to sleep.↩︎

  203. You might have expected it to be “time from now”, but you’d just like to think that, wouldn’t you!↩︎

  204. And that’s why they’re called condition variables!↩︎

  205. I’m not saying it’s aliens… but it’s aliens. OK, really more likely another thread might have been woken up and gotten to the work first.↩︎

  206. Survival of the fittest! Right? I admit it’s actually nothing like that.↩︎

  207. The __STDC_VERSION__ macro didn’t exist in early C89, so if you’re worried about that, check it with #ifdef.↩︎

  208. The reason for this is when optimized, my compiler has put the value of x in a register to make the while loop fast. But the register has no way of knowing that the variable was updated in another thread, so it never sees the 3490. This isn’t really related to the all-or-nothing part of atomicity, but is more related to the synchronization aspects in the next section.↩︎

  209. Until I say otherwise, I’m speaking generally about sequentially consistent operations. More on what that means soon.↩︎

  210. Sanest from a programmer perspective.↩︎

  211. Apparently C++23 is adding this as a macro.↩︎

  212. The spec notes that they might differ in size, representation, and alignment.↩︎

  213. I just pulled that example out of nowhere. Maybe it doesn’t matter on Intel/AMD, but it could matter somewhere, dangit!↩︎

  214. C++ elaborates that if the signal is the result of a call to raise(), it is sequenced after the raise().↩︎


  216. Because consume is all about the operations that are dependent on the value of the acquired atomic variable, and there is no atomic variable with a fence.↩︎














  230. You can do this with -O on the command line.↩︎




Prev | Contents | Next