Beej's Bit Bucket

 ⚡ Tech and Programming Fun

m4 Macros and CSS


I'm certainly not the first person to consider this, but it's a simple enough technique for managing CSS that it bears repeating.

The problem: your CSS has a bunch of constants hard-coded in it all over the place, because CSS doesn't support the concept of variables.

Perhaps the original theme color was #1122ff, and so you have CSS like this, where the string "1122ff" appears about 8 million times in your various CSS files:

#somediv { background-color: #1122ff; } .othertext { color: #1122ff; }

But then you find out that the new theme color is going to be #2211ff and can you please change it. Sure there's search-and-replace, but there might be false positives, and it's not easy to verify that everything replaced well.

But there is a better way!


There are some pretty full-fledged tools to work specifically with CSS. Most notably we have less and Sass, but there are others. These are great tools that work specifically on CSS and have some wonderful features.

If I may, I have written a free, simpler, and better replacement tool in Python called goatee. As such, I don't use m4 for this kind of stuff.

See the "silver lining" part of the next section for why you might want to use m4, anyway. At least this page serves as a simple introduction to m4!

The Basics

Using the concept of "macros", we can substitute one string of characters for another.

Macro processors have been around for quite some time. C has one built into it, in fact. Most assemblers have them. And there's a rather infamous macro processor known as "m4".

Many people have exposure to m4 in both the configuration of sendmail, and GNU's autoconf tools, and, to be frank, it's rare you hear good things about them. I chalk this up to the fact that m4 can be made over-the-top-super-zany-9000 and when normal people look at it, they flip out.

That's OK.

Also, m4 doesn't work with multibyte character encodings, but in many circles these are rare in CSS.

There is a silver lining, and it's something like this:

  1. m4 can be used in a basic sense in a way that's still human-readable.
  2. m4 is lightweight and fast.
  3. m4 is easy to run.
  4. m4 is easy to install, with very few dependencies.
  5. m4 is part of POSIX, and is widely available—and might already be installed on your system.
  6. GNU m4 will support multibyte characters someday. Probably.

Let's write an m4 script, call it "foobar.css.m4" that takes that color constant out and puts it in a macro. (If you're going to copy this, take care to note the open and close quotes are backticks (`) and apostrophes (')—more on that later):

define(`THEMECOLOR', `#2211ff')dnl #somediv { background-color: THEMECOLOR; } .othertext { color: THEMECOLOR; }

And then we run m4 from the command line and redirect the output to a file (or don't and just see it on standard-out):

$ m4 foobar.css.m4 > foobar.css

Then this is the result:

#somediv { background-color: #2211ff; } .othertext { color: #2211ff; }
Ta-daa! So now if you want to change the color, all you have to do is change the define() at the top of the file and run it through m4 again.

Let's check out that m4 define:

define(`THEMECOLOR', `#2211ff')dnl

There are two things that are weirdly unexpected here.

First of all, note the quotes around the elements. The open quote is a backtick (`) and the closing quote is an apostrophe ('). What this does is tells m4 to not expand any macros it finds between the quotes. So here, just in case you have a macro called "THE", it won't replace the "THE" in "THEMECOLOR", because THEMECOLOR is quoted.

(If you don't like the quotes, you can change them to other characters using m4's changequote() built-in.)

The second weird thing to notice is that dangling "dnl" at the end of the line. This tells m4 to "delete through newline", or, in order words, ignore everything from here to the start of the next line. If you don't have this there, the newline at the end of the define() line will be emitted into the final output, leading to an extra line in the file at that point. For CSS, it's harmless, so the dnl isn't really necessary, likely. And maybe you're running it through a CSS compressor anyway, if this is all part of an automated build. But if you don't like the extra newlines, add the dnl. It's really very very common in m4 code.

One more final note is that if you want your macro directly beside other text, you'll have to put parentheses after the name to separate it from that text. The parentheses tell m4 that you're invoking the macro with no arguments. This is the same as the normal case where you use no parens, except it's explicit.

To demonstrate, let's say we want to show the string "3490", and we have it defined in two parts, "34" and "90":

define(`THIRTYFOUR', `34') define(`NINETY', `90') THIRTYFOURNINTY

When we run this, it outputs:


What happened to the macro expansion? Well, obviously we didn't declare a macro called "THIRTYFOURNINETY", so m4 didn't substitute anything. But if we use the parens to explicitly say the macros have no arguments:

define(`THIRTYFOUR', `34') define(`NINETY', `90') THIRTYFOUR()NINETY

We get our expected result:


So that's the basics.

Including External Files

Maybe you have a few other CSS files and you want to include them into the main final CSS file during the m4 run.  This is easily accomplished with the include() built-in:

define(`THEMECOLOR', `#2211ff')dnl

This will bring in another file, just as if you'd entered it there.

(There's also sinclude() which will silently do nothing if the file doesn't exist.)

Doing Math

One of the niftier things you can do with m4 is integer math with the eval() macro.

Now I've been struggling to think of an example where this would be required, since virtually everything can be done with relative and absolute positioning in CSS, with proper nesting of the elements in the HTML. But nevertheless, here is an example using two absolutely-positioned divs, where we want thing2 to be 40 pixels to the right of thing1:

define(`XPOS', `300')dnl #thing1 { left: XPOS()px; /* m4 expands this to "300px" */ } #thing2 { left: eval(XPOS + 40)px; /* m4 expands this to "340px" */ }

m4 and HTML

You might be considering m4 for HTML, too. Why not? You can include common HTML sidebars and header information, substitute various strings, build parts of the site conditionally, and so on.

My only caveats are:

  1. Don't get carried away and make some kind of crazy m4 megastructure that no one else can ever maintain.
  2. m4 doesn't do multibyte character encodings, such as UTF-8 (past the first 127 characters, anyway!)

Consider command-line PHP for prebuilding static HTML, instead. Or goatee. :-)

More Info

Probably the most notable things I haven't talked about here are macro expansion with arguments, and string manipulation. But that stuff's pretty straightforward if you have a handle on the stuff I've already mentioned, above.

To that end, here are some more sites with m4 information:

Share me!

Historic Comments

 Steven Johnson 2010-12-18 18:08:06

Doesn't support multibyte character encodings? Sorry, it's 2010, not 1982; I now consider non-unicode-aware tools to be massive fail.

 beej 2010-12-19 01:18:18

@Steven Johnson I agree with the sentiment in general.

But in this case, I make a bit of an exception. The CSS file is likely to be ASCII. And if it's not, it's likely to be UTF-8, but only use the first 7 bits.

And not that this is an excuse, but C's "unique" wide character handling has to be part of the issue.

 Cody 2010-12-30 09:42:04

Awesome post, Beej. Very cool you have a blog after all these years, back when you were using your uni's server for hosting of everything.

Very useful info here; never thought of it this way (though to be fair I use postfix mostly so rarely if ever use m4).

And, to Steven: a few things to maybe consider.

1) m4 isn't from 2010; it's from long before that. No need to fix non broken things (see next point)
2) First thing I thought of when I read multibyte characters is the C type : wchar_t ; in which case, do know that to support it it'd not be nearly as portable. I"ll quote from the TLDP howto on unicode just to confirm :

"A `wchar_t' may or may not be encoded in Unicode; this is platform and sometimes also locale dependent. A multibyte sequence `char *' may or may not be encoded in UTF-8; this is platform and sometimes also locale dependent. "

So there's several things there that say why it may not be included (perhaps this is what beej was referring to even, with how C deals with it). I could be wrong, but I would have thought that's what he means by multibyte chars.

But that aside, is it really a big deal ? Look up iconv command (though, gnu); I had a problem when converting a huge amount of data files to xml (libxml2 had a bloody fit!). One for loop in a bash shell fixed it with iconv.

Besides, the program is originally from 1979 or even earlier.

Unix/linux is and always has been full of small utilities that do one thing and one thing very well, and m4 is a macro processing utility. Since wchar's are not as portable, it also makes some sense to not support it (but again, iconv or similar).

Just something to keep in mind, as really, it's not a failure. If you use that logic, and expect all to be fixed, well we'd already be on ipv6, we'd have less issues with tcp/udp/ip (vulnerabilities et al), and Windows would be gone!

 kostas vitoroulis 2011-04-17 15:15:23

Thanks for the tip! That is a really smart one!

As for Steve's comment regarding the utility being old: Yes there is general truth to the statement but why re-invent the wheel when the specific task gets done perfectly well, simply, quickly and in a very lean fashion?

 beej 2012-01-26 05:44:49

Eventually, however, I wrote a simple tool to handle everything I needed, and was cleaner and more flexible than m4 for this usage. :-)


blog comments powered by Disqus
Blog  ⚡  Email  ⚡  Home page