2010-12-18
m4 Macros and CSS
I'm certainly not the first person to consider this, but it's a simple enough technique for managing CSS that it bears repeating.
The problem: your CSS has a bunch of constants hard-coded in it all over the place, because CSS doesn't support the concept of variables.
Perhaps the original theme color was #1122ff
, and so you have CSS like
this, where the string 1122ff
appears about 8 million times in your
various CSS files:
#somediv {
background-color: #1122ff;
}
.othertext {
color: #1122ff;
}
But then you find out that the new theme color is going to be
#2211ff
and can you please change it. Sure there's search-and-replace,
but there might be false positives, and it's not easy to verify that
everything replaced well.
But there is a better way!
Warning
There are some pretty full-fledged tools to work specifically with CSS. Most notably we have less and Sass, but there are others. These are great tools that work specifically on CSS and have some wonderful features.
See the "silver lining" part of the next section for why you might want to use m4, anyway. At least this page serves as a simple introduction to m4!
The Basics
Using the concept of "macros", we can substitute one string of characters for another.
Macro processors have been around for quite some time. C has one built into it, in fact. Most assemblers have them. And there's a rather infamous macro processor known as "m4".
Many people have exposure to m4 in both the configuration of sendmail, and GNU's autoconf tools, and, to be frank, it's rare you hear good things about them. I chalk this up to the fact that m4 can be made over-the-top-super-zany-9000 and when normal people look at it, they flip out.
That's OK.
Also, m4 doesn't work with multibyte character encodings, but in many circles these are rare in CSS.
There is a silver lining, and it's something like this:
-
m4 can be used in a basic sense in a way that's still human-readable.
-
m4 is lightweight and fast.
-
m4 is easy to run.
-
m4 is easy to install, with very few dependencies.
-
m4 is part of POSIX, and is widely available—and might already be installed on your system.
-
GNU m4 will support multibyte characters someday. Probably.
Let's write an m4 script, call it "foobar.css.m4" that takes that
color constant out and puts it in a macro. (If you're going to copy
this, take care to note the open and close quotes are backticks (`
)
and apostrophes ('
)—more on that later):
define(`THEMECOLOR', `#2211ff')dnl
#somediv {
background-color: THEMECOLOR;
}
.othertext {
color: THEMECOLOR;
}
And then we run m4
from the command line and redirect the output to a
file (or don't and just see it on standard-out):
$ m4 foobar.css.m4 > foobar.css
Then this is the result:
#somediv {
background-color: #2211ff;
}
.othertext {
color: #2211ff;
}
Ta-daa! So now if you want to change the color, all you have to do is
change the define()
at the top of the file and run it through m4
again.
Let's check out that m4 define:
define(`THEMECOLOR', `#2211ff')dnl
There are two things that are weirdly unexpected here.
First of all, note the quotes around the elements. The open quote is a
backtick (`
) and the closing quote is an apostrophe ('
). What this
does is tells m4 to not expand any macros it finds between the quotes.
So here, just in case you have a macro called "THE", it won't replace
the THE
in THEMECOLOR
, because `THEMECOLOR'
is quoted.
(If you don't like the quotes, you can change them to other characters
using m4's changequote()
built-in.)
The second weird thing to notice is that dangling "dnl
" at the end of
the line. This tells m4 to "delete through newline", or, in order words,
ignore everything from here to the start of the next line. If you don't
have this there, the newline at the end of the define()
line will be
emitted into the final output, leading to an extra line in the file at
that point. For CSS, it's harmless, so the dnl
isn't really
necessary, likely. And maybe you're running it through a CSS compressor
anyway, if this is all part of an automated build. But if you don't
like the extra newlines, add the dnl
. It's really very very common in
m4 code.
One more final note is that if you want your macro directly beside other text, you'll have to put parentheses after the name to separate it from that text. The parentheses tell m4 that you're invoking the macro with no arguments. This is the same as the normal case where you use no parens, except it's explicit.
To demonstrate, let's say we want to show the string "3490", and we have it defined in two parts, "34" and "90":
define(`THIRTYFOUR', `34')
define(`NINETY', `90')
THIRTYFOURNINTY
When we run this, it outputs:
THIRTYFOURNINETY
What happened to the macro expansion? Well, obviously we didn't declare
a macro called THIRTYFOURNINETY
, so m4 didn't substitute anything.
But if we use the parens to explicitly say the macros have no arguments:
define(`THIRTYFOUR', `34')
define(`NINETY', `90')
THIRTYFOUR()NINETY
We get our expected result:
3490
So that's the basics.
Including External Files
Maybe you have a few other CSS files and you want to include them into
the main final CSS file during the m4 run. This is easily accomplished
with the include()
built-in:
define(`THEMECOLOR', `#2211ff')dnl
include(`common.css.m4')dnl
This will bring in another file, just as if you'd entered it there.
(There's also sinclude()
which will silently do nothing if the file
doesn't exist.)
Doing Math
One of the niftier things you can do with m4 is integer math with the
eval()
macro.
Now I've been struggling to think of an example where this would be
required, since virtually everything can be done with relative and
absolute positioning in CSS, with proper nesting of the elements in the
HTML. But nevertheless, here is an example using two
absolutely-positioned div
s, where we want thing2
to be 40 pixels to
the right of thing1
:
define(`XPOS', `300')dnl
#thing1 {
left: XPOS()px; /* m4 expands this to "300px" */
}
#thing2 {
left: eval(XPOS + 40)px; /* m4 expands this to "340px" */
}
m4 and HTML
You might be considering m4 for HTML, too. Why not? You can include common HTML sidebars and header information, substitute various strings, build parts of the site conditionally, and so on.
My only caveats are:
- Don't get carried away and make some kind of crazy m4 megastructure that no one else can ever maintain.
- m4 doesn't do multibyte character encodings, such as UTF-8 (past the first 127 characters, anyway!)
Consider command-line PHP for prebuilding static HTML, instead.
More Info
Probably the most notable things I haven't talked about here are macro expansion with arguments, and string manipulation. But that stuff's pretty straightforward if you have a handle on the stuff I've already mentioned, above.
To that end, here are some more sites with m4 information:
-
Static Site Creation—even more information about the overall build process
Doesn't support multibyte character encodings? Sorry, it's 2010, not 1982; I now consider non-unicode-aware tools to be massive fail.
@Steven Johnson I agree with the sentiment in general.
But in this case, I make a bit of an exception. The CSS file is likely to be ASCII. And if it's not, it's likely to be UTF-8, but only use the first 7 bits.
And not that this is an excuse, but C's "unique" wide character handling has to be part of the issue.
Awesome post, Beej. Very cool you have a blog after all these years, back when you were using your uni's server for hosting of everything.
Very useful info here; never thought of it this way (though to be fair I use postfix mostly so rarely if ever use m4).
And, to Steven: a few things to maybe consider.
1) m4 isn't from 2010; it's from long before that. No need to fix non broken things (see next point)
2) First thing I thought of when I read multibyte characters is the C type : wchar_t ; in which case, do know that to support it it'd not be nearly as portable. I"ll quote from the TLDP howto on unicode just to confirm :
"A `wchar_t' may or may not be encoded in Unicode; this is platform and sometimes also locale dependent. A multibyte sequence `char *' may or may not be encoded in UTF-8; this is platform and sometimes also locale dependent. "
So there's several things there that say why it may not be included (perhaps this is what beej was referring to even, with how C deals with it). I could be wrong, but I would have thought that's what he means by multibyte chars.
But that aside, is it really a big deal ? Look up iconv command (though, gnu); I had a problem when converting a huge amount of data files to xml (libxml2 had a bloody fit!). One for loop in a bash shell fixed it with iconv.
Besides, the program is originally from 1979 or even earlier.
Unix/linux is and always has been full of small utilities that do one thing and one thing very well, and m4 is a macro processing utility. Since wchar's are not as portable, it also makes some sense to not support it (but again, iconv or similar).
Just something to keep in mind, as really, it's not a failure. If you use that logic, and expect all to be fixed, well we'd already be on ipv6, we'd have less issues with tcp/udp/ip (vulnerabilities et al), and Windows would be gone!
Thanks for the tip! That is a really smart one!
As for Steve's comment regarding the utility being old: Yes there is general truth to the statement but why re-invent the wheel when the specific task gets done perfectly well, simply, quickly and in a very lean fashion?
Eventually, however, I wrote a simple tool to handle everything I needed, and was cleaner and more flexible than m4 for this usage. :-)
https://github.com/beejjorgensen/goatee