I don't know what to think of this post. I've definitely learned a whole hell of a lot from my compiler projects, but I don't see the point of turning compilers into a religion. I wouldn't go around telling people that they have to learn compilers to be a "real programmer", but it does seem like a pretty comprehensive litmus test: if you've written a compiler from scratch, it proves you really do understand something about algorithms, data structures, and systems software.
Steve implies "Learn compilers if you want to be a Real Programmer(TM)!", but I'd say: Write a compiler if you want a project that will teach you a lot and will be a lot of fun with interesting challenges along the way.
It's just that writing a compiler requires such a large up front amount of knowledge of how computers work, and teaches you so much more, that it's hard to compare to any other project where you can learn exactly what is happening in a computer and what every line of your code actually does to the operation of the machine you are working on/targeting.
Sure other projects you can learn just as much general knowledge, but I'm going to agree with Steve that it's pretty enlightening. I'll also agree that it is a bottomless pit of your time and effort.
After reading this post when it came out, I started learning about compilers. I bought the dragon book from AbeBooks.com (which cost just $3.50 there; Amazon = $85) and I am reading it now.
Engineering a Compiler by Cooper and Torczon is probably better these days than the dragon book. The dragon book is of primarily historical interest imo. (Unless you really do need to write an LL(k) parser generator...)
I would say coding is more important than studying the books a lot.
This is curious. I haven't studied compilers at all, but I always hear the Dragon book is the default. Can anyone second these recommendations (Cooper/Torczon, the Gholoum paper)?
No, no book is good for that. You just need to pick a project and start coding. Interpreters are a fun place to begin. I'd suggest getting a parser generator and reimplementing Lua or something. You'll want to look up stuff in the Cooper & Torczon book as you do this (like what an abstract syntax tree is), but I doubt it's useful to read it through or study it. Reading the code of existing compilers is also very helpful. (A word to the wise: Don't start with the source code of GCC or other industrial-strength compiler. :))
(This is my philosophy toward learning anything programming related, so maybe I'm just weird.)
This is a very long article, and it's founded on a false premise.
Compilers is not an important topic because there's nothing universal you can take out of it. Machine code as we know it is a pretty arbitrary construction. Register machines themselves are pretty arbitrary---they're just one way to make use of transistor circuitry. The conversion from C to machine code is not really worth remembering. It's true that this system forms the basis for all our computing, but like the nanoscale physics of semiconductors, it's not required knowledge for programmers. It's just a bag of rules...it's not even a bag of tricks.
If you work at Intel optimizing compilers, then maybe you need to know this stuff. If you never hunch down and program inline assembly, why do you care? You're too far from the bytecode to leverage compiler knowledge against performance, and in fact you'd probably be wasting your time to do so. Its workings operate below the required level of abstraction. Understand what's going on, but leave the details to someone else. I don't need to know how Firefox implements an HTML parser to render my webpage. I just need to know that it does so in a consistent way.
Compiler implementations are pretty arbitrary, and oftentimes they're filled with a lot of goofy stuff. We shouldn't still be carrying cruft like Please-Excuse-My-Dear-Aunt-Sally around in our languages---and this is half of what a compiler does. If there's anything beautiful to take out of a compilers class, it's that a compiler can compile itself. There---I've spoiled the ending, skip the course and take a theory class instead.
You're focusing on all the wrong parts: the details. It's not like Steve is saying that you'll be greatly enlightened by memorizing how the IA-64 calling convention works. Steve is referring to the skills that you'd learn from solving any large and "algorithmic" problem, with the bonus that compilers will also teach you a bit about the low-level workings of your machine. You'll learn how to represent and manipulate complex data, you'll learn what a calling convention is, what a system call is.
Compilers is not an important topic because there's nothing universal you can take out of it. Machine code as we know it is a pretty arbitrary construction.
Compilers have nothing to do with machine code, other than that's sometimes the output target. There are plenty of other (often more interesting) targets.
Eval in Lisp is half a page. That's not a lot of material. If you further discount machine language, the only thing left for a compiler/interpreter to do is handle eccentricities in language syntax. This is rote and arbitrary application of data structures. Rarely do I find it edifying to "undo" complexity that someone else has added, at whim, to computation.
Although Steve Yegge brings up a lot of interesting points, he's bad at explaining them.
Learning about compilers and programming languages is important because of the principles inherent in building them, which average programmers never completely get.
In my opinion, these principles are:
1) Code is data, and data can be code.
2) The only sensible way to deal with unmanageably-large amounts of code is by creating a language (whether this be an API, a protocol, a DSL, or a Turing-complete programming language) to communicate with.
3) Many classes of problems can be thought of as a transformation of a data-structure in one language to an "equivalent" data-structure in another language, which at its core, is all a compiler does.
He admits that situations 1-4 are pretty much solved once you have a parse tree. I think situations 5 and 7 are too. My perspective is that the compiler's work is only just getting started once you have a parse tree. I was hoping for a hint as to what about compilers was so important; if it's just parsing then no news here.
Compilers is one of the few classes I really regret not taking in college. Fortunately, I discovered the unprotected URL to my university's online lectures site (no you can't have it) and might just have to watch this semester's class...
I started watching them, and the professor recommends this book (and the new edition of the Dragon Book):
I've skimmed, not read, the article, but it's good enough that I'll be causing some tree-death to read it slowly on the train home tonight. I might have to rebalance my algorithms v. compilers heuristic.
Steve implies "Learn compilers if you want to be a Real Programmer(TM)!", but I'd say: Write a compiler if you want a project that will teach you a lot and will be a lot of fun with interesting challenges along the way.