After taking a
semester to study compilers and their value in undergraduate education, I found that these are a unique topic
in Computer Science. The only problem is, most people either don't know/care about what they are! Well I hope
to (a) introduce you compilers and (b) explain why I think they're so cool.
What Are Compilers?
Glad you asked. A compiler is a program that a programmer uses all the time. When a person writes code like Java or C++, they save it into a file, and then tell a compiler to compile their code into machine instructions. When the compiler is done, you've got an executable program!
So the compiler sits between a programmer and the computer's actual instructions. Programmers use compilers quite frequently because they're always testing out their code. Here are a basic requirements of a compiler, with a few comments.
- The compiler must be able to read the syntax of the source programming language to
understand the semantics
A source language here is what's referred to as a "high-level" language (C++, Java, Perl, etc). Programming languages are a lot more structured than natural languages. There are few, if any, ambiguities and the syntax always reflects the symantics. This requirement is fulfilled by a "front end", software that converts the programmer's written code into a data structure called an "abstract syntax tree". - The compiler must know a lot about the target language/architecture
For example, a C++ compiler for the PowerPC architecture (architecture for the Macintosh, XBox 360, and various embedded devices) must know a lot about how that system operates. Every architecture has its own quirks and speedups, so the compiler writers need to know how to use that system most efficiently. This requirement is fulfilled by a "back end", so that every source language has a front end and every target language/architecture has a back end and they hook up easily in between. -

Slow AND large is bad.The compiler must be able to make the code run optimally
When people say "optimally", they usually mean fast. Most of the time we want our programs to run as fast as possible, but this usually means you use up more memory. Sometimes you want your code to take up little memory, so the compiler should be able to do that, too. Time and space are always a tradeoff in Computer Science. The compiler ought to be configurable to perform optimizations for speed, space, or a little of both. Turning on optimizations should never break your code! (ahem, GCC -O3...)
Levels of Abstraction
With compilers, and for all of Computer Science for that matter, it's all about "levels of abstraction". So when I call C++ or Java a "high level language", I mean that they are high in terms of abstraction and not high in terms of difficulty. Actually, a higher level of abstraction is supposed to make programming easy. The more abstractly you can specify your program, the less specific you have to be (duh!)
I mention abstraction here
because compilers take one level of abstraction and fill in the details; that is, they understand a
program and then write it. Compilers play the same role between a programmer's code and the computer
that programmers play between their managers and their code. Suppose a manager comes up one day and says to
his software engineer:
Hi Bob, how's it going. Yeah, I'm gonna need you to write a word processor this week. And if you could hand in that TPS report on Saturday, that'd be great...
Okay, so the manager needs a word processor. That's it. No details, only some idea what that means, but that's what's needed. So the software engineer sits down and interprets what the manager meant by that sentence and writes high-level code for a word processor. The software engineer just took an abstract statement and made it more concrete, thus taking the project down one more level of abstraction like a compiler does.
The
place where my analogy breaks down is the human/machine component to it. We can't be that vague to a computer;
these computers need incredibly specific details to perform even the easiest tasks. High level
languages allow programmers to still be specific, but not verbose. So that's why we have humans who are taking
the vague user requirements and make them concrete into high level language, and we just have a program
translate it into a low-level language that the computer understands.
Why People (Usually) Don't Care
Whenever I describe my independent study on compilers to people, I get two reactions depending on the audience.
- If the audience is geeky and knows compilers
Wow, you ARE crazy! - If the audience has no idea what a compiler is
That's nice.
The geeks are turned off
by the horror stories of late-night programming marathons, so they generally turn a deaf ear to the topic -
figuring they know nothing about it. Additionally, there are relatively few "real world" compilers
out there, so there's a very small chance that someone would actually build one of these things.
Studying compiler theory, then, is usually constricted to the academic arenas.
For the non-geeks, compilers are just tools for programmers, so the they don't the importance. And since non-programmers don't program, they have no appreciation for how difficult the compiling process is!
Why People SHOULD Care
Compilers have an enormous amount of power in how they translate these programs. Sometimes companies will write their own compilers for language like C++ to perform specific optimizations for their systems! So even though only a few people in the world actually develop compilers, they're used ubiquitously - making them a very crucial component.
Compilers are also quite important in an academic setting because they involve a huge number of algorithms which need to work together seamlessly. One could teach an entire compiler course on the theory of language recognition without ever talking about back ends and optimizations. Normally, software projects have one big algorithm implemented, but compilers have many distinct algorithms going on.

Testing is absolutely crucial.Lastly, compilers are great in an academic setting because they are
great programming projects. My independent study was taking Calvin's old compilers course and reorganizing the
curriculum so that students can get all of the components in ther compiler working. Putting together large
modules to see the importance of unit testing, functional testing, and acceptance testing was a huge step for
me. Not to mention I got implement my own programming language!
Conclusion
There's a whole lot more to say on compilers, but I'll leave those for later articles. I'm just going to say here that compilers are really fascinating pieces of software often go underappreciated and misunderstood. I may not study these things in depth later on in my CS career, but I certainly do appreciate what they are and what they can do.
Page last updated: February 04, 2008