|
|
| Table of contents |
|
2 Types of compilers 3 Compiler design 4 Compiler front end 5 Compiler back end 6 Compiled vs. Interpreted languages 7 Further reading 8 See also 9 External links |
How it works
Usually the translation is from a source code (generally a high level language) to a target code (generally a low level object code or machine language) that may be directly executed by a computer or a virtual machine. However, a compiler from a low level language to a high level one is also possible; this is normally known as a decompiler if it is reconstructing a high level language which (could have) generated the low level language. Compilers also exist which translate from one high level language to another, or sometimes to an intermediate language that still needs further processing; these are sometimes known as cascaders.
Typical compilers output so-called objectss that basically contain machine code augmented by information about the name and location of entry points and external calls (to functions not contained in the object). A set of object files, which need not have all come from a single compiler provided that the compilers used share a common output format, may then be linked together to create the final executable which can be run directly by a user.
Types of compilers
A compiler may produce code intended to run on the same platform as the compiler itself runs on. This is sometimes called a native-code compiler. Alternatively, it might produce code designed to run on a different platform. This is known as a "cross compiler". Cross compilers are very useful when bringing up a new hardware platform for the first time (see bootstrapping). A "source to source compiler" is a type of compiler that takes a high level language as its input and outputs a high level language. For example, an automatic parallelizing compiler will frequently take in a high level language program as an input and then transform the code and annotate it with parallel code annotations (e.g. OpenMP) or language constructs (e.g. FORTRAN DOALL statements).
Many modern compilers share a common 'two stage' design. The first stage, the 'compiler front end' translates the source language into an intermediate representation. The second stage, the 'compiler back end' works with the internal representation to produce code in the output language.
While compiler design is a complex task, this approach mitigates the complexity by allowing either the front end or back end to retarget the compiler's source or output language respectively. This way, modern compilers are often portable and allow multiple dialects of a language to be compiled.
Certain languages, due to the design of the language and certain rules placed on the declaration of variables and other objects used, and the predeclaration of executable procedures prior to reference or use, are capable of being compiled in a single pass.
The compiler front end consists of multiple phases itself, each informed by formal language theory:
While there are applications where only the compiler front end is necessary, such as static language verification tools, a real compiler hands the intermediate representation generated by the front end to the back end, which produces a functional equivalent program in the output language. This is done in multiple steps:
Many people divide higher level programming languages into two categories: compiled languages and interpreted languages. However, in fact most of these languages can be implemented either through compilation or interpretation, the categorisation merely reflecting which method is most commonly used to implement that language. (Some interpreted languages, however, cannot easily be implemented through compilation, especially those which allow self-modifying code.)
Compilers: Principles, Techniques and Tools by Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman (ISBN 0201100886) is considered to be the standard authority on compiler basics, and makes a good primer for the techniques mentioned above. (It is often called the Dragon Book because of the picture on its cover showing a Knight of Programming fighting the Dragon of Compiler Design.) External link to publisher's catalog entry
Understanding and Writing Compilers: A Do It Yourself Guide (ISBN 0333217322) by Richard Bornat is an unusually helpful book, being one of the few that adequately explains the recursive generation of machine instructions from a parse-tree. Having learnt his subject in the early days of mainframes and minicomputers, the author has many useful insights that more recent books often fail to convey.
During the 1990s a large number of free compilers and compiler development tools have been developed for all kinds of languages, both as part of the GNU project and other open-source initiatives. Some of them are considered to be of high quality and their free source code makes a nice read for anyone interested in modern compiler concepts.
Compiler design
In the past, compilers were divided into many passes to save space. When each pass is finished, the compiler can free the space needed during that pass.Compiler front end
Compiler back end
Compiled vs. Interpreted languages
Further reading
See also
External links