After you type your instructions in an editor by using a programming language such as C++ or Java, guess what? The computer doesn’t have the slightest idea what you just created. Computers understand only machine language, so you need to use another special program to convert your source code (the instructions that you write in C++ or Java) into machine language.
You can use either of the following two types of programs to convert source code into machine language:
Compilers
A compiler takes your source code, converts the entire thing into machine language, and then stores these equivalent machine-language instructions in a separate file, often known as an executable file. The process is like having a translator study an entire novel written in Spanish and then translate and rewrite it into Arabic.
Remember Whenever a compiler converts source code into machine language, it’s compiling a program.
After you compile a program, you can just give away copies of the executable (machine-language) version of your program without giving away your source code. Most commercial programs (such as Microsoft PowerPoint and Quicken) are compiled, but you never see the source code.
After you use a compiler to convert source code into machine language, you never need to use the compiler again (unless you make changes to your source code).
Remember A compiler creates machine language for a specific microprocessor. If you write a program in BASIC and want to run it on a Macintosh and a Windows computer, you need to compile your program twice: once for the Macintosh and once for the Windows environment. However, many compilers, such as REALbasic, let you compile for multiple platforms at the same time. So if you wrote a program in REALbasic and compiled it for Windows and Macintosh, you’d wind up with a compiled executable file for Windows and a second compiled executable file for Macintosh.
Technical Stuff Not all compilers are equal. Given identical C++ source code, one C++ compiler may create a program that runs quickly, whereas a second C++ compiler may create a smaller file that runs much slower, yet both programs may look and work exactly alike.
A compiler is nothing more than a program. So if you want to create a compiler, guess what? To write any type of a program, you need a compiler (or an assembler, which is just a special compiler for converting assembly language into machine language). So here’s the dilemma. How did computer scientists create the first compiler?
To create the first compiler, computer scientists used a technique called bootstrapping, derived from the phrase “pulling yourself up by the bootstraps.” First, they wrote the bare bones of the compiler in machine language, which the computer can understand without any translation whatsoever. Then they used this bare bones compiler to create an assembler so they could write more of the compiler in assembly language. Finally, when enough of the compiler had been built from a small base of machine-language code and a larger base of assembly-language code, they used the compiler itself to write additional instructions in a higher-level programming language (such as C) to build the rest of the compiler.
Nearly every language compiler is written partially or entirely in another programming language. Microsoft wrote the original Visual Basic compiler in assembly language but wrote later versions in C++. Shoptalk Systems wrote its Liberty BASIC compiler in a language called SmallTalk. So if you want to create a compiler for a brand-new programming language, you have to start creating that compiler by using an existing programming language.
 |
Warning! Your program is at the mercy of the compiler you use. Many Macintosh programs were created by using a compiler called CodeWarrior, but when Apple switched from PowerPC to Intel processors, guess what? CodeWarrior wouldn’t compile C++ source code for the new Intel Macs. So everyone (including Microsoft and Adobe) who had written C++ programs with CodeWarrior had to rewrite their C++ programs and compile them with Apple’s C++ compiler instead. Switching compilers and rewriting programs to run under a different compiler is rarely easy, so not only is it important to choose the right programming language, but also the right compiler.
Interpreters
A second, but less popular, way to convert source code into machine language is to use an interpreter. An interpreter converts each line of your source code into machine language, one line at a time. The process is like giving a speech in English and having someone translate your sentences, one at a time, into another language (such as French).
Whereas a compiler stores machine code in a separate file, an interpreter converts source code into machine language but stores the machine code in the computer’s memory. Every time that you turn off the computer, you lose the machine-language version of your program. Each time you want to run the program, you must feed the source code into the interpreter again.
If anyone wants to run your program, that person needs both an interpreter and the source code for your program. Because your source code enables everyone to see how you wrote your program (and gives others a chance to copy or modify your program without your permission), few commercial programs use an interpreter.
Interpreters are often used for Web page programming languages, such as JavaScript. Because different computers can view Web pages, you can’t compile programs that you write in JavaScript into machine language because people with different computers may visit your Web site. Instead, your computer’s browser uses an interpreter to run a JavaScript program.
Technical Stuff In the old days, when computers were slow and lacking in sufficient memory and storage space, interpreters were popular because they gave you instant feedback. The moment you typed an instruction into the computer, the interpreter told you whether that instruction would work and even showed you the results. With an interpreter, you could write and test your program at the same time. Now, computers are so fast that most programmers use compilers rather than interpreters, although a handful of languages like LISP still rely on interpreters.
P-code: A combination compiler and interpreter
Getting a program to run on different types of computers is a big pain in the neck. Even though Macintosh and Windows programs use pull-down menus and dialog boxes, programs often need to write one set of commands to create Macintosh pull-down menus and a second set of commands to create the identical menus in Windows.
Because one program almost never runs on multiple computers without extensive modification, programmers combined the features of a compiler with an interpreter to create something called p-code.
Instead of compiling source code directly into machine language, you compile the source code into a special intermediate file format (called p-code or byte code).
To run a program compiled into p-code, you need a special p-code or byte code interpreter, often called a run-time file. To run a p-code file on a Macintosh, your computer needs a Macintosh p-code run-time file. To run that same p-code file on Windows, your computer needs a Windows p-code runtime file. Instead of creating compilers for multiple computers, it’s often easier just to create different run-time files for multiple computers.
Java is the most popular programming language that uses p-code. After you compile a Java program into p-code, you can copy that p-code to a Macintosh, a Windows computer, or a Linux computer. As long as that computer uses a Java p-code interpreter, you can run the Java program on that computer without modification.
Best of all, programs that you compile into p-code can run without the original source code, which means that you can protect your source code and still give your program away to others.
Remember Naturally, p-code has its own disadvantages. Programs stored as p-code tend to run much slower than programs that compiled directly into machine language. Although p-code programs can run without a copy of the original source code that you use to create them, you can also decompile p-code programs.
Decompiling a p-code program can reveal the original source code that the programmer used to create the program. So if you write a program in Java and compile it into p-code, a rival can decompile your p-code program and see your original Java source code. Your rival then ends up with a nearly identical copy of your source code, which gives him the opportunity to steal your program.
Technical Stuff You can actually decompile any program, including programs that you compile into machine language. But unlike with decompiling p-code programs, decompiling a machine-language version of a program never gets you the original high-level language source code that the programmer used to write the program. If you compile a program into machine language, the original source code can be written in C++, COBOL, FORTRAN, BASIC, Ada, LISP, Pascal, or any other programming language in the world. Because the decompiler has no idea what language the original source code was written in, it can decompile a machine-language version of a program only into equivalent assembly language. After you decompile a program into assembly-language source code, you can rewrite or modify that source code. Decompiling effectively allows you to steal the ideas of others, but it’s often used to dissect computer viruses and worms to understand how they work (and how to defend against them).
So what do I use?
If you want to write programs to sell, use a compiler, which protects your original source code. If you want to write a program to run on your Web page, you can use either an interpreter or p-code. If you want to write a program that can run on different types of computers, p-code may prove your only choice. As a safer but more cumbersome alternative, you can also use multiple compilers and modify your program to run on each different computer.
Remember The language that you choose can determine whether you can use a compiler, an interpreter, or p-code. You often convert Java programs into p-code, for example, although you can also compile them directly into machine language. On the other hand, you usually compile C++ and rarely interpret or convert C++ programs into p-code.