By the end of this lab you should:
Although it would technically work just fine to write all of the code for a program in a single file, there are many reasons not to do so. The two most important reasons to distribute a program among multiple files is code readability and code reuse. Breaking your program into multiple files makes it much easier to find the piece of code that you're interested in. This is the readability issue. Also, if you want to reuse a class from one program in another program it is much easier to simply copy the files that contain the code for that class into your new project, than to have to search through a huge amount of code trying to copy and paste out the code that you thing is relevant. And even then you arn't really sure if you got everything that was necessary. In this lab we'll learn how to break C++ programs into multiple files and how to compile the programs once we've done so.
The way in which programs are broken into multiple files is the following:
The compiler is divided into 4 different parts: the preprocessor, the compiler, the assembler, and the linker. The preprocessor gets your code ready to be compiled. This includes gathering together all of the files that you plan to use. The compiler and assembler take the code that the preprocessor gives them, and generate "assembly code" which is the instructions that the CPU inside of your computer understands. The linker takes all of the files of assembly code that have been generated by the compiler and assembler, and ties them together into one coherent program. When you're using multiple files, there are some special things that you need to do in order to get them to compile together. The things involved are special instructions to the preprocessor and the linker stages of the compiler.
Since all of the code is no longer in the same file, we need a way of telling the compilers preprocessor where to find the class declarations. We do this with a "# include" statement at the top of any files that use a particular class. For example, let's say that we have a class named Employee whose declaration is in the file employee.h and whose implementation is in the file employee.cpp. To use the Employee class in our main function, we need to put the following include statement at the top of the file:
#include "employee.h"
Notice that for files that we have written, we leave the
.h on the name of the file, and we put quotes around the file
name rather than angle brackets.
We also need to put the same include statement at the top of any other file
which uses the Employee class. This includes the implementation file
employee.cpp.
One problem that arises when you are using multiple files is that you end up # including the .h file for a certain class in many places. This confuses the compiler because it thinks that you are trying to declare the same class multiple times. In order to keep the compiler from getting all confused, we need a way of making sure that it only reads the class declaration once, instead of every time that the .h file for the class is # included.
To do this we use the preprocessor directives: #ifndef, #define, and #endif. The compiler uses variables, just like the make program ( and many other programs ) do. Again, the idea is just the same as variables in C++, but the syntax is slightly different. If we want to declare a variable to the preprocessor ( let's call the variable __EMPLOYEE_H__ ) we would write the following in our .h file:
#define __EMPLOYEE_H__
If we do this this at the top of our .h file, the first time the file is read by the preprocessor the variable __EMPLOYEE_H__ will be defined. To make sure that it only happens once, we'll use a construct that will seem very much like the "if" statement from C++. What we'll use is the #ifndef ... #endif preprocessor directives. These are read as "if not defined ... end if". We'll encapsulate everything in our .h file with these directives. Here's an example .h file foo.h:
#ifndef __FOO_H__
#define __FOO_H__
class foo
{
public:
foo();
private:
.
.
.
};
#endif
The first time the file foo.h is included, the variable __FOO_H__ is defined and the class declaration is included. Every other time the file foo.h is included, the variable __FOO_H__ is already defined, so everything between #ifndef __FOO_H__ and #endif is ignored. The variable name __FOO_H__ is a typical naming convention which is the file name in all capital letters with two underscore characters at the beginning and end, and the period replaced with an underscore.
An object file is a file containing code that has gone through all of the stages of the compiler except for the linker. These files end with a .o extension, and it is customary to create an object file for each class. To compile an object file we need to give the compiler the files that we want turned into an object file, and tell it not to go through the linker stage yet. To do this we specify the option "-c" on the command line to g++. This tells g++ not to try linking the file. So, supposing we have a class Foo in the files foo.h and foo.cpp. To compile these files to an object file we would use the command:
g++ -Wall -W -Werror -pedantic -ansi -o foo.o -c foo.cpp
Notice that we don't specify the file foo.h. This is because it is included by the preprocessor directive #include "foo.h" which is at the top of the file foo.cpp. However, when we write the rule in our makefile we'll put foo.h in the dependencies list so that the object file is recompiled if the declaration of the class changes.
Now to compile the entire program we simply add the .o file(s) to the list of files that are needed to compile the full program. Assuming we wrote the main in the file main.cpp and we are using the class Foo, the command to g++ would look like the following:
g++ -Wall -W -Werror -pedantic -ansi -o a.out main.cpp foo.o
Now, to tie everything together, lets put all of this together in a makefile:
# Makefile for multiple files
CXX=g++
CXXFLAGS=-Wall -W -Werror -pedantic -ansi
OBJECTS=foo.o
all: $(OBJECTS)
$(CXX) $(CXXFLAGS) -o a.out main.cpp $(OBJECTS)
foo.o: foo.h
$(CXX) $(CXXFLAGS) -o foo.o -c foo.cpp
As we've discussed before, the make utility is a very complex and powerful program. In this course we will just scratch the surface of what make can do. An interesting thing about make is that in many situations, if you don't specify a command for a rule it will try and figure out what you want to happen for that rule. This is called an "Implicit Rule."
If you specify a target whose name ends in .o but do not specify a command. The make utility will assume that you want to create an object file. If make can find a file whose name matches the name of the target, but has a .cpp extension it will create a command to g++ that will create an object file using that .cpp file.
You are encouraged to give this a try, but don't use implicit rules in the makefiles that you turn in with your assignments unless you are absolutely sure that the required flags ( ie. -Wall -W -Werror -pedantic -ansi ) are being included by the implicit rules.