Lab2: Simple Call Graph Analysis

In this lab, you will write an LLVM analysis to find possible targets of indirect calls. You need a Linux environment for all the labs.

0. Preparation

We are going to use LLVM as the analysis infrastructure. Follow the instruction here to install the repository of the latest LLVM, then install a version >= 10.0.1, for example

$ sudo apt update
$ sudo apt install clang-11 llvm-11-dev

Next, download the analysis framework and make sure you can build it correctly:

$ tar Jxf callgraph.tar.xz
$ cd callgraph
$ mkdir build
$ cd build
$ cmake ../
$ make

We are going to use the Linux kernel as the analysis target. Download the latest 5.12 kernel and decompress it; then build it with the default configuration and the wrapper I provided to generate the LLVM IR bitcode.

$ wget https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-5.12.tar.xz
$ tar Jxf linux-5.12.tar.xz
$ cd linux-5.12
$ export CLANG=clang-11
$ make ARCH=x86_64 CC=PATH_TO_CALLGRAPH/clang-emit-bc.sh defconfig
$ make ARCH=x86_64 CC=PATH_TO_CALLGRAPH/clang-emit-bc.sh -j

If build finished successfully, you can find the LLVM IR bitcode in .llbc suffix. Next, we generate the list of IR files as the analysis targets. Now, let’s try the analysis framework on these targets.

# under linux-5.12
$ find $PWD -name "*.llbc" > PATH_TO_CALLGRAPH/bclist
# switch to the analysis framework
$ cd PATH_TO_CALLGRAPH
$ ./build/KAMain @bclist

You should see the bitcode files got loaded and processed.

1. Objectives

Our objective for this lab assignment is to learn how to write a simple call graph analysis based on function type compatibility matching. For each indirect call site, you iterate through all address taken functions and see if their types are compatible: the return types are compatible, the number of arguments are the same, and each argument is type compatible. If so, you add the function as potential call target of the indirect call site.

2. Workflow

The CallGraph.cc has already provided a general skeleton to finish the analysis. What you need to do is to finish the isCompatibleType function. Note that because we load individual bitcode files under different LLVMContext, you CANNOT directly compare type pointers to determine if two types are the same.

3. Submission

Please submit your report through iLearn, preferably in PDF format. In the report, please list your complete source code with sufficient explanation and output messages.