CS 201 Project

Handed Out: Apr. 25, 2018
Due: May 25, 2018
Grade Weight: 30% of total course grade
This assignment can be done individually, or in a group of two.


Contents


Project Description

The goal of this project is to understand and implement the basic program analysis and instrumentation techniques. In this project, you are required to implement the following profiling methods:
  1. Edge Profiling: Edge profiling collects the execution frequency of each control-flow edge executed during a specific program run.
  2. Path profiling: You are to implement a subset of Ball-Larus's efficient path profiling algorithm. Given the IR for a piece of code, Ball-Larus developed an algorithm for instrumenting the IR in such a way that executing the instrumented IR code collects the frequencies of Ball-Larus paths in the original IR code. Given a function, Ball-Larus paths are a set of acyclic control flow paths that start at the function ENTRY node or a LOOP ENTRY node and terminate at the function EXIT node or a loop BACKEDGE. For a more precise definition of a Ball-Larus path, and the instrumentation algorithm, please refer to the following paper: You need to collect the frequencies of only those Ball-Larus paths that begin at an innermost loop's ENTRY and terminate at the loop's BACKEDGE. You are not required to implement the optimizations described in the paper. For programs with nested loops, you need to first detect all the loops, next identify the innermost loops, and then instrument to collect frequencies only of paths within the innermost loops.

You will carry out this project as follows:
  1. Parse the input file, construct the control flow graph, and identify all the loops (please implement loop construction algorithm -- e.g., algorithm in Control Flow Analysis slide -- otherwise your score might be penalized accordingly).
  2. Instrument the code to find the execution frequency of control flow edges, and paths according to the algorithm given in the above paper (only for Ball-Larus paths at innermost loops).
  3. Run the instrumented code and output the profile.
More details about the languages, tools, and output requirements can be found in the following:

The LLVM Intermediate Representation (IR)

LLVM provides a framework for writing compiler passes that operate (e.g., to optimize or instrument) upon LLVM Intermediate Representation language. A detailed definition of the LLVM IR can be found in the LLVM Language Reference Manual. Below is an example of a simple C program, and the corresponding LLVM IR.

An unconventional hello world in C

#include <stdio.h>

int main() {
  int x = 5;
  printf("Hello World; x = %d\n", x);

  return 0;
}

The Corresponding LLVM IR

The LLVM IR for the above source code can be obtained by running: $ clang -emit-llvm hello.c -S -o hello.bc
; ModuleID = 'support/hello.c'
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.10.0"

@.str = private unnamed_addr constant [21 x i8] c"Hello World; x = %d\0A\00", align 1

; Function Attrs: nounwind ssp uwtable
define i32 @main() #0 {
  %1 = alloca i32, align 4
  %x = alloca i32, align 4
  store i32 0, i32* %1
  store i32 5, i32* %x, align 4
  %2 = load i32* %x, align 4
  %3 = call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([21 x i8]* @.str, i32 0, i32 0), i32 %2)
  ret i32 0
}

declare i32 @printf(i8*, ...) #1

attributes #0 = { nounwind ssp uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.ident = !{!0}

!0 = metadata !{metadata !"Apple LLVM version 6.0 (clang-600.0.57) (based on LLVM 3.5svn)"}

Getting Started with LLVM

The pre-requisites for working with LLVM are as follows:
  1. Familiarity with C++; Additionally, knowledge of certain C++11 features like auto can help better understand examples in the LLVM documentation.
  2. On-demand faimilarity with the LLVM Language Reference
  3. Upfront faimiliarity with the LLVM Programmer's Manual
  4. Cursory understanding of the Visitor Design Pattern
  5. Familiarity with compiler terminology like BasicBlocks etc.

Working Example of Program Instrumentation using LLVM

We now describe the use of LLVM for program inspection and transformation, by writing an LLVM pass. Writing an LLVM Pass outlines the basic writing of an LLVM transform and a simplified example is provided here.

Programs in LLVM IR code are organized as a four-level hierarchy: Module, Function, BasicBlock and Instruction. That is, each LLVM program is composed of a Modules, one for each translation unit. A Module is a collection of Functions; a Function is a collection of BasicBlocks and a BasicBlock is a set of LLVM IR Instructions in SSA form.

This example counts the cumulative number of times BasicBlocks were encountered during program execution in a newly introduced global integer named bbCounter, which is incremented at the start of every BasicBlock.

The highlevel algorithm to do this is outlined as:
Create a GlobalVariable bbCounter;
  foreach Module M:
    foreach Funcion f in M:
      foreach BasicBlock b in F:
        Add code to increment bbCounter;

At the end of main(), print bbCounter;
This algorithm also outlines the basic structure of most LLVM transformations.

Code Walkthrough
  1. To create a GlobalVariable, we can do so in the Module that contains the Functions whose BasicBlocks we are interested in. This can be done in the bool doInitialization(Module &M) function which is invoked once when the Module is first encountered. This is shown in lines 34 -- 46.
  2. Line 37 creates an integer GlobalVariable named bbCounter with InternalLinkage in the Context of the current Module.
  3. To iterate over all functions in the code, the LLVM Transform needs to be a FunctionPass. This is achieved by publicly inheriting from the FunctionPass class as seen on line 25.
  4. LLVM uses the Visitor Design Pattern: the runOnFunction(Function &F) method (lines 59 -- 65) is invoked on each function, in some order. In this example, this function simply invokes runOnBasicBlock(BasicBlock &BB) for each BasicBlock in the Function, as seen on line 64. Lines 61 -- 63 add the final printf to print bbCounter to the end of the 'exit' BasicBlock of the main() function ensuing that the last statement is a ReturnInst (this is an approximate hack that works sufficiently for our purpose).
  5. Next, the function runOnBasicBlock(BasicBlock &BB) does the important task of creating and adding the LLVM IR instructions to CreateLoad to load bbCounter into a temporary variable loadAddr, CreateAdd() to add 1 to bbCounter, CreateStore() to store the new value to addAddr and store addAddr into bbCounter. Note that the temporary variables are used in accordance with SSA form which requires that each variable be assigned to exactly once. Observe the use of the IRBuilder class. This class provides the APIs to create various types of instructions and add them to specific locations of a BasicBLock ensuring that the inserted code is still SSA compliant.
  6. Finally, doFinalization(Module &M) can be used to perform any clean up before the visitor pattern finishes visiting the current Module.
  7. Observe that all the functions return boolean; the functions should return true if they modify the Module/Function/BasicBlock/Instruction and false otherwise.
  8. The addFinalPrintf(...) method creates an invocation to printf via the CreateCall2() method.
  9. You can use M.dump() or BB.dump() to debug your instrumentation: it prints the corresponding entity to stdout for you to inspect visually.
  1. #include "llvm/Pass.h"
  2. #include "llvm/IR/Module.h"
  3. #include "llvm/IR/Function.h"
  4. #include "llvm/IR/BasicBlock.h"
  5. #include "llvm/IR/IRBuilder.h"
  6. #include "llvm/Support/raw_ostream.h"
  7. #include "llvm/IR/Type.h"
  8.  
  9. using namespace llvm;
  10.  
  11. namespace {
  12. // https://github.com/thomaslee/llvm-demo/blob/master/main.cc
  13. static Function* printf_prototype(LLVMContext& ctx, Module *mod) {
  14. std::vector<Type*> printf_arg_types;
  15. printf_arg_types.push_back(Type::getInt8PtrTy(ctx));
  16.  
  17. FunctionType* printf_type = FunctionType::get(Type::getInt32Ty(ctx), printf_arg_types, true);
  18. Function *func = mod->getFunction("printf");
  19. if(!func)
  20. func = Function::Create(printf_type, Function::ExternalLinkage, Twine("printf"), mod);
  21. func->setCallingConv(CallingConv::C);
  22. return func;
  23. }
  24. struct BasicBlocksDemo : public FunctionPass {
  25. static char ID;
  26. LLVMContext *Context;
  27. BasicBlocksDemo() : FunctionPass(ID) {}
  28. GlobalVariable *bbCounter = NULL;
  29. GlobalVariable *BasicBlockPrintfFormatStr = NULL;
  30. Function *printf_func = NULL;
  31.  
  32. //----------------------------------
  33. bool doInitialization(Module &M) {
  34. errs() << "\n---------Starting BasicBlockDemo---------\n";
  35. Context = &M.getContext();
  36. bbCounter = new GlobalVariable(M, Type::getInt32Ty(*Context), false, GlobalValue::InternalLinkage, ConstantInt::get(Type::getInt32Ty(*Context), 0), "bbCounter");
  37. const char *finalPrintString = "BB Count: %d\n";
  38. Constant *format_const = ConstantDataArray::getString(*Context, finalPrintString);
  39. BasicBlockPrintfFormatStr = new GlobalVariable(M, llvm::ArrayType::get(llvm::IntegerType::get(*Context, 8), strlen(finalPrintString)+1), true, llvm::GlobalValue::PrivateLinkage, format_const, "BasicBlockPrintfFormatStr");
  40. printf_func = printf_prototype(*Context, &M);
  41.  
  42. errs() << "Module: " << M.getName() << "\n";
  43.  
  44. return true;
  45. }
  46.  
  47. //----------------------------------
  48. bool doFinalization(Module &M) {
  49. errs() << "-------Finished BasicBlocksDemo----------\n";
  50.  
  51. return false;
  52. }
  53. //----------------------------------
  54. bool runOnFunction(Function &F) override {
  55. errs() << "Function: " << F.getName() << '\n';
  56.  
  57. for(auto &BB: F) {
  58. // Add the footer to Main's BB containing the return 0; statement BEFORE calling runOnBasicBlock
  59. if(F.getName().equals("main") && isa<ReturnInst>(BB.getTerminator())) { // major hack?
  60. addFinalPrintf(BB, Context, bbCounter, BasicBlockPrintfFormatStr, printf_func);
  61. }
  62. runOnBasicBlock(BB);
  63. }
  64.  
  65. return true; // since runOnBasicBlock has modified the program
  66. }
  67.  
  68. //------------------------- ---------
  69. bool runOnBasicBlock(BasicBlock &BB) {
  70. errs() << "BasicBlock: " << BB.getName() << '\n';
  71. IRBuilder<> IRB(BB.getFirstInsertionPt()); // Will insert the generated instructions BEFORE the first BB instruction
  72.  
  73. Value *loadAddr = IRB.CreateLoad(bbCounter);
  74. Value *addAddr = IRB.CreateAdd(ConstantInt::get(Type::getInt32Ty(*Context), 1), loadAddr);
  75. IRB.CreateStore(addAddr, bbCounter);
  76.  
  77. for(auto &I: BB)
  78. errs() << I << "\n";
  79.  
  80. return true;
  81. }
  82.  
  83. //----------------------------------
  84. // Rest of this code is needed to: printf("%d\n", bbCounter); to the end of main, just BEFORE the return statement
  85. // For this, prepare the SCCGraph, and append to last BB?
  86. void addFinalPrintf(BasicBlock& BB, LLVMContext *Context, GlobalVariable *bbCounter, GlobalVariable *var, Function *printf_func) {
  87. IRBuilder<> builder(BB.getTerminator()); // Insert BEFORE the final statement
  88. std::vector<Constant*> indices;
  89. Constant *zero = Constant::getNullValue(IntegerType::getInt32Ty(*Context));
  90. indices.push_back(zero);
  91. indices.push_back(zero);
  92. Constant *var_ref = ConstantExpr::getGetElementPtr(var, indices);
  93.  
  94. Value *bbc = builder.CreateLoad(bbCounter);
  95. CallInst *call = builder.CreateCall2(printf_func, var_ref, bbc);
  96. call->setTailCall(false);
  97. }
  98. };
  99. }
  100.  
  101. char BasicBlocksDemo::ID = 0;
  102. static RegisterPass X("bbdemo", "BasicBlocksDemo Pass", false, false);

Path Profiling using LLVM: Setup and Program Template

The primary mechanism to use LLVM for program inspection and transformation is to write and LLVM pass. Optimizations or instrumentations with LLVM require an LLVM Transform in the form of an LLVM Plugin. The following instructions will help you setup your computer for writing your own LLVM plugin. You will complete the probject by filling in the details of the appropriate functions in a plugin template as described below.

These instructions have been tested and verified on Debian/Ubuntu Linux and Mac OS X operating systems.
  1. Install clang on your computer. On Ubuntu, sudo apt-get install clang will install the most recent version.
  2. Please try to install Clang version 3.4 as it has been tested and verified.
  3. Create a Directory named Workspace in your home directory: mkdir -p ~/Workspace && cd ~/Workspace
  4. Download LLVM source codefrom http://llvm.org/releases/3.6.0/llvm-3.6.0.src.tar.xz to ~/Workspace/ [LLVM is NOT installed on the CS servers].
  5. Extract the sources: cd ~/Workspace && tar xvf llvm-3.6.0.src.tar.xz && mv llvm-3.6.0.src llvm
  6. Build LLVM from source: cd ~/Workspace/llvm && ./configure && make. If your computer has spare CPUs, you can run make -j 4 for faster builds.
  7. Download the skeleton LLVM plugin from here and extract CS201PathProfiling to ~/Workspace/llvm/lib/Transforms/
  8. Fill n the details for doInitialization(Module &M), runOnFunction(Function &F) and doFinalize(Module &M) methods
  9. Build and test the plugin using LLVM's opt tool on the sample input programs using the buildAndTest.sh command on the sample 'test' input in the support folder: cd ~/Workspace/llvm/lib/Transforms/CS201PathProfiling && ./buildAndTest.sh test

Output Requirements

Instrumented programs should output the number of times each edge, and Ball-Larus path (for innermost loops) was executed during the run of the program. Since there are many reasonable ways this can be done, we are going to leave the specifics up to you. There are however, some broad requirements.

First, when you report that a path has a specific execution count, it must be clear which path in the CFG you are talking about. This way, we will be able to identify whether or not you have identified the correct paths. Second, the values assigned to each of the edges in the Ball-Larus paths by the algorithm given in the paper must be clear. If these can't be determined from the output of your pass and the instrumented program, your score will be penalized accordingly.


Here is an example of a C program that we might want to instrument:

Example C Program

  1. void function_1(unsigned x) {
  2. ENTRY:
  3. if (!(x > 0)) goto EXIT;
  4. if (x % 4 == 0) {
  5. --x;
  6. }
  7. else {
  8. --x;
  9. }
  10. goto ENTRY;
  11. EXIT:
  12. return;
  13. }
  14.  
  15. int main() {
  16. function_1(100);
  17. return 0;
  18. }

You will develop an optimization pass as follows. You first find all the innermost loops, then assign edge values to the edges therein using the Ball-Larus algorithm, and insert instrumentation to increment counters to capture edge frequency whenever that edge is traversed. Below is the output of this optimization pass. It prints out each basic block, assigning it a unique identifier. It also prints out the innermost loops that were discovered, as well values assigned to the edges in those innermost loops by the Ball-Larus algorithm. For the sake of readability, the instrumentation to maintain the path counter and print out the path values has been elided.
Output of Profiled Program contains edges and paths frequencies.

Example Instrumentation Output

Function: function_1
BasicBlock: b0

  %1 = alloca i32, align 4
  store i32 %x, i32* %1, align 4
  br label %b1

BasicBlock: b1

  %3 = load i32, i32* %1, align 4
  %4 = icmp ugt i32 %3, 0
  br i1 %4, label %b3, label %b2

BasicBlock: b2

  br label %b7

BasicBlock: b3

  %7 = load i32, i32* %1, align 4
  %8 = urem i32 %7, 4
  %9 = icmp eq i32 %8, 0
  br i1 %9, label %b4, label %b5

BasicBlock: b4

  %11 = load i32, i32* %1, align 4
  %12 = add i32 %11, -1
  store i32 %12, i32* %1, align 4
  br label %b6

BasicBlock: b5

  %14 = load i32, i32* %1, align 4
  %15 = add i32 %14, -1
  store i32 %15, i32* %1, align 4
  br label %b6

BasicBlock: b6

  br label %b1

BasicBlock: b7

  ret void

Innermost Loops: {b1,b3,b4,b5,b6}
Edge values: {(b1,b3,0),(b3,b4,0),(b3,b5,1),(b4,b6,0),(b5,b6,0)}

Function: main
BasicBlock: b0

  %1 = alloca i32, align 4
  store i32 0, i32* %1
  call void @function_1(i32 100)
  ret i32 0

Innermost Loops: {} 
Edge values: {}

Finally, here is the output of the instrumented program. Notice that each path count is labeled with the ID number of the basic block at the head of the path, as well as the path number as assigned by the Ball-Larus algorithm.

Example Output of Profiled Program

EDGE PROFILING:
b0 -> b1: 1
b1 -> b2: 1
b1 -> b3: 100
b2 -> b7: 1
b3 -> b4: 25
b3 -> b5: 75
b4 -> b6: 25
b5 -> b6: 75
b6 -> b1: 100
PATH PROFILING:
Path_b1_0: 25
Path_b1_1: 75

Example C Programs for Testing

Additional C programs that you can use to test your code can be found in the support directory of the project folder.

Submission

The submission should be a tar file named 'username1_username2.tar' (where 'username' is your CSE account username) emailed to Arash Alavi aalav003@ucr.edu. Also include the Name and Student-ID of each member of your submission group in your submission email. The tar file must contain:
  1. The finished CS201PathProfiling/, similar to the one that you downloaded and used to complete this assignment.
  2. The folder extracted from your submission should work as described above in Path Profiling using LLVM: Setup and Program Template, using the buildAndTest.sh script provided as part of the template.
  3. A README listing the Name, Email and Student ID of each member of the group. Your README should also contain instructions for compiling your pass and running it on an arbitrary C program.

Reporting problems

Please read this document in its entirety before you email to: aalav003@ucr.edu.