CS 201 Project
Handed Out: Apr. 25, 2018
Due: May 25, 2018
Grade Weight: 30% of total course grade
This assignment can be done individually, or in a group of two.
Contents
Project Description
The goal of this project is to understand and implement the basic program analysis and instrumentation techniques. In this project, you are required to implement the following profiling methods:
-
Edge Profiling: Edge profiling collects the execution frequency of each control-flow edge executed during a specific program run.
-
Path profiling: You are to implement a subset of Ball-Larus's efficient path profiling algorithm.
Given the IR for a piece of code, Ball-Larus developed an algorithm for instrumenting the IR in such a way that executing the instrumented IR code collects the frequencies of Ball-Larus paths in the original IR code.
Given a function, Ball-Larus paths are a set of acyclic control flow paths that start at the function ENTRY node or a LOOP ENTRY node and terminate at the function EXIT node or a loop BACKEDGE.
For a more precise definition of a Ball-Larus path, and the instrumentation algorithm, please refer to the following paper:
You need to collect the frequencies of only those Ball-Larus paths
that begin at an innermost loop's ENTRY and terminate at the loop's
BACKEDGE. You are not required to implement the optimizations
described in the paper. For programs with nested loops, you need to
first detect all the loops, next identify the innermost loops, and then
instrument to collect frequencies only of paths within the innermost loops.
You will carry out this project as follows:
- Parse the input file, construct the control flow graph, and identify all the loops (please implement loop construction algorithm -- e.g., algorithm in Control Flow Analysis slide -- otherwise your score might be penalized accordingly).
- Instrument the code to find the execution frequency of control flow edges, and paths according to the algorithm given in the
above paper (only for Ball-Larus paths at innermost loops).
- Run the instrumented code and output the profile.
More details about the languages, tools, and output requirements can be found in the following:
The LLVM Intermediate Representation (IR)
LLVM provides a framework for writing
compiler passes that operate (e.g., to optimize or instrument) upon LLVM
Intermediate Representation language. A detailed definition of the LLVM
IR can be found in the LLVM Language Reference Manual. Below is an example of a simple C program, and the corresponding LLVM IR.
An unconventional hello world in C
#include <stdio.h>
int main() {
int x = 5;
printf("Hello World; x = %d\n", x);
return 0;
}
The Corresponding LLVM IR
The LLVM IR for the above source code can be obtained by running: $ clang -emit-llvm hello.c -S -o hello.bc
; ModuleID = 'support/hello.c'
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.10.0"
@.str = private unnamed_addr constant [21 x i8] c"Hello World; x = %d\0A\00", align 1
; Function Attrs: nounwind ssp uwtable
define i32 @main() #0 {
%1 = alloca i32, align 4
%x = alloca i32, align 4
store i32 0, i32* %1
store i32 5, i32* %x, align 4
%2 = load i32* %x, align 4
%3 = call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([21 x i8]* @.str, i32 0, i32 0), i32 %2)
ret i32 0
}
declare i32 @printf(i8*, ...) #1
attributes #0 = { nounwind ssp uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.ident = !{!0}
!0 = metadata !{metadata !"Apple LLVM version 6.0 (clang-600.0.57) (based on LLVM 3.5svn)"}
Getting Started with LLVM
The pre-requisites for working with LLVM are as follows:
- Familiarity with C++; Additionally, knowledge of certain C++11 features like auto can help better understand examples in the LLVM documentation.
- Template programming with C++
You should be comfortable reading programs that use templates. You
should also be comfortable using templates. It is not necessary for you
to know how to write template functions yourself.
- Object Oriented Programming in C++
Again, you should be comfortable reading and using object oriented
code, but you do not necessarily need to know how to write object
oriented code yourself.
- Inheritance in C++
The LLVM libraries are designed with an object oriented philosophy.
They make extensive use of inheritance and polymorphism. You should be
comfortable reading and using code that employs these techniques.
- C++ Standard Template Library
LLVM makes extensive use of the C++ Standard Template Library (STL)
as well. In addition, the design philosophy of LLVM closely mirrors that
of the STL in many respects. You should be comfortable with the design
paradisgms of the STL, such as iterators, streams, etc.
- On-demand faimilarity with the LLVM Language Reference
- Upfront faimiliarity with the LLVM Programmer's Manual
- Cursory understanding of the Visitor Design Pattern
- Familiarity with compiler terminology like BasicBlocks etc.
Working Example of Program Instrumentation using LLVM
We now describe the use of LLVM for program inspection and transformation, by writing an LLVM pass. Writing an LLVM Pass outlines the basic writing of an LLVM transform and a simplified example is provided here.
Programs in LLVM IR code are organized as a four-level hierarchy: Module, Function, BasicBlock and Instruction. That is, each LLVM program is composed of a Modules, one for each translation unit. A Module is a collection of Functions; a Function is a collection of BasicBlocks and a BasicBlock is a set of LLVM IR Instructions in SSA form.
This example counts the cumulative number of times BasicBlocks were encountered during program execution in a newly introduced global integer named bbCounter, which is incremented at the start of every BasicBlock.
The highlevel algorithm to do this is outlined as:
Create a GlobalVariable bbCounter;
foreach Module M:
foreach Funcion f in M:
foreach BasicBlock b in F:
Add code to increment bbCounter;
At the end of main(), print bbCounter;
This algorithm also outlines the basic structure of most LLVM transformations.
Code Walkthrough
- To create a GlobalVariable, we can do so in the Module that contains the Functions whose BasicBlocks we are interested in. This can be done in the bool doInitialization(Module &M) function which is invoked once when the Module is first encountered. This is shown in lines 34 -- 46.
- Line 37 creates an integer GlobalVariable named bbCounter with InternalLinkage in the Context of the current Module.
- To iterate over all functions in the code, the LLVM Transform needs to be a FunctionPass. This is achieved by publicly inheriting from the FunctionPass class as seen on line 25.
- LLVM uses the Visitor Design Pattern: the runOnFunction(Function &F) method (lines 59 -- 65) is invoked on each function, in some order. In this example, this function simply invokes runOnBasicBlock(BasicBlock &BB) for each BasicBlock in the Function, as seen on line 64. Lines 61 -- 63 add the final printf to print bbCounter to the end of the 'exit' BasicBlock of the main() function ensuing that the last statement is a ReturnInst (this is an approximate hack that works sufficiently for our purpose).
- Next, the function runOnBasicBlock(BasicBlock &BB) does the important task of creating and adding the LLVM IR instructions to CreateLoad to load bbCounter into a temporary variable loadAddr, CreateAdd() to add 1 to bbCounter, CreateStore() to store the new value to addAddr and store addAddr into bbCounter. Note that the temporary variables are used in accordance with SSA form which requires that each variable be assigned to exactly once. Observe the use of the IRBuilder class. This class
provides the APIs to create various types of instructions and add them to specific locations of a BasicBLock ensuring that the inserted code is still SSA compliant.
- Finally, doFinalization(Module &M) can be used to perform any clean up before the visitor pattern finishes visiting the current Module.
- Observe that all the functions return boolean; the functions should return true if they modify the Module/Function/BasicBlock/Instruction and false otherwise.
- The addFinalPrintf(...) method creates an invocation to printf via the CreateCall2() method.
- You can use M.dump() or BB.dump() to debug your instrumentation: it prints the corresponding entity to stdout for you to inspect visually.
- #include "llvm/Pass.h"
- #include "llvm/IR/Module.h"
- #include "llvm/IR/Function.h"
- #include "llvm/IR/BasicBlock.h"
- #include "llvm/IR/IRBuilder.h"
- #include "llvm/Support/raw_ostream.h"
- #include "llvm/IR/Type.h"
-
- using namespace llvm;
-
- namespace {
- // https://github.com/thomaslee/llvm-demo/blob/master/main.cc
- static Function* printf_prototype(LLVMContext& ctx, Module *mod) {
- std::vector<Type*> printf_arg_types;
- printf_arg_types.push_back(Type::getInt8PtrTy(ctx));
-
- FunctionType* printf_type = FunctionType::get(Type::getInt32Ty(ctx), printf_arg_types, true);
- Function *func = mod->getFunction("printf");
- if(!func)
- func = Function::Create(printf_type, Function::ExternalLinkage, Twine("printf"), mod);
- func->setCallingConv(CallingConv::C);
- return func;
- }
-
- struct
span>BasicBlocksDemo : public FunctionPass {
- static char ID;
- LLVMContext *Context;
- BasicBlocksDemo() : FunctionPass(ID) {}
li>
- GlobalVariable *bbCounter = NULL;
- GlobalVariable *BasicBlockPrintfFormatStr = NULL;
- Function *printf_func = NULL;
-
- //----------------------------------
- bool doInitialization(Module &M) {
- errs() << "\n---------Starting BasicBlockDemo---------\n";
- Context = &M.getContext();
- bbCounter = new
span>GlobalVariable(M, Type::getInt32Ty(*Context), false, GlobalValue::InternalLinkage, ConstantInt::get(Type::getInt32Ty(*Context), 0), "bbCounter"
span>);
- const char *finalPrintString = "BB Count: %d\n";
- Constant *format_const = ConstantDataArray::getString(*Context, finalPrintString);
- BasicBlockPrintfFormatStr = new GlobalVariable(M, llvm::ArrayType::get(llvm::IntegerType::get(*Context, 8), strlen(finalPrintString)+1), true, llvm::GlobalValue
span>::PrivateLinkage, format_const, "BasicBlockPrintfFormatStr");
- printf_func = printf_prototype(*Context, &M);
-
- errs() << "Module: " << M.getName() << "\n";
-
- return true;
- }
-
- //----------------------------------
- bool doFinalization(Module &M) {
- errs()
span><< "-------Finished BasicBlocksDemo----------\n";
-
- return false;
- }
-
- //----------------------------------
- bool runOnFunction(Function &F) override {
- errs() << "Function: " << F.getName() << '\n';
-
- for(auto &BB: F) {
- // Add the footer to Main's BB containing the return 0; statement BEFORE calling runOnBasicBlock
li>
- if(F.getName().equals("main") && isa<ReturnInst>(BB.getTerminator())) { // major hack?
- addFinalPrintf(BB, Context, bbCounter, BasicBlockPrintfFormatStr, printf_func);
- }
- runOnBasicBlock(BB);
- }
-
- return true; // since runOnBasicBlock has modified the program
- }
-
- //-------------------------
---------
- bool runOnBasicBlock(BasicBlock &BB) {
- errs() << "BasicBlock: " << BB.getName() << '\n';
- IRBuilder<> IRB
span>(BB.getFirstInsertionPt()); // Will insert the generated instructions BEFORE the first BB instruction
-
- Value *loadAddr = IRB.CreateLoad(bbCounter);
- Value *addAddr = IRB.CreateAdd(ConstantInt::get(Type::getInt32Ty(*Context), 1), loadAddr);
- IRB.CreateStore(addAddr, bbCounter);
-
- for(auto &I: BB)
- errs() << I << "\n";
-
- return true;
- }
-
- //----------------------------------
- // Rest of this code is needed to: printf("%d\n", bbCounter); to the end of main, just BEFORE the return statement
- // For this, prepare the
SCCGraph, and append to last BB?
- void addFinalPrintf(BasicBlock& BB, LLVMContext *Context, GlobalVariable *bbCounter, GlobalVariable *var, Function *printf_func)
span>{
- IRBuilder<> builder(BB.getTerminator()); // Insert BEFORE the final statement
- std::vector<Constant*> indices;
- Constant *zero = Constant::getNullValue(
span>IntegerType::getInt32Ty(*Context));
- indices.push_back(zero);
- indices.push_back(zero);
- Constant *var_ref = ConstantExpr::getGetElementPtr(var, indices
span>);
-
- Value *bbc = builder.CreateLoad(bbCounter);
- CallInst *call = builder.CreateCall2(printf_func, var_ref, bbc);
- call->
span>setTailCall(false);
- }
- };
- }
-
- char BasicBlocksDemo::ID = 0;
- static RegisterPass X("bbdemo", "BasicBlocksDemo Pass", false, false);
Path Profiling using LLVM: Setup and Program Template
The primary mechanism to use LLVM for program inspection and
transformation is to write and LLVM pass. Optimizations or
instrumentations with LLVM require an LLVM Transform
in the form of an LLVM Plugin. The following instructions will help you
setup your computer for writing your own LLVM plugin. You will complete
the probject by filling in the details of the appropriate functions in a
plugin template as described below.
These instructions have been tested and verified on Debian/Ubuntu Linux and Mac OS X operating systems.
- Install clang on your computer. On Ubuntu, sudo apt-get install clang will install the most recent version.
Please try to install Clang version 3.4 as it has been tested and verified.
- Create a Directory named Workspace in your home directory: mkdir -p ~/Workspace && cd ~/Workspace
- Download LLVM source codefrom http://llvm.org/releases/3.6.0/llvm-3.6.0.src.tar.xz to ~/Workspace/ [LLVM is NOT installed on the CS servers].
- Extract the sources:
cd ~/Workspace && tar xvf llvm-3.6.0.src.tar.xz && mv llvm-3.6.0.src llvm
- Build LLVM from source:
cd ~/Workspace/llvm && ./configure && make
. If your computer has spare CPUs, you can run make -j 4 for faster builds.
- Download the skeleton LLVM plugin from here and extract CS201PathProfiling to
~/Workspace/llvm/lib/Transforms/
- Fill n the details for
doInitialization(Module &M), runOnFunction(Function &F) and doFinalize(Module &M)
methods
- Build and test the plugin using LLVM's opt tool on the sample input programs using the buildAndTest.sh command on the sample 'test' input in the support folder: cd ~/Workspace/llvm/lib/Transforms/CS201PathProfiling && ./buildAndTest.sh test
Output Requirements
Instrumented programs should output the number of times each edge, and Ball-Larus path (for innermost loops) was executed during the run of the program. Since there are many
reasonable ways this can be done, we are going to leave the specifics up
to you. There are however, some broad requirements.
First, when you report that a path has a specific execution count, it
must be clear which path in the CFG you are talking about. This way, we
will be able to identify whether or not you have identified the correct
paths. Second, the values assigned to each of the edges in the
Ball-Larus paths by the algorithm given in the paper must be clear. If
these can't be determined from the output of your pass and the
instrumented program, your score will be penalized accordingly.
Here is an example of a C program that we might want to instrument:
Example C Program
- void function_1(unsigned x) {
- ENTRY:
- if (!(x > 0)) goto EXIT;
- if (x % 4
span> == 0) {
- --x;
- }
- else {
- --x;
- }
- goto ENTRY;
- EXIT:
-
span>return;
- }
-
- int main() {
- function_1(100);
- return 0;
- }
You will develop an optimization pass as follows. You first find all the innermost loops, then assign edge values to the edges therein using the Ball-Larus algorithm, and insert instrumentation to increment counters to
capture edge frequency whenever that edge is traversed. Below is the output of this optimization pass. It prints out each basic block, assigning it a unique identifier.
It also prints out the innermost loops that were discovered, as well values assigned to the edges in those innermost loops by the Ball-Larus algorithm. For the sake of readability, the instrumentation to maintain the path counter and print out the path values has been elided.
Output of Profiled Program contains edges and paths frequencies.
Example Instrumentation Output
Function: function_1
BasicBlock: b0
%1 = alloca i32, align 4
store i32 %x, i32* %1, align 4
br label %b1
BasicBlock: b1
%3 = load i32, i32* %1, align 4
%4 = icmp ugt i32 %3, 0
br i1 %4, label %b3, label %b2
BasicBlock: b2
br label %b7
BasicBlock: b3
%7 = load i32, i32* %1, align 4
%8 = urem i32 %7, 4
%9 = icmp eq i32 %8, 0
br i1 %9, label %b4, label %b5
BasicBlock: b4
%11 = load i32, i32* %1, align 4
%12 = add i32 %11, -1
store i32 %12, i32* %1, align 4
br label %b6
BasicBlock: b5
%14 = load i32, i32* %1, align 4
%15 = add i32 %14, -1
store i32 %15, i32* %1, align 4
br label %b6
BasicBlock: b6
br label %b1
BasicBlock: b7
ret void
Innermost Loops: {b1,b3,b4,b5,b6}
Edge values: {(b1,b3,0),(b3,b4,0),(b3,b5,1),(b4,b6,0),(b5,b6,0)}
Function: main
BasicBlock: b0
%1 = alloca i32, align 4
store i32 0, i32* %1
call void @function_1(i32 100)
ret i32 0
Innermost Loops: {}
Edge values: {}
Finally, here is the output of the instrumented program. Notice that
each path count is labeled with the ID number of the basic block at the
head of the path, as well as the path number as assigned by the
Ball-Larus algorithm.
Example Output of Profiled Program
EDGE PROFILING:
b0 -> b1: 1
b1 -> b2: 1
b1 -> b3: 100
b2 -> b7: 1
b3 -> b4: 25
b3 -> b5: 75
b4 -> b6: 25
b5 -> b6: 75
b6 -> b1: 100
PATH PROFILING:
Path_b1_0: 25
Path_b1_1: 75
Example C Programs for Testing
Additional C programs that you can use to test your code can be found in the support directory of the project folder.
Submission
The submission should be a tar file named 'username1_username2.tar'
(where 'username' is your CSE account username) emailed to
Arash Alavi aalav003@ucr.edu. Also include the Name and Student-ID of each member of your submission group in
your submission email. The tar file must contain:
- The finished CS201PathProfiling/, similar to the one that you downloaded and used to complete this assignment.
- The folder extracted from your submission should work as described above in Path Profiling using LLVM: Setup and Program Template, using the buildAndTest.sh script provided as part of the template.
- A README listing the Name, Email and Student ID of each member of
the group. Your README should also contain instructions for compiling
your pass and running it on an arbitrary C program.
Reporting problems
Please read this document in its entirety before you email to:
aalav003@ucr.edu.