Effectiveness of Different Test Case Prioritization Methods Based on Coverage Criteria
CS 206 Project, Fall 2012
Handed Out: October 30, 2012
Due Date: November 29, 2012
Grade Weight: 15% of total course grade
Note: This project can be completed in groups of two.
Overview
In this project, you will compare the effectiveness of different kinds of test case prioritization
methods in exposing faults via multiple coverage criteria. You are provided with a set of benchmark
programs, a set of tests cases for each benchmark, and a set of faults for each benchmark.
- You will collect the coverage information for each test case.
- For each benchmark program, you will use test case
prioritization methods to create test suites according to the different
coverage criteria.
- For each created test suite, you will evaluate its fault-exposing
potential by identifying the set of available faults that are exposed by
the test suite.
- Finally, you will report on your experimental results and observations.
Download Required Files
First, download the required files for this project here: benchmarks.tar.gz (download size: approximately 1.8 MB)
Next, uncompress the downloaded file by executing the command: "tar -xvzf benchmarks.tar.gz
".
Benchmark Programs
You will be provided with a set of 7 relatively small benchmark programs written in C
. Each program is associated with a set of faults and a set of test cases:
Program Name |
# of Available Faults |
# of Available Test Cases |
tcas |
41 |
1590 |
totinfo |
23 |
1026 |
schedule |
9 |
2634 |
schedule2 |
9 |
2679 |
printtokens |
7 |
4072 |
printtokens2 |
9 |
4057 |
replace |
31 |
5542 |
General Directory Structure for Benchmarks
All benchmark programs are contained within a folder named "benchmarks
". Within this folder, there is a subdirectory for each of the seven benchmark programs.
Within the subdirectory for a benchmark program, you will find the following folders and files:
- The benchmark program
The benchmark program is specified as a .c
file, and possibly one or more associated .h
files.
Example: Program tcas
is located in the file "benchmarks/tcas/tcas.c
" (there are no associated .h
files for this program).
- The set of faults
The faults are specified as a set of "faulty versions" that are associated with each benchmark program (one fault per faulty version). Each faulty version is contained within a folder that is named with the letter "v
",
followed by the number of the faulty version. The program code for each
faulty version is identical to that of the (original) benchmark
program, except for a
slight modification to the code that represents the fault.
Example: Program tcas
is associated with 41 different faulty versions, contained in the folders named "benchmarks/tcas/v1
" through "benchmarks/tcas/v41
".
- The set of test cases
The test cases are specified in the file "universe.txt
" (one test case per line).
Example: The test cases for program tcas
are contained in the file "benchmarks/tcas/universe.txt
".
- Test case input file directories
Any other folders that may be present contain input files that
are used by the associated test cases. You don't need to do anything
with these folders except to make sure they are present in the current
working directory when running the benchmark program.
Compiling and Running Each Benchmark Program
All benchmark programs take some input values (either command-line
parameters, the name of an input file, or both), and produce output that
is written to the standard output stream (by default, the screen). The
following table shows how to compile each program, how to execute the
compiled program using the first test case specified in the associated "universe.txt
"
test case file, and the names of any input file directories that are
required by the associated test cases. Note that the test cases assume
that any necessary input file directories are contained in the current
working
directory when running the benchmark program.
Program Name |
How to Compile |
Example Command to Execute Program |
Input File Directories |
tcas |
gcc -g -o tcas tcas.c |
tcas 1258 1 0 897 174 7253 1 629 500 0 0 1 |
(none) |
totinfo |
gcc -g -o totinfo totinfo.c -lm |
totinfo < universe/jkADl.mat |
universe |
schedule |
gcc -g -o schedule schedule.c |
schedule 5 1 1 < input/dat027 |
input |
schedule2 |
gcc -g -o schedule2 schedule2.c |
schedule2 5 1 1 < input/dat027 |
input |
printtokens |
gcc -g -o printtokens printtokens.c |
printtokens < inputs/newtst122.tst |
inputs |
printtokens2 |
gcc -g -o printtokens2 printtokens2.c |
printtokens2 < inputs/newtst122.tst |
inputs |
replace |
gcc -g -o replace replace.c -lm |
replace '@|' 'E)m' < input/ruin.1373 |
input , moni , temp-test |
Coverage Criteria
We will consider 2 different kinds of coverage criteria in this project.
- Statement coverage. Statement coverage is the selection of tests so that every statement has been executed at least once.
- Branch coverage. Branch coverage is a requirement according to which, for
each conditional branch in the program (such as due to if statements, loops etc.), the
branch condition must have been true at least once and false at least once during testing.
Obtaining Coverage Information
For each test case associated with each benchmark program, you have to
collect the coverage information for each of the two coverage criteria
listed above. The coverage information can be collected by using a UNIX
tool called gcov.
Here are some useful links for the usage of gcov:
Test Case Prioritization
Test case prioritization techniques schedule test cases in an execution order
according to some criterion. Here is the prioritization methods you will use:
- Random prioritization. Randomly order the test cases in a test suite.
- Total Coverage prioritization. Prioritize test cases according to the total
number of statements/branches they cover simply by sorting them in order of their total
statement/branch coverage.
- Additional Coverage prioritization. Iteratively perform the following two
steps until all statements/branches are covered by at least one test case:
(i) select a test case that yields the greatest additional statement/branch coverage; and
(ii) then adjust the coverage information on subsequent test cases to indicate their
coverage of statements/branches not yet covered.
If you want to read more about the above methods you may refer to the following papers
for more details.
References:
- S. Elbaum, A. G. Malishevsky, and G. Rothermel, Test Case Prioritization: A Family of Empirical Studies. IEEE Transactions on Software Engineering, 2002.
- B. Jiang, Z. Zhang, W. K. Chan, and T. H. Tse, Adaptive Random Test Case Prioritization. IEEE/ACM International Conference on Automated Software Engineering, 2009 (ASE'09).
- G. Rothermel, R. H. Untch, C. Chu, and M. J. Harrold, Test Case Prioritization: An Empirical Study. IEEE International Conference on Software Maintenance, 1999 (ICSM'99).
Detailed Requirements
The following tasks will need to be performed to complete this project.
- Create the following test suites
- For each benchmark program, create six test suites using the combinations
of two coverage criteria (statements and branches) and three prioritization methods
(random, total, and additional).
- Each test suite should be "adequate" with respect to the given coverage
criterion. That is, the test suite should cover all of the entities that are covered
by the entire set of available test cases. The test suite you generate will be
significantly smaller than the set of all available test cases. For example, even
though there may be over 2000 test cases available for a benchmark program, a test
suite may only need 10 or 15 test cases in order to become adequate for some particular
coverage criterion.
- Evaluate the fault-exposing potential of each test suite
- Compile each benchmark program, and compile each of the faulty versions associated with each benchmark program.
- For each created test suite for each benchmark program,
identify the set of faults that are exposed by the test suite. A fault
is considered to be "exposed" by a test suite if there exists at least
one test case in the test suite that causes the program to produce different
output when run on the faulty version, as compared to when run on (the
original version of) the program. For example, even though there may be
over 20 total faults available for a benchmark program, it may turn out
that a particular test suite is only able to expose 2 or 3 of these
faults.
- Report your experimental results and observations
- Create a data table listing your experimental results. The
table should show, for each created test suite associated with each
benchmark program, the size of the test suite (number of test cases),
and the number of faults exposed by the test suite.
- Write about any interesting observations you can make from
your experimental results. How small are your test suites as compared to
the original number of available test cases? How do the suite sizes
change according to different coverage criteria? How many faults are
exposed by your test suites as compared to the total number of available
faults? Which coverage criteria seem to be the least/most effective at
being able to expose faults? What other conclusions can you draw from
your observations?
Implementation and Submission
Programming Languages to Use
We would prefer that you use either C
/C++
or Java
to implement your solution to this project. However, if you choose to
use another language (e.g., script languages), that is OK too. Whatever
language you choose to use, please also submit along with your programs,
a description file (README) for how to compile and run them.
Items to Submit
Please submit the following items for this project (Please put all your
files into one directory and compress it into a zip/tgz/tar.gz file;
please do not include executables in your submission) to ltan003@cs.ucr.edu:
- The report you have prepared regarding your experimental results and observations, according to the above detailed requirements.
- The test suites you have created for each benchmark program.
- The programs you wrote in order to get your experimental results,
along with a README file to describe how to compile and run each
program.