Dynamic Program Analysis
for Secure and Reliable Computing
Rajiv Gupta & Neelam Gupta
Funded By
-
National Science Foundation, CNS, Computer Systems Research Program,
CNS-0751961/0719791, 9/2007-8/2009.
-
National Science Foundation, CNS, Computer Research Infrastructure Program,
CNS-0751949/0708199, 9/2007-8/2008.
-
National Science Foundation, CNS, Computer Systems Research Program,
CCF-0810906/0614707, 9/2006-8/2009.
-
National Science Foundation, CCF, Computer Processes and Artifacts Cluster,
CCF-0753470/0541382, 1/2006-1/2009.
-
Microsoft Research, Phoenix/SSCLI program, Redmond, Washington
4/2006-3/2007.
-
IBM, Eclipse Innovation Award,
1/2005-12/2005.
-
Microsoft Research, Redmond, Washington
9/2003-8/2006.
-
IBM, Eclipse Innovation Award,
1/2003-12/2003.
Ph.D. Students
-
Dennis Jeffrey
-
Vijayanand Nagarajan
-
Chen Tian
-
Sriraman Tallam -- graduated 2007
-
Xiangyu Zhang -- graduated 2006
-
Youtao Zhang -- graduated 2002
-
Clara Jaramillo -- graduated 2000
-
Ras Bodik -- graduated 1999
-
Evelyn Duesterwald -- graduated 1996
Description
The goal of this project is to develop scalable analysis techniques that make use of dynamic
information from program runs as well as some limited static information. The analysis developed is
being exploited to improve the Security and Reliability of software. We have developed
techniques for collecting and storing large amounts of profile data and mining
this data for information useful in automated fault location and automated testing
of software. More specifically our contributions include the following:
-
Scalable Collection of Dynamic Information is being achieved through integrated use of
checkpointing and logging as well as fine-grained tracing of software. We have developed a two
step approach in which during initial program execution checkpointing and logging is used.
If the program exhibits incorrect behavior, the log is analyzed to identify portion of
execution that needs to traced using the replay phase to collect information useful in
debugging. We have developed the Whole Execution Trace representation that integrates
many different kinds of profiles that have been used for characterizing program behavior into
one single representation. This includes address, value, control flow, as well as data and control
dependence profiles. Collection, storage, and analysis of complete profiles is a challenging task.
We has developed a compact and rapidly traversible representation of whole execution profiles.
Whole execution traces produced by execution of a couple of billion instructions require a gigabyte
of storage. We have further scaled this approach so it can be applied to long running programs
by proposing Execution Reduction, a technique that combines checkpointing/logging with
fine-grained tracing.
-
Dynamic Slicing & Matching are powerful analysis that are useful for many applications.
The large amounts of information collected to enable computation of dynamic program slices can
easily cause the machine to run out of memory. In our recent work we have experimentally evaluated
various precise and imprecise dynamic slicing algorithms to show the drawbacks of imprecise slicing.
We have shown that whole execution trace representation enables rapid dynamic program slicing
of realistic program runs without running out of memory. Dynamic matching of whole execution
traces of two program versions is being carried out to enable a wide range of applications including
those of piracy detection, some debugging scenarios, and software maintenance. Unlike some static
approaches for matching program versions, our approach does not require access to source code of the
two program versions. Rich program execution histories enable us to rapidly produce highly accurate
matches. Our work differs from static differencing techniques as it enables matching of code
sequences that dynamically behave the same even though they statically appear to be different.
-
Fault Location & Correction.
After developing practical implementations of dynamic slicing we have employed it for fault
location. Three different types of dynamic slices are being used for fault location: backward
slices of erroneous values; forward slices of minimal fault inducing inputs; and
bidirectional slices of critical predicates. We have also developed techniques for
pruning dynamic slices based upon positive evidence indicating that some of the statements
in the dynamic slices may be expected to be correct because they are responsible for producing
correct results. Location of faults is also important for security because faults can
often be used to launch attacks. We have also developed a new approach to locate and
correct an erroneous statement in a faulty function. Our approach combines ideas from
software testing and weakest preconditions used in correctness proof methods to locate a likely
erroneous statement. Our recent work is aimed at developing techniques for automatically
identifying corrupted program state and determining corrections to the corrupted state.
Our results show that this approach performs fault location with a very high degree of precision.
-
Software Testing is an important activity undertaken to improve the reliability of software.
We have developed a greedy heuristic algorithm that iteratively exploits the implications
among the test cases and implications among the requirements to generate a minimized suite
that covers the same set of requirements as that of the unminimized suite. To address the
problem of fault loss due to removing test cases from test suite, we have also developed a
technique to selectively retain coverage redundant test cases that are likely to expose new faults.
We have developed a new structural coverage criterion called invariant-coverage criterion for
generating inputs to detect likely program invariants. We have developed numerical iterative
techniques for automatically generating test data for a given path in a program.
-
Multithreading and Parallelism pose additional challenges for monitoring program executions.
We have developed scalable dynamic analysis technique for efficiently tracing long-running
multithreaded programs running on a single processor core. Monitoring of parallel programs being
run on multiple processors for applications, such as data race detection and dynamic information
flow tracking, require atomic updates of data being maintained during monitoring. We have developed
techniques for guaranteeing atomicity at a low execution time cost. The availability of additional
cores is also being exploited to reduce the monitoring overhead of parallel applications.
Software
Publications
Multithreading & Parallelism
-
S. Tallam, C. Tian, and R. Gupta
Dynamic Slicing of Multithreaded Programs for Race Detection,
International Conference on Software Maintenance (ICSM),
Beijing, China, September 2008.
-
C. Tian, V. Nagarajan, R. Gupta, and S. Tallam
Dynamic Recognition of Synchronization Operations for Improved Data Race Detection,
International Symposium on Software Testing and Analysis (ISSTA),
Seattle, July 2008.
-
V. Nagarajan and R. Gupta
Support for Symmetric Shadow Memory in Multiprocessors,
Workshop on Parallel and Distributed Systems: Testing, Analysis, and Debugging (PADTAD),
9 pages, Seattle, July 2008.
-
S. Tallam, C. Tian, R. Gupta, and X. Zhang
Avoiding Program Failures Through Safe Execution Perturbations,
IEEE Computer Software and Applications Conference (COMPSAC),
Turku, Finland, August 2008.
-
C. Tian, V. Nagarajan, and R. Gupta
Synchronization Aware Conflict Resolution for Runtime Monitoring Using Transactional Memory,
Workshop on Software Tools for Multicore Systems (STMCS),
colocated with CGO, pages 1-6, Boston, April 2008.
-
R. Gupta, N. Gupta, X. Zhang, D. Jeffrey, V. Nagarajan, S. Tallam, and C. Tian
Scalable Dynamic Information Flow Tracking and its Applications,
NSF Next Generation Software Workshop (NSFNGS),
colocated with IPDPS, pages 1-5, Florida, April 2008.
-
V. Nagarajan, H-S. Kim, Y. Wu, and R. Gupta
Dynamic Information Flow Tracking on Multicores,
Workshop on Interaction between Compilers and Computer Architectures (INTERACT),
Co-located with HPCA, Salt Lake City, Feb. 2008.
-
S. Tallam, C. Tian, X. Zhang, and R. Gupta
Enabling Tracing of Long-Running Multithreaded Programs via Dynamic Execution Reduction,
International Symposium on Software Testing and Analysis (ISSTA),
pages 207-218, London, July 2007.
Scalable Profiling
-
V. Nagarajan, D. Jeffrey, R. Gupta, and N. Gupta
ONTRAC: A System for Efficient ONline TRACing for Debugging,
International Conference on Software Maintenance (ICSM),
pages 445-454, Paris, September 2007.
-
X. Zhang, N. Gupta, and R. Gupta,
``Whole Execution Traces and Their Use in Debugging,''
The Compiler Design Handbook: Optimizations and Machine Code Generation,
Second Edition, Chapter 4, CRC Press, pages 4-1--4-17, Dec. 2007.
-
S. Tallam and R. Gupta,
Unified Control Flow and Dependence Traces,
ACM Transactions on Architecture and Code Optimization (TACO),
Vol. 4, No. 3, 31 pages, September 2007.
-
Y. Lin, Y. Zhang, and R. Gupta,
The Design and Evaluation of Path Matching Schemes on Compressed Control Flow Traces,
Journal of Systems and Software (JSS),
Vol. 80, No. 3, pages 396-409, 2007.
-
X. Zhang and R. Gupta,
Whole Execution Traces and their Applications,
ACM Transactions on Architecture and Code Optimization (TACO),
Vol. 2, No. 3, pages 301-334, Sept. 2005.
-
S. Tallam, R. Gupta, and X. Zhang,
Extended Whole Program Paths,
International Conference on Parallel Architectures and Compilation Techniques (PACT),
pages 17-26, St. Loius, Missouri, September 2005.
-
X. Zhang and R. Gupta,
Whole Execution Traces,
IEEE/ACM 37th International Symposium on Microarchitecture (MICRO),
pages 105-116, Portland, Oregan, December 2004.
-
S. Tallam, X. Zhang, and R. Gupta,
Extending Path Profiling across Loop Backedges and Procedure Boundaries,
Second Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO),
pages 251-262, San Jose, CA, March 2004.
-
Y. Zhang and R. Gupta,
Timestamped Whole Program Path Representation and its Applications,
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI),
pages 180-190, Snowbird, Utah, June 2001.
Dynamic Slicing and Matching
-
V. Nagarajan, R. Gupta, X. Zhang, M. Madou, B. De Sutter, and K. De Bosschere
Matching Control Flow of Program Versions,
International Conference on Software Maintenance (ICSM),
pages 84-93, Paris, September 2007.
-
X. Zhang, S. Tallam, and R. Gupta
Dynamic Slicing Long Running Programs through Execution Fast Forwarding,
14th ACM SIGSOFT Symposium on Foundations of Software Engineering (FSE),
pages 81-91, Portland, Oregon, November 2006.
-
X. Zhang, R. Gupta, and Y. Zhang
Cost and Precision Tradeoffs of Dynamic Slicing Algorithms,
ACM Transactions on Programming Languages and Systems (TOPLAS),
Vol. 27, No. 4, pages 631-661, July 2005.
-
X. Zhang and R. Gupta,
Matching Execution Histories of Program Versions,
Joint 10th European Software Engineering Conference and
13th ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC-FSE),
pages 197-206, Lisbon, Portugal, September 2005.
-
X. Zhang and R. Gupta,
Cost Effective Dynamic Program Slicing,
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI),
pages 94-106, Washington D.C., June 2004.
-
X. Zhang, R. Gupta, and Y. Zhang
Effective Forward Computation of Dynamic Slices Using Reduced Ordered Binary Decision Diagrams,
IEEE/ACM International Conference on Software Engineering (ICSE),
pages 502-511, Edinburgh, UK, May 2004.
-
Recipient of ICSE 2003 Distinguished Paper Award.
X. Zhang, R. Gupta, and Y. Zhang
Precise Dynamic Slicing Algorithms,
IEEE/ACM International Conference on Software Engineering (ICSE),
pages 319-329, Portland, Oregon, May 2003.
-
Nominated for Best Paper Award.
N. Gupta and P. Rao,
``Program Execution Based Module Cohesion Measurement,''
16th IEEE International Conference on Automated Software Engineering (ASE),
pages 144-153, San Diego, USA, November 2001.
-
R. Gupta and M.L. Soffa,
Hybrid Slicing: An Approach for Refining
Static Slices using Dynamic Information,
ACM SIGSOFT 3rd Symposium on the Foundations of Software
Engineering (FSE),
pages 29-40, Washington, DC, October 1995.
Extended version in ACM Transactions on Software Engineering
and Methodology (TOSEM),
Vol. 6, No. 4, pages 370-397, October 1997
(with J. Howard).
-
E. Duesterwald, R. Gupta, and M.L. Soffa,
Distributed Slicing and Partial Re-execution for Distributed Programs,
5th Workshop on Languages and Compilers for Parallel Computing (LCPC),
LNCS 757 Springer Verlag,
pages 497-511, Yale University, New Haven, Connecticut, August 1992.
Fault Location and Correction
-
D. Jeffrey, N. Gupta, and R. Gupta
Identifying the Root Causes of Memory Bugs Using Corrupted Memory Location Suppression,
International Conference on Software Maintenance (ICSM),
Beijing, China, September 2008.
-
D. Jeffrey, N. Gupta, and R. Gupta
Fault Localization using Value Replacement,
International Symposium on Software Testing and Analysis (ISSTA),
Seattle, July 2008.
-
N. Gupta and R. Gupta
ExPert: Dynamic Analysis based Fault Location via Execution Perturbations,
NSF Next Generation Software Workshop (NSFNGS),
colocated with IPDPS, pages 1-6, March 2007.
-
X. Zhang, S. Tallam, N. Gupta, and R. Gupta
Towards Locating Execution Omission Errors,
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI),
pages 415-424, San Diego, June 2007.
-
X. Zhang, N. Gupta, and R. Gupta,
Locating Faulty Code By Multiple Points Slicing,
Software - Practice & Experience (SP&E) Journal,
Vol. 37, Issue 9, pages 935-961, July 2007.
-
X. Zhang, N. Gupta, and R. Gupta
A Study of Effectiveness of Dynamic Slicing in Locating Real Faults,
Empirical Software Engineering (ESE) Journal,
Vol. 12, No. 2, pages 143-160, April 2007.
-
X. Zhang, N. Gupta, and R. Gupta
Pruning Dynamic Slices With Confidence,
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI),
pages 169-180, Ottawa, Canada, June 2006.
-
X. Zhang, N. Gupta, and R. Gupta
Locating Faults Through Automated Predicate Switching,
IEEE/ACM International Conference on Software Engineering (ICSE),
pages 272-281, Shanghai, China, May 2006.
-
N. Gupta, H. He, X. Zhang, and R. Gupta,
Locating Faulty Code Using Failure-Inducing Chops,
20th IEEE/ACM International Conference on Automated Software Engineering (ASE),
pages 263-272, Long Beach, California, Nov. 2005.
-
X. Zhang, H. He, N. Gupta, and R. Gupta
Experimental Evaluation of Using Dynamic Slices for Fault Location,
SIGSOFT-SIGPLAN Sixth International Symposium on Automated and Analysis-Driven Debugging (AADEBUG),
pages 33-42, Monterey, California, September 2005.
-
H. He and N. Gupta,
``Automated Debugging using Path-Based Weakest Preconditions,''
Fundamental Approaches to Software Engineering (FASE),
ETAPS Joint Conference 2004, Barcelona, Spain, March 29-31, 2004.
-
X. Zhang and R. Gupta,
Hiding Program Slices for Software Security,
First Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO),
pages 325-336, San Francisco, CA, March 2003.
-
C. Jaramillo, R. Gupta, and M.L. Soffa,
Comparison Checking: An Approach to Avoid Debugging of Optimized Code,
Joint 7th European Software Engineering Conference and
7th ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC-FSE),
LNCS 1687, Springer Verlag, pages 268-284, Toulouse, France, Sept. 1999.
Software Testing
-
D. Jeffrey and N. Gupta,
``Experiments with Test Case Prioritization Using Relevant Slices,"
Journal on Systems and Software (JSS), for a Special Issue on Model-Based
Software Testing,
vol. 81, no. 2, pages 196-221, Feb. 2008.
-
D. Jeffrey and N. Gupta,
``Improving Fault Detection Capability by
Selectively Retaining Test Cases During Test Suite Reduction,''
IEEE Transactions on Software Engineering,
pages 108-123, vol. 33, no. 2, Feb. 2007.
-
Recipient of Best Paper Award.
D. Jeffrey and N. Gupta,
``Prioritizing Test Cases Using Relevant Slices,''
30th Annual International Computer Software and Applications Conference (COMPSAC),
pages 411-418, Chicago, USA, September 18-21, 2006.
-
S. Tallam and N. Gupta,
``A Concept Analysis Inspired Greedy Algorithm for Test Suite Minimization,''
ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE),
Lisbon, Portugal, September 5-6, 2005.
-
D. Jeffrey and N. Gupta,
``Test Suite Reduction with Selective Redundancy,''
21st IEEE International Conference on Software Maintenance (ICSM),
pages 549-558, Budapest, Hungary, September 25-30, 2005.
-
N. Gupta, Y-J. Cho, and M.Z. Hossain,
``Experiments with UNA for Solving Linear Constraints in Real Variables,''
ACM Symposium on Applied Computing (SAC),
Nicosia, Cyprus, March 14-17, 2004.
-
N. Gupta and Z.V. Heidepriem,
``A New Structural Coverage Criteria for Dynamic Detection of Program Invariants,''
18th IEEE International Conference on Automated Software Engineering (ASE),
Montreal, Quebec, Canada, October 6-10, 2003.
-
N. Gupta,
``Generating Test Data for Dynamically Discovering Likely Program Invariants,''
ICSE 2003 Workshop on Dynamic Analysis (WODA),
Portland, Oregon, USA, May 3-10, 2003.
-
S. Visvanathan and N. Gupta,
``Generating Test Data for Functions with Pointer Inputs,''
17th IEEE International Conference on Automated Software Engineering (ASE),
pages 149-160, Edinburgh, UK, September 2002.
-
N. Gupta and R. Gupta,
``Data Flow Testing,''
The Compiler Design Handbook: Optimizations and Machine Code Generation,
First Edition, Chapter 7, pages 247-267, CRC Press, September 2002.
-
N. Gupta
``A Hierarchy of Structural Coverage Criteria for Testing Multithreaded Programs,''
International Conference on Parallel and Distributed Processing
Techniques and Applications (PDPTA),
pages 1641-1646, Las Vegas, Nevada, USA, June 2001.
-
N. Gupta, A.P. Mathur, and M.L. Soffa,
``Generating Test Data for Branch Coverage,''
15th IEEE International Conference on Automated Software Engineering (ASE),
Grenoble, France, September 2000.
-
N. Gupta, A.P. Mathur, and M.L. Soffa,
``UNA Based Iterative Test Data Generation and its Evaluation,''
14th IEEE International Conference on Automated Software Engineering (ASE),
pages 224-232, Cocoa Beach, Florida, USA, October 1999.
-
N. Gupta, A.P. Mathur, and M.L. Soffa,
``Automated Test Data Generation Using an Iterative Relaxation Method,''
ACM SIGSOFT Sixth International Symposium on Foundations of Software Engineering (FSE),
pages 231-244, Orlando, Florida, USA, November 1998
-
M.J. Harrold, R. Gupta, and M.L. Soffa,
``A Methodology for Controlling the Size of a Test Suite,''
ACM Transactions on Software Engineering and Methodology (TOSEM),
Vol. 2, No. 3, pages 270-285, July 1993.
-
R. Gupta, M.J. Harrold, and M.L. Soffa,
An Approach to Regression Testing using Slicing,
IEEE-CS International Conference on Software Maintenance (ICSM),
pages 299-308, Orlando, Florida, November 1992.
Profile-Guided and Demand-Driven Analysis
-
R. Bodik, R. Gupta, and M.L. Soffa,
Retrospective -- Complete Removal of Redundant Expressions,
20 Years of the ACM/SIGPLAN Conference on Programming Language Design and
Implementation
(PLDI) (1979-1999): A Selection, ACM SIGPLAN Notices,
Vol. 39, No. 4, pages 596-597, April 2004.
-
R. Bodik, R. Gupta, and V. Sarkar,
ABCD: Eliminating Array Bounds Checks on Demand,
ACM SIGPLAN Conference on Programming Language Design and
Implementation (PLDI),
pages 321-333, Vancouver B.C., Canada, June 2000.
-
Selected for 20 Years of PLDI.
R. Bodik, R. Gupta and M.L. Soffa,
Complete Removal of Redundant Expressions,
ACM SIGPLAN Conference on Programming Language Design and
Implementation (PLDI),
pages 1-14, Montreal, Canada, June 1998.
-
R. Gupta, D. Berson, and J.Z. Fang,
Resource-Sensitive Profile-Directed Data Flow Analysis for
Code Optimization,
IEEE/ACM 30th International Symposium on Microarchitecture (MICRO),
pages 558-568, Research Triangle Park, North Carolina, December 1997.
-
E. Duesterwald, R. Gupta, and M.L. Soffa,
Demand-Driven Computation of Interprocedural Data Flow,
ACM SIGPLAN-SIGACT 22nd Symposium on
Principles of Programming Languages (POPL),
pages 37-48,
San Francisco, California, January 1995.
Also published in ACM Transactions on Programming
Languages and Systems (TOPLAS),
Vol. 19, No. 6,
pages 992-1030, November 1997.
-
R. Bodik, R. Gupta, and M.L. Soffa,
Refining Data Flow Information using Infeasible Paths,
Joint 6th European Software Engineering Conference and
5th ACM SIGSOFT Symposium on the Foundations of Software Engineering,
LNCS 1301, Springer Verlag, pages 361-377,
Zurich, Switzerland, September 1997.
-
R. Bodik and R. Gupta,
Partial Dead Code Elimination using Slicing Transformations,
ACM SIGPLAN Conference on Programming Language Design and
Implementation (PLDI),
pages 159-170, Las Vegas, Nevada, June 1997.
-
R. Gupta,
A Fresh Look at Optimizing Array Bound Checks,
ACM SIGPLAN Conference on Programming Language Design and
Implementation (PLDI),
pages 272-282, White Plains, NY, June 1990.
Also published in
ACM Letters on Programming Languages and Systems (LOPLAS),
Vol.2, Nos.1-4, pages 135-150, March-December 1994.
-
R. Gupta,
``Generalized Dominators and Post-Dominators,''
ACM SIGPLAN-SIGACT 19th Symposium on Principles of Programming Languages (POPL),
pages 246-257, Albuquerque, New Mexico, January 1992.