Designing code analyses for large software systems (DECA)

3 CP, 2 SWS

When? Wednesday on 9:50-11:30
Where? Room Berlin/B 0.03 in the ground floor of the new building of the Fraunhofer SIT (Rheinstraße 75)

NOTE: Starting Wed, Oct. 29 we will have to change to a new time and place:
Building S1/03, room 221, every Wednesday at 11:40-13:20

Our apologies for any inconvenience caused.

Exam: March 10, 2014, 16:00, S101/A1 (Audimax) This will be a 90 minutes closed-book exam. Dictionaries are allowed but no other aids. The exam will be in English, but answers in German will be permitted.

The topic of this lecture is the automated (static) code analysis of large software systems, particularly with respect to security properties. We will be treating important scientific problems in the area (partially solved, partially open), and will discuss different conceptual frameworks that can be used to design and implement automated code analyses. We will be paying particular attention to flow- and context sensitive analyses, as well as pointer analyses.

NOTE: This lecture is the successor of the lecture ACA which was held in winter 2013/14. There is one significant difference between DECA and ACA: In ACA, the biweekly exercises were of a practical nature, where students learned how to implement program analyses as well. With DECA, the exercise sheets are now more theoretical and can be answered on paper. The practical exercises now instead happen within the lab ICA. If you choose to attend DECA, then we highly recommend choosing ICA as well!

Asking questions

If you have questions regarding the lecture, please use the Forum kindly provided by the Fachschaft. We will be monitoring the forum regularly.

SVN: Slides and other material, submission of coursework

All course material will be distributed through SVN. This is the URL for the SVN:

https://repository.st.informatik.tu-darmstadt.de/sse/deca/2014/

Exercise sheets and bonus system

There is a bonus system that allows you to improve your final grade by up to 1.0 points, depending on how well you succeed in the exercises.

  • There will be 4 exercise sheets, and all count towards the bonus.
  • Every exercise on every sheet will be stating the number of points that can be earned for that exercise. Depending on the level of completion of an exercise, the students may be awarded all points or a fraction of those.
  • Every exercise sheet will give a maximum of 15 points.
  • The final bonus will be computed by normalizing the number of points awarded against the number of total points that can potentially be awarded, i.e., 60 points (5 sheets with up to 15 points each) and rounding up to the next grading level. The maximum bonus is 1.0.
  • The final grade for this course will be computed by adding your bonus to the grade for the final exam.

Exercises are usually due two days before the lecture at which the next sheet is given out, i.e., on Monday night. This is to allow us to discuss the results of the exercises in the next lecture. The due date will also be printed on each sheet. Hand in your results using SVN. Submissions by Email will not be accepted.

Recommended reading

For further details, we recommend the following two books:

  • Data flow analysis : theory and practice (Khedker et al)
  • Principles of Program Analysis (Flemming et al.)

Week 1: Oct. 15th – Instructor: Eric Bodden
Kick-Off and Jimple IR

Kick-off:

  • Formation of groups
  • Description of course outline

New material:

  • The Jimple Intermediate Representation: From source code or bytecode to control-flow graphs.

The slides for the first lecture are available here.

Additional reading material:

Week 2: Oct. 22rd – Instructor: Eric Bodden
Intra-Procedural Static Analysis

The slides are available here.

The first exercise sheet is available here.
NOTE: This was updated on Oct. 22, 17:46, due to a tiny mistake. There was one line of assembler missing in Exercise 1.

Additional reading material:

Week 3: Oct. 29th – Instructor: Eric Bodden
Call-graph construction

Additional reading material:

The slides can be accessed through SVN.

Week 4: Nov. 5th – Instructor: Eric Bodden
Points-to Analysis

  • Flow-sensitive vs. flow-insensitive
  • May- vs. must-analysis
  • Weak and strong updates

Additional reading material:

Week 5: Nov. 12th – Instructor: Mauro Baluda
Inter-procedural analysis

  • General design space of inter-procedural analysis
  • Flow-sensitivity or not?
  • Context-sensitivity or not?
  • Practical considerations
  • Discussion of first exercise sheet, presentation of sheet 2

Week 6: Nov. 19th – Instructor: Karim Ali
Call-strings approach

  • Context-sensitive analysis using the call-strings approach
  • Required treatment of recursion

Additional reading material:

Week 7: Nov. 26th – Instructor: Eric Bodden
Functional approach

  • Concept of summary functions
  • Non-feasibility of constructing general procedure summaries

Additional reading material:

Week 8: Dec. 3rd – Instructor: Eric Bodden
IFDS

  • Constructing summaries for distributive data-flow problems
  • Iterated discussion of solution to sheet 1, presentation of sheet 3

Additional reading material:

Week 9: Dec. 10th – Instructor: Eric Bodden
IDE

  • Added expressiveness and performance through IDE
  • Discussion of exercise sheet 2

Additional reading material:

Week 10: Dec. 17th – Instructor: Eric Bodden
SPLlift

  • Software product lines
  • SPLlift as an example IDE instance
  • Limitations of IDE

Additional reading material:

Week 11: Jan. 14 – Instructor: Eric Bodden
Tabulation of non-distributive analyses using VASCO

  • Recap on functional approach and call-strings approach
  • Taking the best out of both worlds, using VASCO
  • Tradeoffs between VASCO and IFDS

Additional reading material:

Week 12: Jan. 21 – Instructor: Eric Bodden
FlowTwist: Vulnerability analysis for the Java runtime library

  • The problem behind CVE-2012-4681 and CVE-2013-0422
  • How to detect this problem during static analysis
  • Scalability challenges and remidies

Additional reading material:

Week 13: Jan. 28 – Instructor: Steven Arzt
FlowDroid: Effective taint analysis for Android applications

  • Challenges in analyzing Android applications
  • On-demand alias analysis
  • Maintaining context sensitivity in on-demand analyses

Additional reading material:

Week 14: Feb. 4th – Instructor: Eric Bodden
Taming reflection in static analysis

  • Problem of reflection and dynamic loading
  • How to make static analyses aware of reflective calls and dynamically loaded code
  • Empirical assessment

Additional reading material:

Week 15: Feb. 11th – Instructor: Eric Bodden
Recap

We will recap the most important aspects of this course.

March 10, 2014, 16:00, S101/A1 (Audimax)
Exam