Implementing code analyses for large software systems (ICA)

6 CP, 4 SWS

When? Thursday on 9:50-11:30 (we may frequently end earlier)
Where? Room Berlin/B 0.03 in the ground floor of the new building of the Fraunhofer SIT (Rheinstraße 75)

In this lab students will learn to implement static code analyses for large software systems using the well known Soot framework for the static analysis of Java and Android applications. Over the course of the semester, students will learn to implement concrete program analyses addressing real problem statements taken from the area of IT security. To achieve this goal, the lecturers will request new features to be implemented every one to two weeks, which are discussed in a joint meeting. The meeting’s second goal is to clarify any questions students may have with the framework or their implementation. Students will learn how to (and will be required to) test their implementations regularly using provided test cases.

Important Note: We HIGHLY recommend attending the lecture Designing code analyses for large-scale software systems (DECA) at the same time, as this lab will assume knowledge about concepts taught in this lecture.

Structure of this course

This course happens concurrently to the lecture DECA. In this lecture, we will explain different important concepts of static program analyses for large-scale programs. In this lab you will be able to try out those concepts on your own. The main goal of this lab is to develop a static security analysis over the course of the entire term. Students will be working in teams of two or three students, and will refine and extend their analysis implementation from one week to the next, until a fully working implementation is reached at the end. To help students reach this goal, we will give helpful tips during the weekly meeting, in which students can also ask questions about problems with their current implementation. The final grade is based on the quality of the implementation that the student group provides at the end of the course.

Asking questions

If you have questions regarding the lecture, please use the Forum kindly provided by the Fachschaft. We will be monitoring the forum regularly.

SVN repository

Exercises will be handed out and in through this subversion repository:

First exercise sheet

The first exercise sheet is available here. Subsequent sheets will be available directly from the SVN repository.

Introduction slides

You can download the introduction slides here.

Grading scheme

Students will be graded, in part, individually. Grades will be awarded based on the following weights:

  • 50% for the completion of the exercise sheets (one grade per group)
  • 10% for the quality of the final presentation (one grade per group)
  • 20% for the functionality of the delivered code (does it work as asked?, one grade per group)
  • 10% for comprehensiveness of test cases (is the code well tested?, one grade per group)
  • 10% for individual report: every student will write a two-page report, which will be graded individually per student

In case of any suspicion of greatly unequal contributions by individual team members we reserve the right to question individual team members and weight the grades according to actual contributions.

Details on the presentation: The presentation should be given by one or maximally two students of the group. However, all students should contribute to preparing the slides. Questions can and will be asked to all group members. The presentations should be easy to understand for the other students participating in the course. The duration of the presentation is 20 minutes, plus 10 minutes for questions. The contents are to be discussed with your advisor. 50% of the presentation grade (i.e. 5% in total) will be based on grades given by the other student groups, and the other 50% will be assigned by the lecturer.

Details on the code: The code should solve the problem stated. What this means exactly will be specified by your adviser. Also, all non-trivial parts should be properly commented. A brief high-level documentation should ease navigation through the code for people unfamiliar with it.

Details on test cases: The code should be well tested through automated test cases. Discuss a testing strategy with your advisor. Make sure to include not only the most obvious but also tricky test cases.

Details on individual report:  The report must detail not only the general idea and high-level solution strategy, but also should explain in detail how you personally contributed to the project. The report should be two pages in ACM Alternate style (Option 2). There must be one report per team member. Reports must be written individually, copying of text among team members is not allowed.

Reports and the code must be handed in through SVN by March 31st, 23:59 local time.

Final presentations: The presentations will take place on March 31st, 9:00-13:00 at Fraunhofer SIT in Room Berlin/B 0.03 with the following schedule: UPDATED

Group# Project# Project Title
9:00 2 1 ASM backend for SOOT
9:30 6 2 Static Analysis of invokedynamic
10:00 14 4 Dead code elimination
10:30 15 minutes break
10:45 1 6 A Precise Sink model for Data Flow Analyses
11:15 13 7 Inter-Procedural Might Throw Analysis
11:45 12 3 Validation of Jimple bytecode before Dalvik conversion
12:15 4 5 Improved handling of reflective calls