An OpenBSD Super-Lint
- Supervisor Dr. Joseph Kiniry
- Subject Area Software Engineering
- Pre-requisites Very good knowledge of C
- Co-requisites (things you must learn along the way) Good knowledge of UNIX (generally) and OpenBSD (specifically), static program checking, and OS-level systems programming
- Subject Coverage Programming Languages, Compilers, Static Analysis
- Project Type Design and Implementation
- Hardware Any computer running OpenBSD
Description
OpenBSD is widely regarded as the world's most secure operating system. It represents the combined effort of hundreds of dedicated individuals, most of whom work in their spare time for free, over nearly a decade.
OpenBSD developers try very hard to write code that is well-designed, clean, maintainable, but above all, secure. A number of techniques are used for such, but most of them are relatively ad hoc when compared to the kind of program development that that KindSoftware research group regularly participates in.
One way to further improve the quality and security of OpenBSD code is to use various types of static analysis to reason about the code to (perhaps automatically) ensure that it conforms to some form of (perhaps lightweight, or entirely implicit) specification.
A number of tools exist to provide this kind of functionality. The most rudimentary of them is called "lint" and has been around for nearly twenty years. Lint statically checks C code for standard programming errors, but is not very "smart".
Other, much more advanced, static analysis tools have been developed over the years. One of the most mature of these tools was originally called LCLint. It has seen a number of different implementations, the most recent of which is called Splint.
The main problem with static analysis of arbitrary C code is that (a) C's semantics are under-defined, and (b) aliasing and pointer manipulation make reasoning impossible in many situations. This means that tools like Splint are necessarily incomplete, as they cannot reason about arbitrary C code and thus many problems in the code cannot be detected, and unsound, as some of the problems that these tools identify are not in fact errors.
But OpenBSD C code is not arbitrary. The OpenBSD developers have strict coding conventions that significantly restricts the use of C programming constructs. These conventions can thus significantly simplify the semantic domain, thereby improving the chances of developing a sound and complete automated reasoning framework.
This project focuses on (a) developing local expertise in these tools, (b) formalising the conventions used by OpenBSD developers, and (c) synthesising these two results.
Mandatory Goals
- Gain familiarity with OpenBSD.
- Gain familiarity with lightweight static checkers for C like the lint tool.
- Evaluate several research static checkers for systems-level C code.
- Analyse and formalise some of the core OpenBSD development methodologies that simplify the domain semantics for automated reasoning.
- Design, implement, and test a framework for static checking systems-level C code in OpenBSD using a combination of existing FLOSS tools and new techniques and incorporating the results of the previous step.
- Release the framework and tools to the world under an appropriate FLOSS license, hopefully so that it will be incorporated into, and used by, OpenBSD.
Discretionary Goals
- Analyse a single critical OpenBSD component with the new tool.
- Support framework and tool indefinitely and help kick-start community for such.
- Coauthor, submit, and publish a paper on this work.
Exceptional Goals
- Analyse a several critical OpenBSD components with the new tool.
Sources of Information and Preparatory Reading
- OpenBSD
- The Splint FAQ
- Static Analysis Tools
- Look into CodeSurfer/C and CodeSurfer/x86