Data Structures in OpenBSD

Note: This project is being pursued by MSc student Adam Britt in 2011-2012.

  • Supervisor Dr. Joseph Kiniry
  • Subject Area Software Engineering
  • Pre-requisite Good knowledge of C and operating systems
  • Co-requisite (things you must learn along the way) Basic knowledge of compilers, standardized APIs (e.g., POSIX APIs, the OpenBSD kernel API, the standard C library, etc.)
  • Subject Coverage Software Engineering, Formal Specifications, Documentation, Operating Systems
  • Project Type Reviewing existing documentation (i.e., standards documents, man pages, and code comments) and writing formal specification of (portions of) critical, key C APIs
  • Hardware/Software: Any machine running OpenBSD

Description

OpenBSD is an operating system known for its strong security practices and uncompromising principles. It also has a long history, being forked in 1995 from netbsd, which itself is descended from Berkeley Software Distribution (BSD). BSD was initially started in 1977.

While this software has shown that it stands the test of time, there are many changes and improvements in the seemingly unrelated fields of data structure design and hardware, since the original kernel was written. While the kernel is still in active development, the data structures used by the kernel have not been a primary point of focus.

Memory was a hard constraint in the 1980's that heavily influenced how much space a data structure could use, often trading efficiency for space. This constraint is almost non-existent today allowing us to consider data structures that existed 30 years ago but that used too much memory.

Furthermore, there have been many improvements or new data structures that might make sense to use in the kernel. Most notably is the development of pure data structures.

The state of data structures can be summed up by an OpenBSD committer: "There are lots of very simple structures, like lists, still being used. The hash table implementation is also weak sauce."

The project goal is to analyse the existing data structures used by the kernel to see if improvements can be made either in terms of security, efficiency, or both. This is an area where even moderate increases in efficiency can have a large impact on overall system performance because data structures are so fundamental to many parts of the kernel.

Mandatory

  1. Gain familiarity with OpenBSD.
  2. Gain familiarity with tools like GrammaTech's CodeSurfer.
  3. Analyze the OpenBSD kernel to find what data structures are being used and where.
  4. A study of the current relevant literature in both data structures and (OpenBSD) kernel development
  5. Evaluate the OpenBSD kernel to find where data structure changes could have the largest impact.
  6. Design and implement appropriate data structures where a change would have the largest impact.