Type systems are usually formulated as collections of rules for
checking the ``shape consistency'' of programs. This kind of
checking exposes not only trivial mental slips, but also deeper
conceptual errors, which frequently manifest as type errors.
At a larger scale, module systems formalize the shapes of the
interfaces between components of a large software project and
automatically verify their consistency. By making interfaces
explicit, they also help document the structure of the system and
guide the design process. In conventional programming languages such
as Ada, the module system provides straightforward mechanisms for
packaging components and abstracting away from internal
representations of data. Newer languages like Standard ML go further,
supporting the explicit construction and type-checking of more general
procedures for constructing large-scale program components from
others. On the other hand, object-oriented languages such as C++ use
objects and classes for both small- and medium-scale
program structuring, combining aspects of type and module systems in a
single mechanism. Newer object-oriented languages such as Java also
provide rudimentary support for larger-scale packaging of groups of
classes.
Research Themes
My own research in type and module systems began with an exploration
of the connections between object-oriented abstraction and the data
abstraction supported by conventional module systems. Much of the
technical work (by myself and others) involved developing an
appropriate mathematical framework for modeling both objects and basic
abstract data types. My current interest is in
enriching this framework so that other aspects of module systems, such
as the control of name spaces and shared substructures, can also be
expressed. An important parallel effort is the development of
practical type inference techniques for the rich type systems used in
these models. The long-term goal is the design of new type and module
systems for practical programming languages and environments, in which
objects and classes can coexist with a variety of other mechanisms for
structuring very large-scale programs.
Another focus in my recent work has been the use of static type systems in concurrent programming based on message passing, in particular concurrent object-oriented programming. My previous research in this area (in collaboration with Sangiorgi, Kobayashi, and Turner, among others) has shown how type systems with features such as subtyping and linearity information can be ``transplanted'' from sequential languages, yielding useful tools for static detection of common errors. More surprisingly, these type systems also give rise to new techniques for reasoning about concurrent programs. My current interests concern the role of parametric polymorphism in modular reasoning about concurrent systems. By capturing formally the intuition that a generic function or server process must behave uniformly in the type of the data it manipulates, we should be able to reduce significantly the number of states that must be considered when reasoning about a concurrent system. The most important application area for such improvements is in model checking, where the size of state spaces is a crucial concern. In the longer term, I am interested in type systems capable of ensuring various forms of noninterference between concurrent activities. The ultimate aim is to assemble these fragments into a rich type discipline and an associated ``logical toolbox'' of more general manual and semi-automatic reasoning techniques.
An underlying theme in my work is the interweaving of theoretical modeling and pragmatic experimentation. I have used a variety of tiny programming calculi for this purpose---in particular, Church's lambda-calculus forms a framework for type systems of sequential languages, while Milner's pi-calculus plays a similar role in my work on concurrent languages. The value of these calculi as research tools lies in the fact that they are simultaneously media in which to experiment with useful programming idioms and styles, intermediate languages for compilers, and tractable mathematical objects whose properties can be investigated rigorously.
An important experimental vehicle for my research has been the language Pict, conceived in 1993 (with Turner) as a testbed for statically typed concurrent programming using objects. Pict combines a clean operational semantics based directly on the pi-calculus with an ambitious type system extending my earlier theoretical work on type systems for objects and classes. The pi-calculus also serves as the intermediate language for an optimizing compiler. Current work on Pict centers on its use as the core of a family of experimental languages ("nomadic pict") supporting various forms of code mobility---transmission of computational agents between the nodes on a distributed system or across the internet. Basing these experiments on a language with both a firm theoretical grounding and a full-scale implementation will enable a systematic and pragmatically grounded exploration of some of the huge space of ``languages for programming the world wide web,'' ranging from the simple client-server architectures typical of present-day web languages like Java to much richer programming models incorporating migration of running agents, support for fault tolerance, and flexible security policies. A number of interesting typing issues arise in this context---for example, the problem of checking static consistency of interfaces in a long-lived, distributed, and persistent environment.
My recent move to Penn has led to a number of opportunities for collaborative projects in novel areas for me. In one recent line of work, Peter Buneman and I have defined a novel type analysis for ``semistructured database query languages'' based on an unusual notion of untagged union types. In another, Insup Lee, Trevor Jim, and I are working on refining and generalizing a file synchronization system that has been a side project of my own for the past two years. We have the system running now on both Windows and Unix (it is the first such tool capable of handling both platforms), and are considering how to extend it to the domain of synchronizing replicated information on the Web.
Another recent project, dubbed TinkerType, concerns modular presentation of collections of type systems. The goal is to construct a ``map'' of a large variety of existing type systems, showing their family relations and the ways in which some can be derived by incremental modification -- ``inheritance'' -- from others. Over the past few months, we have developed a formal language for describing fragments of type systems and built a tool for assembling type systems (both TeX descriptions of inference rules and running typecheckers) from their constituent fragments. The first product of this project will be a graduate-level textbook on type systems. In the longer term, our ambition is to construct not just type systems but also proofs of standard properties in an incremental manner.