Journal Articles
[1]
|
Stephen Mell, Steve Zdancewic, and Osbert Bastani.
Optimal Program Synthesis via Abstract Interpretation.
8(POPL), 2024.
[ PDF ]
We consider the problem of synthesizing programs with numerical constants that optimize a quantitative objective, such as accuracy, over a set of input-output examples. We propose a general framework for optimal synthesis of such programs in a given DSL, with provable optimality guarantees. Our framework enumerates programs in a general search graph, where nodes represent subsets of concrete programs. To improve scalability, it uses A* search in conjunction with a search heuristic based on abstract interpretation; intuitively, this heuristic establishes upper bounds on the value of subtrees in the search graph, enabling the synthesizer to identify and prune subtrees that are provably suboptimal. In addition, we propose a natural strategy for constructing abstract transformers for monotonic semantics, which is a common property for components in neurosymbolic DSLs. Finally, we implement our approach in the context of two existing DSLs for data classification, demonstrating that our algorithm is more scalable than existing optimal synthesizers.
|
[2]
|
Nicolas Chappe, Paul He, Ludovic Henrio, Yannick Zakowski, and Steve Zdancewic.
Choice Trees: Representing Nondeterministic, Recursive, and
Impure Programs in Coq.
Proceedings of the ACM on Programming Languages, 7(POPL), 2023.
[ PDF ]
This paper introduces Choice Trees (CTrees), a monad for modeling
nondeterministic, recursive, and impure programs in Coq.
Inspired by Xia et al.'s ITrees, this novel data structure
embeds computations into coinductive trees with three kind of
nodes: external events, and two variants of nondeterministic
branching. This apparent redundancy allows us to provide
shallow embedding of denotational models with internal choice
in the style of CCS, while recovering an inductive LTS view of
the computation. CTrees inherit a vast collection of
bisimulation and refinement tools, with respect to which we
establish a rich equational theory. We connect CTrees to the
ITrees infrastructure by showing how a monad morphism
embedding the former into the latter permits to use CTrees to
implement nondeterministic effects. We demonstrate the
utility of CTrees by using them to model concurrency semantics
in two case studies: CCS and cooperative multithreading.
|
[3]
|
Nick Rioux, Xuejing Huang, Bruno C. d. S. Oliviera, and Steve Zdancewic.
A Bowtie for a Beast: Overloading, Eta Expansion, and
Extensible Data Types in F.
Proceedings of the ACM on Programming Languages, 7(POPL), 2023.
[ PDF ]
The typed merge operator offers the promise of a compositional style of statically-typed programming in which solutions to the expression problem arise naturally. This approach, dubbed compositional programming, has recently been demonstrated by Zhang et al.
Unfortunately, the merge operator is an unwieldy beast. Merging values from overlapping types may be ambiguous, so disjointness relations have been introduced to rule out undesired nondeterminism and obtain a well-behaved semantics. Past type systems using a disjoint merge operator rely on intersection types, but extending such systems to include union types or overloaded functions is problematic: naively adding either reintroduces ambiguity. In a nutshell: the elimination forms of unions and overloaded functions require values to be distinguishable by case analysis, but the merge operator can create exotic values that violate that requirement.
This paper presents F, a core language that demonstrates how unions, intersections, and overloading can all coexist with a tame merge operator. The key is an underlying design principle that states that any two inhabited types can support either the deterministic merging of their values, or the ability to distinguish their values, but never both. To realize this invariant, we decompose previously studied notions of disjointness into two new, dual relations that permit the operation that best suits each pair of types. This duality respects the polarization of the type structure, yielding an expressive language that we prove to be both type safe and deterministic.
|
[4]
|
Mohsen Lesani, Li-Yao Xia, Anders Kaseorg, Christian J. Bell, Adam Chlipala,
Benjamin C. Pierce, and Steve Zdancewic.
C4: Verified Transactional Objects.
Proceedings of the ACM on Programming Languages, OOPSLA, 2022.
[ PDF ]
Transactional objects combine the performance of classical concurrent objects with the high-level programmability of transactional memory. However, verifying the correctness of transactional objects is tricky, requiring
reasoning simultaneously about classical concurrent objects, which guarantee the atomicity of individual
methodsÐthe property known as linearizabilityÐand about software-transactional-memory libraries, which
guarantee the atomicity of user-defined sequences of method callsÐor serializability.
We present a formal-verification framework called C4, built up from the familiar notion of linearizability
and its compositional properties, that allows proof of both kinds of libraries, along with composition of
theorems from both styles to prove correctness of applications or further libraries. We apply the framework
in a significant case study, verifying a transactional set object built out of both classical and transactional
components following the technique of transactional predication; the proof is modular, reasoning separately
about the transactional and nontransactional parts of the implementation. Central to our approach is the use
of syntactic transformers on interaction treesÐi.e., transactional libraries that transform client code to enforce
particular synchronization disciplines. Our framework and case studies are mechanized in Coq.
|
[5]
|
Irene Yoon, Yannick Zakowski, and Steve Zdancewic.
Formal Reasoning About Layered Monadic Interpreters.
Proceedings of the ACM on Programming Languages, 6(ICFP), 2022.
[ PDF ]
Monadic computations built by interpreting, or “handling,” operations of a
free monad, are a compelling formalism for modeling language semantics and
defining the behaviors of effectful systems, especially in dependent type
theories such as Coq's. The resulting layered semantics offer the promise of
modular reasoning principles based on the equational theory of the underlying
monads. However, there are a number of obstacles to using such layered
interpreters in practice. With more layers comes more boilerplate and glue
code needed to define the monads and interpreters involved. That overhead is
compounded by the need to define (and justify) the relational reasoning
principles that characterize the equivalences at each layer.
This paper addresses these problems by significantly extending the
capabilities of the Coq interaction trees (ITrees) library, which
supports layered monadic interpreters. We characterize a rich class of
“interpretable monads”---obtained by applying monad transformers to
ITrees---and show how to generically lift interpreters through them. We
also introduce a corresponding framework for relational reasoning about
“equivalence of monads up to a relation R.” This collection of
typeclasses, instances, new reasoning principles, and tactics greatly
generalizes the existing theory of the ITrees library, eliminating large
amounts of unwieldy boilerplate code and dramatically simplifying proofs.
|
[6]
|
Lucas Silver and Steve Zdancewic.
Dijkstra Monads Forever: Termination-Sensitive Specifications
for Interaction Trees.
Proceedings of the ACM on Programming Languages, 5(POPL),
January 2021.
[ PDF ]
This paper extends the Dijkstra monad framework, designed for writing specifications over effectful programs using monadic effects, to handle termination sensitive specifications over interactive programs. We achieve this by introducing base specification monads for non-terminating programs with uninterpreted events. We model such programs using interaction trees, a coinductive datatype for representing programs with algebraic effects in Coq, which we further develop by adding trace semantics. We show that this approach subsumes typical, simple proof principles. The framework is implemented as an extension of the Interaction Trees Coq library.
|
[7]
|
Yannick Zakowski, Calvin Beck, Irene Yoon, Ilya Zaichuk, Vadim Zaliva, and
Steve Zdancewic.
Modular, Compositional, and Executable Formal Semantics for LLVM
IR.
Proceedings of the ACM on Programming Languages, 5(ICFP), 2021.
[ PDF ]
This paper presents a novel formal semantics, mechanized in Coq, for a large, sequential subset of the LLVM IR. In contrast to previous approaches, which use relationally-specified operational semantics, this new semantics is based on monadic interpretation of interaction trees, a structure that provides a more compositional approach to defining language semantics while retaining the ability to extract an executable interpreter. Our semantics handles many of the LLVM IR’s non-trivial language features and is constructed modularly in terms of event handlers, including those that deal with nondeterminism in the specification. We show how this semantics admits compositional reasoning principles derived from the interaction trees equational theory of weak bisimulation, which we extend here to better deal with nondeterminism, and we use them to prove that the extracted reference interpreter faithfully refines the semantic model. We validate the correctness of the semantics by evaluating it on unit tests and LLVM IR programs generated by HELIX.
|
[8]
|
Paul He, Eddy Westbrook, Brent Carmer, Chris Phifer, Valentin Robert, Karl
Smeltzer, Andrei Andrei Ştefănescu, Aaron Tomb, Adam Wick, Matthew
Yacavone, and Steve Zdancewic.
A Type System for Extracting Functional Specifications from
Memory-Safe Imperative Programs.
Proceedings of the ACM on Programming Languages, OOPSLA, 2021.
[ PDF ]
Verifying imperative programs is hard. A key difficulty is that the specification of what an imperative program does is often intertwined with details about pointers and imperative state. Although there are a number of powerful separation logics that allow the details of imperative state to be captured and managed, these details are complicated and reasoning about them requires significant time and expertise. In this paper, we take a different approach: a memory-safe type system that, as part of type-checking, extracts functional specifications from imperative programs. This disentangles imperative state, which is handled by the type system, from functional specifications, which can be verified without reference to pointers. A key difficulty is that sometimes memory safety depends crucially on the functional specification of a program; e.g., an array index is only memory-safe if the index is in bounds. To handle this case, our specification extraction inserts dynamic checks into the specification. Verification then requires the additional proof that none of these checks fail. However, these checks are in a purely functional language, and so this proof also requires no reasoning about pointers.
|
[9]
|
Li-yao Xia, Yannick Zakowski, Paul He, Chung-Kil Hur, Gregory Malecha,
Benjamin C. Pierce, and Steve Zdancewic.
Interaction Trees.
Proceedings of the ACM on Programming Languages, 4(POPL),
January 2020.
[ PDF ]
Interaction trees (ITrees) are a general-purpose data structure for representing the behaviors of recursive programs that interact with their environments. A coinductive variant of “free monads,” ITrees are built out of uninterpreted events and their continuations. They support compositional construction of interpreters from event handlers, which give meaning to events by defining their semantics as monadic actions. ITrees are expressive enough to represent impure and potentially nonterminating, mutually recursive computations, while admitting a rich equational theory of equivalence up to weak bisimulation. In contrast to other approaches such as relationally specified operational semantics, ITrees are executable via code extraction, making them suitable for debugging, testing, and implementing software artifacts that are amenable to formal verification.
We have implemented ITrees and their associated theory as a Coq library, mechanizing classic domain- and category-theoretic results about program semantics, iteration, monadic structures, and equational reasoning. Although the internals of the library rely heavily on coinductive proofs, the interface hides these details so that clients can use and reason about ITrees without explicit use of Coq’s coinduction tactics.
To showcase the utility of our theory, we prove the termination-sensitive correctness of a compiler from a simple imperative source language to an assembly-like target whose meanings are given in an ITree-based denotational semantics. Unlike previous results using operational techniques, our bisimulation proof follows straightforwardly by structural induction and elementary rewriting via an equational theory of combinators for control-flow graphs.
|
[10]
|
Nick Rioux and Steve Zdancewic.
Computation Focusing.
Proceedings of the ACM on Programming Languages, 5(ICFP), 2020.
[ PDF ]
Focusing is a technique from proof theory that exploits type
information to prune inessential nondeterminism from proof
search procedures. Viewed through the lens of the Curry-Howard
correspondence, a focused typing derivation yields terms in
normal form. This paper explores how to exploit focusing for
reasoning about contextual equivalences and full
abstraction. We present a focused polymorphic
call-by-push-value calculus and prove a computational
completeness result: for every well-typed term, there exists a
focused term that is βη-equivalent to it. This completeness
result yields a powerful way to refine the context lemmas for
establishing contextual equivalences, cutting down the set
that must be considered to just focused contexts. The paper
demonstrates the application of focusing to establish program
equivalences, including free theorems. It also uses focusing
to prove full abstraction of a translation of the pure, total
call-by-push-value language into a language with divergence
and simple effect types, yielding a novel solution to a
simple-to-state, but hitherto difficult to solve problem.
|
[11]
|
Anders Miltner, Solomon Maina, Kathleen Fisher, Benjamin C. Pierce, David
Walker, and Steve Zdancewic.
Synthesizing Symmetric Lenses.
Proceedings of the ACM on Programming Languages, 3(ICFP), 2019.
[ PDF ]
Lenses are programs that can be run both "front to back" and "back to front," allowing updates to either their source or their target data to be transferred in both directions. Since their introduction by Foster et al., lenses have been extensively studied, extended, and applied. Recent work has also demonstrated how techniques from type-directed program synthesis can be used to efficiently synthesize a simple class of lenses—so-called bijective lenses over string data—given a pair of types (regular expressions) and a small number of examples.
We extend this synthesis algorithm to a much broader class of lenses, called simple symmetric lenses, including all bijective lenses, all of the popular category of "asymmetric" lenses, and a rich subset of the more powerful “symmetric lenses” proposed by Hofmann et al. Intuitively, simple symmetric lenses allow some information to be present on one side but not the other and vice versa. They are of independent theoretical interest, being the largest class of symmetric lenses that do not rely on persistent internal state.
Synthesizing simple symmetric lenses is substantially more challenging than synthesizing bijective lenses: Since some of the information on each side can be “disconnected” from the other side, there will, in general, be many lenses that agree with a given example. To guide the search process, we use stochastic regular expressions and ideas from information theory to estimate the amount of information propagated by a candidate lens, generally preferring lenses that propagate more information, as well as user annotations marking parts of the source and target data structures as either irrelevant or essential.
We describe an implementation of simple symmetric lenses and our synthesis procedure as extensions to the Boomerang language. We evaluate its performance on 48 benchmark examples drawn from Flash Fill, Augeas, the bidirectional programming literature, and electronic file format synchronization tasks. Our implementation can synthesize each of these lenses in under 30 seconds.
|
[12]
|
Anders Miltner, Kathleen Fisher, Benjamin C. Pierce, David Walker, and Steve
Zdancewic.
Synthesizing Bijective Lenses.
Proceedings of the ACM on Programming Languages, 2(POPL),
January 2018.
[ PDF ]
Bidirectional transformations between different data representations
occur frequently in modern software systems. They appear as serializers and
deserializers, as parsers and pretty printers, as database views and view
updaters, and as a multitude of different kinds of ad hoc data converters.
Manually building bidirectional transformations---by writing two separate
functions that are intended to be inverses---is tedious and error prone.
A better approach is to use a domain-specific language in
which both directions can be written as a single expression. However,
these domain-specific languages can be difficult to program in, requiring
programmers to manage fiddly details while working in a complex type system.
We present an alternative approach.
Instead of coding transformations manually, we synthesize them from
declarative format descriptions and examples.
Specifically,
we present Optician, a tool for type-directed synthesis of bijective
string transformers. The inputs to Optician are a pair of ordinary
regular expressions representing two data formats
and a few concrete examples for disambiguation. The output is a well-typed
program in Boomerang (a bidirectional language based on the
theory of lenses).
The main technical challenge involves navigating the vast program search
space efficiently. In particular, and unlike most prior work on type-directed
synthesis, our system operates in the context of a language with a rich equivalence
relation on types (the theory of regular expressions). Consequently, program
synthesis requires search in two dimensions: First, our synthesis algorithm must
find a pair of “syntactically compatible types,” and second, using the structure
of those types, it must find a type- and example-compliant term. Our key insight is
that it is possible to
reduce the size of this search space without losing any computational power
by defining a new language of lenses designed
specifically for synthesis. The new language is free from arbitrary function
composition and operates only over types and terms in a new disjunctive normal form.
We prove (1) our new language
is just as powerful as a more natural, compositional, and declarative
language and (2) our synthesis algorithm is sound and complete with respect to the
new language. We also demonstrate empirically
that our new language changes the synthesis problem from
one that admits intractable solutions to one that admits
highly efficient solutions, able to synthesize intricate lenses
between complex file formats in seconds.
We evaluate Optician on a benchmark suite of 39 examples
that includes both microbenchmarks and realistic examples derived from
other data management systems including Flash Fill, a tool
for synthesizing string transformations in spreadsheets, and Augeas, a tool
for bidirectional processing of Linux system configuration files.
|
[13]
|
Solomon Maina, Anders Miltner, Kathleen Fisher, Benjamin C. Pierce, David
Walker, and Steve Zdancewic.
Synthesizing Quotient Lenses.
Proceedings of the ACM on Programming Languages, 2(ICFP), 2018.
[ PDF ]
Quotient lenses are bidirectional transformations whose correctness
laws are “loosened” by specified equivalence relations, allowing
inessential details in concrete data formats to be suppressed.
For example, a programmer could use a quotient lens to define
a transformation that ignores the order of fields in XML data, so
that two XML files with the same fields but in different orders would be
considered the same, allowing a single, simple program to handle them both.
Building on a recently published algorithm for synthesizing plain bijective
lenses from high-level specifications, we show how to synthesize bijective
quotient lenses in three steps. First, we introduce quotient regular
expressions (QREs), annotated regular expressions that conveniently mark
inessential aspects of string data formats; each QRE specifies,
simulteneously, a regular language and an equivalence relation on it.
Second, we introduce QRE lenses, i.e., lenses mapping between QREs.
Our key technical result is a proof that every QRE lens can be transformed
into a functionally equivalent lens that canonizes source and target data just
at the “edges” and that uses a bijective lens to map between the respective
canonical elements; no internal canonization occurs in a lens in this normal
form. Third, we leverage this normalization theorem to synthesize QRE
lenses from a pair of QREs and example input-output pairs, reusing earlier work
on synthesizing plain bijective lenses. We have implemented QREs and QRE lens
synthesis as an extension to the bidirectional programming language Boomerang.
We evaluate the effectiveness of our approach by synthesizing QRE lenses
between various real-world data formats in the Optician benchmark suite.
|
[14]
|
Andrew W. Appel, Lennart Beringer, Adam Chlipala, Benjamin C. Pierce, Zhong
Shao, Stephanie Weirich, and Steve Zdancewic.
Position paper: The Science of Deep Specification.
Philosophical Transactions of the Royal Society of London A:
Mathematical, Physical and Engineering Sciences, 375(2104), 2017.
[ DOI
http ]
We introduce our efforts within the project "The science of deep specification" to work out the key formal underpinnings of industrial-scale formal specifications of software and hardware components, anticipating a world where large verified systems are routinely built out of smaller verified components that are also used by many other projects. We identify an important class of specification that has already been used in a few experiments that connect strong component-correctness theorems across the work of different teams. To help popularize the unique advantages of that style, we dub it deep specification, and we say that it encompasses specifications that are rich, two-sided, formal and live (terms that we define in the article). Our core team is developing a proof-of-concept system (based on the Coq proof assistant) whose specification and verification work is divided across largely decoupled subteams at our four institutions, encompassing hardware microarchitecture, compilers, operating systems and applications, along with cross-cutting principles and tools for effective specification. We also aim to catalyse interest in the approach, not just by basic researchers but also by users in industry.This article is part of the themed issue "Verified trustworthy software systems".
|
[15]
|
B. Valiron and S. Zdancewic.
Modeling Simply-Typed Lambda Calculi in the Category of Finite
Vector Spaces.
Scientific Annals of Computer Science, 24(2):325--368, 2014.
[ DOI
PDF ]
In this paper we use finite vector spaces (finite
dimension, over finite fields) as a non-standard
computational model of linear logic. We first define
a simple, finite PCF-like lambda-calculus with
booleans, and then we discuss two finite models, one
based on finite sets and the other on finite vector
spaces. The first model is shown to be fully
complete with respect to the operational semantics
of the language, while the second model is not. We
then develop an algebraic extension of the finite
lambda calculus and study two operational semantics:
a call-by-name and a call-by-value. These
operational semantics are matched with their
corresponding natural denotational semantics based
on finite vector spaces. The relationship between
the various semantics is analyzed, and several
examples based on Church numerals are presented.
|
[16]
|
Peng Li and Steve Zdancewic.
Arrows for Secure Information Flow.
Theoretical Computer Science, 411(19):1974--1994, 2010.
[ PDF ]
This paper presents an embedded security sublanguage for enforcing information-
flow policies in the standard Haskell programming language. The sublanguage pro-
vides useful information-flow control mechanisms including dynamic security lat-
tices, run-time code privileges and declassification all without modifying the base
language. This design avoids the redundant work of producing new languages, low-
ers the threshold for adopting security-typed languages, and also provides great
flexibility and modularity for using security-policy frameworks.
The embedded security sublanguage is designed using a standard combinator in-
terface called arrows. Computations constructed in the sublanguage have static and
explicit control-flow components, making it possible to implement information-flow
control using static-analysis techniques at run time, while providing strong secu-
rity guarantees. This paper presents a formal proof that our embedded sublanguage
provides noninterference, a concrete Haskell implementation and an example appli-
cation demonstrating the proposed techniques.
|
[17]
|
Stephen Tse and Steve Zdancewic.
Run-time principals in information-flow type systems.
Transactions on Programming Languages and Systems, 30(1):6,
2008.
[ PDF ]
Information-flow type systems are a promising approach for enforcing
strong end-to-end confidentiality and integrity policies. Such
policies, however, are usually specified in terms of static
information---data is labeled _high_ or _low_ security at
compile time. In practice, the confidentiality of data may depend
on information available only while the system is running.
This paper studies language support for _run-time principals_,
a mechanism for specifying security policies that depend on which
principals interact with the system. We establish the basic
property of noninterference for programs written in such language,
and use run-time principals for specifying run-time authority in
downgrading mechanisms such as declassification.
In addition to allowing more expressive security policies, run-time
principals enable the integration of language-based security
mechanisms with other existing approaches such as Java stack
inspection and public key infrastructures. We sketch an
implementation of run-time principals via public keys such that
principal delegation is verified by certificate chains.
|
[18]
|
Andrew C. Myers, Andrei Sabelfeld, and Steve Zdancewic.
Enforcing Robust Declassification and Qualified Robustness.
Journal of Computer Security, 14(2):157--196, 2006.
[ PDF ]
Noninterference requires that there
is no information flow from sensitive to public data in a given
system. However, many systems
release sensitive information as part of their
intended function and therefore violate noninterference.
To control information flow while permitting
information release, some systems have a downgrading or
declassification mechanism, but this creates the danger
that it may cause unintentional information release.
This paper shows that a robustness property
can be used to characterize programs in which declassification mechanisms
cannot be controlled by attackers to release more information than
intended. It describes a simple way to provably enforce this robustness
property through a type-based compile-time program analysis. The paper also
presents a generalization of robustness that
supports upgrading (endorsing) data integrity.
|
[19]
|
Jay Ligatti, David Walker, and Steve Zdancewic.
A Type-theoretic Interpretation of Pointcuts and Advice.
Science of Computer Programming: Special Issue on Foundations of
Aspect-Oriented Programming, pages 240--266, 2006.
[ PDF ]
This paper defines the semantics of MinAML, an idealized
aspect-oriented programming language, by giving a type-directed
translation from a user-friendly external language to a compact,
well-defined core language. We argue that our framework is an
effective way to give semantics to aspect-oriented programming
languages in general because the translation eliminates shallow
syntactic differences between related constructs and permits
definition of a simple and elegant core language.
The core language extends the simply-typed lambda calculus with two
central new abstractions: explicitly labeled program points and
first-class advice. The labels serve both to trigger advice and to
mark continuations that the advice may return to. These constructs
are defined orthogonally to the other features of the language and
we show that our abstractions can be used in both functional and
object-oriented contexts. We prove Preservation and Progress lemmas
for our core language and show that the translation from MinAML source
into core is type-preserving. We also consider several extensions to
our basic framework including a general mechanism for analyzing the
current call stack.
|
[20]
|
Steve Zdancewic, Lantian Zheng, Nathaniel Nystrom, and Andrew C. Myers.
Secure Program Partitioning.
Transactions on Computer Systems, 20(3):283--328, 2002.
[ PDF
PS ]
This paper presents secure program partitioning, a language-based
technique for protecting confidential data during computation in
distributed systems containing mutually untrusted hosts. Confidentiality
and integrity policies can be expressed by annotating programs with
security types that constrain information flow; these programs can then
be partitioned automatically to run securely on heterogeneously trusted
hosts. The resulting communicating subprograms collectively implement
the original program, yet the system as a whole satisfies the security
requirements of participating principals without requiring a universally
trusted host machine. The experience in applying this methodology and
the performance of the resulting distributed code suggest that this is a
promising way to obtain secure distributed computation.
This article is an expanded version of the published paper “Untrusted
Hosts and Confidentiality: Secure Program Partitioning”.
The main difference between the two is Appendix A, which
contains a correctness proof for the control-transfer protocols described
in Section 5.
|
[21]
|
Steve Zdancewic and Andrew C. Myers.
Secure Information Flow via Linear Continuations.
Higher Order and Symbolic Computation, 15(2/3):209--234, 2002.
[ PDF
PS ]
Security-typed languages enforce secrecy or integrity policies by
type-checking. This paper investigates continuation-passing style (CPS) as a
means of proving that such languages enforce noninterference and as a first
step towards understanding their compilation. We present a low-level, secure
calculus with higher-order, imperative features and linear
continuations.
Linear continuations impose a stack discipline on the control flow of
programs. This additional structure in the type system lets us establish a
strong information-flow security property called noninterference. We prove
that our CPS target language enjoys the noninterference property and we show
how to translate secure high-level programs to this low-level
language. This noninterference proof is the first of its kind for a
language with higher-order functions and state.
|
[22]
|
Dan Grossman, Greg Morrisett, and Steve Zdancewic.
Syntactic Type Abstraction.
Transactions on Programming Languages and Systems,
22(6):1037--1080, November 2000.
[ PDF
PS ]
Software developers often structure programs in such a
way that different pieces of code constitute distinct
principals. Types help define the protocol by which these principals
interact. In particular, abstract types allow a principal to make
strong assumptions about how well-typed clients use the facilities
that it provides. We show how the notions of principals and type
abstraction can be formalized within a language. Different principals
can know the implementation of different abstract types. We use
additional syntax to track the flow of values with abstract types
during the evaluation of a program and demonstrate how this framework
supports syntactic proofs (in the style of subject reduction) for
type-abstraction properties. Such properties have traditionally
required semantic arguments; using syntax avoids the need to build a
model for the language. We present various typed lambda calculi with
principals, including versions that have mutable state and recursive types.
|
|
Conference and Selective Workshop Papers
[1]
|
Lucas Silver, Paul He, Ethan Cecchetti, Andrew K. Hirsch, and Steve Zdancewic.
Semantics for Noninterference with Interaction Trees.
In Proceedings of the 37th Annual European Conference on
Object-Oriented Programming (ECOOP 2023), 2023.
[ PDF ]
Noninterference is the strong information-security property that a program does not leak secrets through publicly-visible behavior. In the presence of effects such as nontermination, state, and exceptions, reasoning about noninterference quickly becomes subtle. We advocate using interaction trees (ITrees) to provide compositional mechanized proofs of noninterference for multi-language, effectful, nonterminating programs, while retaining executability of the semantics. We develop important foundations for security analysis with ITrees: two indistinguishability relations, leading to two standard notions of noninterference with adversaries of different strength, along with metatheory libraries for reasoning about each. We demonstrate the utility of our results using a simple imperative language with embedded assembly, along with a compiler into that assembly language.
|
[2]
|
Stephen Mell, Favyen Bastani, Steve Zdancewic, and Osbert Bastani.
Synthesizing Trajectory Queries from Examples.
In Computer Aided Verification - 35th International Conference,
CAV, 2023.
[ PDF ]
Data scientists often need to write programs to process pre-
dictions of machine learning models, such as object detections and trajec-
tories in video data. However, writing such queries can be challenging due
to the fuzzy nature of real-world data; in particular, they often include
real-valued parameters that must be tuned by hand. We propose a novel
framework called Quivr that synthesizes trajectory queries matching a
given set of examples. To efficiently synthesize parameters, we introduce
a novel technique for pruning the parameter space and a novel quanti-
tative semantics that makes this more efficient. We evaluate Quivr on
a benchmark of 17 tasks, including several from prior work, and show
both that it can synthesize accurate queries for each task and that our
optimizations substantially reduce synthesis time.
|
[3]
|
Lawrence Dunn, Val Tannen, and Steve Zdancewic.
Syntax Monads for the Working Formal Metatheorist.
In Proceedings of the 6th International Conference on Applied
Category Theory (ACT), 2023.
[ PDF ]
Formally verifying the properties of formal systems using a proof assistant requires justifying numerous minor lemmas about capture-avoiding substitution. Despite work on category-theoretic accounts
of syntax and variable binding, raw, first-order representations of syntax, the kind considered by
many practitioners and compiler frontends, have received relatively little attention. Therefore applications miss out on the benefits of category theory, such as the deeply attractive promise of reusing
formalized infrastructural lemmas between implementations of different systems. Our Coq framework Tealeaves provides libraries of reusable infrastructure for first-order representations of variable
binding, such as de Bruijn indices and locally nameless. In this paper we give a string-diagrammatic
account of decorated traversable monads (DTMs), the key abstraction implemented by Tealeaves.
We define DTMs as monoids of structured endofunctors before proving a representation theorem à
la Kleisli.
|
[4]
|
Lawrence Dunn, Val Tannen, and Steve Zdancewic.
Tealeaves: Structured Monads for Generic First-Order Abstract
Syntax Infrastructure.
In 14th International Conference on Interactive Theorem Proving
(ITP), 2023.
[ PDF ]
Verifying the metatheory of a formal system in Coq involves a lot of tedious" infrastructural" reasoning about variable binders. We present Tealeaves, a generic framework for first-order representations of variable binding that can be used to develop this sort of infrastructure once and for all. Given a particular strategy for representing binders concretely, such as locally nameless or de Bruijn indices, Tealeaves allows developers to implement modules of generic infrastructure called backends that end users can simply instantiate to their own syntax. Our framework rests on a novel abstraction of first-order abstract syntax called a decorated traversable monad (DTM) whose equational theory provides reasoning principles that replace tedious induction on terms. To evaluate Tealeaves, we have implemented a multisorted locally nameless backend providing generic versions of the lemmas generated by LNgen. We discuss case studies where we instantiate this generic infrastructure to simply-typed and polymorphic lambda calculi, comparing our approach to other utilities.
|
[5]
|
George Tolkachev, Stephen Mell, Steve Zdancewic, and Osbert Bastani.
Counterfactual Explanations for Natural Language Interfaces.
In 60th Annual Meeting of the Association for Computational
Linguistics (ACL), 2022.
(short paper).
[ PDF ]
A key challenge facing natural language interfaces is enabling users to understand the capabilities of the underlying system. We propose a novel approach for generating explanations of a natural language interface based on semantic parsing. We focus on counterfactual explanations, which are post-hoc explanations that describe to the user how they could have minimally modified their utterance to achieve their desired goal. In particular, the user provides an utterance along with a demonstration of their desired goal; then, our algorithm synthesizes a paraphrase of their utterance that is guaranteed to achieve their goal. In two user studies, we demonstrate that our approach substantially improves user performance, and that it generates explanations that more closely match the user's intent compared to two ablations.
|
[6]
|
Stephen Mell, Osbert Bastani, and Steve Zdancewic.
Ideograph: A Language for Expressing and Manipulating Structured
Data.
In Proceedings Twelfth International Workshop on Computing with
Terms and Graphs (TERMGRAPH 2022), pages 65--84. Electronic Proceedings in
Theoretical Computer Science, 2022.
[ PDF ]
We introduce Ideograph, a language for expressing and manipulating structured data. Its types describe kinds of structures, such as natural numbers, lists, multisets, binary trees, syntax trees with variable binding, directed multigraphs, and relational databases. Fully normalized terms of a type correspond exactly to members of the structure, analogous to a Church-encoding. Moreover, definable operations over these structures are guaranteed to respect the structures' equivalences. In this paper, we give the syntax and semantics of the non-polymorphic subset of Ideograph, and we demonstrate how it can represent and manipulate several interesting structures.
|
[7]
|
Hengchu Zhang, Wolf Honoré, Nicolas Koh, Yao Li, Yishuai Li, Li-Yao Xia,
Lennart Beringer, William Mansky, Benjamin Pierce, and Steve Zdancewic.
Verifying an HTTP Key-Value Server with Interaction Trees and
VST.
In Liron Cohen and Cezary Kaliszyk, editors, 12th International
Conference on Interactive Theorem Proving (ITP 2021), volume 193 of
Leibniz International Proceedings in Informatics (LIPIcs), pages
32:1--32:19, Dagstuhl, Germany, 2021. Schloss Dagstuhl -- Leibniz-Zentrum
für Informatik.
[ DOI
PDF
http ]
We present a networked key-value server, implemented in C and formally verified in Coq. The server
interacts with clients using a subset of the HTTP/1.1 protocol and is specified and verified using
interaction trees and the Verified Software Toolchain. The codebase includes a reusable and fully
verified C string library that provides 17 standard POSIX string functions and 17 general purpose
non-POSIX string functions. For the KVServer socket system calls, we establish a refinement relation
between specifications at user-space level and at CertiKOS kernel-space level.
|
[8]
|
Yishuai Li, Benjamin Pierce, and Steve Zdancewic.
Model-Based Testing of Networked Applications.
In The ACM SIGSOFT International Symposium on Software Testing
and Analysis (ISSTA), 2021.
We present a principled automatic testing framework for application-layer protocols. The key innovation is a domain-specific embedded language for writing nondeterministic models of the behavior of networked servers. These models are defined within the Coq interactive theorem prover, supporting a smooth transition from testing to formal verification.
Given a server model, we show how to automatically derive a tester that probes the server for unexpected behaviors. We address the uncertainties caused by both the server's internal choices and the network delaying messages nondeterministically. The derived tester accepts server implementations whose possible behaviors are a subset of those allowed by the nondeterministic model.
We demonstrate the effectiveness of this framework by using it to specify and test a fragment of the HTTP/1.1 protocol, showing that the automatically derived tester can capture RFC violations in buggy server implementations, including the latest versions of Apache and Nginx.
|
[9]
|
Yannick Zakowski, Paul He, Chung-Kil Hur, and Steve Zdancewic.
An Equational Theory for Weak Bisimulation via Generalized
Parameterized Coinduction.
In Proceedings of the 9th ACM SIGPLAN International Conference
on Certified Programs and Proofs (CPP), January 2020.
[ PDF ]
Coinductive reasoning about infinitary structures such as streams is widely applicable. However, practical frameworks for developing coinductive proofs and finding reasoning principles that help structure such proofs remain a challenge, especially in the context of machine-checked formalization.
This paper gives a novel presentation of an equational theory for reasoning about structures up to weak bisimula- tion. The theory is both compositional, making it suitable for defining general-purpose lemmas, and also incremental, meaning that the bisimulation can be created interactively. To prove the theory’s soundness, this paper also introduces generalized parameterized coinduction, which addresses ex- pressivity problems of earlier works and provides a practical framework for coinductive reasoning. The paper presents the resulting equational theory for streams, but the technique applies to other structures too.
All of the results in this paper have been proved in Coq, and the generalized parameterized coinduction framework is available as a Coq library.
|
[10]
|
Nicolas Koh, Yao Li, Yishuai Li, Li yao Xia, Lennart Beringer, Wolf Honoré,
William Mansky, Benjamin C. Pierce, and Steve Zdancewic.
From C to Interaction Trees: Specifying, Verifying, and Testing
a Networked Server.
In Proceedings of the 8th ACM SIGPLAN International Conference
on Certified Programs and Proofs (CPP), January 2019.
[ PDF ]
We present the first formal verification of a networked server implemented
in C. Interaction trees, a general structure for representing
reactive computations, are used to tie together disparate verification and
testing tools (Coq, VST, and QuickChick) and to axiomatize the behavior of
the operating system on which the server runs (CertiKOS). The main theorem
connects a specification of acceptable server behaviors, written in a
straightforward “one client at a time” style, with the
CompCert semantics of the C program. The variability
introduced by low-level buffering of messages and interleaving of multiple
TCP connections is captured using network refinement, a variant of
observational refinement.
|
[11]
|
Marcella Hastings, Brett Hemenway, Daniel Noble, and Steve Zdancewic.
SoK: General Purpose Compilers for Secure Multi-Party
Computation.
In IEEE 2019 Symposium on Security and Privacy (Oakland),
2019.
[ PDF ]
Secure multi-party computation (MPC) allows a group of mutually distrustful parties to compute a joint function on their inputs
without revealing any information beyond the result of the computation.
This type of computation is extremely powerful and has wide-ranging applications in academia, industry, and government.
Protocols for secure computation have existed for decades, but only recently have general-purpose compilers for executing MPC on arbitrary functions been developed.
These projects rapidly improved the state of the art, and began to make MPC accessible to non-expert users.
However, the field is changing so rapidly that it is difficult even for experts to keep track of the varied capabilities of modern frameworks.
In this work, we survey general-purpose compilers for secure multi-party computation.
These tools provide high-level abstractions to describe arbitrary functions and execute secure computation protocols.
We consider eleven systems: EMP-toolkit, Obliv-C, ObliVM, TinyGarble, SCALE-MAMBA (formerly SPDZ), Wysteria, Sharemind, PICCO, ABY,
Frigate and CBMC-GC.
We evaluate these systems on a range of criteria, including language expressibility, capabilities of the cryptographic back-end, and accessibility to developers.
We advocate for improved documentation of MPC frameworks, standardization within the community, and make recommendations for future directions in compiler development.
Installing and running these systems can be challenging, and for each system, we also provide a complete virtual environment (Docker container) with all the necessary dependencies
to run the compiler and our example programs.
|
[12]
|
Jennifer Paykin and Steve Zdancewic.
A HoTT Quantum Equational Theory.
In The 16th International Conference on Quantum Physics and
Logic (QPL), 2019.
extended version available on arXiv.
[ http ]
This paper presents an equational theory for the QRAM model of quantum computation, formulated as an embedded language inside of homotopy type theory. The embedded language approach is highly expressive, and reflects the style of state-of-the art quantum languages like Quipper and QWIRE. The embedding takes advantage of features of homotopy type theory to encode unitary transformations as higher inductive paths, simplifying the presentation of an equational theory. We prove that this equational theory is sound and complete with respect to established models of quantum computation.
|
[13]
|
Christine Rizkallah, Dmitri Garbuzov, and Steve Zdancewic.
A Formal Equational Theory for Call-By-Push-Value.
In 9th International Conference on Interactive Theorem Proving
(ITP), 2018.
[ PDF ]
Establishing that two programs are contextually equivalent is hard, yet
essential for reasoning about
semantics preserving program
transformations such as compiler optimizations.
We adapt Lassen's normal form bisimulations technique to establish the soundness of equational
theories for both an untyped
call-by-value λ-calculus and a variant of Levy's call-by-push-value
language. We demonstrate that our equational theory significantly simplifies the verification of optimizations.
|
[14]
|
Robert Rand, Jennifer Paykin, Dong-Ho Lee, and Steve Zdancewic.
REQUIRE: Reasoning about Reversible Quantum Circuits.
In The 15th International Conference on Quantum Physics and
Logic (QPL), 2018.
[ PDF ]
|
[15]
|
Jennifer Paykin, Robert Rand, and Steve Zdancewic.
QWire: A Core Language for Quantum Circuits.
In Proc. of the ACM Symposium on Principles of Programming
Languages (POPL), 2017.
[ PDF ]
This paper introduces QWire (“choir”), a language for defining quantum
circuits and an interface for manipulating them inside of an arbitrary
classical host language. QWire is minimal---it contains only a
few primitives---and sound with respect to the physical properties entailed by
quantum mechanics. At the same time, QWire is expressive and highly modular
due to its relationship with the host language, mirroring the QRAM model
of computation that places a quantum computer (controlled by circuits)
alongside a classical computer (controlled by the host language).
We present QWire along with its type system and operational semantics, which
we prove is safe and strongly normalizing whenever the host language is. We
give circuits a denotational semantics in terms of density matrices. Throughout, we
investigate examples that demonstrate the expressive power of QWire, including
extensions to the host language that (1) expose a general analysis framework
for circuits, and (2) provide dependent types.
|
[16]
|
William Mansky, Yuanfeng Peng, Steve Zdancewic, and Joseph Devietti.
Verifying Dynamic Race Detection.
In The 6th ACM SIGPLAN Conference on Certified Programs and
Proofs (CPP 2017), 2017.
[ PDF ]
Writing race-free concurrent code is notoriously difficult, and data races can result in bugs that are difficult to isolate and reproduce. Dynamic race detection can catch data races that cannot (easily) be detected statically. One approach to dynamic race detection is to instrument the potentially racy code with operations that store and compare metadata, where the metadata implements some known race detection algorithm (e.g. vector-clock race detection). In this paper, we describe the process of formally verifying several algorithms for dynamic race detection. We then turn to implementations, laying out an instrumentation pass for race detection in a simple language and presenting a mechanized formal proof of its correctness: all races in a program will be caught by the instrumentation, and all races detected by the instrumentation are possible in the original program.
|
[17]
|
Robert Rand, Jennifer Paykin, and Steve Zdancewic.
Qwire Practice: Formal Verification of
Quantum Circuits in Coq.
In The 14th International Conference on Quantum Physics and
Logic (QPL), 2017.
[ PDF ]
We describe an embedding of the QWIRE quantum circuit language in the Coq proof assistant. This
allows programmers to write quantum circuits using high-level abstractions and to prove properties of
those circuits using Coq’s theorem proving features. The implementation uses higher-order abstract
syntax to represent variable binding and provides a type-checking algorithm for linear wire types,
ensuring that quantum circuits are well-formed. We formalize a denotational semantics that interprets
QWIRE circuits as superoperators on density matrices, and prove the correctness of some simple
quantum programs.
|
[18]
|
Jennifer Paykin and Steve Zdancewic.
The Linearity Monad.
In Proceedings of the 10th ACM SIGPLAN International Haskell
Symposium, 2017.
[ PDF ]
We introduce a technique for programming with domain-specific
linear languages using the monad that arises from the theory of
linear/non-linear logic. In this work we interpret the linear/nonlinear
model as a simple, effectful linear language embedded inside
an existing non-linear host language. We implement a modular
framework for defining these linear EDSLs in Haskell, allowing
both shallow and deep embeddings. To demonstrate the effectiveness
of the framework and the linearity monad, we implement
languages for file handles, mutable arrays, session types, and quantum
computing.
|
[19]
|
Jonathan Frankle, Peter-Michael Osera, David Walker, and Steve Zdancewic.
Example-Directed Synthesis: A Type-Theoretic Interpretation.
In Proc. of the ACM Symposium on Principles of Programming
Languages (POPL), 2016.
[ PDF ]
Input-output examples have emerged as a practical and user-friendly
specification mechanism for program synthesis in many environments.
While example-driven tools have demonstrated tangible impact that has
inspired adoption in industry, their underlying semantics are less well-understood:
what are “examples” and how do they
relate to other kinds of specifications? This paper
demonstrates that examples can, in general, be interpreted
as refinement types. Seen in this light, program
synthesis is the task of finding an inhabitant of
such a type. This insight provides an immediate
semantic interpretation for examples. Moreover,
it enables us to exploit decades of research in type theory as
well as its correspondence with intuitionistic logic rather
than designing ad hoc theoretical frameworks for synthesis from scratch.
We put this observation into practice by formalizing synthesis
as proof search in a sequent calculus with
intersection and union refinements that we prove
to be sound with respect to a conventional type system.
In addition, we show how to handle negative examples,
which arise from user feedback or counterexample-guided loops.
This theory serves as the basis for a prototype
implementation that extends our core language to
support ML-style algebraic data types and structurally
inductive functions. Users can also specify
synthesis goals using polymorphic refinements and
import monomorphic libraries.
The prototype serves as a vehicle
for empirically evaluating a number of different
strategies for resolving the nondeterminism of the sequent
calculus---bottom-up theorem-proving,
term enumeration with refinement type checking, and
combinations of both---the results of which classify, explain, and
validate the design choices of existing synthesis systems.
It also provides a platform for measuring the practical
value of a specification language that combines
“examples” with the more general expressiveness of refinements.
|
[20]
|
Jennifer Paykin and Steve Zdancewic.
Linear λμ is CP (more or less).
In A List of Successes to Change the World (Wadlerfest), 2016.
[ PDF ]
In this paper we compare Wadler's CP calculus for classical
linear processes to a linear version of Parigot's λμ calculus
for classical logic. We conclude that linear λμ is “more or less” CP, in that it equationally corresponds to a polarized
version of CP. The comparison is made by extending a technique
from Melliès and Tabareau's tensor logic that correlates negation
with polarization. The polarized CP, which is written CP+-
and pronounced “CP more or less,” is an interesting bridge in
the landscape of Curry-Howard interpretations of logic.
|
[21]
|
Peter-Michael Osera and Steve Zdancewic.
Type-and-Example-Directed Program Synthesis.
In Proc. 2015 ACM SIGPLAN Conference on Programming Languages
Design and Implementation (PLDI), 2015.
[ PDF ]
This paper presents an algorithm for synthesizing recursive
functions that process algebraic datatypes. It is founded on
proof-theoretic techniques that exploit both type information and
input--output examples to prune the search space. The algorithm uses
refinement trees, a data structure that succinctly
represents constraints on the shape of generated code. We evaluate
the algorithm by using a prototype implementation to synthesize more
than 40 benchmarks and several non-trivial larger examples. Our
results demonstrate that the approach meets or outperforms the
state-of-the-art for this domain, in terms of synthesis time or
attainable size of the generated programs.
|
[22]
|
Jeehoon Kang, Chung-Kil Hur, William Mansky, Dmitri Garbuzov, Steve Zdancewic,
and Viktor Vafeiadis.
A Formal C Memory Model Supporting Integer-Pointer Casts.
In Proc. 2015 ACM SIGPLAN Conference on Programming Languages
Design and Implementation (PLDI), 2015.
[ PDF ]
The ISO C standard does not specify the semantics of many valid programs that
use non-portable idioms such as integer-pointer casts. Recent efforts at
formal definitions and verified implementation of the C language inherit this
feature. By adopting high-level abstract memory models, they validate common
optimizations. On the other hand, this prevents reasoning about much low-level
code relying on the behavior of common implementations, where formal
verification has many applications.
We present the first formal memory model that allows many common optimizations
and fully supports operations on the representation of pointers. All
arithmetic operations are well-defined for pointers that have been cast to
integers. Crucially, our model is also simple to understand and program with. All our results are fully formalized in Coq.
|
[23]
|
William Mansky, Dmitri Garbuzov, and Steve Zdancewic.
An Axiomatic Specification for Sequential Memory Models.
In Computer Aided Verification - 27th International Conference,
CAV 2015, 2015.
[ PDF ]
Formalizations of concurrent memory models often represent memory behavior in terms of sequences of operations, where operations are either reads, writes, or synchronizations. More concrete models of (sequential) memory behavior may include allocation and free operations, but also include details of memory layout or data representation. We present an abstract specification for sequential memory models with allocation and free operations, in the form of a set of axioms that provide enough information to reason about memory without overly constraining the behavior of implementations. We characterize a set of “well-behaved” programs that behave uniformly on all instances of the specification. We show that the specification is both feasible---the CompCert memory model implements it---and usable---we can use the axioms to prove the correctness of an optimization that changes the memory behavior of programs in an LLVM-like language.
|
[24]
|
Santosh Nagarakatte, Milo M. K. Martin, and Steve Zdancewic.
Everything You Want to Know About Pointer-Based Checking.
In 1st Summit on Advances in Programming Languages, SNAPL
2015, May 3-6, 2015, Asilomar, California, USA, pages 190--208, 2015.
[ DOI
PDF
http ]
Lack of memory safety in C/C++ has resulted in numerous security vulnerabilities and serious bugs in large software systems. This paper highlights the challenges in enforcing memory safety for C/C++ programs and progress made as part of the SoftBoundCETS project. We have been exploring memory safety enforcement at various levels - in hardware, in the compiler, and as a hardware-compiler hybrid - in this project. Our research has identified that maintaining metadata with pointers in a disjoint metadata space and performing bounds and use-after-free checking can provide comprehensive memory safety. We describe the rationale behind the design decisions and its ramifications on various dimensions, our experience with the various variants that we explored in this project, and the lessons learned in the process. We also describe and analyze the forthcoming Intel Memory Protection Extensions (MPX) that provides hardware acceleration for disjoint metadata and pointer checking in mainstream hardware, which is expected to be available later this year.
|
[25]
|
Robert Rand and Steve Zdancewic.
VPHL: A Verified Partial-Correctness Logic for Probabilistic
Programs.
In Mathematical Foundations of Program Semantics (MFPS), 2015.
[ PDF ]
We introduce a Hoare-style logic for probabilistic
programs, called VPHL, that has been formally
verified in the Coq proof assistant. VPHL features
propositional, rather than additive, assertions and
a simple set of rules for reasoning about these
assertions using the standard axioms of probability
theory. 's assertions are partial
correctness assertions, meaning that their
conclusions are dependent upon (deterministic)
program termination. The underlying simple
probabilistic imperative language, PRIMP, includes
a probabilistic toss operator, probabilistic guards
and potentially-non-terminating while loops.
|
[26]
|
Santosh Nagarakatte, Milo M. K. Martin, and Steve Zdancewic.
WatchdogLite: Hardware-Accelerated Compiler-Based Pointer
Checking.
In Proceedings of Annual IEEE/ACM International Symposium on
Code Generation and Optimization, CGO '14, pages 175:175--175:184. ACM,
2014.
[ DOI
PDF
http ]
Lack of memory safety in C is the root cause of a multitude of serious bugs and security vulnerabilities. Numerous software-only and hardware-based schemes have been proposed to enforce memory safety. Among these approaches, pointer-based checking, which maintains per-pointer metadata in a disjoint metadata space, has been recognized as providing comprehensive memory safety. Software approaches for pointer-based checking have high performance overheads. In contrast, hardware approaches introduce a myriad of hardware structures and widgets to mitigate those performance overheads.
This paper proposes WatchdogLite, an ISA extension that provides hardware acceleration for a compiler implementation of pointer-based checking. This division of labor between the compiler and the hardware allows for hardware acceleration while using only preexisting architectural registers. By leveraging the compiler to identify pointers, perform check elimination, and insert the new instructions, this approach attains performance similar to prior hardware-intensive approaches without adding any hardware structures for tracking metadata.
|
[27]
|
Aloïs Brunel, Marco Gaboardi, Damiano Mazza, and Steve Zdancewic.
A Core Quantitative Coeffect Calculus.
In Proc. of the 23rd European Symposium on Programming (ESOP),
volume 8410, pages 351--370, 2014.
[ PDF ]
Linear logic is well known for its resource-awareness, which has
inspired the design of several resource management mechanisms in
programming language design. Its resource-awareness arises
from the distinction between linear, single-use data and
non-linear, reusable data. The latter is marked by the so-called
exponential modality, which, from the categorical viewpoint, is a
(monoidal) comonad.
Monadic notions of computation are well-established
mechanisms used to express effects in pure functional
languages. Less well-established is the notion of comonadic
computation. However, recent works have shown the usefulness of
comonads to structure context dependent computations.
In this work, we present a language lrPCF inspired by a
generalized interpretation of the exponential modality. In lrPCF the
exponential modality carries a label---an element of a
semiring R---that provides
additional information on how a program uses its context.
This additional structure is used
to express comonadic type analysis.
|
[28]
|
Benoît Valiron and Steve Zdancewic.
Finite vector spaces as model of simply-typed lambda-calculi.
In Proceedings of the 11th International Colloquium on
Theoretical Aspects of Computing (ICTAC 14), 2014.
[ PDF ]
In this paper we use finite vector spaces (finite dimension, over
finite fields) as a non-standard computational model of linear
logic. We first define a simple, finite PCF-like lambda-calculus with
booleans, and then we discuss two finite models, one based on finite
sets and the other on finite vector spaces. The first model is shown
to be fully complete with respect to the operational semantics of the
language. The second model is not complete, but we develop an
algebraic extension of the finite lambda calculus that recovers
completeness. The relationship between the two semantics is described,
and several examples based on Church numerals are presented.
|
[29]
|
Santosh Nagarakatte, Milo M K Martin, and Steve Zdancewic.
Hardware-enforced Comprehensive Memory Safety.
IEEE MICRO's "Top Picks of Architecture Conferences of 2012"
Issue (Micro Top Picks'2013), May/June 2013.
[ PDF ]
The lack of memory safety in languages such as C and C++ is a root
source of exploitable security vulnerabilities. This
article presents Watchdog, a hardware approach that
eliminates such vulnerabilities by enforcing
comprehensive memory safety. Inspired by prior
software-only mechanisms, Watchdog maintains bounds
and identifier metadata with pointers, propagates
them on pointer operations, and checks them on
pointer dereferences. Checking this bounds and
identifier metadata provides both precise,
byte-granularity buffer-overflow protection and
protection from use-after-free errors, even in the
presence of reallocations. Watchdog stores pointer
metadata in a disjoint shadow space to provide
comprehensive protection and ensure compatibility
with existing code. To streamline implementation and
reduce runtime overhead, Watchdog uses
micro-operations to implement metadata access and
checking, eliminates metadata copies via a register
renaming scheme, and uses a dedicated identifier
cache to reduce checking overhead.
|
[30]
|
Jianzhou Zhao, Santosh Nagarakatte, Milo M. K. Martin, and Steve Zdancewic.
Formal Verification of SSA-Based Optimizations for LLVM.
In Proc. 2013 ACM SIGPLAN Conference on Programming Languages
Design and Implementation (PLDI), 2013.
[ PDF ]
Modern compilers, such as LLVM and GCC, use a static single assignment
(SSA) intermediate representation (IR) to simplify and enable many advanced
optimizations. However, formally verifying the correctness of SSA-based
optimizations is challenging because SSA properties depend on a function's
entire control-flow graph.
This paper addresses this challenge by developing a proof technique
for proving SSA-based program invariants and compiler optimizations.
We use this technique in the Coq proof assistant to create
mechanized correctness proofs of several “micro” transformations
that form the building blocks for larger SSA optimizations. To demonstrate
the utility of this approach, we formally verify a variant of LLVM's
memtoreg transformation in Vellvm, a Coq-based formal
semantics of the LLVM IR. The extracted implementation generates code with performance
comparable to that of LLVM's unverified implementation.
|
[31]
|
Christian DeLozier, Richard Eisenberg, Santosh Nagarakatte, Peter-Michael
Osera, Milo M. K. Martin, and Steve Zdancewic.
Ironclad C++: A Library-Augmented Type-Safe Subset of C++.
In Proceedings of the 28th Annual ACM SIGPLAN Conference on
Object-Oriented Programming, Systems, Languages, and Applications, (OOPSLA),
2013.
[ PDF ]
The C++ programming language remains widely used, despite inheriting
many unsafe features from C---features that often lead to failures of
type or memory safety that manifest as buffer overflows,
use-after-free vulnerabilities, or abstraction violations. Malicious
attackers can exploit such violations to compromise application and
system security.
This paper introduces Ironclad C++, an approach to bringing the
benefits of type and memory safety to C++. Ironclad C++ is, in
essence, a library-augmented, type-safe subset of C++. All Ironclad
C++ programs are valid C++ programs that can be compiled using
standard, off-the-shelf C++ compilers. However, not all valid C++
programs are valid Ironclad C++ programs: a syntactic source-code
validator statically prevents the use of unsafe C++ features. To
enforce safety properties that are difficult to check statically,
Ironclad C++ applies dynamic checks via templated “smart pointer”
classes.
Using a semi-automatic refactoring tool, we have ported nearly 50K
lines of code to Ironclad C++. These benchmarks incur a performance
overhead of 12% on average, compared to the original unsafe C++
code.
|
[32]
|
Santosh Nagarakatte, Milo M. K. Martin, and Steve Zdancewic.
Watchdog: Hardware for Safe and Secure Manual Memory Management
and Full Memory Safety.
In Proceedings of the 39th International Symposium on Computer
Architecture (ISCA), June 2012.
[ PDF ]
Languages such as C and C++ use unsafe manual memory
management, allowing simple bugs (i.e., accesses to an object after
deallocation) to become the root cause of exploitable security
vulnerabilities. This paper proposes Watchdog, a hardware-based
approach for ensuring safe and secure manual memory management.
Inspired by prior software-only proposals, Watchdog generates a
unique identifier for each memory allocation, associates these
identifiers with pointers, and checks to ensure that the
identifier is still valid on every memory access. This use of
identifiers and checks enables Watchdog to detect errors even in
the presence of reallocations.
Watchdog stores these pointer identifiers in a disjoint shadow
space to provide comprehensive protection and ensure compatibility
with existing code.
To streamline the implementation
and reduce runtime overhead: Watchdog (1) uses micro-ops to
access metadata and perform checks, (2) eliminates metadata
copies among registers via modified register renaming, and (3)
uses a dedicated metadata cache to reduce checking overhead.
Furthermore, this paper extends Watchdog's mechanisms to detect
bounds errors, thereby providing full hardware-enforced memory
safety at low overheads.
|
[33]
|
Jianzhou Zhao, Santosh Nagarakatte, Milo M. K. Martin, and Steve Zdancewic.
Formalizing the LLVM Intermediate Representation for Verified
Program Transformations.
In Proc. of the ACM Symposium on Principles of Programming
Languages (POPL), 2012.
[ PDF ]
This paper presents Vellvm (verified LLVM), a framework
for reasoning about programs expressed in LLVM's intermediate
representation and transformations that operate on it. Vellvm
provides a mechanized formal semantics of LLVM's intermediate
representation, its type system, and properties of its SSA form.
The framework is built using the Coq interactive theorem prover.
It includes multiple operational semantics and proves relations among
them to facilitate different reasoning styles and proof
techniques.
To validate Vellvm's design, we extract an interpreter from the Coq
formal semantics that can execute programs from LLVM test suite and
thus be compared against LLVM reference implementations.
To demonstrate Vellvm's practicality, we formalize and verify a
previously proposed transformation that hardens C programs against
spatial memory safety violations. Vellvm's tools allow us to extract
a new, verified implementation of the transformation pass that plugs
into the real LLVM infrastructure; its performance is competitive with
the non-verified, ad-hoc original.
|
[34]
|
Jianzhou Zhao and Steve Zdancewic.
Mechanized Verification of Computing Dominators for Formalizing
Compilers.
In The Second International Conference on Certified Programs and
Proofs (CPP), Lecture Notes in Computer Science, pages 27--42, 2012.
[ PDF ]
One prerequisite to the formal verification of modern compilers is to formalize
computing dominators, which enable SSA forms, advanced optimizations, and
analysis. This paper provides an abstract specification of dominance
analysis that is sufficient for formalizing modern compilers; it describes a
certified implementation and instance of the specification that is simple to
design and reason about, and also reasonably efficient. The paper also presents
applications of dominance analysis: an SSA-form type checker, verifying
SSA-based optimizations, and constructing dominator trees. This development is a
part of the Vellvm project. All proofs and implementation have been carried
out in Coq.
|
[35]
|
Stephanie Weirich, Dimitrios Vytiniotis, Simon Peyton Jones, and Steve
Zdancewic.
Generative Type Abstraction and Type-level Computation.
In Proc. of the ACM Symposium on Principles of Programming
Languages (POPL), 2011.
[ PDF ]
Modular languages support generative type abstraction,
ensuring that an abstract type is distinct from its
representation, except inside the implementation where the
two are synonymous. We show that this well-established
feature is in tension with the non-parametric features
of newer type systems, such as indexed type families and GADTs.
In this paper we solve the problem by using kinds to
distinguish between parametric and non-parametric contexts.
The result is directly applicable to Haskell, which is rapidly
developing support for type-level computation, but the same
issues should arise whenever generativity and non-parametric features
are combined.
|
[36]
|
Karl Mazurak and Steve Zdancewic.
Lolliproc: to Concurrency from Classical Linear Logic via
Curry-Howard and Control.
In Proc. of the 15th ACM SIGPLAN International Conference on
Functional Programming (ICFP), 2010.
[ PDF ]
While many type systems
based on the intuitionistic fragment of linear logic have been
proposed, applications in programming languages of the full power of
linear logic---including double-negation elimination---have remained
elusive. Meanwhile, linearity has been used in many type systems for
concurrent programs---e.g., session types---which suggests
applicability to the problems of concurrent programming, but the ways
in which linearity has interacted with concurrency primitives in
lambda calculi have remained somewhat ad-hoc. In this paper we
connect classical linear logic and concurrent functional programming
in the language Lolliproc, which provides simple primitives for
concurrency that have a direct logical interpretation and that combine
to provide the functionality of session types. Lolliproc features a
simple process calculus “under the hood” but hides the machinery of
processes from programmers. We illustrate Lolliproc by example and
prove soundness, strong normalization, and confluence results, which,
among other things, guarantees freedom from deadlocks and race
conditions.
|
[37]
|
Santosh Nagarakatte, Jianzhou Zhao, Milo M. K. Martin, and Steve Zdancewic.
CETS: Compiler-Enforced Temporal Safety for C.
In Proceedings of the ACM International Symposium on Memory
Management (ISMM), 2010.
[ PDF ]
Temporal memory safety errors, such as dangling pointer dereferences and
double frees, are a prevalent source of software bugs in unmanaged
languages such as C. Existing schemes that attempt to retrofit temporal
safety for such languages have high runtime overheads and/or are
incomplete, thereby limiting their effectiveness as debugging aids.
This paper presents CETS, a compile-time transformation for detecting
all violations of temporal safety in C programs. Inspired by existing
approaches, CETS maintains a unique identifier with each object,
associates this metadata with the pointers in a disjoint metadata space
to retain memory layout compatibility, and checks that the object is
still allocated on pointer dereferences. A formal proof shows that this
is sufficient to provide temporal safety even in the presence of
arbitrary casts if the program contains no spatial safety violations.
Our CETS prototype employs both temporal check removal optimizations
and traditional compiler optimizations to achieve a runtime overhead of
just 48% on average. When combined with a
spatial-checking system, the average overall overhead is 116%
for complete memory safety.
|
[38]
|
Jianzhou Zhao, Qi Zhang, and Steve Zdancewic.
Relational Parametricity for Polymorphic Linear Lambda
Calculus.
In Proceedings of the Eighth ASIAN Symposium on Programming
Languages and Systems (APLAS), 2010.
[ PDF ]
This paper presents a novel syntactic logical relation for a polymorphic
linear lambda-calculus that treats all types as linear and introduces the
constructor ! to account for intuitionistic terms, and Fo---an
extension of System F that uses kinds to distinguish linear from intuitionistic
types. We define a logical relation for open values under both open linear and
intuitionistic contexts, then extend it for open terms with evaluation and
open relation substitutions. Relations that instantiate type quantifiers are
for open terms and types. We demonstrate the applicability of this logical
relation through its soundness with respect to contextual equivalence, along
with free theorems for linearity that are difficult to achieve by closed
logical relations. When interpreting types on only closed terms, the model
defaults to a closed logical relation that is both sound and complete with
respect to contextual equivalence and is sufficient to reason about
isomorphisms of type encodings. All of our results have been mechanically
verified in Coq.
|
[39]
|
Santosh Nagarakatte, Jianzhou Zhao, Milo M. K. Martin, and Steve Zdancewic.
SoftBound: Highly Compatible and Complete Spatial Memory
Safety for C.
In Proc. 2009 ACM SIGPLAN Conference on Programming Languages
Design and Implementation (PLDI), 2009.
[ PDF ]
The serious bugs and security vulnerabilities facilitated by C/C++'s
lack of bounds checking are well known, yet C and C++ remain in
widespread use. Unfortunately, C's arbitrary pointer arithmetic,
conflation of pointers and arrays, and programmer-visible memory layout
make retrofitting C/C++ with spatial safety guarantees extremely
challenging. Existing approaches suffer from incompleteness, have high
runtime overhead, or require non-trivial changes to the C source code.
Thus far, these deficiencies have prevented widespread adoption of such
techniques.
This paper proposes SoftBound, a compile-time transformation for
enforcing spatial safety of C. Inspired by HardBound, a
previously proposed hardware-assisted approach, SoftBound similarly
records base and bound information for every pointer as disjoint
metadata. This decoupling enables SoftBound to provide
spatial safety without requiring changes to C source code. Unlike
HardBound, SoftBound is a software-only approach and performs
metadata manipulation only when loading or storing pointer values. A
formal proof shows that this is sufficient to provide spatial safety
even in the presence of arbitrary casts. SoftBound's full checking
mode provides complete spatial violation detection with 67%
runtime overhead on average. To further reduce overheads, SoftBound
has a store-only checking mode that successfully detects all the
security vulnerabilities in a test suite at the cost of only
22% runtime overhead on average.
|
[40]
|
J. Nathan Foster, Benjamin C. Pierce, and Steve Zdancewic.
Updatable Security Views.
In Proc. of 22nd IEEE Computer Security Foundations Symposium
(CSF), 2009.
[ PDF ]
Security views are a flexible and effective mechanism for controlling
access to confidential information. Rather than allowing untrusted
users to access source data directly, they are instead provided with a
restricted view, from which all confidential information has been
removed. The program that generates the view effectively embodies a
confidentiality policy for the underlying source data. However, this
approach has a significant drawback: it prevents users from updating
the data in the view.
To address the “view update problem” in general, a number of
bidirectional languages have been proposed. Programs in these
languages---often called lenses---can be run in two directions: read
from left to right, they map sources to views; from right to left,
they map updated views back to updated sources. However, existing
bidirectional languages do not deal adequately with security. In
particular, they do not provide a way to ensure the integrity of
source data as it is manipulated by untrusted users of the view.
We propose a novel framework of secure lenses that addresses these
shortcomings. We enrich the types of basic lenses with equivalence
relations capturing notions of confidentiality and integrity, and
formulate the essential security conditions as non-interference
properties. We then instantiate this framework in the domain of string
transformations, developing syntax for bidirectional string
combinators with security-annotated regular expressions as their
types.
|
[41]
|
Aaron Bohannon, Benjamin C. Pierce, Vilhelm Sjöberg, Stephanie Weirich, and
Steve Zdancewic.
Reactive Noninterference.
In ACM Computer and Communications Security Conference (CCS),
2009.
[ PDF ]
Many programs operate reactively---patiently waiting for user input, running
for a while producing output, and eventually returning to a state where they
are ready to accept another input (or occasionally diverging). When a
reactive program communicates with multiple parties, we would like to be
sure that it can be given secret information by one without leaking it to
others.
Motivated by web browsers and client-side web applications,
we explore definitions of noninterference for reactive programs and
identify two of special interest---one corresponding to
termination-insensitive noninterference for a simple sequential language,
the other to termination-sensitive noninterference. We focus on the former
and develop a proof technique for showing that program behaviors are secure
according to this definition. To demonstrate the viability of the approach,
we define a simple reactive language with an information-flow type system
and apply our proof technique to show that well-typed programs are secure.
|
[42]
|
Limin Jia, Jeffrey A. Vaughan, Karl Mazurak, Jianzhou Zhao, Luke Zarko, Joseph
Schorr, and Steve Zdancewic.
AURA: A Programming Language for Authorization and Audit.
In Proc. of the 13th ACM SIGPLAN International Conference on
Functional Programming (ICFP), Victoria, British Columbia, Canada, September
2008.
[ PDF ]
This paper presents AURA, a programming language for access control that
treats ordinary programming constructs (e.g., integers and recursive functions)
and authorization logic constructs (e.g., principals and access control
policies) in a uniform way.
AURA is based on polymorphic DCC and uses dependent types to
permit assertions that refer directly to AURA values while keeping
computation out of the assertion level to ensure tractability.
The main technical results of this paper include fully mechanically
verified proofs of the decidability and soundness
for AURA's type system, and a prototype typechecker and interpreter.
|
[43]
|
Joe Devietti, Colin Blundell, Milo M.K. Martin, and Steve Zdancewic.
HardBound: Architectural Support for Spatial Safety of the C
Programming Language.
In International Conference on Architectural Support for
Programming Languages and Operating Systems (ASPLOS), March 2008.
[ PDF ]
The C programming language is at least as well known for its absence of
spatial memory safety guarantees (i.e., lack of bounds checking) as it is
for its high performance. C's unchecked pointer arithmetic and
array indexing allow simple programming mistakes to lead to erroneous
executions, silent data corruption, and security vulnerabilities. Many
prior proposals have tackled enforcing spatial safety in C programs by
checking pointer and array accesses. However, existing software-only
proposals have significant drawbacks that may prevent wide
adoption, including: unacceptably high runtime overheads, lack of
completeness, incompatible pointer representations, or need for
non-trivial changes to existing C source code and compiler
infrastructure.
Inspired by the promise of these software-only approaches, this paper
proposes a hardware bounded pointer architectural primitive
that supports cooperative hardware/software enforcement of spatial
memory safety for C programs. This bounded pointer is a new hardware
primitive datatype for pointers that leaves the standard C pointer
representation intact, but augments it with bounds information
maintained separately and invisibly by the hardware. The bounds are
initialized by the software, and they are then propagated and enforced
transparently by the hardware, which automatically checks a pointer's
bounds before it is dereferenced. One mode of use requires
instrumenting only malloc, which enables enforcement of
per-allocation spatial safety for heap-allocated objects for existing
binaries. When combined with simple intra-procedural compiler
instrumentation, hardware bounded pointers enable a low-overhead
approach for enforcing complete spatial memory safety in unmodified C
programs.
|
[44]
|
Jeffrey A. Vaughan, Limin Jia, Karl Mazurak, and Steve Zdancewic.
Evidence-based Audit.
In Proc. of 21st IEEE Computer Security Foundations Symposium
(CSF), pages 177--191. IEEE Computer Society Press, 2008.
[ PDF ]
Authorization logics provide a principled and flexible approach to
specifying access control policies. One of their compelling benefits
is that a proof in the logic is evidence that an access-control
decision has been made in accordance with policy. Using such proofs
for auditing reduces the trusted computing base and enables the ability to
detect flaws in complex authorization policies. Moreover, the proof
structure is itself useful, because proof
normalization can yield information about the relevance of policy
statements. Untrusted, but well-typed, applications that access
resources through an appropriate interface must obey the access
control policy and create proofs useful for audit.
This paper presents AURA, an authorization logic based on a
dependently-typed variant of DCC and proves the metatheoretic
properties of subject-reduction and normalization. It shows the
utility of proof-based auditing in a number of examples and discusses
several pragmatic issues that must be addressed in this context.
|
[45]
|
Peng Li and Steve Zdancewic.
Combining Events And Threads For Scalable Network Services.
In Proc. 2007 ACM SIGPLAN Conference on Programming Languages
Design and Implementation (PLDI), pages 189--199, 2007.
[ PS ]
This paper proposes to combine two seemingly opposed programming
models for building massively concurrent network services: the
event-driven model and the multithreaded model. The result is a
hybrid design that offers the best of both worlds---the ease of use
and expressiveness of threads and the flexibility and performance of
events.
This paper shows how the hybrid model can be implemented entirely at
the application level using concurrency monads in Haskell,
which provides type-safe abstractions for both events and threads.
This approach simplifies the development of massively concurrent
software in a way that scales to real-world network services. The
Haskell implementation supports exceptions, symmetrical multiprocessing,
software transactional memory, asynchronous I/O mechanisms and
application-level network protocol stacks. Experimental results
demonstrate that this monad-based approach has good performance: the
threads are extremely lightweight (scaling to ten million threads),
and the I/O performance compares favorably to that of Linux NPTL.
|
[46]
|
Jeffrey A. Vaughan and Steve Zdancewic.
A Cryptographic Decentralized Label Model.
In IEEE 2007 Symposium on Security and Privacy (Oakland),
pages 192--206, 2007.
[ PDF
PS ]
Information-flow security policies are an appealing way of
specifying confidentiality and integrity policies in information
systems. Most previous work on language-based security has assumed
that programs run in a closed, managed environment and that they use
potentially unsafe constructs, such as declassification, to
interface to external communication channels, perhaps after
encrypting data to preserve its confidentiality. This situation is
unsatisfactory for systems that need to communicate over untrusted
channels or use untrusted persistent storage, since the connection
between the cryptographic mechanisms used in the untrusted
environment and the abstract security labels used in the trusted
language environment is ad hoc and unclear.
This paper addresses this problem in three ways: First, it presents
a simple, security-typed language with a novel mechanism called
packages that provides an abstract means for creating
opaque objects and associating them with security labels; well-typed
programs in this language enforce noninterference. Second, it shows
how to implement these packages using public-key cryptography. This
implementation strategy uses a variant of Myers and Liskov's
decentralized label model, which supports a rich label
structure in which mutually distrusting data owners can specify
independent confidentiality and integrity requirements. Third, it
demonstrates that this implementation of packages is sound with
respect to Dolev-Yao style attackers---such an attacker cannot
determine the contents of a package without possessing the
appropriate keys, as determined by the security label on the
package.
|
[47]
|
Nikhil Swamy, Michael Hicks, Stephen Tse, and Steve Zdancewic.
Managing Policy Updates in Security-Typed Languages.
In Proc. of 19th IEEE Computer Security Foundations Workshop
(CSFW), pages 202--216. IEEE Computer Society Press, 2006.
[ PDF ]
This paper presents RX, a new security-typed
programming language with features intended to make the management of
information-flow policies more practical. Security labels in RX,
in contrast to prior approaches, are defined in terms of owned
roles, as found in the RT role-based trust-management framework.
Role-based security policies allow flexible delegation, and our
language RX provides constructs through which programs can robustly
update policies and react to policy updates dynamically. Our dynamic
semantics use statically verified transactions to eliminate
illegal information flows across updates, which we call
transitive flows. Because policy updates can be observed
through dynamic queries, policy updates can potentially reveal
sensitive information. As such, RX considers policy statements
themselves to be potentially confidential information and subject to
information-flow metapolicies.
|
[48]
|
Peng Li and Steve Zdancewic.
Encoding Information Flow in Haskell.
In Proc. of 19th IEEE Computer Security Foundations Workshop
(CSFW), pages 16--27. IEEE Computer Society Press, 2006.
[ PDF ]
This paper presents an embedded security sublanguage for enforcing
information-flow policies in the standard Haskell programming
language. The sublanguage provides useful information-flow control
mechanisms including dynamic security lattices, run-time code
privileges and declassification, without modifying the base
language. This design avoids the redundant work of producing new
languages, lowers the threshold for adopting security-typed
languages, and also provides great flexibility and modularity for
using security-policy frameworks.
The embedded security sublanguage is designed using a standard
combinator interface called arrows. Computations constructed
in the sublanguage have static and explicit control-flow components,
making it possible to implement information-flow control using
static-analysis techniques at run time, while providing strong
security guarantees. This paper presents a concrete Haskell
implementation and an example application demonstrating the proposed
techniques.
|
[49]
|
Rajeev Alur, Pavol Černý, and Steve Zdancewic.
Preserving Secrecy under Refinement.
In Proc. of 33rd International Colloquium on Automata, Languages
and Programming (ICALP), pages 107--118, 2006.
[ PDF ]
We propose a general framework of secrecy and
preservation of secrecy for labeled transition systems. Our
definition of secrecy is parameterized by the distinguishing power of
the observer, the properties to be kept secret, and the executions of
interest, and captures a multitude of definitions in the literature.
We define a notion of secrecy preserving refinement between
systems by strengthening the classical trace-based refinement so that
the implementation leaks a secret only when the specification also
leaks it. We show that secrecy is in general not definable in
mu-calculus, and thus not expressible in specification logics
supported by standard model-checkers. However, we develop a
simulation-based proof technique for establishing secrecy preserving
refinement. This result shows how existing refinement checkers can be
used to show correctness of an implementation with respect to a
specification.
|
[50]
|
Peng Li and Steve Zdancewic.
Downgrading Policies and Relaxed Noninterference.
In Proc. 32nd ACM Symp. on Principles of Programming Languages
(POPL), pages 158--170, January 2005.
[ PDF ]
In traditional information-flow type systems, the
security policy is often formalized as noninterference properties.
However, noninterference alone is too strong to express security
properties useful in practice. If we allow downgrading in such
systems, it is challenging to formalize the security policy as an
extensional property of the system.
This paper presents a generalized framework of downgrading
policies. Such policies can be specified in a simple and tracable
language and can be statically enforced by mechanisms such as type
systems. The security guarantee is then formalized as a concise
extensional property using program equivalences. This
relaxed noninterference generalizes traditional pure
noninterference and precisely characterizes the information released
due to downgrading.
|
[51]
|
Peng Li and Steve Zdancewic.
Practical Information-flow Control in Web-based Information
Systems.
In Proc. of 18th IEEE Computer Security Foundations Workshop
(CSFW), pages 2--15, 2005.
[ PDF ]
This paper presents a practical application of
language-based information-flow control, namely, a domain-specific
web scripting language designed for interfacing with databases. The
primary goal is to provide strong enforcement of confidentiality and
integrity policies: confidential data can be released only in
permitted ways and trustworthy data must result from expected
computations or conform to expected patterns. Such security
policies are specified in the database layer and statically enforced
for the rest of the system in an end-to-end fashion.
In contrast with existing web-scripting languages, which provide
only ad hoc mechanisms for information security, the scripting
language described here uses principles based on the well-studied
techniques in information-flow type systems. However, because web
scrips often need to downgrade confidential data and manipulate
untrusted user input, they require practical and convenient ways of
downgrading secure data. To achieve this goal, the language allows
safe downgrading according to downgrading policies specified
by the programmer. This novel, pattern-based approach provides a
practical instance of recent work on delimited release and
relaxed noninterference and extends that work by accounting
for integrity policies.
|
[52]
|
Stephen Tse and Steve Zdancewic.
Designing a Security-typed Language with Certificate-based
Declassification.
In Proc. of the 14th European Symposium on Programming (ESOP),
volume 3444, pages 279--294, 2005.
[ PDF ]
This paper presents a calculus that supports information-flow
security policies and certificate-based declassification. The
decentralized label model and its downgrading mechanisms are
concisely expressed in the polymorphic lambda calculus with
subtyping (System F-Sub). We prove a conditioned version of the
noninterference theorem such that authorization for declassification
is justified by digital certificates from public-key
infrastructures.
|
[53]
|
Brian E. Aydemir, Aaron Bohannon, Matthew Fairbairn, J. Nathan Foster,
Benjamin C. Pierce, Peter Sewell, Dimitrios Vytiniotis, Geoffrey Washburn,
Stephanie Weirich, and Steve Zdancewic.
Mechanized Metatheory for the Masses: The POPLMark Challenge.
In International Conference on Theorem Proving in Higher Order
Logics (TPHOLs), pages 50--65, 2005.
[ PDF ]
How close are we to a world where every paper on
programming languages is accompanied by an electronic appendix with
machine-checked proofs?
We propose an initial set of benchmarks for measuring progress in
this area. Based on the metatheory of System F-Sub, a typed
lambda-calculus with second-order polymorphism, subtyping, and
records, these benchmarks embody many aspects of programming
languages that are challenging to formalize: variable binding at
both the term and type levels, syntactic forms with variable numbers
of components (including binders), and proofs demanding complex
induction principles. We hope that these benchmarks will help
clarify the current state of the art, provide a basis for comparing
competing technologies, and motivate further research.
|
[54]
|
Peng Li and Steve Zdancewic.
Advanced Control Flow in Java Card Programming.
In Proceedings of the 2004 ACM SIGPLAN/SIGBED Conference on
Languages, Compilers, and Tools for Embedded Systems (LCTES), pages
165--174, June 2004.
[ PDF ]
Java Card technology simplifies the development of smart card
applications by providing a high-level programming language similar to
Java. However, the master-slave programming model used in current
Java Card platform creates control flow difficulties when writing
complex card programs, making it inconvenient, tedious, and
error-prone to implement Java Card applications. This paper examines
these drawbacks of the master-slave model and proposes a concurrent
thread model for developing future Java Card programs, which is much
closer to conventional Java network programming. This paper also
presents a code translation algorithm and a corresponding tool that
makes it possible to write card programs in the concurrent thread
model without losing compatibility with the existing Java Card API.
|
[55]
|
Stephen Tse and Steve Zdancewic.
Run-time Principals in Information-flow Type Systems.
In IEEE 2004 Symposium on Security and Privacy (Oakland), pages
179--193. IEEE Computer Society Press, May 2004.
[ PDF
PS ]
Information-flow type systems are a promising approach for enforcing
strong end-to-end confidentiality and integrity policies. Such
policies, however, are usually specified in term of static
information--data is labeled high or low security at
compile time. In practice, the confidentiality of data may depend
on information available only while the system is running
This paper studies language support for run-time principals, a
mechanism for specifying information-flow security policies that
depend on which principals interact with the system. We establish
the basic property of noninterference for programs written in such
language, and use run-time principals for specifying run-time
authority in downgrading mechanisms such as declassification.
In addition to allowing more expressive security policies, run-time
principals enable the integration of language-based security
mechanisms with other existing approaches such as Java stack
inspection and public key infrastructures. We sketch an
implementation of run-time principals via public keys such that
principal delegation is verified by certificate chains.
|
[56]
|
Stephen Tse and Steve Zdancewic.
Translating Dependency into Parametricity.
In Proc. of the 9th ACM SIGPLAN International Conference on
Functional Programming (ICFP), 2004.
[ PDF
PS ]
Abadi et al. introduced the dependency core calculus (DCC) as a unifying framework to study many important program analyses such as binding time, information flow, slicing, and function call tracking. DCC uses a lattice of monads and a nonstandard typing rule for their associated bind operations to describe the dependency of computations in a program. Abadi et al. proved a noninterference theorem that establishes the correctness of DCC’s type system and thus the correctness of the type systems for the analyses above.
In this paper, we study the relationship between DCC and the Girard-Reynolds polymorphic lambda calculus (System F). We encode the recursion-free fragment of DCC into F via a type-directed translation. Our main theoretical result is that, following from the correctness of the translation, the parametricity theorem for F implies the noninterference theorem for DCC. In addition, the translation provides in- sights into DCC’s type system and suggests implementation strategies of dependency calculi in polymorphic languages.
|
[57]
|
Andrew C. Myers, Andrei Sabelfeld, and Steve Zdancewic.
Enforcing Robust Declassification.
In Proc. of 17th IEEE Computer Security Foundations Workshop
(CSFW), pages 172--186, 2004.
[ PDF ]
Noninterference requires that there is no information
flow from sensitive to public data in a given system. However, many
systems perform intentional release of sensitive information as part
of their correct functioning and therefore violate noninterference.
To control information flow while permitting intentional information
release, some systems have a downgrading or declassification
mechanism. A major danger of such a mechanism is that it may cause
unintentional information release. This paper shows that a
robustness property can be used to characterize programs in which
declassification mechanisms cannot be exploited by attackers to
release more information than intended. It describes a simple way
to provably enforce this robustness property through a type-based
compile-time program analysis. The paper also presents a
generalization of robustness that supports upgrading (endorsing)
data integrity.
|
[58]
|
David Walker, Steve Zdancewic, and Jay Ligatti.
A Theory of Aspects.
In Proc. of the 8th ACM SIGPLAN International Conference on
Functional Programming (ICFP), pages 127--139, Upsala, Sweden, August 2003.
[ PDF
PS ]
This paper define the semantics of MinAML, an idealized
aspect-oriented programming language, by giving
a type-directed translation from its user-friendly external language
to its compact, well-defined core language.
We argue that our framework is an effective way
to give semantics to aspect-oriented programming languages in
general because the translation eliminates shallow syntactic differences
between related constructs and permits definition of a clean,
easy-to-understand, and easy-to-reason-about core language.
The core language extends the simply-typed lambda
calculus with two central new abstractions: explicitly labeled program
points and first-class advice. The labels serve both to trigger
advice and to mark continuations that the advice may return to. These
constructs are defined orthogonally to the other features of the
language and we show that our abstractions can be used in both
functional and object-oriented contexts. The labels are well-scoped
and the language as a whole is well-typed. Consequently, programmers
can use lexical scoping in the standard way to prevent aspects from
interfering with local program invariants.
|
[59]
|
Steve Zdancewic and Andrew C. Myers.
Observational Determinism for Concurrent Program Security.
In Proc. of 16th IEEE Computer Security Foundations Workshop
(CSFW), pages 29--45, Asilomar, CA, July 2003.
[ PDF
PS ]
Noninterference is a property of sequential programs that is useful
for expressing security policies for data confidentiality and
integrity. However, extending noninterference to concurrent programs has
proved problematic. In this paper we present a relatively expressive
secure concurrent language. This language, based on existing
concurrent calculi, provides first-class channels, higher-order
functions, and an unbounded number of threads. Well-typed programs
obey a generalization of noninterference that ensures
immunity to internal
timing attacks and to attacks that exploit information about the thread
scheduler. Elimination of these refinement attacks is possible because
the enforced security property extends noninterference with
observational determinism. Although the security property is strong,
it also avoids some of the restrictiveness imposed on
previous security-typed concurrent languages.
|
[60]
|
Lantian Zheng, Stephen Chong, Steve Zdancewic, and Andrew C. Myers.
Building Secure Distributed Systems Using Replication and
Partitioning.
In IEEE 2003 Symposium on Security and Privacy (Oakland),
pages 236--250. IEEE Computer Society Press, 2003.
[ PDF
PS ]
A challenging unsolved security problem is how to
specify and enforce system-wide security policies; this problem is
even more acute in distributed systems with mutual distrust. This
paper describes a way to enforce policies for data confidentiality and
integrity in such an environment. Programs annotated with security
specifications are statically checked and then transformed by the
compiler to run securely on a distributed system with untrusted
hosts. The code and data of the computation are partitioned across the
available hosts in accordance with the security specification. The
key contribution is automatic replication of code and data to increase
assurance of integrity---without harming confidentiality, and without
placing undue trust in any host. The compiler automatically generates
secure run-time protocols for communication among the replicated code
partitions. Results are given from a prototype implementation applied
to various distributed programs.
|
[61]
|
Steve Zdancewic, Lantian Zheng, Nathaniel Nystrom, and Andrew C. Myers.
Untrusted Hosts and Confidentiality: Secure Program
Partitioning.
In Proc. 18th ACM Symp. on Operating System Principles
(SOSP), volume 35(5) of Operating Systems Review, pages 1--14, Banff,
Canada, October 2001.
[ PDF
PS ]
This paper presents secure program partitioning, a
language-based technique for protecting confidential data during
computation in distributed systems containing mutually untrusted
hosts. Confidentiality and integrity policies can be expressed by
annotating programs with security types that constrain information
flow; these programs can then be partitioned automatically to run
securely on heterogeneously trusted hosts. The resulting
communicating subprograms collectively implement the original
program, yet the system as a whole satisfies the security
requirements of participating principals without requiring a
universally trusted host machine. The experience in applying this
methodology and the performance of the resulting distributed code
suggest that this is a promising way to obtain secure distributed
computation.
|
[62]
|
Steve Zdancewic and Andrew C. Myers.
Robust Declassification.
In Proc. of 14th IEEE Computer Security Foundations Workshop
(CSFW), pages 15--23, Cape Breton, Canada, June 2001.
[ PDF
PS ]
Security properties based on information flow, such as
noninterference, provide strong guarantees that confidentiality is
maintained. However, programs often need to leak some amount of
confidential information in order to serve their intended purpose,
and thus violate noninterference. Real systems that control
information flow often include mechanisms for downgrading or
declassifying information; however, declassification can easily
result in the unexpected release of confidential information.
This paper introduces a formal model of information flow in systems
that include intentional information leaks and shows how to
characterize what information leaks. Further, we define a notion of
robustness for systems that include information leaks introduced by
declassification. Robust systems have the property that an attacker is
unable to exploit declassification channels to obtain more
confidential information than was intended to be released. We show
that all systems satisfying a noninterference-like property are
robust; for other systems, robustness involves a nontrivial
interaction between confidentiality and integrity properties. We
expect this model to provide new tools for the characterization of
information flow properties in the presence of intentional information
leaks.
|
[63]
|
Steve Zdancewic and Andrew C. Myers.
Secure Information Flow and CPS.
In Proc. of the 10th European Symposium on Programming (ESOP),
volume 2028 of Lecture Notes in Computer Science, pages 46--61, April
2001.
[ PDF
PS ]
Security-typed languages enforce secrecy or integrity
policies by type-checking. This paper investigates
continuation-passing style as a means of proving that such languages
enforce non-interference and as a first step towards understanding
their compilation. We present a low-level, secure calculus with
higher-order, imperative features. Our type system makes novel use
of ordered linear continuations.
|
[64]
|
Steve Zdancewic, Dan Grossman, and Greg Morrisett.
Principals in Programming Languages: A Syntactic Proof
Technique.
In Proc. of the 4th ACM SIGPLAN International Conference on
Functional Programming (ICFP), pages 197--207, Paris, France, September
1999.
[ PDF
PS ]
Programs are often structured around the idea that
different pieces of code comprise distinct principals, each with a
view of its environment. Typical examples include the modules of a
large program, a host and its clients, or a collection of
interactive agents.
In this paper, we formalize this notion of principal in the
programming language itself. The result is a language in which
intuitive statements such as, “the client must call open to obtain a
file handle”, can be phrased and proven formally.
We add principals to variants of the simply-typed lambda-calculus and
show how we can track the code corresponding to each principal
throughout evaluation. This multiagent calculus yields syntactic
proofs of some type abstraction properties that traditionally require
semantic arguments.
|
|
Technical Reports, Work in Progress, and Unpublished Papers
[1]
|
Jennifer Paykin and Steve Zdancewic.
A HoTT Quantum Equational Theory (Extended Version).
available on arXiv, 2019.
[ http ]
This paper presents an equational theory for the QRAM model of quantum computation, formulated as an embedded language inside of homotopy type theory. The embedded language approach is highly expressive, and reflects the style of state-of-the art quantum languages like Quipper and QWIRE. The embedding takes advantage of features of homotopy type theory to encode unitary transformations as higher inductive paths, simplifying the presentation of an equational theory. We prove that this equational theory is sound and complete with respect to established models of quantum computation.
|
[2]
|
Dmitri Garbuzov, William Mansky, Christine Rizkallah, and Steve Zdancewic.
Structural Operational Semantics for Control Flow Graph
Machines, 2018.
[ arXiv
PDF ]
Compilers use control flow graph (CFG) representations of low-level programs because they are suited to program
analysis and optimizations. However, formalizing the behavior and metatheory of CFG programs is non-trivial: CFG
programs don’t compose well, their semantics depends on auxiliary state, and, as a consequence, they do not enjoy a simple equational theory that can be used for reasoning about the correctness of program transformations. Lambdacalculus-based intermediate representations, in contrast, have well-understood operational semantics and metatheory, including rich equational theories, all of which makes them amenable to formal verification.
This paper establishes a tight equivalence between (a variant of) Levy’s call-by-push-value (CBPV) calculus and
a control flow graph machine whose instructions are in static single assignment (SSA) form. The correspondence is
made precise via a series of abstract machines that align the transitions of the structural operational semantics of the CBPV language with the computation steps of the SSA form.
The target machine, which is derived from the CBPV language, accurately captures the execution model of control
flow graphs, including direct jumps, mutually recursive code blocks, and multi-argument function calls, and the closurefree subset is similar to the SSA intermediate representations found in modern compilers such as LLVM and GCC. The definitions of all the language/abstract machine semantics and the theorems relating them are fully verified in Coq.
|
[3]
|
Jennifer Paykin and Steve Zdancewic.
A Linear/Producer/Consumer model of Classical Linear Logic.
Technical report, University of Pennsylvania, 2014.
[ PDF ]
This paper defines a new proof- and category-theoretic framework for
classical linear logic that separates reasoning into one linear regime
and two persistent regimes corresponding to ! and ?. The resulting
linear/producer/consumer (LPC) logic puts the three classes of
propositions on the same semantic footing, following Benton's
linear/non-linear formulation of intuitionistic linear logic.
Semantically, LPC corresponds to a system of three categories
connected by adjunctions that reflect the linear/producer/consumer
structure. The paper's metatheoretic results include admissibility
theorems for the cut and duality rules, and a translation of the LPC
logic into the category theory. The work also presents several
concrete instances of the LPC model, including one based on finite
vector spaces.
|
[4]
|
Christian DeLozier, Richard Eisenberg, Santosh Nagarakatte, Peter-Michael
Osera, Milo M.K. Martin, and Steve Zdancewic.
Ironclad C++: A library-Augmented Type-Safe Subset of C++.
Technical Report MS-CIS-13-05, University of Pennsylvania, March
2013.
[ PDF ]
|
[5]
|
Jianzhou Zhao, Qi Zhang, and Steve Zdancewic.
Relational Parametricity for Polymorphic Linear Lambda Calculus
(Extended TR).
2010.
[ PDF ]
This is the technical report for the corrsponding APLAS 2010 paper.
|
[6]
|
Jeffrey C. Vaughan, Limin Jia, Karl Mazurak, and Steve Zdancewic.
Evidence-based Audit, Technical Appendix.
Technical Report MS-CIS-08-09, University of Pennsylvania, 2008.
[ PDF ]
|
[7]
|
Limin Jia, Jeffrey A. Vaughan, Karl Mazurak, Jianzhou Zhao, Luke Zarko, Joseph
Schorr, and Steve Zdancewic.
AURA:Preliminary Technical Results.
Technical Report MS-CIS-08-10, University of Pennsylvania, 2008.
[ PDF ]
|
[8]
|
Stephen Tse and Steve Zdancewic.
Concise Concrete Syntax.
Technical Report MS-CIS-08-11, University of Pennsylvania, 2008.
[ PDF ]
|
[9]
|
Brian Aydemir, Stephanie Weirich, and Steve Zdancewic.
Abstracting Syntax.
(15 pages), 2008.
|
[10]
|
Stephen Tse and Steve Zdancewic.
Translating Dependency into Parametricity.
(33 pages) Accepted to Journal of Functional Programming,
pending revisions, 2006.
|
[11]
|
Stephen Tse and Steve Zdancewic.
Translating Dependency into Parametricity.
Technical Report MIS-CIS-04-01, University of Pennsylvania, 2004.
[ PDF ]
The dependency core calculus (DCC) was introduced by Abadi
et al. as a unifying formal framework in which to study a variety of
important program analyses including binding-time, information-flow,
slicing, and function call tracking. The novel feature of DCC is a
lattice of monads and a nonstandard typing rule for their associated
bind operations. Intuitively, the lattice structure describes
which computations in a program may depend on each other. Abadi et
al. prove a noninterference result that establishes the
correctness of DCC's type system, and they use that result to show
that type systems for the above-mentioned analyses are correct.
In this paper, we study the relationship between DCC and the
Girard--Reynolds polymorphic lambda calculus (System F). In
particular, we show how to encode the recursion-free fragment of DCC
into F via a type-directed translation. The main theorem we present
uses this translation to derive the noninterference result for DCC
from the standard parametricity theorem of System F. In addition to
providing insight into DCC's type system, the hope is that the
translation presented here may yield implementation strategies for
non-standard type systems (e.g. for information flow security) in
languages that have parametric polymorphism.
|
[12]
|
Stephen Tse and Steve Zdancewic.
Designing a Security-typed Language with Certificate-based
Declassification.
Technical Report MIS-CIS-04-16, University of Pennsylvania, 2004.
[ PDF ]
This paper presents the design of a programming language that
supports information-flow security policies and certificate-based
declassification.
The language uses monadic information-flow annotations in the style
of Abadi et al.'s dependency core calculus, and has an effects
system and fixpoints. The type system conflates security concepts
such as labels, principals, and privileges with abstract types,
allowing a uniform treatment of lattice structures throughout the
language. Myers' and Liskov's decentralized label model is encoded
using type constructors that describe confidentiality and integrity
policies, and label refinements and principal groups follow
naturally from intersection and union types. Singleton types,
combined with bounded universal and existential quantifications,
connect the type system with public-key infrastructures whose
digital certificates provide authorization for privileged operations
such as declassification. These features allow specification of
security policies in term of dynamic entities such as run-time user
identities and file access permissions.
Besides showing that the language is sound, we present a security
theorem that generalizes standard noninterference to account for
information flows introduced by declassification. Although this
result gives only a coarse approximation to the information
potentially leaked, it captures our intuitions about
certificate-based declassification.
|
[13]
|
Stephen Tse and Steve Zdancewic.
Run-time Principals in Information-flow Type Systems.
Technical Report MS-CIS-03-39, University of Pennsylvania, 2003.
The conference version appears in IEEE Security and Privacy 2004.
[ PDF
PS ]
Information-flow type systems are a promising approach for enforcing
strong end-to-end confidentiality and integrity policies. Such
policies, however, are usually specified in term of static
information--data is labeled high or low security at
compile time. In practice, the confidentiality of data may depend
on information available only while the system is running
This paper studies language support for run-time principals, a
mechanism for specifying information-flow security policies that
depend on which principals interact with the system. We establish
the basic property of noninterference for programs written in such
language, and use run-time principals for specifying run-time
authority in downgrading mechanisms such as declassification.
In addition to allowing more expressive security policies, run-time
principals enable the integration of language-based security
mechanisms with other existing approaches such as Java stack
inspection and public key infrastructures. We sketch an
implementation of run-time principals via public keys such that
principal delegation is verified by certificate chains.
|
[14]
|
Stephan A. Zdancewic.
Programming Languages for Information Security.
PhD thesis, Cornell University, August 2002.
[ PDF
PS ]
Our society's widespread dependence on networked information systems
for everything from personal finance to military communications makes
it essential to improve the security of software. Standard security
mechanisms such as access control and encryption are essential
components for protecting information, but they do not provide
end-to-end guarantees. Programming-languages research has
demonstrated that security concerns can be addressed by using both
program analysis and program rewriting as powerful and flexible
enforcement mechanisms.
This thesis investigates security-typed
programming languages, which use static typing to enforce
information-flow security policies. These languages allow the
programmer to specify confidentiality and integrity constraints on the
data used in a program; the compiler verifies that the
program satisfies the constraints.
Previous theoretical security-typed languages research has focused on
simple models of computation and unrealistically idealized security
policies. The existing practical security-typed languages have not
been proved to guarantee security. This thesis addresses these
limitations in several ways.
First, it establishes noninterference, a basic information-flow
policy, for languages richer than those previously considered. The
languages studied here include recursive, higher-order functions,
structured state, and concurrency. These results narrow the gap
between the theory and the practice of security-typed languages.
Next, this thesis considers more practical security policies.
Noninterference is often too restrictive for real-world programming.
To compensate, a restricted form of declassification is introduced,
allowing programmers to specify a richer set of information-flow
policies. Previous work on information-flow security also assumed
that all computation occurs on equally trusted machines. To overcome
this unrealistic premise, additional security constraints for systems
distributed among heterogeneously trusted hosts are considered.
Finally, this thesis describes Jif/split, a prototype implementation
of secure program partitioning, in which a program can automatically
be partitioned to run securely on heterogeneously trusted hosts. The
resulting communicating subprograms collectively implement the
original program, yet the system as a whole satisfies the security
requirements without needing a universally trusted machine. The
theoretical results developed earlier in the thesis justify
Jif/split's run-time enforcement mechanisms.
|
[15]
|
Steve Zdancewic, Lantian Zheng, Nathaniel Nystrom, and Andrew C. Myers.
Secure Program Partitioning.
Technical Report 2001-1846, Computer Science Dept., Cornell
University, 2001.
[ PDF
PS ]
This paper presents secure program partitioning, a
language-based technique for protecting confidential data during
computation in distributed systems containing mutually untrusted
hosts. Confidentiality and integrity policies can be expressed by
annotating programs with security types that constrain information
flow; these programs can then be partitioned automatically to run
securely on heterogeneously trusted hosts. The resulting
communicating subprograms collectively implement the original
program, yet the system as a whole satisfies the security
requirements of participating principals without requiring a
universally trusted host machine. The experience in applying this
methodology and the performance of the resulting distributed code
suggest that this is a promising way to obtain secure distributed
computation.
This Technical Report is an expanded version of the published paper
“Untrusted Hosts and Confidentiality: Secure Program Partitioning”.
The main difference between the two is Appendix A, which contains a
correctness proof for the control-transfer protocols described in
Section 5.
|
[16]
|
Steve Zdancewic and Andrew C. Myers.
Confidentiality and Integrity with Untrusted Hosts.
Technical Report 2000-1810, Computer Science Dept., Cornell
University, 2000.
[ PDF
PS ]
Several security-typed languages have recently been proposed to enforce
security properties such as confidentiality or integrity by type checking.
We propose a new security-typed language, Spl@, that
addresses two important limitations of previous approaches.
First, existing languages assume that the underlying execution platform
is trusted; this assumption does not scale to distributed computation in
which a variety of differently trusted hosts are available to execute
programs. Our new approach, secure program partitioning, translates
programs written assuming complete trust in a single executing host into
programs that execute using a collection of variously trusted hosts to
perform computation. As the trust configuration of a distributed system
evolves, this translation can be performed as necessary for security.
Second, many common program transformations do not work in existing
security-typed languages; although they produce equivalent
programs, these programs are rejected because of apparent information
flows. Spl@ uses a novel mechanism
based on ordered linear continuations to permit a richer class of program
transformations, including secure program partitioning.
|
[17]
|
Steve Zdancewic and Dan Grossman.
Principals in Programming Languages: Technical Results.
Technical Report TR99-1752, Computer Science Dept., Cornell
University, June 1999.
[ PDF
PS ]
This is the companion technical report for “Principals
in Programming Languages: A Syntactic Proof Technique.” See that
document for a more readable version of these results.
In this paper, we describe two variants of the simply typed
lambda-calculus extended with a notion of principal. The results are
languages in which intuitive statements like “the client must call
open to obtain a file handle” can be phrased and proven formally.
The first language is a two-agent calculus with references and
recursive types, while the second language explores the possibility of
multiple agents with varying amounts of type information. We use these
calculi to give syntactic proofs of some type abstraction results that
traditionally require semantic arguments.
|
|