The Lisp interface to Jack will allow general programming of Jack internals and should simplify all forms of Jack development.
A clear and modular specification of Prolog using the p-calculus is presented in this paper. Prolog goals are represented as pi-calculus processes, and Prolog predicate definitions are translated into p-calculus agent definitions. Prolog's depth-first left-right control strategy as well as the cut control operator are modeled by the synchronized communication among processes, which is similar in spirit to continuation-passing style impletation of Prolog. Prolog terms are represented by presistent processes, while logical variables are modeled by complex processes with channels that, at various times, can be written, read, aand reset, Both unifications with and without backtracking are specified by p-calculus agent definitions. A smooth merging of the specification for control and the specification for unification gives a full specification for much of Prolog. some related and further works are also discussed.
We present a cache-based approach to handling the difficult problem of performing visually acceptable texture resampling/filtering while ray-tracing. While many good methods have been proposed to handle the error introduced by the ray-tracing algorithm when sampling in screen space, handling this error in texture space has been less adequately addressed. Our solution is to introduce the Convolution Mask Approximation Module (CMAM). The CMAM locally approximates the convolution region in texture space as a set of overlapping texture triangles by using a texture sample caching system and ray tagging. Since the caching mechanism is hidden within the CMAM, the ray-tracing algorithm itself is unchanged while achieving an adequate level of texture filtering (area sampling as opposed to point sampling/interpolation in texture space). The CMAM is easily adapted to incorporate prefiltering methods such as MIP mapping and summed-area tables as well as direct convolution methods such as elliptical weighted average filtering.
The Abiteboul and Beeri algebra for complex objects can express a query whose meaning is transitive closure, but the algorithm naturally associated to this query needs exponential space. We show that any other query in the algebra which expresses transitive closure needs exponential space. This proves that in general the powerset is an intractable operator for implementing fixpoint queries.
We show that a form of divide and conquer recursion on sets together with the relational algebra expresses exactly the queries over ordered relational databases which are \NC-computable. At a finer level, we relate k nested uses of recursion exactly to \ACk, k ³ 1. We also give corresponding results for complex objects.
When a human oberserver moves, the eye continually fixates on targets in the world. Although fixation is a common process in human vision, its role has not yet been established for computational purposes. the main contribution of this paper is the formalize the retinal flow for a fixating observer. A further contribution -- a potentially more practical one -- is to explore the role of the periphery in predicting collision. Utilizing fixation is expected to turn out be especialaly fruitful in light of recent advance in computer vision for constructing active head/eye systems.
In this work we make the following assumption: (i) the observer moves with respect to the world and fixates on a target; (ii) the world is rigid, with no independently moving elements; and (iii) the possible rotation axes of the eye lie on a plane (comparable to Listing's Plane). Assumption (ii) and (iii) make the problem of determine retinal flow tractable.
We first define retinal flow for a 2D universe and then extend it to the full 3D will be decomposed into logitudinal and latitudinal flow; the behavior of longitudinal flow along the retinal periphery will be further analyzed for interesting properties. Finally the results of a simulated experiment on retinal flow at the periphery will be presented.
A novel edge operator is introduced baased on steerable asymmetric linear filters consisting of radial wedge segments. An intensity profile is computed by averaging intensity values along a radial wedge segment as it ``sweeps'' about a small circular neighborhood. the ``steerability'' of the filters allows for interpolation of a continuous profile function for n discretely sampled postions of the radial wedge segments. Edge strength is then calculated as a simple difference of conditinal means of the resulting intensity profile. This paper introduces the basic paradigm of using asymmetric filters for low-level image processing tasks and shows how this approach is utilized to design a novel edge operator (the Radial InTensity Edge, or RITE
Quality of Service (QoS) guarantees for 'delay sensitive' networked applications must be end-toend. This paper presents an experimental study of this class of applications where the endpoints are computer workstations. The experiments show that operating system effects dominate any jitter in the network.
Our conclusion is that QoS is that QoS provision by the workstation operating system is as important for maintaining end-to-end guarantees as is network QoS. In local-area settings, operating system influences may be more challenging for end-to-end QoS than network influences. The important influence variables are the degree of multiprocessing, the employed transport protocol (e.g. UDP or TCP), and the priorities assigned to processes.
In this paper we review Subsumption Architecture and Discrete Event systems. These approaches present diverse methodologies for dealing with control of interactions. They often take diametrically opposite directions in addressing specific issues. Subsumption architecture expects limited knowledge of the environment, no explicit representation, limited reasoning capabilities and no centralized control. At the other extremem lies Discrete EVent Systems, which require, at least in manufacturing and communication: a well-structured environment; explicit representations and models; and have limited reasoning capabiliteis and cnetralized control. Both offer benefits limitations which should really be evaluated and traded off when attempting to build a system. However, combining aspects from these two approaches will not address and resolve all issues. We conclude the while both approaches are powwerful there is more to intelligence than just behavior and control, and discuss the limitations and benefits entailed by both.
We map structured connectionist models of knowledge representation and reasoning onto existing general purpose massively parallel architectures with the objective of developing and implementing practical, real-time knowledge base systems. Shruti, a connectionist knowledge representation and reasoning system which attempts to model reflexive reasoning, will serve as our representative connectionist model. Efficient simulation systems for shruti are developed on the Connection Machine CM-2---an SIMD architecture---and on the Connection Machine CM-5---an MIMD architecture. The resulting simulators are evaluated and tested using large, random knowledge bases with up to half a million rules and facts.
Though SIMD simulations on the CM-2 are reasonably fast---requiring a few seconds to tens of seconds for answering simple queries---experiments indicate that MIMD simulations are vastly superior to SIMD simulations and offer hundred- to thousand-fold speedups.
This work provides new insights into the simulation of structured connectionist networks on massively parallel machines and is a step toward developing large yet efficient knowledge representation and reasoning systems.
A probabilistic approach to the brain image matching problem is proposed in which no assumptions are made about the nature of the intensity relationship between the two brain images. Instead the correspondence between the two intensities is represented by a conditional probability, which is iteratively determined as part of the matching problem. This paper presents the theory and describes its finite element implementation. The results of preliminary experiments indicate that there remain several aspects of the algorithm that require further investigation and refinement.
In this technical report, we investigate modeling, control, and coordination of mobile manipulators. A mobile manipulator in this study consists of a robotic manipulator and a mobile platform, with the manipulator being mounted atop the mobile manipulator combines the dextrous manipulation capability offered by fixed-base manipulators and the mobility offered by mobile platforms. While mobile manipulators offer a tremendous potential for flexible material handling and other tasks, as the same time they bring about a number of challenging issues rather than simply increasing the structural complexity. First, combining a manipulator and a platform creates redundancy. Second, a wheeled mobile platform is subject to nonholonomic constraints. Third, there exists dynamic interaction between the manipulator and the mobile platform. Fourth, manipulators and mobile platforms have different bandwidths. Mobile platforms typically have slower dynamic response than manipulators. The object of the thesis is to develop control algorithms that effectively coordinate manipulation and mobility of mobile manipulators.
We begin with deriving the motion equations of mobile manipulators. The derivation presented here makes use of the existing motion equations of manipulators and mobile platforms, and simply introduces the velocity and acceleration dependent terms that account for the dynamic interaction between manipulators and mobile platforms. Since nonholonomic constraints play a critical role in control of mobile manipulators, we then study the control properties of nonholonomic dynamic systems, including feedback linerization and internal dynamics. Based on the newly proposed concept of preferred operating region, we develop a set of coordination algorithms for mobile manipulators. While the manipulator performs manipulation tasks, the mobile platform is controlled to always bring the configuration of the manipulator into a preferred operating region. The control algorithms for two types of tasks -- dragging motion and following motion -- are discussed in detail. The effects of dynamic interaction are also investigated.
To verify the efficacy of the coordination algorithms, we conduct numerical simulations with representative task trajectories. Additionally, the control algorithms for the dragging motion and following motion have been implemented on an experimental mobile manipulator. The results from the simulation and experiment are presented to support the proposed control algorithms.
Many networked multimedia applications are delay-sensitive, and hence desire services with guarantees of resouce availability and timeliness. For networks such as those based on Asynchronous Transfer Mode (ATM), these services aare specified through Qualaity of Service (QoS) parameters. Delivering end-to-end QoS implies complex resouce management at the end-points (e.g., computer workstation hosts), as well as in the underlying network.
In this paper, we describe a model for an end-point entity, which wwe have designed and implemented, called the QoS Broker . The broker orchestrates resources at the end-points, cooperating with resource manaagement in the underlying ATM network. The broker, as an intermediary, hides implementation details from applications and resource managers. We motivate the concept and particulars of our design, including services such as translation, admission and negotiation which the broker uses to properly configure the system to application needs. We treat the QoS negotiation as a 'deal' between the user (``buyer'') and the network (``seller'') for the setup of a customized connection.
The key concept is that the broker is an active intermediary which isolates cooperating entities from operational details of other entities.
An electronic dictionary system (EDS) is developed using object-oriented database techniques based on ObjectStore. The EDS is basically composed of two parts: the Database Building Program (DBP), and the Database Querying Program (DQP). The DBP reads in a dictionary encoded in SGML tags, and builds a database composed of a collection of trees which holds dictionary entries and several lists which contain values of various lexical categories. With text exchangeability introduced by the SGML, the DBP is able to accommodate dictionaries of different structures, after modifying its configuration file. The DQP adopts SQL-like syntax and handles queries by exploiting the category lists through a optimization procedure. INitial tests show that compared with relation database, the DQP enjoys much faster speed and offers much higher flexibility in setting both lexical criterion and context requirements.
This paper describes an experimental system which provides heuristic solutions to vehicle routing problems. The system basicaally contains two components: an initial routes generator and a route improver. The route generator consists of an modified version of the sweep algorithm, and a new TSP algorithm, called the shrink algorithm. It is observed that the shrink algorithm is much more computationally efficient than Lin's exchanged method, while giving outputs comparable to those from the latter. The route improver in turn consists of two sub-components. The first is a small yet efficient expert system, which identifies and remedies specific problems in the initial routes, with the experience of humaan routers incorporated in the improvement process. The second sub-component is a pair-wise rerouter, which runs the saving algorithm on neighboring-route pairs, trying to reduce the total costs. The system has been tried on several typical routing problems, and yields optimal or near-optimal solutions.
This paper presents an architecture for agents that search for and manipulate objects. It is demonstrated in the SODAJACK system, a system that animates a human working at a soda fountain. The system is constructed as a set of three interacting planners. Two of these planners are special-purpose modules which contributed context-specific plans for the tasks of searching for and manipulating objects. The search planner is used to convert knowledge acquisition goals into goals of searching locations. An object specific reasoner is used to build object sensitive plans for manipulating specific objects. Finally, an incremental hierarchical planner is used to integrate these two special purpose planners into a complete system which interleaves planning and acting.
We present a high-level parallel calculus for nested sequences, NSC, offered as a possible theoretical ``core'' of an entire class of collection-oriented parallel languages. NSC is based on while-loops as opposed to general recursion. A formal, machine independent definition of the parallel time complexity and the work complexity of programs in NSC is given. Our main results are: (1) We give a translation method for a particular form of recursion, called map-recursion, into NSC, that preserves the time complexity and adds an arbitrarily small overhead to the work complexity, and (2) We give a compilation method for NSC into a very simple vector parallel machine, which preserves the time complexity and again adds an arbitrarily small overhead to the work complexity.
Recovering 3D depth information from two or more 2D intensity images is a long standing problem in the computer vision community. This paper presents a multi-baseline, coarse-to-fine stereo algorithm which utilizes any number of images (more than 2) and multiple image scales to recover 3D depth information. Several ``view-selection strategies'' are introduced for combining information across the multi-baseline and across scale spade. The control strategies allow us to exploit, maximally, the benefits of large and small baselines and mask sizes while minimizing errors. Results of recovering 3D depth information from a human head are presented. The resulting depth maps are of good accuracy with a depth resolution of approximately 5mm.
The psychological mechanism by which even musically untutored people can comprehend novel melodies resembles that by which they comprehend sentences of their native language. The paper identifies a syntax, a semantics, and a domain or ``model''. These elements are examined in application to the task of harmonic comprehension and analysis of unaccompanied melody, and a computational theory is argued for.
Over the past fifteen years, there has been intensive study of formal systems that can model concurrency and communication. Two such systems are the Calculus of Communicating Systems, and the Algebra of Communicatting Process. The objective of this paper has two aspects: (1) to study the characteristics and features of these two systems, and (2) to investigate two interesting formal proofs concerning issues of decidability of bisimulation equivalence in these systems. An examination of the processes that generate context-free languages as a trace set shows that their bisimulation equivalence is decidable, in contrast to the undecidability of their trace set equivalence problem for processes with a limited amount of concurrency is decidable.
One source of partial information in databases is the need to combine information from several databases. Even if each database is complete for some ``world'', the combined databases will not be, and answers to queries against such combined databases can only be approximated. In this paper we describe various situations in which a precise answer cannot be obtained for a query asked against multiple databases. Based on an analysis of these situations, we propose a classification of constructs that can be used to model approximations.
One of the main goals is to show that most of these models of approximations possess universality properties. The main motivation for doing this is applying the data-oriented approach, which turns universality properties into syntax, to obtain languages for approximations. We show that the languages arising from the universality properties have a number of limitations. In an attempt to overcome those limitations, we explain how all the languages can be embedded into a language for conjunctive and disjunctive sets from [21], and demonstrate its usefulness in querying independent databases.
A fundamental issue in the automated analysis of concurrent systems is the efficient generation of the reachable state space. Since it is not possible to explore all the reachable states of a system if the number of states is very large or infinite, we need to develop techniques for minimizing the state space. this paper presents our approach to cluster subsets of states into equivalent classes. We assume that concurrent systems are specified as communicating state machines with arbitrary data space. We describe a procedure for constructing a minimal reachability state graph from communicating state machines. As an illustration of our approach, we analyze a producer-consumer program written in Ada.
This paper describes an implemented system that generates spoken dialogue, including speech, intonation, and gesture, using two copies of an identical program that differ only in knowledge of the world and which must cooperate to accomplish a goal. The output of the dialogue generation is used to drive a three-dimensional interactive animated model -- two graphic figures on a computer screen who speak and gesture according to the rules of the system. The system is based upon a formal, predictive and explanatory theory of the gesture-speech relationship. A felicitous outcome is a working system to realize autonomous animated conversational agents for virtual reality and other purposes, and a tool for investigating the relationship between speech and gesture.
The correctness of real-time distributed systems depends not only on the function they compute but also on their timing characteristics. Furthermore, those characteristics are strongly influenced by the delays due to synchronization and resource availability. Process algebras have been used successfully to define and prove correctness of distributed systems. More recently, there has been a lot of activity to extend their application to real-time systems. The problem with most current approaches is that they ignore resource constraints and assume either a total parallelism (unlimited resources) or total interleaving (single resource.) Algebra of Communicating Shared Resources (ACSR) is a process algebra designed for the formal specification and manipulation of distributed systems with resource and real-time constraints. A dense time domain provides a more natural way of specifying systems compared to the usual discrete time. Priorities provide a measure of urgency for each action and can be used to ensure that deadlines are met. In ACSR, processes are specified using resource bound, timed actions and instantaneous synchronization events. Processes can be combined using traditional operators such as nondeterministic choice and parallel execution. Specialized operators allow the specification of real-time behavior and constraints. The semantics of ACSR is defined as a labeled transition system. Equivalence between processes is based on the notion of strong bisimulation. A sound and complete set of algebraic laws can be used to transform also any ACSR process into a normal form. In practice, several specifications may satisfy the same requirements with various degree of desirability. Some may use more resources; some may be faster. In fact, there are many ways to rank processes. We describe a method for defining order relations between execution traces and further expanding the relation to general processes. Monotonicity is an important property of operators as it ensures that ordering is preserved by contexts. We study the conditions that must be satisfied by the trace ordering to ensure monotonicity at the process level, both in the prioritized and unprioritized cases. While most operations are monotonic for a large variety of trace relations, few retain this property in a prioritized setting.
Modeling and displaying complex objects in Computer Graphics often requires using multiple levels of detail for display practicality and efficiency. We develop a computational scheme for automatically constructing a hierarchical representation of successively more detailed 3D object surfaces, given a 3D object represented as a connected network of surface points. In our scheme, successive levels of object representation provide progressively detailed descriptions of the object. Thus, each object is represented in a hierarchical format with gradually improving detail.
Dynamic models with global and local deformations were adopted in constructing the hierarchical representation for their reliable recovery procedure and complexity control process. The models can capture the prominent shape of the object with a few global parameters while their local deformations can represent the detail of the shape. We develop a locally adaptive subdivision technique for finite elements increase representational efficiency and accuracy in recovery of the dynamic model. With locally adaptive finite elements, we were able to obtain a better fit to the 3D data with significantly fewer elements than with non-subdivided elements. We construct a smooth hierarchy of object details of complex 3D input object by applying progressively refined deformations to the dynamic models: first by applying global deformations only, then global and local deformations, and finally global and local deformations with various levels of locally adaptive subdivision.
Based on the constructed detail hierarchy, a rendering system is able to display objects are various detail levels, depending on the viewing transformation or user-specified importance. A display mechanism employs a metric based on the screen area of the object which is used in selecting the appropriate level of detail. We develop an area metric threshold (AMT) predicate in order to determine the threshold values of this metric for each level of description in the hierarchy. We show that our object detail hierarchy affords substantially more efficient rendering than full detail data sets.
We describe an implemented system which automatically generates and animates conversations between multiple human-like agents with appropriate and synchronized speech, intonation, facial expressions, and hand gestures. Conversations are created by a dialogue planner that produces the text as well as the intonation of the utterances. The speaker/listener relationship, the text, and the intonation in turn drive facial expressions, lip motions, eye gaze, head motion, and arm gesture generators. Coordinated arm, wrist, and hand motions are invoked to create semantically meaningful gestures. Throughout, we will use examples from an actual synthesized, fully animated conversation.
It is an unfortunate reality that in ``logic programming'' as exemplified by Prolog few interesting problems can be solved using the purely logical fragment of the language. The programmer must instead resort to the use of extra-logical features about which sound reasoning is difficult of impossible. A great deal of work in the theory of logic programming languages in the last decade has centered on the question of whether, by basing languages on richer logical systems than Horn clauses, we might obtain a system in which the logical core of the system covers a greater percentage of the problems programmers face.
When logic programming is based on the proof theory of intuitionistic logic, it is natural to allow implication in goals and in the bodies of clauses. Attempting to prove a goal D É G causes the system to add the assumption d to the current program and then attempt to prove G. This sort of addition to the language has been studied in depth by Miller, Nadathur, Gabbay, and others and has been shown to yield many useful features at the language level, including notions of hypothetical reasoning in databases, and a logic-based module system for logic programming.
There are many problems, though, which cannot be given an attractive treatment in systems based on classical or intuitionistic logic because, in those settings, no control is possible over whether, and how often, a formula is used during a proof. These are known as the relevant and affine constraints. This limitation can be addressed by using a fragment of Girard's linear logic to refine the intuitionistic notion of context.
In this thesis I summarize the theory and design of a logic programming language based on linear logic as first proposed by Hodas and Miller in 1991. Several example applications are shown to justify the language design. It is also shown how this system can be seen as specifying a new family of logic grammars for natural language processing which yield perspicuous and natural grammars for handling filler-gap dependencies.
Several aspects of the implementation of the language are discussed. In particular it is shown how issues that do no arise in implementing traditional systems can prove computationally crippling to a naive implementation of this logic. Therefore a formal operational semantics is proposed which circumvents these problems by delaying, as much as possible, otherwise non-deterministic choices in the proof search.
Finally, a further refinement of the system, in which the programmer can specify the relevant and affine constraints independently, thereby obtaining greater control, is discussed.
Error-free multimedia data processing and communication includes providing guaranteed services such as the colloquial telephone. A set of problems have to be solved and handled in the control-management level of the host and underlying network architectures. We discuss in this paper 'resource management' at the host and network level, and their cooperation to achieve global guaranteed transmission and presentation services, which means end-to-end guarantees. The emphasize is on 'network resources' (e.g., bandwidth, buffer space) and 'host resources' (e.g., CPU processing time) which need to be controlled in order to satisfy the Quality of Service (QoS) requirements set by the users of the multimedia networked system.
The control of the specified resources involves three actions: (1) properly allocate resources (end-to-end) during the multimedia call establishment, so that traffic can flow according to the QoS specification; (2) control resource allocation during the multimedia transmission; (3) adapt to changes when degradation of system components occurs. These actions imply the necessity of: (a) new services, such as admission services, at the hosts and intermediate network nodes; (b) new protocols for establishing connections which satisfy QoS requirements along the path from send to receiver(s), such as resource reservation protocol; (c) new control algorithms for delay, rate and error control; (d) new resource monitoring protocols for reporting system changes, such as resource administration protocol; (e) new adaptive schemes for dynamic resource allocation to respond to system changes; and (f) new architectures at the hosts and switches to accommodate the resource management entities.
This article gives an overview of services, mechanisms and protocols for resource management as outlined above.
A task that robotic manipulators most frequently perform is motion between specified points in the working space. It is therefore important that these motions are efficient. The presence of the obstacles and other requirements of the task often require that the path is specified in advance. Robot actuators cannot generate unlimited forces/torques so it is reasonable to ask how to traverse the prespecified path in minimum time so that the limits on the actuator torques are not violated.
It can be shown that the motion which requires least time to traverse a path requires at least one actuator to operate on the boundary (maximum or minimum). Furthermore, if the path is parameterized, the equations describing the robot dynamics can be rewritten as functions of the path parameter and its first and second derivatives. In general, the actuator bounds will be transformed into the bounds on the acceleration along the path. These bounds will be functions of the velocity and position. It is possible to demonstrate that the optimal motion will be almost always bang-bang in acceleration. The task of finding the optimal torques thus reduces to finding the instants at which the acceleration will switch between the boundaries.
An algorithm for finding the time-optimal motion along prespecified paths that explores this idea will be presented. It will be shown that so called singular arcs exist on which the algorithm will fail. Modification of the algorithm for such situations will be presented. Also, some properties of the solutions of the more general problem when the path is not known will be discussed. Lie-algebraic techniques will be shown to be a convenient tool for the study of such problems.
In this work, a visually realistic and dynamically sound animation of human locomotion is obtained using both kinematic and dynamic techniques. Even though the macro-physical world can be predicted quite accurately by the physics law, we manifest that dynamic techniques alone can not predict the behavior of a self actuated system. We present a technique in which kinematics and dynamics are coupled together to form the cerebrum-cerebellum animation mechanism . Kinematic techniques are used to initiate and generate goal oriented intentional motion, and dynamic techniques are applied at each frame to adjust the kinematic motion to achieve dynamic soundness without affecting the goal achievement. In the locomotion case, dynamic soundness is judged by dynamic balance of the overall system and motion comfort which is a comparative measure of the current joint torques with human strength. The dynamic control module DYNCONTROL performs balance and comfort control based on inverse dynamics and strength data. For the inverse dynamics computation, the Newton-Euler method is applied to a 97 degree of freedom human body model compute the joint forces and torques in real-time. The effect of attached load or external forces is simulated quite accurately. Kinematic locomotion generation module KLOG is responsible for various locomotion primitives and rhythmic and non-rhythmic variation of such primitives, and cover curved path walking and running, lateral, backward, and turnaround steps, the transitions between walking and running for motion continuity etc. Our technique has been applied to the areas such as reactive incremental path planning and virtual reality.
The subject of Natural Language Processing can be considered in both broad and narrow senses. In the broad sense, it covers processing issues at all levels of natural language understanding, including speech recognition, syntactic and semantic analysis of sentences, reference to the discourse context (including anaphora, inference of referents, and more extended relations of discourse coherence and narrative structure), conversational inference and implicature, and discourse planning and generation. In the narrower sense, it covers the syntactic and semantic processing sentences to deliver semantic objects suitable for referring, inferring, and the like. Of course, the results of inference and reference may under some circumstances play a part in processing in the narrow sense. But the processes that are characteristic of these other modules are not the primary concern.
This thesis is concerned with efficient recognition of three-dimensional (3-D) objects using parametric part description. The parametric shape models used are superquadrics, as recovered from depth data. The primary contribution of our research lies in a principled solution to the difficult problems of object part classification and model indexing. The novelty of our approach is in the use of a formal statistical approach for superquadric part classification and a formal evidential framework for model indexing. In addition, the method used for model indexing is amenable to the use of massive parallelism using a connectionist implementation of evidential semantic networks.
A major concern in practical vision systems is how to retrieve the best matched models without exploring all possible object matches. Our approach is to cluster together similar model parts to create prototypical part classes which we term protoparts. Each superquadric part recovered from the input is paired with the best matching protopart using precompiled class statistics. The features used by the classifier are the statistically most significant subset of parameters, computed using principal component analysis. The selected protoparts are used to index into the model database and the retrieved models are ranked by combining the evidence from the protoparts.
We have implemented a working vision system based on the principles described above. Our work builds on a technique for fitting dense range data from simple 3-D objects into their constituent parts in terms of surface and volumetric primitives. Our contributions include a bootstrapping technique for dealing with small training databases, a protopart classifier and an evidential model indexing algorithm. To date, our vision system has been tested on a small database of bottles. result of recognition experiments with familiar, occluded and novel objects indicate the promise of our approach of using protoparts as model indexing keys.
We describe a functional database language OR-SML for handline disjunctive information in database queries, and its implementation on top of Standard ML[21]. the core language has the power of the nested relational algebra, and it is augmented with or-sets which are used to deal with disjunctive information. Sets, or-sets and tuples can be freeley combined to create objects, which gives the language a greater flexibility. We give examples of queries which require disjunctive information (such as querying incomplete or independent databases) and show how to use the language to answer these queries. Since the system runs on top of Standard ML and all database objects are values in the latter, the system benefits from combining a sophisticated query language with the full power of a programming language. OR-SML includes a number of primitives that deal with bags and aggregate functions. It is also configurable by user-defined base types. The language has been implemented aas a library of modules in Standard ML. This allows the user to build just the database language as an independent system, or to interface it to other systems built in Standard ML. We give an example of connecting OR-SML with an already existing interactive theorem prover.
The effective communication of information is an important concern in the design of an expert consultation system. Several researchers have hosen to adopt a critiquing mode, in which the system evalu8ates and reacts to a solution proposed by the user rather than presenting its own solution. In this proposal, I present an architecture for a critiquing system that functions in real-time, during the process of developing and executing a management plan in time-critical situtations. the architecture is able to take account of and reason about multiple, interacting goals and to identify critical errors in the proposed management plan. This architecture is being implemented as part of the TraumAID system for the management of patients with severe injuries.
As computers become ubiquitous, they are increasingly used in safety critical environments. Since many safety critical applications are real-time systems, automated analysis technique of real-time properties is desirable. Most widely used automated analysis techniques are based on state space exploration. Automatic analysis techniques based on state space exploration suffer from the state space explosion problem. In particular, a real-time system may have an unbounded number of states due4 to infinitely many possible time values. This paper presents our approach for generating a finite and efficient representation of the reachable states called a timed reachability graph for a real-time system is specified using a timed automaton which is a timed extension of the well-known finite automaton. Our approach for coping with the state explosion problem is to extract timing information from states and to represent it as relative time relations between transitions. We also present an algorithm for computing the minimum and maximum time bounds between executions of two actions from a timed reachability graph to determine timing properties.
This paper presents a theory and a computational implementation for generating prosodically appropriate synthetic speech in response to database queries. Proper distinctions of contrast and emphasis are expressed in an intonation contour that is synthesized by rule under the control of a grammar, a discourse model, and a knowledge base. The theory is based on Combinatory Categorial Grammar, a formalism which easily integrates the notions of syntactic constituency, semantics, prosodic phrasing and information structure. Results from our current implementation demonstrate the system's ability to generate a variety of intonational possibilities for a given sentence depending on the discourse context.
Most of work on partial information in databases asks which operations of standard languages, like relational algebra, can still be performed correctly in the presence of nulls. In this paper a different point of view is advocated. We believe that the semantics of partiality must be clearly understood and it should give us new design principles for languages for databases with partial information.
There are different sources of partial information, such as missing informationand conflicts that occur when different databases are merged. In this paper, we develop a common semantic framework for them which can be applied in a context more general than the flat relational model. This ordered semantics, which is based on ideas used in the semantics of programming languages, cleanly intergrates all kinds of partial information and serves as a tool to establish connections between them.
Analyzing properties of semantic domains of types suitable for representing partial information, we come up with operations that are naturally associated with those types, and we organize programming syntax around these operations. We show how the languages that we obtain can be used to ask typical queries about incomplete information in relational databases, and how they can express some previously proposed languages. Finally, we discuss a few related topics such as mixing traditional constraints with partial information and extending semantics and languages to accommodate bags and recursive types.
In this thesis, we investigate modeling, control, and coordination of mobile manipulators. A mobile manipulator in this study consists of a robotic manipulator and a mobile platform, with the manipulator being mounted atop the mobile platform. A mobile manipulator combines the dextrous manipulation capability offered by fixed-base manipulators and the mobility offered by mobile platforms. While mobile manipulators offer a tremendous potential for flexible material handling and other tasks, at the same time they bring about a number of challenging issues rather than simply increase the structural complexity. First, combining a manipulator and a platform creates redundancy. Second, a wheeled mobile platform is subject to nonholonomic constraints. third, there exists dynamic interaction between the manipulator and the mobile platform. Fourth, manipulators and mobile platforms have different bandwidths. Mobile platforms typically have slower dynamic response than manipulators. The objective of the thesis is to develop control algorithms that effectively coordinate manipulation and mobility of mobile manipulators.
We begin with deriving the motion equations of mobile manipulators. The derivation present here makes use of the existing motion equations of manipulators and mobile platforms, and introduces the velocity and acceleration dependent terms that account for the dynamic interaction between manipulators and mobile platforms. Since nonholonomic constraints play a critical role in control of mobile manipulators, we then study the control properties of nonholonomic dynamic systems, including feedback linearization and internal dynamics. Based on the newly proposed concept of preferred operating region, we develop a set of coordination algorithms for mobile manipulators. While the manipulator performs manipulation tasks, the mobile platform is controlled to always bring the configuration of the manipulator into a preferred operating region. The control algorithms for two types of tasks -- dragging motion and following motion -- are discussed in detail. The effects of the dynamic interaction are also investigated.
To verify the efficacy of the coordination algorithms, we conduct numerical simulations with representative task trajectories. Additionally, the control algorithms for the dragging motion and following motion have been implemented on an experimental mobile manipulator. The results from the simulation and experiment are presented to support the proposed control algorithms.
Our original paper (Grosz, Joshi, and Weinstein, 1983) on centering claimed that certain entities mentioned in an utterance were more central than others and that this property imposed constraints on a speaker's use of different types of referring expression. Centering was proposed as a model that accounted for this phenomenon. We argued that the compatibility of centering properties of an utterance with choice of referring expression affected the coherence of discourse. Subsequently, we expanded the ideas presented therein. We defined various centering constructs and proposed two centering rules in terms of these constructs. A draft manuscript describing this elaborated centering framework and presenting some initial theoretical claims has been in wide circulation since 1986. this draft (Grosz, Joshi, and Weinstein 1986, hereafter, GJW86) has led to a number of papers by others on this topic and has been extensively cited, but has never been published.
We have been urged to publish the more detailed description of the centering framework and theory proposed in GJW86 so that an official version would be archivally available. The task of completing and revising this draft became more daunting as time passed and more and more papers appeared on centering. Many of these papers proposed extensions to or revisions of the theory and attempted to answer questions posed in GJW86 . It has become ever more clear that it would be useful to have a ``definitive'' statement of the original motivations for centering, the basic definitions underlying the centering framework, and the original theoretical claims. This paper attempts to meet that need. To accomplish this goal, we have chosen to removed descriptions of many open research questions posed in GJW86 as well as solutions that were only partially developed. We have also greatly shortened the discussion of criteria for and constraints on a possible semantic theory as a foundation for this work.
Information stored in databases is usually incomplete. Typical sources of partiality are missing information, conflicts that occur when databases are merged, and asking queries against several databases simultaneously. The field of partial information in databases has not received the attention that it deserves. Most work on partial information in databases asks which operations of standard languages, like relational algebra, can still be performed correctly in the presence of simple forms of partial information. We believe that the problem should be looked at from another point of view: the semantics of partiality must be clearly understood and it should give us new design principles for languages for databases with partial information.
The main goals of this thesis are to develop new analytical tools for studying partial information and its semantics, and to use the semantics of partiality as the basis for design of query languages. Unlike typical research in artificial intelligence, we concentrate on general purpose solutions that are effectively implementable in the context of database query languages and provide a flexible basis for future modeling challenges.
We present a common semantic framework for various kinds of partial information which can be applied in a context more general than the flat relational model. This semantics is based on the idea of ordering objects in terms of being more informative. Such ordered semantics cleanly intergrates all kinds of partial information and serves as a tool to establish connections between them. By analyzing mathematical properties of partial data, it is possible to find operations naturally associated with it. Such operations, arising from characterization of semantic domains of types as free algebras, can be turned into programming language constructs.
We discuss languages for databases with partial information that are given rise to by the semantics. A language for sets and or-sets is introduced and normalization theorem is proved. It allows to incorporate semantics into the language and to distinguish two levels of querying: structural and conceptual. This language has been implemented on top of Standard ML, and shown to be useful in problems of querying independent and incomplete databases.
This document describes a modified Jack system for off-line motion production and on-line (real-time) motion playback to an external IRIS-Performer-based host rendering system. This work was done in partial fulfillment of Contract #N61339-94-C-0005 for the US Marine Corps through NAWCTSD (Naval Air Warfare Center, Training Systems Division).
The work described herein was contributed by several of the members of the Center for Human Modeling and Simulation: John Granieri (Design/Engineering/Integration), Rama Bindiganavale (animator, posture transitions), Hanns-Oskar Poor (animator, posture transitions, Hyeongseok Ko (walking and running motion), Micheal Hollick (locomotion playback control), Bond-Jay Ting (body sculpting), Francisco Azoula (body sculptin, anthropometry), Pei-Hwa Ho (body normalization), Jonathan Crabtree (Performaer, TIPS file format), Xinmin Zhao (slaving), Zhongyang Feng (DIS logfile player), Welton Becket and Barry Reich (terrain reasoning and reactive agent control).
Remote control of robotic manipulation has applications in undersea, shallow space and low bandwidth communication environments. Communication delays on the order of several seconds can occur during space operations that use relay stations, undersea operations using acoustic communication channels, or when significant computational delay exists. This system is also applicable to time varying time delays as experienced when the Internet is used as the medium of communication. The use of supervisory control to address the time delay problem requires a remote manipulator to exhibit some degree of autonomy. This autonomy is necessary to minimize the contribution of communication delay time to task completion time and to relieve the operator of the responsibility for contact control. This dissertation examines the use of behavior-based or subsumption architecture control to provide the required autonomy for the remote manipulator. Behavior-based controllers demonstrate desirable features including reliability and robust operation in unstructured environments and, when used in conjunction with operator direction, avoids the need for higher level representations which do not fit well into the architecture. We develop a model for communications between the operator and the behavior-based controller and define the human-machine interface for operator direction. We demonstrate the supervisory control system on a GRASP Laboratory mockup of a slicing task performed during a satellite repair operation. In this task a robot cuts securing tape along the seams of the panels of a thermal protection blanket. We tested the validity of the supervisory control system by performing controlled experiments using untrained operators. Quantitative and qualitative results demonstrate the supervisory control concept is a practical and viable solution to the time delay problem.
Recent paradigms in the fields of robotics and machine perception have emphasized the importance of selective attention mechanisms for perceiving and interacting with the environment. In the case of a system involved in operations requiring a physical interaction with the surrounding environment, a major role is played by the capability of attentively responding to tactile events. By performing somatosensory saccades, the nature of the cutaneous stimulation can be assessed, and new motor actions can be planned. However, the study of touch--driven attention, has almost been neglected by robotics researchers. In this paper the development of visuo-cutaneo coordination for the production of somatosensory saccades is investigated, and a general architecture for integrating different kinds of attentive mechanisms is proposed. The system autonomously discovers the sensorymotor transformation which links tactile events to visual saccades, on the basis of multisensory consistencies and basic, built--in, motor reflexes. Results obtained both with simulations and robotic experiments are analyzed.
Although there have been many studies of the kinematics and dynamics of single-arm motions little has been done in this area with regards to motions involving multiple limbs, especially for the case of dynamic interaction between two arms. The primary goal of this research has been to examine the manipulation of an object requiring the constant use of both arms. Specifically, two extensions of the minimum jerk model [Fla85], which has been successful in past studies of the single-arm case, were developed to account for the case of two arms manipulating a planar object while dynamically coupled. In order to quantitatively study two-arm manipulation by humans, a planar, three degree of freedom manipulandum capable of measuring the positions of and forces at the hands was designed and built. The end-effector of the manipulandum consisted of two handles mounted at opposite ends of an aluminum bar. Volunteer subjects participated in experiments in which they were asked to move between a series of illuminated targets which indicated the desired position and orientation of the bar. Although the proposed models did successfully predict several aspects of the trajectories, they were unable to account for others. However, it may be possible to modify the models to account for the observed behavior by relaxing some of the simplifying assumptions made to facilitate their development. Furthermore, the analysis of the empirical data has also suggested several alternative approaches to modeling two-arm motion.
Recent developments in connectionist architectures for symbolic computation have made it possible to investigate parsing in a connectionist network while still taking advantage of the large body of work on parsing in symbolic frameworks. This dissertation investigates syntactic parsing in the temporal synchrony variable binding model of symbolic computation in a connectionist network. This computational architecture solves the basic problem with previous connectionist architectures, while keeping their advantages. However, the architecture does have some limitations, which impose computational constraints on parsing in this architecture. This dissertation argues that, despite these constraints, the architecture is computationally adequate for syntactic parsing, and that these constraints make significant linguistic predictions. To make these arguments, the nature of the architecture's limitations are first characterized as a set of constraints on symbolic computation. This allows the investigation of the feasibility and implications of parsing in the architecture to be investigated at the same level of abstraction as virtually all other investigations of syntactic parsing. Then a specific parsing model is developed and implemented in the architecture. The extensive use of partial descriptions of phrase structure trees is crucial to the ability of this model to recover the syntactic structure of sentences within the constraints. Finally, this parsing model is tested on those phenomena which are of particular concern given the constraints, and on an approximately unbiased sample of sentences to check for unforeseen difficulties. The results show that this connectionist architecture is powerful enough for syntactic parsing. They also show that some linguistic phenomena are predicted by the limitations of this architecture. In particular, explanations are given for many cases of unacceptable center embedding, and for several significant constraints on long distance dependencies. These results give evidence for the cognitive significance of this computational architecture and parsing model. This work also shows how the advantages of both connectionist and symbolic techniques can be unified in natural language processing applications. By analyzing how low level biological and computational considerations influence higher level processing, this work has furthered our understanding of the nature of language and how it can be efficiently and effectively processed.
The CCG account of the unbounded constructions -- in particular, relativisation and coordination -- generalises the notion of surface structure in a way that disrupts traditional notions of dominance and command. This has led researchers in other frameworks to suggest that the theory is fundamentally incompatible with a coherent theory of binding and control -- the bounded constructions. The present paper offers a theory of binding in CCG which preserves the original account of the unbounded dependencies, and which renders it immediately compatible with other theories, TAG in particular. The theory requires the abandonment of one assumption that has been traditional (though not essential) in other categorial approaches. The significance of this move is discussed.
We investigate queries in the presence of external functions with arbitrary inputs and outputs (atomic values, sets, nested sets etc). We propose a new notion of domain independence for queries with external functions which, in contrast to previous work, can also be applied to query languages with fixpoints or other kinds of iterators. Next, we define two new notions of computable queries with external functions, and prove that they are equivalent, under the assumption that the external functions are total. Thus, our definition of computable queries with external functions is robust. Finally, based on the equivalence result, we give examples of complete query languages with external functions. A byproduct of the equivalence result is the fact that Relational Machines are complete for complex objects: it was known that they are not complete over flat relations.
This paper develops a new class of physics-based deformable models which can deform both globally and locally. their global parameters are functions allowing the definition of new parameterized primitives and parameterized global deformations. These new global parameter functions improve the accuracy of shape description through the use of a few intuitive parameters such as functional bending and twisting. Using a physics-based approach we convert these geometric models into deformable models that deform due to forces exerted from the datapoints so as to conform to the given dataset. We present an experiment involving the extraction of shape and motion of the Left Ventricle (LV) of a heart from MRI-SPAMM data based on a few global parameter functions.
We develop a new method for analyzing the motion of the left ventricle (LV) of a heart from tagged MRI data. Our technique is based on the development of a new class of physics-based deformable models whose parameters are functions allowing the definition of new parameterized primitives and parameterized deformations. These parameter functions improve the accuracy of shape description through the use of a few intuitive parameters such as functional twisting. Furthermore, these parameters require no complex post-processing in order to be used tby a physician. Using a physics-based approach, we convert these geometric models into deformable models that deform due to forces exerted from the datapoints and conform to the given dataset. We present experiments involving the extraction of shape and motion of the LV from MRI-SPAMM data based on a few parameter functions. Furthermore, by plotting the variations over time of the extracted model parameters from normal and abnormal heart data we are able to characterize quantitatively their-differences.
The theory of cut-free sequent proofs has been used to motivate and justify the design of a number of logic programming languages. Two such languages, l-Prolog and its linear logic refinement, Lolli, provide for various forms of abstraction (modules, abstract data types, higher-order programming) but lack primitives for concurrency. The logic programming language, LO (Linear Objects) provides for concurrency but lacks abstraction mechanisms. In this paper we present Forum, a logic programming presentation of all of linear logic that modularly extends the languages l-Prolog, Lolli, and LO. Forum, therefore, allows specifications to incorporate both abstractions and concurrency. As a meta-language, Forum greatly extends the expressiveness of these other logic programming languages. To illustrate its expressive strength, we specify in Forum a sequent calculus proof system and the operational semantics of a functional programming language that incorporates such nonfunctional features as counters and references.
This thesis explores issues related to using a restricted mathematical formalism as the formal basis for the representation of syntactic competence and the modeling of performance. The specific contribution of this thesis is to examine a language with considerable freer word-order than English, name German, and to investigate the formal requirements that this syntactic freedom imposes. Free word order (or free constituent order) languages can be seen as a test case for linguistic theories, since presumably the stricter word order can be subsumed by an apparatus that accounts for freer word order.
The formal systems investigated in this thesis are based on the tree adjoining grammar (TAG) formalism of Joshi et al. (1975). TAG is an appealing formalism for the representation of natural language syntax because its elementary structures are phrase structure trees, which allows the linguist to localized linguistic dependencies such as agreement, subcategorization, and filler-gap relations, and to develop a theory of grammar based on the lexicon.
The main results of the thesis are an argument that simple TAGs are formally inadequate, and the definition of an extension of TAG that is. Every aspect of the definition of this extension to TAG, called V-TAG, is specifically motivated by linguistic facts, not by formal considerations. A formal investigation of V-TAG reveals that (when lexicalized) it has restricted generative capacity, that it is polynomial parsable, and that it forms an abstract family of languages. This means that is has desirable formal properties for representing natural language syntax. both a formal automaton and a parser for V-TAG are presented.As a consequence of the new system, a reformulation of the linguistic theory that has been proposed for TAG suggests itself. Instead of including a transformational step in the theory of grammar, all derivations are performed within mathematically defined formalisms, thus limiting the degrees of freedom in the linguistic theory, and making the theory more appealing from a computational point of view. This has several interesting linguistic consequences; for instance, functional categories are expressed by feature content (not node labels), and head movement is replaced by the adjunction of heads. The thesis sketches a fragment of a grammar of German, which covers phenomena such as scrambling, extraposition, topicalization, and the V2 effect.
Finally, the formal automaton for V-TAG is used as a model of human syntactic processing. It is shown that this model makes several interesting predictions related to free word order in German.
In recent years, it has become common in the linguistics and philosophy literature to assume that events and processes are ontologically distinct entities, on a par with objects and substances. At the same time, the idea that time-based (episodic) knowledge should be represented as a collection of interrelated eventualities has gained increasing acceptance in the computational linguistics and artificial intelligence literature.
Contrary to what one might expect, a search through the prior literature in linguistics and philosophy reveals no account in which these sortal distinctions play a direct role in adequately explaining the problem of aspectual composition and the closely related imperfective paradox. In the computational linguistics and artificial intelligence literature, moreover, relatively little attention has been paid to either problem.
In the first part of the dissertation, I investigate the hypothesis that the parallel ontological distinctions introduced above may be directly employed in an explanatory formal account of the problem of aspectual composition and the imperfective paradox. In so doing, I develop a synthesis of proposals by Hinrichs (1985), Krifka (1989; 1992) and Jackendoff (1991) which makes correct predictions in many cases not considered by these authors. In particular, the account is the first to adequately explain the syntactic and semantic behavior of non-individuating accomplishment expressions, such as Jack pour some amount of wort into the carboy, which are too vague to individuate a single event by nevertheless behave like other Vendlerian accomplishments.
In the second part of the dissertation, I explore the potential computational applications of the linguistic account, by way of two case studies. In the first one, I follow Moens (1987) in showing how a calculus of eventualities can facilitate the implementation of a simple statement verifier which allows for a much greater range of natural language queries than is usually the case with temporal databases. In the second, more preliminary study, I examine the relevance of the model-theoretic analysis to discourse interpretation, within the context of devising a program which produces simple microworld animations using short narrative descriptions as input specifications.
This dissertation investigates a new approach to query languages inspired by structural recursion and by the categorical notion of monad.
A language based on these principles has been designed and studied. It is found to have the strength of several widely known relational languages but without their weaknesses. This language and its various extensions are shown to exhibit a conservative extension property, which indicates that the depth of nesting of collections in intermediate data has no effect on their expressive power. These languages also exhibit the finite-cofiniteness property on many classes of queries. These two properties provide easy answers to several hitherto unresolved conjectures on query languages that are more realistic than the flat relational algebra.
A useful rewrite system has been derived from the equational theory of monads. It forms the core of a source-to-source optimizer capable of performing filter promotion, code motion, and loop fusion. Scanning routines and printing routines are considered as part of optimization process. An operational semantics that is a blending of eager evaluation and lazy evaluation is suggested in conjunction with these input-output routines. This strategy leads to a reduction in space consumption and a faster response time while preserving good total time performance. Additional optimization rules have been systematically introduced to cache and index small relations, to map monad operations to several classical join operators, to cache large intermediate relations, and to push monad operations to external servers.
A query system Kleisli and a high-level query language CPL for it have been built on top of the functional language ML. Many of my theoretical and practical contributions have been physically realized in Kleisli and CPL. In addition, I have explored the idea of open system in my implementation. Dynamic extension of the system with new primitives, cost functions, optimization rules, scanners, and writers are fully supported. As a consequence, my system can be easily connected to external data sources. In particular, it has been successfully applied to integrate several genetic data sources which include relational databases, structured files, as well as data generated by special application programs.
Increasing bandwidth-delay product of high-speed wide-area networks is well-known to make conventional dynamic traffic control schemes ``sluggish''. Still, most existing schemes employ dynamic control, among which TCP and ATM Forum's rate-based flow control are prominent examples. So far, little has been investigated as to how the existing schemes will scale as bandwidth further increases up to gigabit speed and beyond. Our investigation in this paper is t he first to show that dynamic control has a severe scalability problem with bandwidth increase, and to propose an entirely new approach to traffic control that overcomes the scalability problem. The essence of our approach is in exercising control in bandwidth domain rather than time domain, in order to avoid time delay in control. This requires more bandwidth than the timed counterpart, but achieves a much faster control. Furthermore, the bandwidth requirement is not excessively large because the bandwidth for smaller control delay and we call our approach Bandwidth-Latency Tradeoff(BLT). While the control in existing schemes are bound to delay, BLT is bound to bandwidth. As a fallout, BLT scales tied to bandwidth increase, rather than increasingly deteriorate as conventional schemes. Surprisingly, our approach begins to pay off much earlier than expected, even from a point where bandwidth-delay product is not so large. For instance, in a roughly AURORA-sized network, BLT far outperforms TCP on a shared 150Mbps link, where the bandwidth-delay product is around 60KB. In the other extreme where bandwidth-delay product is large, BLT outperforms TCP by as much as twenty times in terms of network power in a gigabit nationwide network. More importantly, BLT is designed to continue to scale with bandwidth increase and the performance gap is expected to widen further.
The human face is an important and complex communication channel. It is a very familiar and sensitive object of human pereception. The facial animation field has increased greatly in the past few years as fast computer graphics workstations have made the modeling and real-time animation of hundreads of thousands of polygons affordable affordable and almost commonplace. Many applications have been developed such as teleconferencing, surgery, information assistance systems, games, and entertainment. To solve these different problems, different approaches for both animation control and modeling have been developed.
We present a novel, robust, integrated approach to segmentations shape and motion estimation of articulated objects. Initially, we assume the object consists of a single part, and we fit a deformable model to the given data using our physics-based framework. As the object attains new postures, we decide based on certain criteria if and when to replace the initial model with two models. These criteria are based on the model's state and the given data. We then fit the models to the data using a novel algorithm for assigning forces from the data to the two models, which allows partial overlap between them and determination of joint location. This approach is applied iteratively until all the object's moving parts are identified. Furthermore, we define new global deformations and we demonstrate our technique in a series of experiments, where Kalman filtering is employed to acocount for noise and occlusion.
We present an integrated approach towards the segmentation and shape estimation of human body outlines. Initially, we assume that the human body consists of a single part, and we fit a deformable model to the given data using our physics-based shape and motion estimation framework. As an actor attains diffeerent postures, new protrusions emerge on the outline. We model these changes in the shape using a new representation scheme consisting of a parametric composition of deformable models. This representation allows us to identify the underlying human parts that gradually become visible, by monitoring the evolution of shape and motion parameters of the composed models. Based on these parameters, their joint locatioins are identified. Our algorithm is applied iteratively over sebsequent frames until all moving parts are identified. We demonstrate our technique in a series of experiments with very encouraging results.
It was observed by Curry that when (untyped) \lamda-terms can be assigned types, for example, simple types, these terms have nice properties (for example, they are strongly normalizing). Coppo, Dezani, and Veneri, introduced type systems using conjunctive types, and showed that several important classes of (untyped) terms can be characterized according to the shape of tye types that can be assigned to these terms. For example, the strongly normalizable terms, the normalizable terms, and the terms having head-normal forms, can be characterized in some systems \cal D and \cal DW. Our proofs use a new and more modular version of the reducibility method. As an application of our metatheorems, we show how the characterizations obtained by Coppo, Dezani, Veneri, and Pottinger, can be easily rederived. We also characterize the terms that have weak head-normal forms, which appears to be new. We conclude by stating a number of challenging open problems regarding possible generalizations of the realizability method.
The main purpose of this paper is to take apart the reducibility method in order to understand how its pieces fit together, and in particular, to recast the conditions on candidates of reducibility as sheaf conditions. there has been a feeling among experts on this subject that it should be possible to present the reducibility method using more semantic means, and that a deeper understanding would then be gained. This paper gives mathematical substance to this feeling, by presenting a generalization of the reducibility method based on a sematic notion of realizability which uses the notion of a cover algebra (as in abstract sheaf theory). A key technical ingredient is the introduction a new class of semantic structures equipped with preorders, called pre-applicative structures. These structures need not be extensional. In this framework, a general realizability theorem can be shown. Kleene's recursive realizability and a variant of Kreisel's modified realizability both fit into this framework. We are then able to prove a meta-theorem which shows that if a property of realizers satisfies some simple conditions, then it holds for the semantic interpretations of all terms. Applying this theorem to the special case of the term model, yields a general theorem for proving properties of typed $\lambda$-terms, in particular, strong normalization and confluence. This approach clarifies the reducibility method by showing that the closure conditions on candidates of reducibility can be viewed as sheak conditions. the above approach is applied to the simply-typed l -calculus (with types ®, ×, +, and ^) , and to the second-order (polymorphic l-calculus (with types ® and " 2), for which it yields a new theorem.
Determining the structure of the world and the motion of the observer from image changes has been a central problem in computer vision for over fifteen years. Since the early work on Structure from Motion (SFM) by Longuet-Higgins[4] and Pradzny[6], several techniques have been developed to compute the motion of the camera, the shape of moving objects, or distances to points in the world. However, the image changes are non-linearly related to camera motion and distances to points in the world. Thus, solving the problem typically requires non-linear optimization techniques that can be unstable or comptationally inefficient. Linear algorithms are preferable since they are computationally advatageous, and since linear estimation is much better understood than non-linear estimation. Our paper describes an unbiased, completely linear algorithm for Structure-from-Motion. This work is similar to that of Jepson & Heeger [3] except that we employ spherical projection. The use of a sperical imaging geometry allows a simpler and more intuitive derivation of the algorithm, and produces an unbiased estimator. Experimental results are provided that demonstrate the performance of the algorithm.
In computer vision for object recogniton or navigation, shadows are a frequent occurrence. However, shadows are difficult to recognize because they cannot be infallibly recognized until a scene's geometry and lighting are known. We present a number of cues which together strongly suggest the identification of a shadow and which can be examined without a high computational cost. The techniques developed are: a color model for shadows and a color image segmentation method that recovers single material surfaces as single image regions irregardless of whether the surface is partially in shadow; a method to recover the penumbra and umbra of shadows; and a method for determining whether some object could be obstructing a light source. These cues either depend on or their reliability improves with the examination of some well understood shadowns in a scene. Our observer is equipped with an extendable probe for casting its own shadows. These actively obtained shadows allow the observer to experimentally determine the number, location, and rough extent of the light sources in the scene. The system has been tested against a variety of indoor and outdoor environments.
We describe the ZAROFF system, a plan-based constroller for the player who is ``it'' in a game of hide and seek. the system features visually realistic human figure animation including realistic human locomotion. We discuss the planner's interaction with a changing environment to which it has only limited perceptual access.
This paper address various issues related to representing the Korean language using Tree Adjoining Grammars. Topics covered include Korean grammar using TAGs, Machine Translation between Korean and English using Synchronous Tree Adjoining Grammars (STAGs), handling scrambling using Multi Component TAGs(MC-TAGs), and recoving empty arguments. The data for the parsing is from US military tlecommunication messages.