| Component |
Details |
| Morphological |
Consists of approximately 317,000 inflected items |
| Analyzer and |
derived from over 90000 stems. |
| Morph Database |
Entries are indexed on the inflected form and return |
| |
the root form, POS, and inflectional information. |
| POS Tagger |
Wall Street Journal-trained
trigram tagger ([#!kwc88!#]) |
| and Lex Prob |
extended to output N-best POS sequences |
| Database |
([#!soong90!#]). Decreases the time to parse |
| |
a sentence by an average of 93%. |
| Syntactic |
More than 30,000 entries. |
| Database |
Each entry consists of: the uninflected form of the word, |
| |
its POS, the list of trees or tree-families associated with |
| |
the word, and a list of feature equations that capture |
| |
lexical idiosyncrasies. |
| Tree Database |
1094 trees, divided into 52 tree families and 218 individual |
| |
trees. Tree families represent subcategorization frames; |
| |
the trees in a tree family would be related to each other |
| |
transformationally in a movement-based approach. |
| X-Interface |
Menu-based facility for creating and modifying tree files. |
| |
User controlled parser parameters: parser's start category, |
| |
enable/disable/retry on failure for POS tagger. |
| |
Storage/retrieval facilities for elementary and parsed trees. |
| |
Graphical displays of tree and feature data structures. |
| |
Hand combination of trees by adjunction or substitution |
| |
for grammar development. |
| |
Ability to manually assign POS tag |
| |
and/or Supertag before parsing |