Component | Details |
Morphological | Consists of approximately 317,000 inflected items |
Analyzer and | derived from over 90000 stems. |
Morph Database | Entries are indexed on the inflected form and return |
the root form, POS, and inflectional information. | |
POS Tagger | Wall Street Journal-trained trigram tagger ([#!kwc88!#]) |
and Lex Prob | extended to output N-best POS sequences |
Database | ([#!soong90!#]). Decreases the time to parse |
a sentence by an average of 93%. | |
Syntactic | More than 30,000 entries. |
Database | Each entry consists of: the uninflected form of the word, |
its POS, the list of trees or tree-families associated with | |
the word, and a list of feature equations that capture | |
lexical idiosyncrasies. | |
Tree Database | 1004 trees, divided into 53 tree families and 221 individual |
trees. Tree families represent subcategorization frames; | |
the trees in a tree family would be related to each other | |
transformationally in a movement-based approach. | |
X-Interface | Menu-based facility for creating and modifying tree files. |
User controlled parser parameters: parser's start category, | |
enable/disable/retry on failure for POS tagger. | |
Storage/retrieval facilities for elementary and parsed trees. | |
Graphical displays of tree and feature data structures. | |
Hand combination of trees by adjunction or substitution | |
for grammar development. | |
Ability to manually assign POS tag | |
and/or Supertag before parsing |