Component |
Details |
Morphological |
Consists of approximately 317,000 inflected items |
Analyzer and |
derived from over 90000 stems. |
Morph Database |
Entries are indexed on the inflected form and return |
|
the root form, POS, and inflectional information. |
POS Tagger |
Wall Street Journal-trained
trigram tagger ([#!kwc88!#]) |
and Lex Prob |
extended to output N-best POS sequences |
Database |
([#!soong90!#]). Decreases the time to parse |
|
a sentence by an average of 93%. |
Syntactic |
More than 30,000 entries. |
Database |
Each entry consists of: the uninflected form of the word, |
|
its POS, the list of trees or tree-families associated with |
|
the word, and a list of feature equations that capture |
|
lexical idiosyncrasies. |
Tree Database |
1094 trees, divided into 52 tree families and 218 individual |
|
trees. Tree families represent subcategorization frames; |
|
the trees in a tree family would be related to each other |
|
transformationally in a movement-based approach. |
X-Interface |
Menu-based facility for creating and modifying tree files. |
|
User controlled parser parameters: parser's start category, |
|
enable/disable/retry on failure for POS tagger. |
|
Storage/retrieval facilities for elementary and parsed trees. |
|
Graphical displays of tree and feature data structures. |
|
Hand combination of trees by adjunction or substitution |
|
for grammar development. |
|
Ability to manually assign POS tag |
|
and/or Supertag before parsing |