Machine Learning and Natural Language

Fall 2005

Experimental Assignment III Textual Entailment

(Due 12/12/05)

General

The assignment will be done in teams. There are 6 participating teams (no Preparation and Evaluation Team this time)
Your final report on the assignment is due on Monday, Dec. 12. At this time, each team will turn in its results along with a short report describing what you did, what were the difficulties and what are your conclusions.
There will be a meeting devoted to the presentation of the results on the evening of 12/12, instead of a final exam in the class.
Feel free to send me e-mail or come to ask questions.

Textual Entailment

Textual Entailment is the task of determining, for example, that the sentence: ``WalMart defended itself in court today against claims that its female employees were kept out of jobs in management because they are women'' entails that `` WalMart was sued for sexual discrimination''. Determining whether the meaning of a given text snippet entails that of another or whether they have the same meaning is a fundamental problem in natural language understanding that requires the ability to abstract over the inherent syntactic and semantic variability in natural language. This challenge is at the heart of many high level natural language processing tasks including Question Answering, Information Retrieval and Extraction, Machine Translation, and others that attempt to reason about and capture the meaning of linguistic expressions.

This task has been defined only recently, but it's a pretty hot area of research.

The Assignment

You will be given two collections of pairs of (t,h)). The first collection is the development set; you can use it to develop your strategy; look at the data, study it in different ways, train classifiers on it if you'd like, etc.

A week before the assignment is due I will give you a second set of pairs, the test set. Your goal is to achieve good results on this set. But, you will evaluate and report your results on both sets.

Both collections are annotated with a task and with the entailment classification (True/False); needless to say, the annotation of the test set will be used only for evaluating your results. Data The development data is available here.
In addition to the raw sentence pairs (in an xml files) the data has been processed by a semantic role labeling program, and is available in a column format. The test data is available here.
In addition to the raw sentence pairs (in an xml files) the test data has been processed by a semantic role labeling program, and is available in a column format. An evaluation script is available here.
Please note that this was processed with an earlier version of the semantic role labeling; you can process data yourself, via this tool. or any other tools.

As a minimum you need to

Design three different versions of your entailment system, starting with a baseline system, and moving to two more sophisticated approaches.
You must use some external resources (web, wordnet, corpora, etc.) and some preprocessing tools.
Experiment with your systems, and compare them both globally and on each of the tasks separately.
Report
1. Describe what you did, the specifics of your resources, algorithms and experiments.
2. Conclude with some suggestions for improvements, future work, etc.
Grading
Your grade depends on:
1. The quality of your report
2. The quality of your results.
3. Your originality in going beyond the minimal requirements.
Due date
Monday, Dec. 12.
Dan Roth

Machine Learning and Natural Language

Fall 2005

General

Textual Entailment

Recommended Reading on Textual Entailment:

The Assignment

Report

Grading

Due date