PENN CIS 625, SPRING 2012: COMPUTATIONAL LEARNING THEORY
Prof. Michael Kearns
mkearns@cis.upenn.edu
Dr. Jake Abernethy
jaber@seas.upenn.edu
Mondays
12-3 PM
307 Levine Hall
(NOTE ROOM CHANGE)
IMPORTANT NOTE: Since the first Monday of the term (Jan 16) is MLK Day, the first class meeting will be on Monday, Jan 23.
URL for this page:
www.cis.upenn.edu/~mkearns/teaching/COLT
A previous incarnation of this course:
www.cis.upenn.edu/~mkearns/teaching/COLT/colt08.html
COURSE DESCRIPTION
This course is an introduction to Computational Learning Theory, a field which attempts to provide algorithmic, complexity-theoretic and probabilistic foundations to modern machine learning and related topics.
The first part of the course will closely follow portions of An Introduction to Computational Learning Theory, by M. Kearns and U. Vazirani (MIT Press). We will cover perhaps 6 or 7 of the chapters in K&V over (approximately) the first half of the course, often supplementing with additional readings and materials. Copies of K&V will be available at the Penn bookstore. The second portion of the course will focus on a number of models and topics in learning theory not covered in K&V.
The course will give a broad overview of the kinds of problems and techniques typically studied in Computational Learning Theory, and provide a basic arsenal of powerful mathematical tools for analyzing machine learning problems.
Topics likely to be covered include:
COURSE FORMAT, REQUIREMENTS, AND PREREQUISITES
Much of the course will be in fairly traditional "chalk talk" lecture format, but with ample opportunity for discussion, participation, and critique. The course will meet once a week on Mondays from 12 to 3. Lunch will be served.
While there are no specific formal prerequisites, background or courses in algorithms, complexity theory, discrete math, combinatorics, probability theory and statistics will prove helpful, as will "mathematical maturity" in general. We will be examining detailed proofs throughout the course. If you have questions about the desired background, please ask me. Auditors and occasional participants are welcome.
The course requirements for registered students are active in-class participation, a couple of problem sets, possibly co-leading a class discussion, and a final project. The final projects can range from actual research work, to a literature survey, to solving some additional problems.
DETAILED MEETING/TOPIC SCHEDULE
Mon Jan 23
READING: K&V Chapter 1.
Mon Jan 30
READING: K&V Chapters 2 and 3.
PROBLEM SET #1, DUE AS HARDCOPY IN CLASS FEB 13: Solve the following
problems from K&V: 1.1, 1.3, 2.3, 2.4,
3.1, 3.5.
You are free to collaborate in groups,
but please turn in separate write-ups and clearly acknowledge your collaborations.
Please do not attempt to look up solutions in the literature.
Mon Feb 6
Mon Feb 13
READING: K&V Chapter 4 and the following papers.
Mon Feb 20
READING: K&V Chapter 5; supplementary reading:
Mon Feb 27
Mon Mar 5
Mon Mar 12
READING: K&V Chapters 6,7,8.
PROBLEM SET #2, DUE AS HARDCOPY IN CLASS MAR 26: Solve the following
problems from K&V: 5.1, 5.3, 5.4, 7.3, 8.3.
You are free to collaborate on the homework
assignments, but please turn in separate write-ups and acknowledge your collaborations.
Please do not attempt to look up solutions in the literature.
INFORMATION ON COURSE PROJECTS.
You are welcome to send email to Prof Kearns proposing what your final project will
be in order to get feedback and advice.
The deadline for final projects will be
May 8,
via email to Prof Kearns.
Mon Mar 19
READING: While the notation and details may differ in class, the following online notes
are relevant.
Mon Mar 26
READING:
Mon Apr 2
READING:
UPDATE ON COURSE PROJECTS.
It is permitted for you to work in small teams on a joint project, but we'll
expect a proportionally more ambitious project in such cases.
Mon Apr 9
READING:
Papers from last week's discussion:
For this lecture,
students should try to read through page 9 of Elad Hazan's review
article on online convex optimization:
For students that want to know more, and for possible project ideas, here
are some links:
Mon Apr 16
READING: TBD.
Mon Apr 23
READING:
The exact timing and set of topics below will depend on our progress and will be updated
as we proceed. "[MK]" indicates lectures by Prof. Kearns; "[JA]" by Dr. Abernethy.
[MK] In the first meeting, we will go over course mechanics and present a course overview, then
immediately begin investigating the Probably Approximately Correct (PAC) model of learning.
Detailed topics covered: Learning rectangles in the real plane; definition of the PAC model;
PAC-learnability of rectangles in d dimensions; PAC-learnability of conjunctions of boolean variables.
[MK] Hardness of PAC-learning 3-term DNF; PAC-learnability of 3-term DNF by 3CNF;
relaxing the PAC definition; consistency and PAC learning; Occam's Razor;
VC dimension definitions and examples; a polynomial bound on the growth of \Pi_C(m) (Sauer's Lemma).
[JA] Alternate proof of Sauer's Lemma; lower bounds on sample complexity via the VC dimension;
beyond VC dimension --- Rademacher complexity, scale-sensitive dimension and friends.
[MK] Completion of VC upper bound; introduction to boosting; confidence boosting;
a simple majority vote single-stage algorithm for accuracy amplification.
[MK] Boosting continued: Adaboost and analysis; introduction to learning with classification noise;
a statistics-only algorithm for learning conjunctions; introduction to the statistical query learning model.
[MK] Classification noise and statistical query learning continued:
noise-tolerance of all SQ algorithms; almost all PAC algorithms are in SQ;
separating PAC and SQ; the SQ dimension and relationship to VC dimension.
Spring break; no class meeting.
[MK] Learning and cryptography; representation-independence hardness results;
learning DFAs from examples and membership queries.
For those of you registered for credit in the course,
your final assignment will be a course project, for which you have three choices:
[JA] Adversarial online learning; halving, weighted majority, and perceptron algorithms;
game, minimax via weighted majority, and boosting.
[MK] PAC analysis of reinforcement learning; efficient exploration and exploitation.
[JA] Continued: No-regret learning, online convex optimziation, and related topics.
at least part of class on
April 16,
be devoted to discussion of your course projects --- Jake and I will have brief
individual meetings with each person/team about your proposed project. In preparation
for this, you should all send both Jake and I an email sketching your proposed project
by
FRIDAY APRIL 13.
The purpose of the email is not that you
have your project entirely figured out, but to get you thinking about what you want to
do. The more specific, the better, but it's also fine to say you want to think about papers
or problems in some particular area that interests you. That way in the meetings Jake
and I can suggest directions, or papers to look at, etc.
Please make the subject line of your email "CIS 625 Project" so we can easily sort.
[JA] More fun with no-reget, online learning, convex optimization, etc.
[JA] Jake finishes up his lectures on online convex optimization.
[MK] Some drawbacks of no-regret learning; some topics in ML and finance.