Probabilistic Artificial Intelligence

Probabilistic Artificial Intelligence
Michael Kearns, AT&T Labs
Annual Workshop on Computational Complexity
Bellairs Research Institute of McGill University
Holetown, St. James, Barbados
February 22 - 26, 1999

COURSE DESCRIPTION: In the last decade or so, many of the central problems of "classical" artificial intelligence - such as knowledge representation, inference, learning and planning - have been reformulated in frameworks with statistical or probabilistic underpinnings. The benefits of this trend include the adoption of a common set of mathematical tools for the various AI subdisciplines, increased attention on central algorithmic issues, and an emphasis on approximation algorithms for some notoriously hard AI problems.

In this lecture series, I will survey the probabilistic frameworks and the central computational problems posed in several well-developed areas of AI. I will describe some of the algorithms proposed for these problems, overview what is formally known about them (and also what is suspected but not proven), and try to give a flavor of the mathematical techniques involved. The lectures will be self-contained, with an emphasis on the interesting open problems.

Below is a (very) approximate outline of what I hope to cover; undoubtedly, it is overly ambitious. I am also happy to let the interests of the participants influence the directions we pursue. The outline includes many related papers for your perusal; as we approach the dates of the meeting, I will be posting more material, and will let you know as I do so.

I have only one text recommondation for the course, but I do recommend it strongly. It covers many of the basics, and it is a very accessible introduction to an influential topic in AI:

Reinforcement Learning , Richard S. Sutton and Andrew G. Barto. MIT Press, 1998.

See you in Barbados!

COURSE OUTLINE

Basics of Markov Decision Processes and Reinforcement Learning

MDPs: Definitions and AI Motivation

Planning in MDPs: Value Functions, Value Iteration, Policy Evaluation, TD(lambda), Policy Iteration, Linear Programming Approaches

Learning in MDPs: Q-Learning, Model-Based Methods

Related Papers:

On the Complexity of Solving Markov Decision Processes. M. Littman, T. Dean, L. Kaelbling.

Postscript Compressed Postscript PDF

The Theory of Uniform Convergence

Finite Classes: The Chernoff Bound and the Union Bound

Infinite Classes: The VC Dimension

Refinements: Statistical Mechanics Approach

Generalizations of the VC Dimension

Related Papers:

Rigorous Learning Curve Bounds from Statistical Mechanics. D. Haussler, M. Kearns, H.S. Seung, N. Tishby.

Postscript Compressed Postscript PDF

Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications. D. Haussler.

Postscript Compressed Postscript PDF

Improved Theoretical Results for Learning in MDPs

Convergence Rates for Q-Learning and Model-Based Methods

The Exploration-Exploitation Trade-Off: The E^3 Algorithm

Related Papers:

Finite-Sample Rates of Convergence for Q-Learning and Indirect Methods. M. Kearns and S. Singh.

Postscript Compressed Postscript PDF

Near-Optimal Reinforcement Learning in Polynomial Time. M. Kearns and S. Singh.

Postscript Compressed Postscript PDF

Getting (More) Realistic: Handling Large State Spaces

Function Approximation of Value Functions

On-Line Planning via Sparse Sampling

The Need for Structured Representations

Related Papers:

A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes. M. Kearns, Y. Mansour, A. Ng.

Postscript Compressed Postscript PDF

Temporal Difference Learning and TD-Gammon , Gerald Tesauro

Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems. S. Singh and D. Bertsekas.

Postscript Compressed Postscript PDF

Probabilistic Reasoning and Bayesian Networks

Bayesian Networks: Definitions and AI Motivation

Inference in Bayesian Networks

Subtleties: Explaining Away

D-Separation

The Polytree Algorithm for Exact Inference

Related Papers:

An Introduction to Graphical Models. M. Jordan.

Postscript Compressed Postscript PDF

A Tutorial on Learning with Bayesian Networks. D. Heckerman.

Postscript Compressed Postscript PDF

Approximate Inference in Bayes Nets

Sampling Methods for Approximate Inference

Variational Methods for Approximate Inference

Applications: QMR-DT

Related Papers:

An Introduction to Variational Methods for Graphical Models. M. Jordan, Z. Ghahramani, T. Jaakkola, L. Saul.

Postscript Compressed Postscript PDF

Variational Methods and the QMR-DT Database. T. Jaakkola and M. Jordan.

Postscript Compressed Postscript PDF

Large Deviation Methods for Approximate Probabilistic Inference, with Rates of Convergence. M. Kearns and L. Saul.

Postscript Compressed Postscript PDF

Probabilistic Inference Using Markov Chain Monte Carlo Methods. R. Neal.

Postscript Compressed Postscript PDF

Combining MDPs and Bayes Nets
Dynamic Bayes Nets: Definitions and Motivation
DBN-MDPs
Tracking and Planning in DBN-MDPs
Learning in DBN-MDPs: E^3 Generalization
Related Papers:

Tractable Inference for Complex Stochastic Processes. X. Boyen and D. Koller.

Postscript Compressed Postscript PDF

Stochastic Simulation Algorithms for Dynamic Probabilistic Networks. K. Kanazawa, D. Koller, S. Russell.

Postscript Compressed Postscript PDF

Efficient Reinforcement Learning in Factored MDPs. M. Kearns and D. Koller.

Postscript Compressed Postscript PDF

More Realism: Partial Observability and Macro-Actions
POMDPs: Definitions and Motivation
Belief State Planning in POMDPs
Approximate Planning and Learning in POMDPs
"Macro" Actions or Options in MDPs
Related Papers:

Planning and Acting in Partially Observable Stochastic Domains. L. Kaelbling, M. Littman, T. Cassandra.

Postscript Compressed Postscript PDF

Approximate Planning in Large POMDPs via Reusable Trajectories. M. Kearns, Y. Mansour, A. Ng.

Postscript Compressed Postscript PDF

Between MDPs and Semi-MDPs: Learning, Planning and Representing Knowledge. R. Sutton, D. Precup, S. Singh.

Postscript Compressed Postscript PDF

Michael Kearns, AT&T Labs - Research Tel: (973) 360-8322 180 Park Avenue, Room A235 Florham Park, NJ 07932 mkearns@research.att.com

Last Edited: February 26, 2001