Probabilistic Artificial Intelligence
Michael Kearns, AT&T Labs
Annual Workshop on Computational Complexity
Bellairs Research Institute of McGill University
Holetown, St. James, Barbados
February 22 - 26, 1999


COURSE DESCRIPTION: In the last decade or so, many of the central problems of "classical" artificial intelligence - such as knowledge representation, inference, learning and planning - have been reformulated in frameworks with statistical or probabilistic underpinnings. The benefits of this trend include the adoption of a common set of mathematical tools for the various AI subdisciplines, increased attention on central algorithmic issues, and an emphasis on approximation algorithms for some notoriously hard AI problems.

In this lecture series, I will survey the probabilistic frameworks and the central computational problems posed in several well-developed areas of AI. I will describe some of the algorithms proposed for these problems, overview what is formally known about them (and also what is suspected but not proven), and try to give a flavor of the mathematical techniques involved. The lectures will be self-contained, with an emphasis on the interesting open problems.

Below is a (very) approximate outline of what I hope to cover; undoubtedly, it is overly ambitious. I am also happy to let the interests of the participants influence the directions we pursue. The outline includes many related papers for your perusal; as we approach the dates of the meeting, I will be posting more material, and will let you know as I do so.

I have only one text recommondation for the course, but I do recommend it strongly. It covers many of the basics, and it is a very accessible introduction to an influential topic in AI:

  • Reinforcement Learning , Richard S. Sutton and Andrew G. Barto. MIT Press, 1998.

    See you in Barbados!


    COURSE OUTLINE


    Basics of Markov Decision Processes and Reinforcement Learning

  • MDPs: Definitions and AI Motivation
  • Planning in MDPs: Value Functions, Value Iteration, Policy Evaluation, TD(lambda), Policy Iteration, Linear Programming Approaches
  • Learning in MDPs: Q-Learning, Model-Based Methods

    Related Papers:

  • On the Complexity of Solving Markov Decision Processes. M. Littman, T. Dean, L. Kaelbling.
  • Postscript Compressed Postscript PDF

  • The Theory of Uniform Convergence

  • Finite Classes: The Chernoff Bound and the Union Bound
  • Infinite Classes: The VC Dimension
  • Refinements: Statistical Mechanics Approach
  • Generalizations of the VC Dimension

    Related Papers:

  • Rigorous Learning Curve Bounds from Statistical Mechanics. D. Haussler, M. Kearns, H.S. Seung, N. Tishby.
  • Postscript Compressed Postscript PDF
  • Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications. D. Haussler.
  • Postscript Compressed Postscript PDF

  • Improved Theoretical Results for Learning in MDPs

  • Convergence Rates for Q-Learning and Model-Based Methods
  • The Exploration-Exploitation Trade-Off: The E^3 Algorithm

    Related Papers:

  • Finite-Sample Rates of Convergence for Q-Learning and Indirect Methods. M. Kearns and S. Singh.
  • Postscript Compressed Postscript PDF
  • Near-Optimal Reinforcement Learning in Polynomial Time. M. Kearns and S. Singh.
  • Postscript Compressed Postscript PDF


    Getting (More) Realistic: Handling Large State Spaces

  • Function Approximation of Value Functions
  • On-Line Planning via Sparse Sampling
  • The Need for Structured Representations

    Related Papers:

  • A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes. M. Kearns, Y. Mansour, A. Ng.
  • Postscript Compressed Postscript PDF
  • Temporal Difference Learning and TD-Gammon , Gerald Tesauro
  • Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems. S. Singh and D. Bertsekas.
  • Postscript Compressed Postscript PDF


    Probabilistic Reasoning and Bayesian Networks

  • Bayesian Networks: Definitions and AI Motivation
  • Inference in Bayesian Networks
  • Subtleties: Explaining Away
  • D-Separation
  • The Polytree Algorithm for Exact Inference

    Related Papers:

  • An Introduction to Graphical Models. M. Jordan.
  • Postscript Compressed Postscript PDF
  • A Tutorial on Learning with Bayesian Networks. D. Heckerman.
  • Postscript Compressed Postscript PDF


    Approximate Inference in Bayes Nets

  • Sampling Methods for Approximate Inference
  • Variational Methods for Approximate Inference
  • Applications: QMR-DT

    Related Papers:

  • An Introduction to Variational Methods for Graphical Models. M. Jordan, Z. Ghahramani, T. Jaakkola, L. Saul.
  • Postscript Compressed Postscript PDF
  • Variational Methods and the QMR-DT Database. T. Jaakkola and M. Jordan.
  • Postscript Compressed Postscript PDF
  • Large Deviation Methods for Approximate Probabilistic Inference, with Rates of Convergence. M. Kearns and L. Saul.
  • Postscript Compressed Postscript PDF
  • Probabilistic Inference Using Markov Chain Monte Carlo Methods. R. Neal.
  • Postscript Compressed Postscript PDF


    Combining MDPs and Bayes Nets

  • Dynamic Bayes Nets: Definitions and Motivation
  • DBN-MDPs
  • Tracking and Planning in DBN-MDPs
  • Learning in DBN-MDPs: E^3 Generalization

    Related Papers:

  • Tractable Inference for Complex Stochastic Processes. X. Boyen and D. Koller.
  • Postscript Compressed Postscript PDF
  • Stochastic Simulation Algorithms for Dynamic Probabilistic Networks. K. Kanazawa, D. Koller, S. Russell.
  • Postscript Compressed Postscript PDF
  • Efficient Reinforcement Learning in Factored MDPs. M. Kearns and D. Koller.
  • Postscript Compressed Postscript PDF


    More Realism: Partial Observability and Macro-Actions

  • POMDPs: Definitions and Motivation
  • Belief State Planning in POMDPs
  • Approximate Planning and Learning in POMDPs
  • "Macro" Actions or Options in MDPs

    Related Papers:

  • Planning and Acting in Partially Observable Stochastic Domains. L. Kaelbling, M. Littman, T. Cassandra.
  • Postscript Compressed Postscript PDF
  • Approximate Planning in Large POMDPs via Reusable Trajectories. M. Kearns, Y. Mansour, A. Ng.
  • Postscript Compressed Postscript PDF
  • Between MDPs and Semi-MDPs: Learning, Planning and Representing Knowledge. R. Sutton, D. Precup, S. Singh.
  • Postscript Compressed Postscript PDF


    Michael Kearns, AT&T Labs - Research	Tel: (973) 360-8322 
    180 Park Avenue, Room A235
    Florham Park, NJ 07932                          mkearns@research.att.com
    

    Last Edited: February 26, 2001