A new cool eprint has appeared:
J. Veness, et al. A Monte Carlo AIXI Approximation. eprint: 0909.0801
The question that defines the context for this article is: How should probabilities be assigned? One way, with much to recommend itself, is take them to be algorithmic probabilities or universal priors. Suppose one has observed the first N values of a discrete time series, maybe a byte stream, and wishes to predict or make a bet about the next value. Is there a general probability measure appropriate for all cases that fit this abstract setting and, if so, which one? Algorithmic probability, or the universal prior, provides a partial answer: under some general assumptions about computability, the universal prior asymptotically reproduces the best predictions in the limit of infinite N. Despite the fact that no guarantees can be made for any finite N, this is a very appealing property that has been given much attention in algorithmic information theory and mathematical statistics. The universal prior is the probability that a universal Turing machine, given random data, will output a given string. This is related to the Kolmogorov complexity of the time series, so that time series that can be reproduced by short computer programs are assigned higher probability than those that require long computer programs. Predicting the next value is now a simple application of Bayes’ theorem. Problem of induction, solved! (Caveat: solved in the limit of infinitely long observations of time series, and the universal prior cannot, in general, be computed in a finite number of operations.) The next natural step is to develop a universal Artificial Intelligence, by throwing a utility function that depends on what actions are chosen. Plugging in the universal prior and utility function into decision theory solves this problem too. Universal Artificial Intelligence, done!
Marcus Hutter, who denotes the universal prior by and calls the resulting Artificial Intelligence or AIXI, has written lots of interesting articles on the mathematical issues. Because AIXI is uncomputable, only approximations to it can be implemented and run on an actual computer. One way to approximate it is to set a cut-off on the length of computer programs that are allowed to contribute to the (approximate) universal prior, and another cut-off on how many time steps into the future AIXI’s foresight extends to. Monte Carlo techniques can further reduce the computation of a large number of terms to just a statistical sample of the terms. The new eprint presents such a practical implementation and tests it on a few simple games. The results are quite impressive. AIXI-MC, as their universal Artificial Intelligence is called, is able to navigate a simple mazes, play tic-tac-toe, and even play PacMan. As the games progress, AIXI-MC learns from experience and improves its performance, approaching the optimal performance. AIXI-MC’s weakness, compared to a human player, is that on the order of 1,000-100,000 game cycles seem to be required to learn how to play these games effectively, but, on the other hand, AIXI-MC has not had the benefit of years of upbringing and environmental feedback that any human PacMan player starts with.