Solomonoff induction is a mathematically rigorous, idealized form of induction, that is, predicting what will happen in the future based on prior experiences. It is a part of algorithmic information theory. This induction scheme is theoretically optimal, i.e., given enough data, it will always be able to assign probabilities to future events with the maximum possible accuracy allowed. The only problem with Solomonoff induction is that it is incomputable — that is, it would require a computer with infinite processing power to run. However, all successful inductive schemes and machines — including animals and humans — are approximations of Solomonoff induction.
Every verbal argument containing advice for better induction, to the extent that it actually works, works by coaxing the listener into modifying his or her inductive strategy in such a way that it better approximates the theory. The idea that induction can be mathematically formalized in this way is quite profound, and many generations of logicians and philosophers said it couldn’t be done. The theory grew out of work by Ray Solomonoff, Andrey Kolmolgorov, and Gregory Chaitin in the 1960s. Their underlying motivation was to formalize probability theory and induction using axioms, in the same way that algebra and geometry have been formalized. The theory is based on an inductive rule called Bayes’ theorem, which describes a precise mathematical way to update beliefs based on incoming data.
One weakness in Bayes’ theorem is that it depends on a prior probability for a certain event. For example, the probability of an asteroid impacting Earth in the next 10 years can be given on the basis of historical data about asteroid impacts. However, when the sample size of prior events is low, such as the number of times a neutrino has been detected in a neutrino trap, it becomes very difficult to predict the likelihood of the event happening again based solely on past experience.
This is where Solomonoff induction comes in. Using an objective measure of complexity called Kolmogorov complexity, the theory can make an educated guess about the probability of some future event occurring. Kolmogorov complexity is based on a principle called Minimum Description Length (MDL), which assesses the complexity of a string of bits based on the shortest algorithm that can output that string. Although Kolmogorov complexity initially applied to bitstrings only, it can be translated to describe the complexity of events and objects.
Solomonoff induction integrates Kolmogorov complexity into Bayesian reasoning, giving us justified priors for events that may never even have happened. The prior probability of an arbitrary event is judged based upon its overall complexity and specificity. For example, the probability of two random raindrops in a storm hitting the same square meter is fairly low, but much higher than the probability of ten or a hundred random raindrops hitting that square meter.
Some scientists have studied the theory in the context of neuroanatomy, showing how optimal induction is an organizing principle in the evolution of animals that need accurate induction for survival. When true Artificial Intelligence is created, the principles will be a likely inspiration underlying its construction.