Sure to be influential, this book lays the foundations for the use of algebraic geometry in statistical learning theory. Many widely used statistical models and learning machines applied to information science have a parameter space that is singular: mixture models, neural networks, HMMs, Bayesian networks, and stochastic context-free grammars are major examples. Algebraic geometry and singularity theory provide the necessary tools for studying such non-smooth models. Four main formulas are established: 1. the log likelihood function can be given a common standard form using resolution of singularities, even applied to more complex models; 2. the asymptotic behaviour of the marginal likelihood or 'the evidence' is derived based on zeta function theory; 3. new methods are derived to estimate the generalization errors in Bayes and Gibbs estimations from training errors; 4. the generalization errors of maximum likelihood and a posteriori methods are clarified by empirical process theory on algebraic varieties.