Tags
Language
Tags
June 2025
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 1 2 3 4 5
    Attention❗ To save your time, in order to download anything on this site, you must be registered 👉 HERE. If you do not have a registration yet, it is better to do it right away. ✌

    ( • )( • ) ( ͡⚆ ͜ʖ ͡⚆ ) (‿ˠ‿)
    SpicyMags.xyz

    "Pattern Recognition and Machine Learning" by Christopher M. Bishop

    Posted By: exLib
    "Pattern Recognition and Machine Learning" by Christopher M. Bishop

    "Pattern Recognition and Machine Learning" by Christopher M. Bishop
    Information Science and Statistics
    Sрringеr Science+Business Media | 2006 | ISBN: 0387310738 9780387310732 | 761 pages | PDF | 5 MB

    This textbook reflects these recent developments while providing a comprehensive introduction to the fields of pattern recognition and machine learning. No previous knowledge of pattern recognition or machine learning concepts is assumed. Familiarity with multivariate calculus and basic linear algebra is required, and some experience in the use of probabilities would be helpful though not essential as the book includes a self-contained introduction to basic probability theory.

    The book is suitable for courses on machine learning, statistics, computer science, signal processing, computer vision, data mining, and bioinformatics.
    Extensive support is provided for course instructors, including more than 400 exercises, graded according to difficulty. Example solutions for a subset of the exercises are available from the book web site, while solutions for the remainder can be obtained by instructors from the publisher.

    Contents
    Preface
    Mathematical notation
    Contents
    1 Introduction
    1.1 Example: Polynomial Curve Fitting
    1.2 Probability Theory
    1.2.1 Probability densities
    1.2.2 Expectations and covariances
    1.2.3 Bayesian probabilities
    1.2.4 The Gaussian distribution
    1.2.5 Curve fitting re-visited
    1.2.6 Bayesian curve fitting
    1.3 Model Selection
    1.4 The Curse of Dimensionality
    1.5 Decision Theory
    1.5.1 Minimizing the misclassification rate
    1.5.2 Minimizing the expected loss
    1.5.3 The reject option
    1.5.4 Inference and decision
    1.5.5 Loss functions for regression
    1.6 Information Theory
    1.6.1 Relative entropy and mutual information
    Exercises
    2 Probability Distributions
    2.1 Binary Variables
    2.1.1 The beta distribution
    2.2 Multinomial Variables
    2.2.1 The Dirichlet distribution
    2.3 The Gaussian Distribution
    2.3.1 Conditional Gaussian distributions
    2.3.2 Marginal Gaussian distributions
    2.3.3 Bayes’ theorem for Gaussian variables
    2.3.4 Maximum likelihood for the Gaussian
    2.3.5 Sequential estimation
    2.3.6 Bayesian inference for the Gaussian
    2.3.7 Student’s t-distribution
    2.3.8 Periodic variables
    2.3.9 Mixtures of Gaussians
    2.4 The Exponential Family
    2.4.1 Maximum likelihood and sufficient statistics
    2.4.2 Conjugate priors
    2.4.3 Noninformative priors
    2.5 Nonparametric Methods
    2.5.1 Kernel density estimators
    2.5.2 Nearest-neighbour methods
    Exercises
    3 Linear Models for Regression
    3.1 Linear Basis Function Models
    3.1.1 Maximum likelihood and least squares
    3.1.2 Geometry of least squares
    3.1.3 Sequential learning
    3.1.4 Regularized least squares
    3.1.5 Multiple outputs
    3.2 The Bias-Variance Decomposition
    3.3 Bayesian Linear Regression
    3.3.1 Parameter distribution
    3.3.2 Predictive distribution
    3.3.3 Equivalent kernel
    3.4 Bayesian Model Comparison
    3.5 The Evidence Approximation
    3.5.1 Evaluation of the evidence function
    3.5.2 Maximizing the evidence function
    3.5.3 Effective number of parameters
    3.6 Limitations of Fixed Basis Functions
    Exercises
    4 Linear Models for Classification
    4.1 Discriminant Functions
    4.1.1 Two classes
    4.1.2 Multiple classes
    4.1.3 Least squares for classification
    4.1.4 Fisher’s linear discriminant
    4.1.5 Relation to least squares
    4.1.6 Fisher’s discriminant for multiple classes
    4.1.7 The perceptron algorithm
    4.2 Probabilistic Generative Models
    4.2.1 Continuous inputs
    4.2.2 Maximum likelihood solution
    4.2.3 Discrete features
    4.2.4 Exponential family
    4.3 Probabilistic Discriminative Models
    4.3.1 Fixed basis functions
    4.3.2 Logistic regression
    4.3.3 Iterative reweighted least squares
    4.3.4 Multiclass logistic regression
    4.3.5 Probit regression
    4.3.6 Canonical link functions
    4.4 The Laplace Approximation
    4.4.1 Model comparison and BIC
    4.5 Bayesian Logistic Regression
    4.5.1 Laplace approximation
    4.5.2 Predictive distribution
    Exercises
    5 Neural Networks
    5.1 Feed-forward Network Functions
    5.1.1 Weight-space symmetries
    5.2 Network Training
    5.2.1 Parameter optimization
    5.2.2 Local quadratic approximation
    5.2.3 Use of gradient information
    5.2.4 Gradient descent optimization
    5.3 Error Backpropagation
    5.3.1 Evaluation of error-function derivatives
    5.3.2 A simple example
    5.3.3 Efficiency of backpropagation
    5.3.4 The Jacobian matrix
    5.4 The Hessian Matrix
    5.4.1 Diagonal approximation
    5.4.2 Outer product approximation
    5.4.3 Inverse Hessian
    5.4.4 Finite differences
    5.4.5 Exact evaluation of the Hessian
    5.4.6 Fast multiplication by the Hessian
    5.5 Regularization in Neural Networks
    5.5.1 Consistent Gaussian priors
    5.5.2 Early stopping
    5.5.3 Invariances
    5.5.4 Tangent propagation
    5.5.5 Training with transformed data
    5.5.6 Convolutional networks
    5.5.7 Soft weight sharing
    5.6 Mixture Density Networks
    5.7 Bayesian Neural Networks
    5.7.1 Posterior parameter distribution
    5.7.2 Hyperparameter optimization
    5.7.3 Bayesian neural networks for classification
    Exercises
    6 Kernel Methods
    6.1 Dual Representations
    6.2 Constructing Kernels
    6.3 Radial Basis Function Networks
    6.3.1 Nadaraya-Watson model
    6.4 Gaussian Processes
    6.4.1 Linear regression revisited
    6.4.2 Gaussian processes for regression
    6.4.3 Learning the hyperparameters
    6.4.4 Automatic relevance determination
    6.4.5 Gaussian processes for classification
    6.4.6 Laplace approximation
    6.4.7 Connection to neural networks
    Exercises
    7 Sparse Kernel Machines
    7.1 Maximum Margin Classifiers
    7.1.1 Overlapping class distributions
    7.1.2 Relation to logistic regression
    7.1.3 Multiclass SVMs
    7.1.4 SVMs for regression
    7.1.5 Computational learning theory
    7.2 Relevance Vector Machines
    7.2.1 RVM for regression
    7.2.2 Analysis of sparsity
    7.2.3 RVM for classification
    Exercises
    8 Graphical Models
    8.1 Bayesian Networks
    8.1.1 Example: Polynomial regression
    8.1.2 Generative models
    8.1.3 Discrete variables
    8.1.4 Linear-Gaussian models
    8.2 Conditional Independence
    8.2.1 Three example graphs
    8.2.2 D-separation
    8.3 Markov Random Fields
    8.3.1 Conditional independence properties
    8.3.2 Factorization properties
    8.3.3 Illustration: Image de-noising
    8.3.4 Relation to directed graphs
    8.4 Inference in Graphical Models
    8.4.1 Inference on a chain
    8.4.2 Trees
    8.4.3 Factor graphs
    8.4.4 The sum-product algorithm
    8.4.5 The max-sum algorithm
    8.4.6 Exact inference in general graphs
    8.4.7 Loopy belief propagation
    8.4.8 Learning the graph structure
    Exercises
    9 Mixture Models and EM
    9.1 K-means Clustering
    9.1.1 Image segmentation and compression
    9.2 Mixtures of Gaussians
    9.2.1 Maximum likelihood
    9.2.2 EM for Gaussian mixtures
    9.3 An Alternative View of EM
    9.3.1 Gaussian mixtures revisited
    9.3.2 Relation to K-means
    9.3.3 Mixtures of Bernoulli distributions
    9.3.4 EM for Bayesian linear regression
    9.4 The EM Algorithm in General
    Exercises
    10 Approximate Inference
    10.1 Variational Inference
    10.1.1 Factorized distributions
    10.1.2 Properties of factorized approximations
    10.1.3 Example: The univariate Gaussian
    10.1.4 Model comparison
    10.2 Illustration: Variational Mixture of Gaussians
    10.2.1 Variational distribution
    10.2.2 Variational lower bound
    10.2.3 Predictive density
    10.2.4 Determining the number of components
    10.2.5 Induced factorizations
    10.3 Variational Linear Regression
    10.3.1 Variational distribution
    10.3.2 Predictive distribution
    10.3.3 Lower bound
    10.4 Exponential Family Distributions
    10.4.1 Variational message passing
    10.5 Local Variational Methods
    10.6 Variational Logistic Regression
    10.6.1 Variational posterior distribution
    10.6.2 Optimizing the variational parameters
    10.6.3 Inference of hyperparameters
    10.7 Expectation Propagation
    10.7.1 Example: The clutter problem
    10.7.2 Expectation propagation on graphs
    Exercises
    11 Sampling Methods
    11.1 Basic Sampling Algorithms
    11.1.1 Standard distributions
    11.1.2 Rejection sampling
    11.1.3 Adaptive rejection sampling
    11.1.4 Importance sampling
    11.1.5 Sampling-importance-resampling
    11.1.6 Sampling and the EM algorithm
    11.2 Markov Chain Monte Carlo
    11.2.1 Markov chains
    11.2.2 The Metropolis-Hastings algorithm
    11.3 Gibbs Sampling
    11.4 Slice Sampling
    11.5 The Hybrid Monte Carlo Algorithm
    11.5.1 Dynamical systems
    11.5.2 Hybrid Monte Carlo
    11.6 Estimating the Partition Function
    Exercises
    12 Continuous Latent Variables
    12.1 Principal Component Analysis
    12.1.1 Maximum variance formulation
    12.1.2 Minimum-error formulation
    12.1.3 Applications of PCA
    12.1.4 PCA for high-dimensional data
    12.2 Probabilistic PCA
    12.2.1 Maximum likelihood PCA
    12.2.2 EM algorithm for PCA
    12.2.3 Bayesian PCA
    12.2.4 Factor analysis
    12.3 Kernel PCA
    12.4 Nonlinear Latent Variable Models
    12.4.1 Independent component analysis
    12.4.2 Autoassociative neural networks
    12.4.3 Modelling nonlinear manifolds
    Exercises
    13 Sequential Data
    13.1 Markov Models
    13.2 Hidden Markov Models
    13.2.1 Maximum likelihood for the HMM
    13.2.2 The forward-backward algorithm
    13.2.3 The sum-product algorithm for the HMM
    13.2.4 Scaling factors
    13.2.5 The Viterbi algorithm
    13.2.6 Extensions of the hidden Markov model
    13.3 Linear Dynamical Systems
    13.3.1 Inference in LDS
    13.3.2 Learning in LDS
    13.3.3 Extensions of LDS
    13.3.4 Particle filters
    Exercises
    14 Combining Models
    14.1 Bayesian Model Averaging
    14.2 Committees
    14.3 Boosting
    14.3.1 Minimizing exponential error
    14.3.2 Error functions for boosting
    14.4 Tree-based Models
    14.5 Conditional Mixture Models
    14.5.1 Mixtures of linear regression models
    14.5.2 Mixtures of logistic models
    14.5.3 Mixtures of experts
    Exercises
    Appendix A Data Sets
    Appendix B Probability Distributions
    Appendix C Properties of Matrices
    Appendix D Calculus of Variations
    Appendix E Lagrange Multipliers
    References
    Index
    with TOC BookMarkLinks

    "Pattern Recognition and Machine Learning" by Christopher M. Bishop

    About:
    Christopher Bishop is Assistant Director at Microsoft Research Cambridge, and also holds a Chair in Computer Science at the University of Edinburgh. He is a Fellow of Darwin College Cambridge, and was recently elected Fellow of the Royal Academy of Engineering.