FD-5: Multivariate Analysis of Imaging Data
Presented by: Peter Bajorski
Rochester Institute of Technology, USA
In this course, the participants will learn useful tools for the analysis of data on many variables (such as data on many spectral bands or on several responses observed in an experiment). They will identify the benefits of incorporating information from several variables as opposed to analyzing each variable separately. Through understanding the principles behind the analytical tools, the participants will be able to decide when these tools should or should not be used in practice. Many practical and useful examples of analyses of imaging data are included. The instructor will emphasize intuitive and geometric understanding of the introduced concepts. The topics covered include principal component analysis (PCA), canonical correlation analysis, discrimination and classification (supervised learning), Fisher discrimination, and independent component analysis (ICA).
At the end of this course, the participants will be able to
- Explain variability in your data using principal component analysis (PCA)
- Judge when PCA should be used on the covariance or correlation matrices
- Interpret principal components based on loadings and scores
- Evaluate the simplification of data resulting from PCA based on residual analysis
- Use canonical correlations to investigate correlations between two sets of variables
- Construct canonical correlation regression and evaluate its performance by using cross-validation
- Construct discrimination procedures for classification into several populations based on multivariate data
- Create plots based on Fisher discriminants
- Use cross-validation for evaluation of classification procedures
- Explain the concept of independent component analysis (ICA) and its relationship to PCA
- Calculate nonnegative principal components.
List of topics:
- Introduction.
- Definition of multivariate descriptive statistics. The geometry of the multivariate sample.
- Measures of multivariate variability (generalized variance and total variability) and their interpretations. Will also discuss how to deal with small samples on a large number of variables.
- Examples and many graphs for intuitive explanations.
- Principal Component Analysis (PCA).
- Definition of principal components.
- Interpretation of loadings through impact plots.
- Stopping rules for PCA, including some special rules for 100+ dimensional data.
- Interpretation of PC scores. Residual analysis.
- Statistical inference in PCA: i.i.d. (classic) case and non-i.i.d. cases.
- Canonical Correlation Analysis and Regression.
- Definition of canonical variables and the intuition behind it.
- Canonical correlation regression and its cross-validation.
- Examples of practical applications.
- Classification.
- Linear and Quadratic classification rules
- Calculating misclassification costs and probabilities
- Cross-validation of classification rules
- Fisher discriminants for graphical representation
- Spatial smoothing for classification
- Classification with an option of not classifying some observations.
- Independent component analysis (ICA)
- Meaning of ICA and how to perform it
- Nonnegative PCA
- Why do we need Nonnegative PCA?
- How to calculate Nonnegative Principal Components?
- PCA, ICA, and Nonnegative PCA as latent models and how to put all of it together.
This course is intended for participants who want to gain better insight into their multivariate data. Participants are expected to have a basic knowledge of vector and matrix algebra as well as some basic univariate statistics.
Peter Bajorski is an Associate Professor of Statistics at Rochester Institute of Technology, Rochester, NY, USA. Previously, he held positions at Cornell University, the University of British Columbia, and Simon Fraser University. He received the B.S./M.S. degrees in mathematics and the Ph.D. degree in mathematical statistics. He teaches graduate courses in statistics including a course on Multivariate Statistics for Imaging Science. He also designs and teaches short courses in industry, with longer-term follow-up and consulting. He has published over 50 research papers in statistics and in hyperspectral imaging. Dr. Bajorski is past-president of the Rochester Chapter of the American Statistical Association. He is also a senior member of SPIE and a senior member of IEEE. His book on Statistics for Imaging, Optics, and Photonics was published in the prestigious Wiley Series in Probability and Statistics in 2011.