Introduction to PCA
Principal Component Analysis (PCA), is a linear dimension reduction method, which represents high dimensional data with low dimensional code.
If we can find a linear manifold in a high dimensional space, we can project the data onto this manifold, and just represent where it is on the manifold. Given a data matrix X, PCA is usually performed in the following steps:
- Calculate the eigenvalues and eigenvectors for the Xᵀ X.
- Select the top k eigenvalues, and the eigenvectors corresponding to these eigenvalues will be the principal components to choose.
- Project the data onto the manifold formed by these k components.
A typical explanation of PCA:
What PCA does is taking m-dimensional data and finds the k orthogonal directions (also called principal components) in which the data have the most variance. These k principal components form a lower-dimensional subspace, we can then represent a m-dimensional datapoint by its projections onto the kprincipal directions. This loses all information about where the datapoint is located in the remaining orthogonal directions. However, since in PCA, we find the k principal components that have the largest variance explained, those left-out directions don’t cover a lot of variance, we haven’t lost much information.
We see the term of “variance” shows up a lot in the typical explanation of PCA. We can understand this intuitively, since we always want to choose the directions where data can spread out, which are the directions of larger “variance.” However, the resources convincing us of this point mathematically are limited.
Purpose of This Blog Post
The purpose of this blog post is as follows:
- Provide a concise mathematical proof on why PCA is finding the directions where data have largest variance.
- Give readers a better knowledge of the math behind PCA, which helps (Read more...)
This is a Security Bloggers Network syndicated blog post authored by Xuan Zhao. Read the original post at: Cylance Blog