Laplacian Eigenmaps: Preserving Local Structure
Laplacian Eigenmaps is a powerful non-linear dimensionality reduction technique designed to preserve the local structure of data while mapping it to a lower-dimensional space. This technique is particularly effective for data that resides on a non-linear manifold, allowing for insightful visualizations and analysis.
Overview of Laplacian Eigenmaps
Laplacian Eigenmaps operates by projecting data points onto a lower-dimensional space, aiming to preserve local neighborhood relationships. It leverages graph theory and spectral methods to achieve this goal. The process involves constructing a graph representation of the data, computing the graph Laplacian, and then finding a low-dimensional embedding that minimizes a specific cost function.
Mathematical Formulation
Step 1: Construct the Similarity Graph
1. Define the Graph:
Given a dataset \( \mathbf{X} = \{ \mathbf{x}_1, \mathbf{x}_2, \dots, \mathbf{x}_n \} \), where each \( \mathbf{x}_i \in \mathbb{R}^d \) represents a data point in a \( d \)-dimensional space. Construct a graph \( G \) with nodes corresponding to the data points. Edges between nodes represent the similarity between data points.
2. Similarity Measure:
Use a Gaussian kernel to measure similarity between pairs of points. The weight \( W_{ij} \) of the edge between nodes \( i \) and \( j \) is defined as:
\[ W_{ij} = \exp \left(-\frac{\|\mathbf{x}_i - \mathbf{x}_j\|^2}{2 \sigma^2} \right) \]
where \( \sigma \) is a parameter controlling the width of the Gaussian kernel. For nodes \( i \) and \( j \) that are not directly connected, \( W_{ij} = 0 \).
3. Construct the Weight Matrix:
The weight matrix \( W \) is an \( n \times n \) matrix where each entry \( W_{ij} \) denotes the similarity between data points \( \mathbf{x}_i \) and \( \mathbf{x}_j \). The matrix is typically symmetric, i.e., \( W_{ij} = W_{ji} \).
Step 2: Construct the Graph Laplacian
1. Degree Matrix:
The degree matrix \( D \) is a diagonal matrix where each diagonal entry \( D_{ii} \) represents the sum of weights connected to node \( i \):
\[ D_{ii} = \sum_{j} W_{ij} \]
2. Laplacian Matrix:
Compute the unnormalized graph Laplacian \( L \) as:
\[ L = D - W \]
Alternatively, compute the normalized graph Laplacian \( L_{norm} \):
\[ L_{norm} = I - D^{-1/2} W D^{-1/2} \]
where \( I \) is the identity matrix.
Step 3: Compute Eigenvalues and Eigenvectors
1. Eigenproblem:
Solve the generalized eigenvalue problem for the graph Laplacian \( L \):
\[ L \mathbf{y}_i = \lambda_i D \mathbf{y}_i \]
Here, \( \lambda_i \) are the eigenvalues, and \( \mathbf{y}_i \) are the corresponding eigenvectors.
2. Select Smallest Eigenvalues:
Choose the \( k \) smallest non-zero eigenvalues and their corresponding eigenvectors. These eigenvectors form the columns of the low-dimensional embedding matrix \( \mathbf{Y} \).
3. Low-Dimensional Embedding:
The low-dimensional coordinates of the data points are given by:
\[ \mathbf{Y} = \begin{bmatrix} \mathbf{y}_1 & \mathbf{y}_2 & \cdots & \mathbf{y}_k \end{bmatrix} \]
Each row of \( \mathbf{Y} \) represents the coordinates of a data point in the \( k \)-dimensional space.
Example: Applying Laplacian Eigenmaps
Step 1: Construct the Similarity Graph
Consider a dataset of 5 points in 2D:
\[ \mathbf{X} = \begin{bmatrix} 1 & 2 \\ 2 & 3 \\ 3 & 3 \\ 4 & 2 \\ 5 & 1 \end{bmatrix} \]
1. Compute the pairwise similarities using a Gaussian kernel with \( \sigma = 1 \):
\[ W_{ij} = \exp \left(-\frac{\|\mathbf{x}_i - \mathbf{x}_j\|^2}{2 \sigma^2} \right) \]
The resulting weight matrix \( W \) is:
\[ W = \begin{bmatrix} 0 & 0.6065 & 0.3679 & 0.1353 & 0.0183 \\ 0.6065 & 0 & 0.6065 & 0.6065 & 0.1353 \\ 0.3679 & 0.6065 & 0 & 0.6065 & 0.6065 \\ 0.1353 & 0.6065 & 0.6065 & 0 & 0.6065 \\ 0.0183 & 0.1353 & 0.6065 & 0.6065 & 0 \end{bmatrix} \]
2. Construct the degree matrix \( D \):
\[ D = \begin{bmatrix} 1.1279 & 0 & 0 & 0 & 0 \\ 0 & 1.9683 & 0 & 0 & 0 \\ 0 & 0 & 2.2214 & 0 & 0 \\ 0 & 0 & 0 & 1.5113 & 0 \\ 0 & 0 & 0 & 0 & 1.3661 \end{bmatrix} \]
3. Compute the Laplacian matrix \( L \):
\[ L = D - W \]
This results in:
\[ L = \begin{bmatrix} 1.1279 & -0.6065 & -0.3679 & -0.1353 & -0.0183 \\ -0.6065 & 1.9683 & -0.6065 & -0.6065 & -0.1353 \\ -0.3679 & -0.6065 & 2.2214 & -0.6065 & -0.6065 \\ -0.1353 & -0.6065 & -0.6065 & 1.5113 & -0.6065 \\ -0.0183 & -0.1353 & -0.6065 & -0.6065 & 1.3661 \end{bmatrix} \]
Step 2: Compute Eigenvalues and Eigenvectors
1. Solve the eigenvalue problem for \( L \):
Calculate the eigenvalues and eigenvectors of \( L \).
2. Suppose the smallest eigenvalues and their corresponding eigenvectors are:
\[ \mathbf{y}_1 = \begin{bmatrix} 0.5 \\ 0.5 \\ 0.5 \\ 0.5 \\ 0.5 \end{bmatrix}, \quad \mathbf{y}_2 = \begin{bmatrix} 0.7 \\ 0.1 \\ -0.2 \\ 0.3 \\ -0.3 \end{bmatrix} \]
3. The low-dimensional embedding \( \mathbf{Y} \) (in 2D) is:
\[ \mathbf{Y} = \begin{bmatrix} 0.5 & 0.7 \\ 0.5 & 0.1 \\ 0.5 & -0.2 \\ 0.5 & 0.3 \\ 0.5 & -0.3 \end{bmatrix} \]
Here, each row of \( \mathbf{Y} \) represents the coordinates of the corresponding data point in the 2D embedding space.
Summary
Laplacian Eigenmaps projects high-dimensional data into a lower-dimensional space while preserving local neighborhood relationships.
Steps: Construct a similarity graph, compute the graph Laplacian, solve the eigenvalue problem, and select the smallest eigenvalues/eigenvectors for the low-dimensional embedding.
Applications: Effective for visualizing data on non-linear manifolds and understanding its intrinsic geometry.
By using Laplacian Eigenmaps, you can effectively reduce the dimensionality of complex datasets while preserving their underlying local structures.
Comments
Post a Comment