Laplacian Eigenmaps: Preserving Local Structure

Laplacian Eigenmaps is a powerful non-linear dimensionality reduction technique designed to preserve the local structure of data while mapping it to a lower-dimensional space. This technique is particularly effective for data that resides on a non-linear manifold, allowing for insightful visualizations and analysis.

Overview of Laplacian Eigenmaps

Laplacian Eigenmaps operates by projecting data points onto a lower-dimensional space, aiming to preserve local neighborhood relationships. It leverages graph theory and spectral methods to achieve this goal. The process involves constructing a graph representation of the data, computing the graph Laplacian, and then finding a low-dimensional embedding that minimizes a specific cost function.

Mathematical Formulation

Step 1: Construct the Similarity Graph

1. Define the Graph:

Given a dataset \( \mathbf{X} = \{ \mathbf{x}_1, \mathbf{x}_2, \dots, \mathbf{x}_n \} \), where each \( \mathbf{x}_i \in \mathbb{R}^d \) represents a data point in a \( d \)-dimensional space. Construct a graph \( G \) with nodes corresponding to the data points. Edges between nodes represent the similarity between data points.

2. Similarity Measure:

Use a Gaussian kernel to measure similarity between pairs of points. The weight \( W_{ij} \) of the edge between nodes \( i \) and \( j \) is defined as:

\[ W_{ij} = \exp \left(-\frac{\|\mathbf{x}_i - \mathbf{x}_j\|^2}{2 \sigma^2} \right) \]

where \( \sigma \) is a parameter controlling the width of the Gaussian kernel. For nodes \( i \) and \( j \) that are not directly connected, \( W_{ij} = 0 \).

3. Construct the Weight Matrix:

The weight matrix \( W \) is an \( n \times n \) matrix where each entry \( W_{ij} \) denotes the similarity between data points \( \mathbf{x}_i \) and \( \mathbf{x}_j \). The matrix is typically symmetric, i.e., \( W_{ij} = W_{ji} \).

Step 2: Construct the Graph Laplacian

1. Degree Matrix:

The degree matrix \( D \) is a diagonal matrix where each diagonal entry \( D_{ii} \) represents the sum of weights connected to node \( i \):

\[ D_{ii} = \sum_{j} W_{ij} \]

2. Laplacian Matrix:

Compute the unnormalized graph Laplacian \( L \) as:

\[ L = D - W \]

Alternatively, compute the normalized graph Laplacian \( L_{norm} \):

\[ L_{norm} = I - D^{-1/2} W D^{-1/2} \]

where \( I \) is the identity matrix.

Step 3: Compute Eigenvalues and Eigenvectors

1. Eigenproblem:

Solve the generalized eigenvalue problem for the graph Laplacian \( L \):

\[ L \mathbf{y}_i = \lambda_i D \mathbf{y}_i \]

Here, \( \lambda_i \) are the eigenvalues, and \( \mathbf{y}_i \) are the corresponding eigenvectors.

2. Select Smallest Eigenvalues:

Choose the \( k \) smallest non-zero eigenvalues and their corresponding eigenvectors. These eigenvectors form the columns of the low-dimensional embedding matrix \( \mathbf{Y} \).

3. Low-Dimensional Embedding:

The low-dimensional coordinates of the data points are given by:

\[ \mathbf{Y} = \begin{bmatrix} \mathbf{y}_1 & \mathbf{y}_2 & \cdots & \mathbf{y}_k \end{bmatrix} \]

Each row of \( \mathbf{Y} \) represents the coordinates of a data point in the \( k \)-dimensional space.

Example: Applying Laplacian Eigenmaps

Step 1: Construct the Similarity Graph

Consider a dataset of 5 points in 2D:

\[ \mathbf{X} = \begin{bmatrix} 1 & 2 \\ 2 & 3 \\ 3 & 3 \\ 4 & 2 \\ 5 & 1 \end{bmatrix} \]

1. Compute the pairwise similarities using a Gaussian kernel with \( \sigma = 1 \):

\[ W_{ij} = \exp \left(-\frac{\|\mathbf{x}_i - \mathbf{x}_j\|^2}{2 \sigma^2} \right) \]

The resulting weight matrix \( W \) is:

\[ W = \begin{bmatrix} 0 & 0.6065 & 0.3679 & 0.1353 & 0.0183 \\ 0.6065 & 0 & 0.6065 & 0.6065 & 0.1353 \\ 0.3679 & 0.6065 & 0 & 0.6065 & 0.6065 \\ 0.1353 & 0.6065 & 0.6065 & 0 & 0.6065 \\ 0.0183 & 0.1353 & 0.6065 & 0.6065 & 0 \end{bmatrix} \]

2. Construct the degree matrix \( D \):

\[ D = \begin{bmatrix} 1.1279 & 0 & 0 & 0 & 0 \\ 0 & 1.9683 & 0 & 0 & 0 \\ 0 & 0 & 2.2214 & 0 & 0 \\ 0 & 0 & 0 & 1.5113 & 0 \\ 0 & 0 & 0 & 0 & 1.3661 \end{bmatrix} \]

3. Compute the Laplacian matrix \( L \):

\[ L = D - W \]

This results in:

\[ L = \begin{bmatrix} 1.1279 & -0.6065 & -0.3679 & -0.1353 & -0.0183 \\ -0.6065 & 1.9683 & -0.6065 & -0.6065 & -0.1353 \\ -0.3679 & -0.6065 & 2.2214 & -0.6065 & -0.6065 \\ -0.1353 & -0.6065 & -0.6065 & 1.5113 & -0.6065 \\ -0.0183 & -0.1353 & -0.6065 & -0.6065 & 1.3661 \end{bmatrix} \]

Step 2: Compute Eigenvalues and Eigenvectors

1. Solve the eigenvalue problem for \( L \):

Calculate the eigenvalues and eigenvectors of \( L \).

2. Suppose the smallest eigenvalues and their corresponding eigenvectors are:

\[ \mathbf{y}_1 = \begin{bmatrix} 0.5 \\ 0.5 \\ 0.5 \\ 0.5 \\ 0.5 \end{bmatrix}, \quad \mathbf{y}_2 = \begin{bmatrix} 0.7 \\ 0.1 \\ -0.2 \\ 0.3 \\ -0.3 \end{bmatrix} \]

3. The low-dimensional embedding \( \mathbf{Y} \) (in 2D) is:

\[ \mathbf{Y} = \begin{bmatrix} 0.5 & 0.7 \\ 0.5 & 0.1 \\ 0.5 & -0.2 \\ 0.5 & 0.3 \\ 0.5 & -0.3 \end{bmatrix} \]

Here, each row of \( \mathbf{Y} \) represents the coordinates of the corresponding data point in the 2D embedding space.

Summary

Laplacian Eigenmaps projects high-dimensional data into a lower-dimensional space while preserving local neighborhood relationships.

Steps: Construct a similarity graph, compute the graph Laplacian, solve the eigenvalue problem, and select the smallest eigenvalues/eigenvectors for the low-dimensional embedding.

Applications: Effective for visualizing data on non-linear manifolds and understanding its intrinsic geometry.

By using Laplacian Eigenmaps, you can effectively reduce the dimensionality of complex datasets while preserving their underlying local structures.

Comments