Special Topics in Linear Algebra
Special Topics in Linear Algebra was a course both in fundamental linear algebraic theory as well as in the applications of linear algebra for solving a wide range of problems
â€‹
Select course topics can be found below:
Theory:

Field/Vector Space Characterization

RankNullity Theorem

Isomorphism

Change of Base Matrices

CauchySchwarz / Triangle Inequality

Spectral Theorem

Power Vectors / Jordan Canonical Form

Matrix Exponentials

Kronecker Products
Applications:

GramSchmidt Orthogonalization

Polynomial Regression

Spectral Graph Theory for Clustering

CourantFischer Theorem

Graph Matching (Sylvester Equation)

Rayleigh Quotient (estimating eigenvalues)

Gershgorin Circle Theorem

Matrix Compression

Cauchy Interlacing Theorem
â€‹
Another requirement of the course was to write a technical paper on a topic related to a course. I chose to write a paper on random projections. The paper covered the theory behind random projections as well as applications, including the application of random projections to kernelized kmeans as a method to offset the increased timecomplexity cost of kernelizing kmeans. This is a method that can be applied to any kernel method, and seems to have surprisingly few published papers dedicated to it and (to my knowledge) scarce application in realworld scenarios.
The paper was codeintensive, and saw me writing over 4,500 lines of code. The code is very flexible and, among other things, allows for:

The sampling of points of any number of Nspheres in a space with any dimension, the Nspheres being centered at any specified point within the space and with any radius.

Similar sampling but with with multivariate Gaussians.

The automatic processing of kmeans data into useful partitions, with the option of storing accuracy data plots.

The storing of metadata when doing a gridsearch for the purposes of easy runtime and accuracy comparison based on grid parameters (e.g. type of random projection, kernelized boolean, kernel variance, projection dimension, etc.).

Useful file/data handling to automatically generate PNGs, FIG, and MP4 (in the case of 3D data) files, automatically sorted and named according to the data set used, the random projection method used, and various methodspecific parameters (e.g. Nsphere dimension, kernel variance variance etc, kernel type, etc.)
â€‹
More information on each section of the paper can be found below, as well as a PDF of the paper at the bottom of the page for inwindow viewing.
Theory
An overview of the theoretical foundations of random projections, various constructive realizations of random projections, and examples & experimental verification are found in this section
After introductory remarks, my final paper presents the mathematical theory behind random projections. In particular, I introduce the JohnsonLindenstrauss (JL) Lemma, which provides an upper bound on dimension needed to project data down within some margin of errorâ€‹. I go over the formal statement (seen below), an interpretation of the statement, and an exploration of what makes this result so important.
Next, noting that the JL Lemma is not constructive, I introduce the general process of dimensionality reduction with random projections, and present three methods of random projection.
â€‹â€‹
I go on to explore these projection methods, and how their accuracy depends on the degree to which the dimensionality of the data is reduced and on allowed error
To conclude this part of my paper, I evaluate random projections as a general dimensionality reduction technique, and compare it to another popular method  Principal Component Analysis (PCA). I Include an overview of PCA as well as a review of the basic math required to understand the method (spectral theorem/PSD covariance matrices).
â€‹
â€‹
In particular, I note that Gaussian random projection is robust even for nonconvex data
â€‹â€‹
â€‹
â€‹
PCA basis for 1D subspace of R3
Spectrum of covariance matrix of the data
Spectrum demonstrates ineffectiveness of PCA on nonconvex data
Accuracy heatmap of Gaussian random projection demonstrates effectiveness on nonconvex data
Applications
In this section I examine potential applications for random projections,
in particular for expediting kmeans clustering
The next part of my paper deals with potential applications for random projections. I first provide a recap of the KMeans unsupervised clustering method. This method seeks to find a partition of the data by determining the optimal locations of "centroids", which define the center of closedballs in the space. All points that fall within a given closedball are given the same label. The centroid locations are optimal in the sense that they minimize the sum of squared error over all points, where error is usually defined as the Euclidean distance between a point and its associated centroid.
â€‹
Example data can be seen here, along with the partition that KMeans would find.
Unlabeled Data Clustered Data
Raw data sampled from two underlying convex distributions
Data clustered using ordinary kmeans
I go on to highlight how ordinary KMeans fails on nonconvex data, as seen here:
Unlabeled Data Clustered Data
Raw data sampled from two underlying nonconvex distributions
Clustering via ordinary kmeans demonstrates its inability to capture the underlying structure of the data
And how kernelization can resolve this failure by projecting the data to a higher dimension in which they are linearly separable, shown here:
Separated Data with Linear Separator Clustered Data
The same nonconvex data projected into R3, along with convex boundaries (2spheres) and linear separator (gray)
The same nonconvex data clustered using ordinary kmeans after kernelization
Finally, I go on to show how random projections can be used to offset the runtime cost that stems from kernelization, providing experimental evidence. Due to the different constructive realizations of random projections (including sparse ones), the fact that the JL lemma provides only an upper bound for projection dimension, and the fact that random projection works on any data set (not just those sampled from manifolds), bigO analysis is not included.
Standard Kernel KMeans (dashed)
vs.
Projected Kernel KMeans (solid) runtimes
Demonstration of how random projections can expedite kernelized kmeans
Results of Kernel KMeans on three 2Spheres
Clustering using Gaussian random projections for kernelized kmeans very effective on nonconvex, partially intersecting data