Research

Research Interests

My research uses tools from harmonic, functional, and numerical analysis to create new methods to better understand and accomplish data analysis tasks. I am interested in the theory and algorithmic aspects of finding low-dimensional structure within high-dimensional data. Conversely, I am also interested in dimensionality reduction methods which embed high-dimensional data in low dimensions with a goal of doing so faithfully and in an interpretable way. These interests have lead me to work with collaborators in many disciplines on problems like Subspace Clustering, Graph Sketching, and Low-rank Matrix Approximations. I am also interested in signal processing, including studying function spaces to model classes of signals, and have worked extensively on the interaction between high-dimensional Sampling Theory and Radial Basis Function approximation schemes. This page shows some high level overviews of some specific ongoing projects. The theme tying them together is the aim to eventually be able to learn, in a data-driven way, unions of manifolds appearing within large-scale data. For a full list of my publications, see my publication page.

For an explanation of my research which is accessible to a general audience, see the recent UofA spotlight.

Subspace Clustering

The Subspace Clustering Problem is an unsupervised Machine Learning problem in which data in a high-dimensional Euclidean space is assumed to actually come from the union of low-dimensional affine subspaces, and the goal is to find a partition of the data for which each partition (cluster) corresponds to the data from a given subspace. Subsequently, one can find a basis for the subspace. There are many applications in which data naturally fits this framework including motion segmentation, particle picking in cryo-electon microscopy, and object recognition under different illumination patterns.

With Akram Aldroubi (Math, Vanderbilt University), Ali Sekmen (Computer Science, Tennessee State University), and Bugra Koku (Mechanical Engineering, Middle East Technical University), I proved that CUR decompositions lead to a solution of the Subspace Clustering problem, and we gave an algorithm which is robust to noise and achieved the best performance to date on the Hopkins155 motion segmentation dataset.

CUR Decompositions, Similarity Matrices, and Subspace Clustering — Frontiers in Applied Mathematics and Statistics, 2019

CUR Decompositions

CUR decompositions are relatively little-known matrix factorizations obtained by choosing column and row submatrices from a given matrix. Longxiu Huang (Math, UCLA) and I have characterized CUR decompositions and done a thorough perturbation analysis for variants of this factorization. Additionally, we give sampling guarantees which show when randomly sampling column and row submatrices yield a CUR decomposition of a low-rank matrix with high probability.

Perspectices on CUR Decompositions — Applied and Computational Harmonic Analysis, 2019.
Perturbations of CUR Decompositions — Submitted Preprint, 2019

Graph Sketches

Graph sketches are subgraphs of a large graph which approximately maintain some property of the initial graph, such as pairwise distances between vertices. In this case, the sketch is called a graph spanner. With Stephen Kobourov’s group at University of Arizona along with Greg Bodwin, I have written a survey of graph spanners.

We have also introduced the subject of multi-level graph spanners, and considered several algorithmic solutions to the problem. All Manifold Learning algorithms include a graph sparsification step to approximate geodesic distances; we hope to apply spanner techniques in this arena in the near future.

Approximation Algorithms and an Integer Program for Multi-Level Graph Spanners — Special Event on Analysis of Experimental Algorithms, 2019
Graph Spanners: A Tutorial Review — Submitted Preprint, 2019

Quasi Shift-invariant Spaces

Sampling theory asks the question: given a class of functions, if we have access to a set of discrete measurements for a given function f , is it possible to reconstruct f from these measurements? If so, how can this reconstruction be done in a computationally efficient manner?

Modern sampling theory has attempted to go as far beyond Paley–Wiener (or bandlimited) signals as possible, and my research has been on interpolation of smooth functions (Sobolev or Triebel–Lizorkin classes) by approximants in quasi shift-invariant spaces, which are relatively unstructured generalizations of bandlimited signals.

Regular Families of Kernels for Nonlinear Approximation — Journal of Mathematical Analysis and Applications, 2019
On the Structure and Interpolation Properties of Quasi Shift-Invariant Spaces — Journal of Functional Analysis, 2018
Nonuniform Sampling and Reconstruction of Bandlimited Functions in Higher Dimensions — Journal of Mathematical Analysis and Applications, 2017
Approximation Rates for Interpolation of Sobolev Functions via Gaussians and Allied Functions — Journal of Approximation Theory, 2015

Keaton Hamm

Mathematics • Data Science

Research Interests

Subspace Clustering

CUR Decompositions

Graph Sketches

Quasi Shift-invariant Spaces