Overview

My research interests lie at the intersection of numerical linear algebra, algebraic graph theory, and data science. In particular, my work has made notable contributions to the areas of matrix polynomials, the numerical solution of polynomial equations, the rankability of data. More recently, I have worked on a combinatorial optimization approach to measuring the rankability of data, and I have developed a novel approach to characterizing digraphs via the numerical range.

Research Statement

Spectral graph theory has a long and interesting history which intersects the seminal works on the algebraic connectivity, i.e., the second smallest eigenvalue of the graph Laplacian. Furthermore, the eigenvalues of the graph Laplacian have been used to characterize graphs and have applications to data mining, image processing, mixing of Markov chains, chromatic numbers, and much more. However, there has been far less success in the study of the spectra of directed graphs, which is mainly due to the asymmetry of the associated matrices.

Most of the results on the spectra of digraphs are specific to the adjacency matrix or the (symmetric) Laplacian matrix for strongly connected digraphs, see Brualdi and the references therein. A few notable exceptions, which apply to the (asymmetric) Laplacian matrix for digraphs, include the results of Bauer and our recent work on characterizing acyclic tournaments in CamLanSmi2020. Even more recently, we have developed a novel method which uses the "restricted numerical range" of the Laplacian matrix to characterize digraphs, see CamRobWie2020

Characterization of Acyclic Tournaments

Recently, the rankability problem was proposed, which seeks to quantify a dataset's inherent ability to produce a meaningful ranking of its items. For instance, in the context of a round-robin tournament, a perfectly rankable season could be modeled as an acyclic tournament digraph. In CamLanSmi2020, we give a spectral-degree characterization of such digraphs and use that characterization to formulate a measure of rankability.

Below is our rankability measure for each year in the Big East NCAA College Football conference from 1995 to 2012. In addition, we provide a ranking fit measure, which is equal to the percentage of games correctly predicted by a ranking obtained from the linear ordering problem (LOP).

Big East Rankability
There is a strong correlation between rankability and ranking fit, with a spearman correlation of .9004. For digraphs with an edge between every pair of vertices, this correlation is expected since the LOP produces a unique ranking with no violations if and only if the digraph is an acyclic tournament. Below we include the digraphs associated with the years 2001 and 2007, which were some of the highest and lowest rankability years, respectively. Note that the edge (i,j) exists if team i beat team j; also, the vertices are labeled according the teams' ranking as given by the LOP.
Big East Years 2001 and 2007
For 2001, there is only one violation to the ranking, i.e., the edge (5,3). Moreover, the vertices 3, 4, and 5 are the only vertices involved in a cycle. By contrast, in 2007, there are many violations to the ranking, e.g., the edges (5,1), (6,2), (7,1), and (7,2); also, all vertices are involved in a cycle. Interestingly, there is only one cycle in the 2001 digraph and there are 205 cycles in the 2007 digraph.

Our rankability measure is effective for data that can be modeled as a digraph with an edge between every pair of vertices. However, there are many datasets that cannot be modeled this way, e.g., the network of all college football teams is very sparse since many teams never play each other. This observation motivates the following problems.

  • Can we develop a rankability measure that is based on properties of the LOP, e.g., the optimal value and the set of optimal solutions, rather than the acyclic tournament characterization?
  • What other graph properties (if any) will help us determine when the LOP has a unique optimal solution, not necessarily without violations?
In applications, the LOP often has multiple optimal solutions. While methods for finding all optimal solutions exist, e.g., SCIP's count method, the set of all optimal solutions is often so big that it is not feasible to analyze. In such cases, it may be better to consider two extremal optimal solutions, i.e., solutions that are furthest apart as measured by the Kendall tau distance. This observation motivates the following problems.
  • Can we develop a discrete optimization model that will solve for two extremal solutions of an LOP?
  • What other factors should we consider to determine which of the optimal solutions of an LOP is "best"? Theoretically, we might consider the optimal solution that minimizes the sum of the Kendall tau distances to the other optimal solutions. In applications, there may be other factors to consider which will favor one optimal solution over the others.

The Restricted Numerical Range

In CamRobWie2020, we develop the restricted numerical range, which is a convex set in the complex plan that is produced via the Laplacian matrix of a digraph. We show that the minimum real part of the restricted numerical range is equal to the algebraic connectivity of the digraph as defined by Wu. In fact, the algebraic connectivity is an eigenvalue of of the symmetric part of the restricted Laplacian. This observation motivates the following problem.

  • Can we use an eigenvector of the symmetric part of the restricted Laplacian associated with the algebraic connectivity to perform a clustering of the digraph? In Wu's paper, bounds related to the maximum directed cut, isoperimetric number, and the algebraic connectivity are proved, which suggests that the corresponding eigenvector may be used to bi-partition the vertices of the digraph.

Moreover, we use the restricted numerical range to characterize digraphs. For instance, a digraph is a cycle on n vertices if and only if its restricted numerical range is a complex polygon with vertices equal to 1 minus the roots of unity, not including zero. In fact, these vertices (including zero) make up the eigenvalues of the Laplacian, which also characterize the cycle. What makes the restricted numerical range so powerful is that it is capable of characterizing digraphs that are not characterized by their Laplacian spectrum.

Consider the k-imploding star defined as the directed join of the empty digraph on (n-k) vertices and the completely connected digraph on k vertices. Examples of the 1-Imploding and 2-Imploding stars are shown below.

Imploding Stars
In CamRobWie2020, we proved that the k-imploding stars are characterized by their singleton {k} restricted numerical range. It is worth noting that the k-imploding star is not characterized by its Laplacian spectrum for values of k between 1 and (n-2) (inclusive). This observation motivates the following problem.
  • Are there digraphs that are characterized by their Laplacian spectrum, but are not characterized by their restricted numerical range?
We suspect the answer to this question is no. Also, note that the k-imploding stars are the only digraphs with a singleton restricted numerical range. Moreover, this result has been extended to show that digraphs that are the directed join of two bidirectional digraphs are characterized by a real restricted numerical range.

Beyond the digraphs with a real restricted numerical range, we have characterized cycles and regular tournaments, both of which have a restricted numerical range that is a complex polygon. This observation motivates the following problem.

  • What other digraphs have a restricted numerical range that is a complex polygon?
Finally, we note that all digraphs with polygonal restricted numerical range, including the degenerate points and lines, must have a non-negative algebraic connectivity. One non-appealing aspect of the algebraic connectivity of digraphs as defined by Wu is that it is possible to have negative algebraic connectivity. This observation motivates the following problem.
  • What digraphs have non-negative algebraic connectivity, i.e., what digraphs have a restricted numerical range that does not intersect the left half of the complex plane?