Randomized Algorithms for Matrix Computations and Applications to Data Mining

Petros Drineas, RPI

The introduction of randomization in the design and analysis of algorithms for matrix computations (such as matrix multiplication, least-squares regression, the Singular Value Decomposition (SVD), etc.) over the last decade provided a new paradigm and a complementary perspective to traditional numerical linear algebra approaches. These novel approaches were motivated by technological developments in many areas of scientific research that permit the automatic generation of large data sets, which are often modeled as matrices. In this talk we will focus on the Singular Value Decomposition (SVD) of matrices and the related Principal Components Analysis, which have found numerous applications in extracting structure from datasets in diverse domains, ranging from the social sciences and the internet to biology and medicine. The extracted structure is expressed in terms of singular vectors, which are linear combinations of all the input data and lack an intuitive physical interpretation. We shall discuss matrix decompositions which express the structure in a matrix as linear combination of actual columns (or rows) of the matrix. Such decompositions are easier to interpret in applications, since the selected columns and rows are subsets of the data. Our (randomized) algorithms run in cubic time and come with strong accuracy guarantees. Finally, we will demonstrate how these decompositions may be applied in order to identify ancestry-informative DNA markers that may be used to assign individuals to populations of origin.

Speaker Biography

Prof. Drineas is an assistant professor in the Computer Science Department of Rensselaer Polytechnic Institute, which he joined in 2003. Prof. Drineas earned a doctorate in computer science from Yale University in 2003 and a bachelor in computer engineering from the University of Patras, Greece, in 1997. His research interests lie in the area of the design and analysis of algorithms, and in particular the design and analysis of randomized algorithms for linear algebraic problems. Prof. Drineas is the recipient of an NSF CAREER award.