An Improved Approximation Algorithm for the Column Subset Selection Problem (2009)
Christos Boutsidis, Michael W. Mahoney, Petros Drineas
We consider the problem of selecting the “best ” subset of exactly k columns from an m × n matrix A. In particular, we present and analyze a novel two-stage algorithm that runs in O(min{mn 2, m...
Unsupervised Feature Selection for Principal Components Analysis [Extended Abstract] (2009)
Christos Boutsidis, Michael W. Mahoney, Petros Drineas
Principal Components Analysis (PCA) is the predominant linear dimensionality reduction technique, and has been widely applied on datasets in all scientific domains. We consider, both theoretically...
An Improved Approximation Algorithm for the Column Subset Selection Problem (2009)
Christos Boutsidis, Michael W. Mahoney, Petros Drineas
Abstract We consider the problem of selecting the "best " sub-set of exactly k columns from an m * n matrix A. Inparticular, we present and analyze a novel two-stage algorithm that runs in...
Petros Drineas, Michael W. Mahoney, S. Muthukrishnan
Abstract. Many data analysis applications deal with large matrices and involve approximating the matrix using a small number of “components. ” Typically, these components are linear combinations...
Michael W. Mahoney, Mauro Maggioni, Petros Drineas
Abstract. Motivated by numerous applications in which the data may be modeled by a variable subscripted by three or more indices, we develop a tensor-based extension of the matrix CUR decomposition....
Deterministic and randomized column selection algorithms for matrices (2008)
Christos Boutsidis, Petros Drineas
Abstract. Given a matrix A ∈ R m×n (m ≥ n) and an integer k (k ≪ n) we discuss deterministic and randomized algorithms for selecting the k “most linearly independent ” columns of A. After...
Random Projections for the Nonnegative Least-Squares Problem (2008)
Boutsidis, Christos, Drineas, Petros
Constrained least-squares regression problems, such as the Nonnegative Least Squares (NNLS) problem, where the variables are restricted to take only nonnegative values, often arise in applications....
An Improved Approximation Algorithm for the Column Subset Selection Problem (2008)
Boutsidis, Christos, Mahoney, Michael W., Drineas, Petros
We consider the problem of selecting the "best" subset of \emph{exactly $k$ columns} from an $m \times n$ matrix $A$. In particular, we present and analyze a novel two-stage algorithm that runs in...
An Experimental Evaluation of a Monte-Carlo Algorithm for Singular Value Decomposition (2008)
Petros Drineas, Eleni Drinea, Patrick S. Huggins
Abstract. We demonstrate that an algorithm proposed by Drineas et. al. in [7] to approximate the singular vectors/values ofa matrix A, is not only oftheoretical interest but also a fast, viable...
Methods Intra- and interpopulation genotype reconstruction from tagging SNPs (2008)
Peristera Paschou, Michael W. Mahoney, Asif Javed, Judith R. Kidd, Andrew J. Pakstis, Sheng Gu, ...
The optimal method to be used for tSNP selection, the applicability of a reference LD map to unassayed populations, and the scalability of these methods to genome-wide analysis, all remain subjects...
in Information Retrieval, Data Management, and Data (2008)
The tutorial will cover randomized sampling algorithms that extract structure from very large data sets modeled as matrices or tensors. Both provable algorithmic results and recent work on applying...
Abstract Sampling Algorithms and Coresets for ℓp Regression (2008)
Anirban Dasgupta, Petros Drineas, Boulos Harb, Ravi Kumar, Michael W. Mahoney
The ℓp regression problem takes as input a matrix A ∈ R n×d, a vector b ∈ R n, and a number p ∈ [1, ∞), and it returns as output a number Z and a vector xopt ∈ R d such that Z = min...
FAST UNIVERSALIZATION OF INVESTMENT STRATEGIES ∗ (2008)
Karhan Akcoglu, Petros Drineas, Ming-yang Kao
Abstract. A universalization of a parameterized investment strategy is an online algorithm whose average daily performance approaches that of the strategy operating with the optimal parameters...
Independent Test Sequence Compaction through Integer Programming (2008)
Petros Drineas Computer, Petros Drineas
We discuss the compaction of independent test sequences for sequential circuits. Our first contribution is the formulation of this problem as an integer program, which we then solve through a...
An Experimental Evaluation of a Monte-Carlo (2008)
Algorithm For Singular, Petros Drineas, Eleni Drinea, Patrick S. Huggins
We demonstrate that an algorithm proposed by Drineas et.
Faster Least Squares Approximation (2007)
Drineas, Petros, Mahoney, Michael W., Muthukrishnan, S., Sarlos, Tamas
Least squares approximation is a technique to find an approximate solution to a system of linear equations that has no exact solution. Methods dating back to Gauss and Legendre find a solution in...
PCA-Correlated SNPs for Structure Identification in Worldwide Human Populations (2007)
Peristera Paschou, Elad Ziv, Esteban G. Burchard, Shweta Choudhry, William Rodriguez-Cintron, Michael W. Mahoney, ...
Existing methods to ascertain small sets of markers for the identification of human population structure require prior knowledge of individual ancestry. Based on Principal Components Analysis (PCA),...
Relative-Error CUR Matrix Decompositions (2007)
Drineas, Petros, Mahoney, Michael W., Muthukrishnan, S.
Many data analysis applications deal with large matrices and involve approximating the matrix using a small number of ``components.'' Typically, these components are linear combinations of the rows...
Sampling Algorithms and Coresets for Lp Regression (2007)
Dasgupta, Anirban, Drineas, Petros, Harb, Boulos, Kumar, Ravi, Mahoney, Michael W.
The Lp regression problem takes as input a matrix $A \in \Real^{n \times d}$, a vector $b \in \Real^n$, and a number $p \in [1,\infty)$, and it returns as output a number ${\cal Z}$ and a vector...
Pca-correlated snps for structure identification in worldwide human populations (2007)
Peristera Paschou, Elad Ziv, Esteban G. Burchard, Shweta Choudhry, William Rodriguez-cintron, Michael W. Mahoney, ...
Existing methods to ascertain small sets of markers for the identification of human population structure require prior knowledge of individual ancestry. Based on Principal Components Analysis (PCA),...
Peristera Paschou, Michael W. Mahoney, Asif Javed, Judith R. Kidd, Andrew J. Pakstis, Sheng Gu, ...
data
Pca-correlated snps for structure identification in worldwide human populations (2007)
Peristera Paschou, Elad Ziv, Esteban G. Burchard, Shweta Choudhry, William Rodriguez-cintron, Michael W. Mahoney, ...
Existing methods to ascertain small sets of markers for the identification of human population structure require prior knowledge of individual ancestry. Based on Principal Components Analysis (PCA),...
Pca-correlated snps for structure identification in worldwide human populations (2007)
Peristera Paschou, Elad Ziv, Esteban G. Burchard, Shweta Choudhry, William Rodriguez-cintron, Michael W. Mahoney, ...
Existing methods to ascertain small sets of markers for the identification of human population structure require prior knowledge of individual ancestry. Based on Principal Components Analysis (PCA),...
Intra- and interpopulation genotype reconstruction from tagging SNPs (2007)
Paschou, Peristera, Mahoney, Michael W., Javed, Asif, Kidd, Judith R., Pakstis, Andrew J., Gu, Sheng, ...
The optimal method to be used for tSNP selection, the applicability of a reference LD map to unassayed populations, and the scalability of these methods to genome-wide analysis, all remain subjects...
Sampling algorithms for `2 regression and applications (2006)
Petros Drineas, Michael W. Mahoney
S. Muthukrishnan z
Subspace sampling and relative-error matrix approximation: Column-based methods (2006)
Petros Drineas, Michael W. Mahoney, S. Muthukrishnan
Abstract. Given an m×n matrix A and an integer k less than the rank of A, the “best ” rank k approximation to A that minimizes the error with respect to the Frobenius norm is Ak, which is...
Subspace sampling and relative-error matrix approximation: Column-based methods (2006)
Petros Drineas, Michael W. Mahoney, S. Muthukrishnan
Abstract. Much recent work in the theoretical computer science, linear algebra, and machine learning has considered matrix decompositions of the following form: given an m × n matrix A, decompose it...
Intra- and interpopulation genotype reconstruction from tagging SNPs (2006)
Paschou, Peristera, Mahoney, Michael W., Javed, Asif, Kidd, Judith R., Pakstis, Andrew J., Gu, Sheng, ...
The optimal method to be used for tSNP selection, the applicability of a reference LD map to unassayed populations, and the scalability of these methods to genome-wide analysis, all remain subjects...
Petros Drineas, Malik Magdon-ismail, Gopal P, Reino Virrankoski, Andreas Savvides
This paper initiates the principled study of distance reconstruction for distance-based node localization. We address an important issue in node localization by showing that the highly incomplete set...
On the Nystrom Method for Approximating a Gram Matrix for (2005)
Petros Drineas, Michael W. Mahoney
A problem for many kernel-based methods is that the amount of computation required to find the solution scales as O(n ), where n is the number of training examples. We develop and analyze an...
Approximating a Gram Matrix for Improved (2005)
Petros Drineas, Michael W. Mahoney
Abstract. A problem for many kernel-based methods is that the amount of computation required to find the solution scales as O(n ), where n is the number of training examples. We develop and analyze...
Approximating a Gram matrix for improved kernel-based learning (2005)
Petros Drineas, Michael W. Mahoney
Abstract. A problem for many kernel-based methods is that the amount of computation required to find the solution scales as O(n 3), where n is the number of training examples. We develop and analyze...
Petros Drineas, Michael W. Mahoney
CW k C T, where C is a matrix consisting of a small number c of columns of G and Wk is the best rank-k approximation to W, the matrix formed by the intersection between those c columns of G and the...
On the Nyström method for approximating a Gram matrix for improved kernel-based learning (2005)
Petros Drineas, Michael W. Mahoney, Nello Cristianini
A problem for many kernel-based methods is that the amount of computation required to find the solution scales as O(n 3), where n is the number of training examples. We develop and analyze an...
A randomized algorithm for a tensor-based generalization of the Singular Value Decomposition (2005)
Petros Drineas, Michael W. Mahoney
~A
On the Nyström method for approximating a Gram matrix for improved kernel-based learning (2005)
Petros Drineas, Michael W. Mahoney, Nello Cristianini
A problem for many kernel-based methods is that the amount of computation required to find the solution scales as O(n 3), where n is the number of training examples. We develop and analyze an...
Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix (2004)
Petros Drineas, Ravi Kannan, Michael W. Mahoney
matrix A. It is often of interest to nd a low-rank approximation to A, i.e., an approximation D to the matrix A of rank not greater than a speci ed rank k, where k is much smaller than m and n....
Petros Drineas, Ravi Kannan, Michael W. Mahoney, Let A
In many applications, the data consist of (or may be naturally formulated as) an m n matrix A which may be stored on disk but which is too large to be read into RAM or to practically perform...
Sampling Sub-problems of Heterogeneous Max-Cut Problems and Approximation Algorithms (2004)
Petros Drineas, Ravi Kannan, Michael W. Mahoney
Recent work in the analysis of randomized approximation algorithms for NP -hard optimization problems has involved approximating the solution to a problem by the solution of a related sub-problem of...
Petros Drineas, Ravi Kannan, W. Mahoney
Abstract. In many applications, the data consist of (or may be naturally formulated as) an m × n matrix A which may be stored on disk but which is too large to be read into random access memory...
Fast Monte Carlo algorithms for matrices I: Approximating matrix multiplication (2004)
Petros Drineas, Ravi Kannan, W. Mahoney
Abstract. Motivated by applications in which the data may be formulated as a matrix, we consider algorithms for several common linear algebra problems. These algorithms make more efficient use of...
Sampling sub-problems of heterogeneous Max-Cut problems and approximation algorithms (2004)
Petros Drineas, Ravi Kannan, Michaelw. Mahoney
Abstract. Recent work in the analysis of randomized approximation algorithms for NP-hard optimization problems has involved approximating the solution to a problem by the solution of a related...
Petros Drineas, Ravi Kannan, W. Mahoney
Abstract. In many applications, the data consist of (or may be naturally formulated as) an m × n matrix A which may be stored on disk but which is too large to be read into random access memory...
Petros Drineas, Ravi Kannan, W. Mahoney
Abstract. In many applications, the data consist of (or may be naturally formulated as) an m × n matrix A. It is often of interest to find a low-rank approximation to A, i.e., an approximation D to...
Fast Monte Carlo algorithms for matrices I: Approximating matrix multiplication (2004)
Petros Drineas, Ravi Kannan, W. Mahoney
Abstract. Motivated by applications in which the data may be formulated as a matrix, we consider algorithms for several common linear algebra problems. These algorithms make more efficient use of...
Petros Drineas, Ravi Kannan, W. Mahoney
Abstract. In many applications, the data consist of (or may be naturally formulated as) an m × n matrix A. It is often of interest to find a low-rank approximation to A, i.e., an approximation D to...
Petros Drineas, Mukkai S. Krishnamoorthy, Michael D. Sofka, Bülent Yener
Abstract. This work describes a methodology that can be used to identify structure and communication patterns within an organization based on e-mail data. The first step of the method is the...
Randomized algorithms for matrix operations / (2003)
Thesis (Ph. D.)--Yale University, 2003.
Petros Drineas, Yiorgos Makris
We discuss a non-intrusive methodology for concurrent error detection in FSMs. The proposed method is based on compaction and monitoring of the state/output bits of an FSM via parity trees. While...
Pass-Ecient Algorithms for Approximating Large Matrices (2003)
Petros Drine As, Petros Drineas
Introduction We are interested in developing and analyzing fast Monte Carlo algorithms for performing useful computations on large matrices. We consider new methods for common problems such as matrix...
Pass efficient algorithms for approximating large matrices (2003)
1 Summary In many applications, an m \Theta n matrix A is stored on disk and is too large to be read into RAM. Our main result is a succinct easily computed approximation A0 to A which is also an m...
Fast Universalization of Investment Strategies with Provably Good Relative Returns (2002)
Akcoglu, Karhan, Drineas, Petros, Kao, Ming-Yang
A universalization of a parameterized investment strategy is an online algorithm whose average daily performance approaches that of the strategy operating with the optimal parameters determined...
Pass Efficient Algorithm for Approximating Large Matrices (2002)
In many applications, an m × n matrix A is stored on disk and is too large to be read into RAM...
Fast Monte-Carlo Algorithms for Approximate Matrix Multiplication (2001)
Given an m n matrix A and an n p matrix B, we present 2 simple and intuitive algorithms to compute an approximation P to the product A B, with provable bounds for the norm of the \error matrix"...
A Randomized Singular Value Decomposition (2001)
Algorithm For Image, Eleni Drinea, Petros Drineas, Patrick Huggins
The main contribution of this paper is to demonstrate that a new randomized SVD algorithm, proposed by Drineas et. al. in [4], is not only of theoretical interest but also a viable and fast...
Concurrent Fault Detection in Random Combinational Logic (1985)
Petros Drineas, Yiorgos Makris
We discuss a non-intrusive methodology for concurrent fault detection in random combinational logic. The proposed method is similar to duplication, wherein a replica of the circuit acts as a...
Intra- and interpopulation genotype reconstruction from tagging SNPs
Paschou, Peristera, Mahoney, Michael W., Javed, Asif, Kidd, Judith R., Pakstis, Andrew J., Gu, Sheng, ...
The optimal method to be used for tSNP selection, the applicability of a reference LD map to unassayed populations, and the scalability of these methods to genome-wide analysis, all remain subjects...
PCA-Correlated SNPs for Structure Identification in Worldwide Human Populations
Paschou, Peristera, Ziv, Elad, Burchard, Esteban G, Choudhry, Shweta, Rodriguez-Cintron, William, Mahoney, Michael W, ...
Existing methods to ascertain small sets of markers for the identification of human population structure require prior knowledge of individual ancestry. Based on Principal Components Analysis (PCA),...
Tracing Sub-Structure in the European American Population with PCA-Informative Markers
Paschou, Peristera, Drineas, Petros, Lewis, Jamey, Nievergelt, Caroline M., Nickerson, Deborah A., Smith, Joshua D., ...
Genetic structure in the European American population reflects waves of migration and recent gene flow among different populations. This complex structure can introduce bias in genetic association...
Energy Minimization via Graph Cuts: Settling What is Possible
Daniel Freedman And, Daniel Freedman, Petros Drineas
The recent explosion of interest in graph cut methods in computer vision naturally spawns the question: what energy functions can be minimized via graph cuts? This question was first attacked by two...
CUR matrix decompositions for improved data analysis
Mahoney, Michael W., Drineas, Petros
Principal components analysis and, more generally, the Singular Value Decomposition are fundamental data analysis tools that express a data matrix in terms of a sequence of orthogonal or uncorrelated...