Algorithmic and Statistical Perspectives on Large-Scale Data Analysis

10/08/2010
by   Michael W. Mahoney, et al.
0

In recent years, ideas from statistics and scientific computing have begun to interact in increasingly sophisticated and fruitful ways with ideas from computer science and the theory of algorithms to aid in the development of improved worst-case algorithms that are useful for large-scale scientific and Internet data analysis problems. In this chapter, I will describe two recent examples---one having to do with selecting good columns or features from a (DNA Single Nucleotide Polymorphism) data matrix, and the other having to do with selecting good clusters or communities from a data graph (representing a social or information network)---that drew on ideas from both areas and that may serve as a model for exploiting complementary algorithmic and statistical perspectives in order to solve applied large-scale data analysis problems.

READ FULL TEXT
research
03/04/2012

Approximate Computation and Implicit Regularization for Very Large-scale Data Analysis

Database theory and database practice are typically the domain of comput...
research
01/30/2013

Statistical mechanics of complex neural systems and high dimensional data

Recent experimental advances in neuroscience have opened new vistas into...
research
06/23/2013

A Statistical Perspective on Algorithmic Leveraging

One popular method for dealing with large-scale data sets is sampling. F...
research
11/30/2020

What are the most important statistical ideas of the past 50 years?

We argue that the most important statistical ideas of the past half cent...
research
01/02/2023

Science Platforms for Heliophysics Data Analysis

We recommend that NASA maintain and fund science platforms that enable i...
research
05/25/2015

Statistical and Algorithmic Perspectives on Randomized Sketching for Ordinary Least-Squares -- ICML

We consider statistical and algorithmic aspects of solving large-scale l...
research
03/13/2019

GNA: new framework for statistical data analysis

We report on the status of GNA — a new framework for fitting large-scale...

Please sign up or login with your details

Forgot password? Click here to reset