The Sum-of-Squares Algorithmic Paradigm in Statistics
(STATS 319: "Literature of Statistics", Winter 2021)
- Instructor: Tselil Schramm
- Time: Mondays 13:00–14:20 Pacific, Winter 2021
- Office hours: by appointment
This seminar course will introduce and explore Sum-of-Squares (SoS) algorithms in the context of statistics. In recent years, the powerful SoS "proofs-to-algorithms" methodology has led to numerous breakthroughs in algorithmic statistics, yielding polynomial-time algorithms with provable guarantees for previously unreachable problems including learning latent variable models (via tensor decomposition), clustering, problems in robust statistics, and more.
The course will begin with an introductory lecture designed to bring the uninitiated up to speed. We will then cover several fundamental topics including tensor decomposition for latent variable models, robust clustering, and fast algorithms based on SoS. After this, the remaining topics will be guided by student choice. See below for a list of possible papers.
The course will follow a seminar format; with the exception of the first lecture, each lecture will be delivered by one or two students.
For each lecture, I will designate a student scribe to produce high-quality notes.
Prerequisites: The prerequisites for this course are "mathematical maturity" and a solid foundation in probability.
You should be comfortable reading material at the level of e.g. this blog post, or e.g. this paper.
Lecture notes will be posted within a week or two of each lecture date.
- (01-11) Lecture 0: Introduction to SoS, robust mean estimation. [notes]
(01-18) NO CLASS: Martin Luther King Jr. Day.
- (01-25) Lecture 1: Tensor decomposition & learning latent variable models. [notes]
- (02-01) Lecture 2: Clustering and learning mixtures. [notes]
- (02-08) Lecture 3: Robust linear regression. [slides]
(02-15) NO CLASS: Presidents' Day.
- (02-22) Lecture 4: Speeding up Sum-of-Squares: Gradient Methods for Heavy-Tailed Mean Estimation. [notes]
- (03-01) Lecture 5: Speeding up Sum-of-Squares: Spectral Algorithms from SoS Analyses. [notes]
- (03-08) Lecture 6: TBD, student choice.
- (03-15) Lecture 7: TBD, student choice.
Potential works by topic
Tensor decomposition & latent variable models:
Tensor decomposition provides a powerful primitive for learning latent variable models, and sum-of-squares algorithms have revolutionized algorithmic guarantees for tensor decomposition.
These algorithms were some of the first to use the "proofs-to-algorithms" paradigm, meaning that simple proofs of identifiability immediately yield SoS algorithms.
Fast Algorithms: Sum-of-squares algorithms use large semidefinite programs as a subroutine, with running times which are at best super-quadratic in the input size. For this reason, realizing the guarantees of these algorithms in practice remains an exciting challenge. There are a number of approaches for taking an SoS algorithm and extracting a faster algorithm, and we will explore a number of these. Relevant resources include:
- Fast spectral algorithms from sum-of-squares proofs: tensor decomposition and planted sparse vectors by Hopkins, Schramm, Shi and Steurer.
- Fast and robust tensor decomposition with applications to dictionary learning by Schramm and Steurer.
- Fast Mean Estimation with Sub-Gaussian Rates by Cherapanamjeri, Flammarion, and Bartlett.
- Quantum Entropy Scoring for Fast Robust Mean Estimation and Improved Outlier Detection by Dong, Hopkins, and Li.
List Decodable Mean Estimation in Nearly Linear Time by Cherapanamjeri, Mohanty, and Yau, and List-Decodable Mean Estimation in Nearly-PCA Time by Diakonikolas, Kane, Kongsgaard, Li, and Tian.
Clustering, Robust Estimation, and Robust Regression: the SoS paradigm has yielded the first-ever polynomial time algorithms with provable guarantees for a number of algorithmic tasks in statistics; this has been especially true for problems where robustness plays an important role.
Here are several works on this topic, though this is a very active area of research and this list is by no means exhaustive.
- Mixture Models, Robustness, and Sum of Squares Proofs by Hopkins and Li,
Outlier-robust moment-estimation via sum-of-squares by Kothari and Steurer, and Better Agnostic Clustering Via Relaxed Tensor Norms by Kothari and Steinhardt. See also this series of blog posts by Sam Hopkins.
- Efficient Algorithms for Outlier-Robust Regression, by Klivans, Kothari, and Meka.
- Mean Estimation with Sub-Gaussian Rates in Polynomial Time, by Hopkins. The first polynomial-time algorithm for heavy-tailed mean estimation.
- Algorithms for Heavy-Tailed Statistics: Regression, Covariance Estimation, and Beyond, by Cherapanamjeri, Hopkins, Kathuria, Raghavendra, and Tripuraneni.
Lower Bounds: Since SoS algorithms are, in some sense, the most powerful polynomial-time algorithms known, proving lower bounds against sum-of-squares algorithms provides convincing evidence that solving certain algorithmic problems is beyond our reach.
Below are a few representative papers on the topic.
- A Nearly Tight Sum-of-Squares Lower Bound for the Planted Clique Problem by Barak, Hopkins, Kelner, Kothari, Moitra, and Potechin. See also the prior works of
Meka-Potechin-Wigderson, which are significantly weaker but also less technically involved.
- Sum of squares lower bounds for refuting any CSP by Kothari, Mori, O'Donnell, and Witmer.
- The power of sum-of-squares for detecting hidden structures by Hopkins, Kothari, Potechin, Raghavendra, Schramm, and Steurer.
- Online and Distribution-Free Robustness: Regression and Contextual Bandits with Huber Contamination by Chen, Koehler, Moitra, and Yau.
- Sparse PCA: Algorithms, Adversarial Perturbations and Certificates by d'Orsi, Kothari, Novikov, and Steurer.
- Mean-field approximation, convex hierarchies, and the optimality of correlation rounding: a unified perspective by Jain, Koehler, and Risteski.
If you feel I have left something important out of this list, please do not hesitate to email me.