The Sum-of-Squares Paradigm in Statistics

The Sum-of-Squares Algorithmic Paradigm in Statistics

(STATS 314a: "Advanced Statistical Theory", Spring 2022)

Logistics

Lectures: Mondays & Wednesdays, 9:45am-11:15am, Sequoia Hall 207. Note the first two lectures will be online, Zoom link available on Canvas.
Instructor: Tselil Schramm.
Office Hours: by appointment.

The best way to contact me is by email. Please be sure to include "STATS 314" in the subject line.

Prerequisites: The most important prerequisite is mathematical maturity. I expect a solid background in probability (the material in STATS 116 is technically sufficient, but students will have ideally built upon this knowledge in other courses). Coursework in algorithms (such as CS161) or machine learning (such as STATS/CS 229, STATS 214) is helpful as well---I will not assume this knowledge, but I'll give background at a fast pace.

Overview

This course will introduce and explore Sum-of-Squares (SoS) algorithms in the context of statistics. In recent years, the powerful SoS "proofs-to-algorithms" paradigm has led to numerous breakthroughs in algorithmic statistics, yielding polynomial-time algorithms with provable guarantees for previously unreachable problems including learning latent variable models (via tensor decomposition), clustering, numerous problems in robust statistics, and more.

The course will begin with introductory lectures designed to bring the uninitiated up to speed: we'll introduce the SoS "proofs-to-algorithms" paradigm, as well as the underlying tools from convex optimization (semidefinite programming). We will then cover a number of applications, including clustering, robust mean estimation, robust regression, mean-field approximations of Ising models, and tensor decompositions for learning latent variable models. In the wake of these applications, an important question is how to make these algorithms not only polynomial-time but practical, and we will see the current best efforts on this front. Finally, we'll also explore the ways in which SoS has informed the study of information-computation tradeoffs, or the interaction of computational and statistical resources.

Topics

We will cover the following topics, roughly partitioned into units as follows:

Introduction to the sum-of-squares paradigm, semidefinite programming, and information-computation tradeoffs, via illustrative examples.
Clustering. Clustering mixtures of Gaussians and other distributions with SoS certificates of hypercontractivity.
Global correlation rounding & Ising models. A general technique for rounding sum-of-squares relaxations, with application to mean-field approximations in Ising models.
Robust statistics. Brief reprise of robust mean estimation. Robust linear regression, and list-decodable linear regression. Heavy-tailed mean estimation.
Tensor decomposition. Tensor decomposition is a useful primitive in learning latent variable models, such as dictionary learning and topic models. SoS algorithms provide the best guarantees known for tensor decomposition in a number of settings.
Practical implementations of SoS. SoS provides strong guarantees in polynomial time, but because these algorithms rely on solving large semidefinite programs, running them "off the shelf" is impractical. We'll see some cases in which researchers have developed practical versions of SoS algorithms.
Lower Bounds. For many problems in algorithmic statistics, no polynomial-time algorithms are known, even when the problems are known to be information-theoretically solvable. Lower bounds against sum-of-squares algorithms give evidence that these problems may be computationally hard. The study of SoS lower bounds has informed the study of information-computation tradeoffs, leading to the "low-degree conjecture;" we will survey these developments.

Schedule

Below is a preliminary schedule. It will be updated as we progress through the course.

Date	Topic	Notes
3/28	Introduction: robust mean estimation. [notes]	online
3/30	Semidefinite programming and the SoS algorithm [notes]	online
4/04	Matrix and tensor completion	Section 2.2 of this survey covers the lecture content. Handwritten notes available on Canvas, until typed lecture notes are posted.
4/06	Community detection in block models & information-computation tradeoffs [notes]	Emmanuel Abbe's survey is a great reference, with a thorough (though 5 year old) bibliography.
4/11	Clustering mixtures of Gaussians [notes]
-	Homework 0 available on Canvas, due Monday April 18 at 5:00pm Pacific.
4/13	Clustering 2, certifiable hypercontractivity	(same notes as previous lecture)
4/18	Global correlation rounding [notes]	Guest lecture by Frederic Koehler
4/20	Mean-field approximations in Ising Models	Guest lecture by Frederic Koehler, see this paper
4/25	Robust linear regression [notes]
4/27	List-decodable linear regression	The lecture is based on the paper of Karmalkar, Klivans, and Kothari. The concurrent paper of Raghavendra and Yau obtains similar results.
5/02	List-decodable regression cont.	See the survey article of Monique Laurent for details on how to certify the non-negativity of univariate polynomials over an interval (section 3.6).
5/04	Tensor decomposition for learning latent variable models (quasi-polynomial time dictionary learning)	Quasi-polynomial time SoS algorithms for dictionary learning via tensor decomposition, a representative paper about using tensor decompositions to learn latent variable models (one of several).
5/09	Efficient SoS algorithms for decomposing random overcomplete tensors	Until the new lecture notes are up, see these notes from last year's STATS 319. The relevant references are this paper by Ge and Ma and this paper by Ma, Shi, Steurer. Also see Tropp's matrix concentration survey.
5/11	Making SoS practical 1: spectral algorithms for overcomplete tensor decomposition	The lecture was based on this recent paper of Ding, d'Orsi, Liu, Tiegel, and Steurer, which builds on this paper and this paper.
5/16	Making SoS practical 2: fast SDP solvers	See this paper about solving constraint satisfaction problems and this paper about robust mean estimation as two examples.
5/18	SoS & lower bounds 1: 3-XOR	Lecture notes from the course by Barak and Steurer.
5/23	SoS & lower bounds 2: planted clique and pseudocalibration	Notes on lower bound for planted clique from the course by Barak and Steurer. See also this survey (section 3) on pseudocalibration, and this survey and Sam Hopkins' thesis for a conjuecture about the computational complexity of hypothesis testing.
5/25	Student Presentations
~~5/30~~	~~NO CLASS: Memorial Day~~
6/01	Student Presentations

Materials & Resources

The lecture notes from last year's iteration of this course as a STATS 319 seminar may be helpful.
The work-in-progress monograph of Boaz Barak and David Steurer, Proofs, beliefs, and algorithms through the lens of sum-of-squares, is a great resource. The notes are also accompanied by videos of corresponding lectures. Though the focus is not necessarily statistical, they very much emphasize the proof-to-algorithms viewpoint.
Videos of Pravesh Kothari's class, Proofs vs Algorithms: The Sum-of-Squares Method are available on YouTube.
The survey article of Monique Laurent gives a comprehensive account of the history, fundamentals, and connections to algebra.

Additional recommended readings and videos will be posted on a per-lecture basis, in the "notes" field of the schedule above.