Random projections & Johnson

§1 The Johnson–Lindenstrauss lemma
A random linear projection from ℝᵈ down to ℝᵏ (with k ~ log N / ε²) preserves all pairwise distances among N points to within (1 ± ε), regardless of d. The single most useful "I can shrink this and not lose anything" theorem in ML.
§2 Gaussian, ±1, and sparse projections
The Gaussian random projection from §7.1 is the elegant version. In production, you use ±1 or sparse projections instead — same distortion bound, no multiplies, integer-friendly. Achlioptas (2003) and Li (2006) proved these discrete forms work as well as the Gaussian.
§3 Fast JL via Hadamard (and the loop to TurboQuant)
Dense JL is O(dk) per vector. Ailon-Chazelle 2009 dropped it to O(d log d) by replacing the dense projection with a randomly-signed Hadamard transform plus a sparse subsample. The Hadamard transform — applied in O(n log n) with no multiplies — is the same structured rotation TurboQuant (Ch.25) uses to equalize variance.

Random projections & Johnson–Lindenstrauss