Skip to Content
Math401Math 401, Summer 2025: Freiwald research project notesMath 401, Paper 1, Side note 3: Levy's concentration theorem

Math 401 Paper 1, Side note 3: Levy’s concentration theorem

Our goal is to prove the generalized version of Levy’s concentration theorem used in Hayden’s work for η\eta-Lipschitz functions.

Let f:Sn1Rf:S^{n-1}\to \mathbb{R} be a η\eta-Lipschitz function. Let MfM_f denote the median of ff and f\langle f\rangle denote the mean of ff. (Note this can be generalized to many other manifolds.)

Select a random point xSn1x\in S^{n-1} with n>2n>2 according to the uniform measure (Haar measure). Then the probability of observing a value of ff much different from the reference value is exponentially small.

Pr[f(x)Mf>ϵ]exp(nϵ22η2)\operatorname{Pr}[|f(x)-M_f|>\epsilon]\leq \exp(-\frac{n\epsilon^2}{2\eta^2}) Pr[f(x)f>ϵ]2exp((n1)ϵ22η2)\operatorname{Pr}[|f(x)-\langle f\rangle|>\epsilon]\leq 2\exp(-\frac{(n-1)\epsilon^2}{2\eta^2})

This version of Levy’s concentration theorem can be found in Geometry of Quantum states  15.84 and 15.85.

Basic definitions

Lipschitz function

η\eta-Lipschitz function

Let (X,distX)(X,\operatorname{dist}_X) and (Y,distY)(Y,\operatorname{dist}_Y) be two metric spaces. A function f:XYf:X\to Y is said to be η\eta-Lipschitz if there exists a constant LRL\in \mathbb{R} such that

distY(f(x),f(y))LdistX(x,y)\operatorname{dist}_Y(f(x),f(y))\leq L\operatorname{dist}_X(x,y)

for all x,yXx,y\in X. And η=fLip=infLRL\eta=\|f\|_{\operatorname{Lip}}=\inf_{L\in \mathbb{R}}L.

That basically means that the function ff should not change the distance between any two pairs of points in XX by more than a factor of LL.

Levy’s concentration theorem in High-dimensional probability by Roman Vershynin

Levy’s concentration theorem (Vershynin’s version)

This theorem is exactly the 5.1.4 on the High-dimensional probability by Roman Vershynin.

Isoperimetric inequality on Rn\mathbb{R}^n

Among all subsets ARnA\subset \mathbb{R}^n with a given volume, the Euclidean ball has the minimal area.

That is, for any ϵ>0\epsilon>0, Euclidean balls minimize the volume of the ϵ\epsilon-neighborhood of AA.

Where the volume of the ϵ\epsilon-neighborhood of AA is defined as

Aϵ(A){xRn:yA,xy2ϵ}=A+ϵB2nA_\epsilon(A)\coloneqq \{x\in \mathbb{R}^n: \exists y\in A, \|x-y\|_2\leq \epsilon\}=A+\epsilon B_2^n

Here the 2\|\cdot\|_2 is the Euclidean norm. (The theorem holds for both geodesic metric on sphere and Euclidean metric on Rn\mathbb{R}^n.)

Isoperimetric inequality on the sphere

Let σn(A)\sigma_n(A) denotes the normalized area of AA on nn dimensional sphere SnS^n. That is σn(A)Area(A)Area(Sn)\sigma_n(A)\coloneqq\frac{\operatorname{Area}(A)}{\operatorname{Area}(S^n)}.

Let ϵ>0\epsilon>0. Then for any subset ASnA\subset S^n, given the area σn(A)\sigma_n(A), the spherical caps minimize the volume of the ϵ\epsilon-neighborhood of AA.

The above two inequalities is not proved in the Book High-dimensional probability. But you can find it in the Appendix C of Gromov’s book Metric Structures for Riemannian and Non-Riemannian Spaces.

To continue prove the theorem, we use sub-Gaussian concentration (Chapter 3 of High-dimensional probability by Roman Vershynin) of sphere nSn\sqrt{n}S^n.

This will leads to some constant C>0C>0 such that the following lemma holds:

The “Blow-up” lemma

Let AA be a subset of sphere nSn\sqrt{n}S^n, and σ\sigma denotes the normalized area of AA. Then if σ12\sigma\geq \frac{1}{2}, then for every t0t\geq 0,

σ(At)12exp(ct2)\sigma(A_t)\geq 1-2\exp(-ct^2)

where At={xSn:dist(x,A)t}A_t=\{x\in S^n: \operatorname{dist}(x,A)\leq t\} and cc is some positive constant.

Proof of the Levy’s concentration theorem

Proof:

Without loss of generality, we can assume that η=1\eta=1. Let MM denotes the median of f(X)f(X).

So Pr[f(X)M]12\operatorname{Pr}[|f(X)\leq M|]\geq \frac{1}{2}, and Pr[f(X)M]12\operatorname{Pr}[|f(X)\geq M|]\geq \frac{1}{2}.

Consider the sub-level set A{xnSn:f(x)M}A\coloneqq \{x\in \sqrt{n}S^n: |f(x)|\leq M\}.

Since Pr[XA]12\operatorname{Pr}[X\in A]\geq \frac{1}{2}, by the blow-up lemma, we have

Pr[XAt]12exp(ct2)\operatorname{Pr}[X\in A_t]\geq 1-2\exp(-ct^2)

And since

Pr[XAt]Pr[f(X)M+t]\operatorname{Pr}[X\in A_t]\leq \operatorname{Pr}[f(X)\leq M+t]

Combining the above two inequalities, we have

Pr[f(X)M+t]12exp(ct2)\operatorname{Pr}[f(X)\leq M+t]\geq 1-2\exp(-ct^2)

Levy’s concentration theorem in Metric Structures for Riemannian and Non-Riemannian Spaces by M. Gromov

Levy’s concentration theorem (Gromov’s version)

The Levy’s lemma can also be found in Metric Structures for Riemannian and Non-Riemannian Spaces by M. Gromov. 312.193\frac{1}{2}.19 The Levy concentration theory.

Theorem 312.193\frac{1}{2}.19 Levy concentration theorem:

An arbitrary 1-Lipschitz function f:SnRf:S^n\to \mathbb{R} concentrates near a single value a0Ra_0\in \mathbb{R} as strongly as the distance function does.

That is

μ{xSn:f(x)a0ϵ}<κn(ϵ)2exp((n1)ϵ22)\mu\{x\in S^n: |f(x)-a_0|\geq\epsilon\} < \kappa_n(\epsilon)\leq 2\exp(-\frac{(n-1)\epsilon^2}{2})

where

κn(ϵ)=ϵπ2cosn1(t)dt0π2cosn1(t)dt\kappa_n(\epsilon)=\frac{\int_\epsilon^{\frac{\pi}{2}}\cos^{n-1}(t)dt}{\int_0^{\frac{\pi}{2}}\cos^{n-1}(t)dt}

a0a_0 is the Levy mean of function ff, that is the level set of f1:RSnf^{-1}:\mathbb{R}\to S^n divides the sphere into equal halves, characterized by the following equality:

μ(f1(,a0])12 and μ(f1[a0,))12\mu(f^{-1}(-\infty,a_0])\geq \frac{1}{2} \text{ and } \mu(f^{-1}[a_0,\infty))\geq \frac{1}{2}

Hardcore computing may generates the bound but M. Gromov did not make the detailed explanation here.

Detailed proof by Takashi Shioya.

The central idea is to draw the connection between the given three topological spaces, S2n+1S^{2n+1}, CPnCP^n and R\mathbb{R}.

First, we need to introduce the following distribution and lemmas/theorems:

OBSERVATION

consider the orthogonal projection from Rn+1\mathbb{R}^{n+1}, the space where SnS^n is embedded, to Rk\mathbb{R}^k, we denote the restriction of the projection as πn,k:Sn(n)Rk\pi_{n,k}:S^n(\sqrt{n})\to \mathbb{R}^k. Note that πn,k\pi_{n,k} is a 1-Lipschitz function (projection will never increase the distance between two points).

We denote the normalized Riemannian volume measure on Sn(n)S^n(\sqrt{n}) as σn()\sigma^n(\cdot), and σn(Sn(n))=1\sigma^n(S^n(\sqrt{n}))=1.

Definition of Gaussian measure on Rk\mathbb{R}^k

We denote the Gaussian measure on Rk\mathbb{R}^k as γk\gamma^k.

dγk(x)12πkexp(12x2)dxd\gamma^k(x)\coloneqq\frac{1}{\sqrt{2\pi}^k}\exp(-\frac{1}{2}\|x\|^2)dx

xRkx\in \mathbb{R}^k, x2=i=1kxi2\|x\|^2=\sum_{i=1}^k x_i^2 is the Euclidean norm, and dxdx is the Lebesgue measure on Rk\mathbb{R}^k.

Basically, you can consider the Gaussian measure as the normalized Lebesgue measure on Rk\mathbb{R}^k with standard deviation 11.

Maxwell-Boltzmann distribution law

It is such a wonderful fact for me, that the projection of n+1n+1 dimensional sphere with radius n\sqrt{n} to Rk\mathbb{R}^k is a Gaussian distribution as nn\to \infty.

For any natural number kk,

d(πn,k)σn(x)dxdγk(x)dx\frac{d(\pi_{n,k})_*\sigma^n(x)}{dx}\to \frac{d\gamma^k(x)}{dx}

where (πn,k)σn(\pi_{n,k})_*\sigma^n is the push-forward measure of σn\sigma^n by πn,k\pi_{n,k}.

In other words,

(πn,k)σnγk weakly as n(\pi_{n,k})_*\sigma^n\to \gamma^k\text{ weakly as }n\to \infty

Proof

We denote the nn dimensional volume measure on Rk\mathbb{R}^k as volk\operatorname{vol}_k.

Observe that πn,k1(x),xRk\pi_{n,k}^{-1}(x),x\in \mathbb{R}^k is isometric to Snk(nx2)S^{n-k}(\sqrt{n-\|x\|^2}), that is, for any xRkx\in \mathbb{R}^k, πn,k1(x)\pi_{n,k}^{-1}(x) is a sphere with radius nx2\sqrt{n-\|x\|^2} (by the definition of πn,k\pi_{n,k}).

So,

d(πn,k)σn(x)dx=volnk(πn,k1(x))volk(Sn(n))=(nx2)nk2xn(nx2)nk2dx\begin{aligned} \frac{d(\pi_{n,k})_*\sigma^n(x)}{dx}&=\frac{\operatorname{vol}_{n-k}(\pi_{n,k}^{-1}(x))}{\operatorname{vol}_k(S^n(\sqrt{n}))}\\ &=\frac{(n-\|x\|^2)^{\frac{n-k}{2}}}{\int_{\|x\|\leq \sqrt{n}}(n-\|x\|^2)^{\frac{n-k}{2}}dx}\\ \end{aligned}

as nn\to \infty.

note that limn(1an)n=ea\lim_{n\to \infty}{(1-\frac{a}{n})^n}=e^{-a} for any a>0a>0.

(nx2)nk2=(n(1x2n))nk2nnk2exp(x22)(n-\|x\|^2)^{\frac{n-k}{2}}=\left(n(1-\frac{\|x\|^2}{n})\right)^{\frac{n-k}{2}}\to n^{\frac{n-k}{2}}\exp(-\frac{\|x\|^2}{2})

So

(nx2)nk2xn(nx2)nk2dx=ex22xRkex22dx=1(2π)k2ex22=dγk(x)dx\begin{aligned} \frac{(n-\|x\|^2)^{\frac{n-k}{2}}}{\int_{\|x\|\leq \sqrt{n}}(n-\|x\|^2)^{\frac{n-k}{2}}dx}&=\frac{e^{-\frac{\|x\|^2}{2}}}{\int_{x\in \mathbb{R}^k}e^{-\frac{\|x\|^2}{2}}dx}\\ &=\frac{1}{(2\pi)^{\frac{k}{2}}}e^{-\frac{\|x\|^2}{2}}\\ &=\frac{d\gamma^k(x)}{dx} \end{aligned}

Proof of the Levy’s concentration theorem via the Maxwell-Boltzmann distribution law

We use the Maxwell-Boltzmann distribution law and Levy’s isoperimetric inequality to prove the Levy’s concentration theorem.

The goal is the same as the Gromov’s version, first we bound the probability of the sub-level set of ff by the κn(ϵ)\kappa_n(\epsilon) function by Levy’s isoperimetric inequality. Then we claim that the κn(ϵ)\kappa_n(\epsilon) function is bounded by the Gaussian distribution.

Note, this section is not rigorous enough in sense of mathematics and the author should add sections about Levy family and observable diameter to make the proof more rigorous and understandable.

Proof

Let f:SnRf:S^n\to \mathbb{R} be a 1-Lipschitz function.

Consider the two sets of points on the sphere SnS^n with radius n\sqrt{n}:

Ω+={xSn:f(x)a0ϵ},Ω={xSn:f(x)a0+ϵ}\Omega_+=\{x\in S^n: f(x)\leq a_0-\epsilon\}, \Omega_-=\{x\in S^n: f(x)\geq a_0+\epsilon\}

Note that Ω+Ω\Omega_+\cup \Omega_- is the whole sphere Sn(n)S^n(\sqrt{n}).

By the Levy’s isoperimetric inequality, we have

volnk(πn,k1(ϵ))volnk(πn,k1(Ω+))+volnk(πn,k1(Ω))\operatorname{vol}_{n-k}(\pi_{n,k}^{-1}(\epsilon))\leq \operatorname{vol}_{n-k}(\pi_{n,k}^{-1}(\Omega_+))+\operatorname{vol}_{n-k}(\pi_{n,k}^{-1}(\Omega_-))

We define κn(ϵ)\kappa_n(\epsilon) as the following:

κn(ϵ)=volnk(πn,k1(ϵ))volk(Sn(n))=ϵπ2cosn1(t)dt0π2cosn1(t)dt\kappa_n(\epsilon)=\frac{\operatorname{vol}_{n-k}(\pi_{n,k}^{-1}(\epsilon))}{\operatorname{vol}_k(S^n(\sqrt{n}))}=\frac{\int_\epsilon^{\frac{\pi}{2}}\cos^{n-1}(t)dt}{\int_0^{\frac{\pi}{2}}\cos^{n-1}(t)dt}

By the Levy’s isoperimetric inequality, and the Maxwell-Boltzmann distribution law, we have

μ{xSn:f(x)a0ϵ}<κn(ϵ)2exp((n1)ϵ22)\mu\{x\in S^n: |f(x)-a_0|\geq\epsilon\} < \kappa_n(\epsilon)\leq 2\exp(-\frac{(n-1)\epsilon^2}{2})

Levy’s Isoperimetric inequality

This section is from the Appendix C+C_+ of Gromov’s book Metric Structures for Riemannian and Non-Riemannian Spaces.

Not very edible for undergraduates.

Crash course on Riemannian manifolds

This part might be extended to a separate note, let’s check how far we can go from this part.

References:

Riemannian manifolds

A Riemannian manifold is a smooth manifold equipped with a Riemannian metric, which is a smooth assignment of an inner product to each tangent space TpMT_pM of the manifold.

An example of Riemannian manifold is the sphere CPn\mathbb{C}P^n.

Riemannian metric

A Riemannian metric is a smooth assignment of an inner product to each tangent space TpMT_pM of the manifold.

An example of Riemannian metric is the Euclidean metric on Rn\mathbb{R}^n.

Notion of Connection

A connection is a way to define the directional derivative of a vector field along a curve on a Riemannian manifold.

For every pMp\in M, where MM denote the manifold, suppose M=RnM=\mathbb{R}^n, then let X=(f1,,fn)X=(f_1,\cdots,f_n) be a vector field on MM. The directional derivative of XX along the point pp is defined as

DVX=limh0X(p+h)X(p)hD_VX=\lim_{h\to 0}\frac{X(p+h)-X(p)}{h}

Nabla notation and Levi-Civita connection

Ricci curvature

References

Last updated on