Threshold phenomena for quantum marginals
Stanislaw Szarek
Case Western Reserve/Paris 6
Cambridge, October 15, 2013
Collaborators: G. Aubrun, D. Ye
Comm. Pure Appl. Math. (2014), arXiv:1106.2264v3Phys. Rev. A. 85(R) (2012), arXiv:1112.4582v2
http://www.cwru.edu/artsci/math/szarek/
Abstract
Consider a quantum system consisting of N identical particles andassume that it is in a random pure state (i.e., uniformly distributedover the sphere of the corresponding Hilbert space) and twosubsystems A and B consisting of k particles each. Are A and Blikely to share entanglement? Is the AB -marginal typically PPT?
For many natural properties there is a sharp“phase transition.”E.g., there is a threshold K ∼ N/5 such that - if k > K , then Aand B typically share entanglement - if k < K , then A and Btypically do not share entanglement.
The first statement was (essentially) shown in the talk by G.Aubrun. Here we present a general scheme for handling suchquestions and sketch the analysis specific to entanglement.
Talk summary
• setup, notation; random quantum states and ensembles ofrandom matrices
• emergence of entanglement (the main result)
• a sketch of the proof using the tools of geometric functionalanalysis and random matrix theory
Sets of states
The set of (mixed) states on H is denoted by D = D(H)
Separable (unentangled) states on H = H1 ⊗H2 : S = S(H)
When we talk about S(Cn), we implicitly assume n = d2 andCn ∼ Cd ⊗ Cd (a fixed bipartition).
The sets D(Cn) and S(Cn) are convex bodies in the hyperplaneH1 := tr(·) = 1 ⊂ Msa
n with 1n I in the interior. 1
n I is the onlypoint invariant under the isometries of either set.
Our approach will (in principle) work for any property in place ofd × d separability, provided it has some minimal permanenceproperties.
Sets of states
The set of (mixed) states on H is denoted by D = D(H)
Separable (unentangled) states on H = H1 ⊗H2 : S = S(H)
When we talk about S(Cn), we implicitly assume n = d2 andCn ∼ Cd ⊗ Cd (a fixed bipartition).
The sets D(Cn) and S(Cn) are convex bodies in the hyperplaneH1 := tr(·) = 1 ⊂ Msa
n with 1n I in the interior. 1
n I is the onlypoint invariant under the isometries of either set.
Our approach will (in principle) work for any property in place ofd × d separability, provided it has some minimal permanenceproperties.
Sets of states
The set of (mixed) states on H is denoted by D = D(H)
Separable (unentangled) states on H = H1 ⊗H2 : S = S(H)
When we talk about S(Cn), we implicitly assume n = d2 andCn ∼ Cd ⊗ Cd (a fixed bipartition).
The sets D(Cn) and S(Cn) are convex bodies in the hyperplaneH1 := tr(·) = 1 ⊂ Msa
n with 1n I in the interior. 1
n I is the onlypoint invariant under the isometries of either set.
Our approach will (in principle) work for any property in place ofd × d separability, provided it has some minimal permanenceproperties.
Setup, notation; partial trace
The state of the entire system is described by
|ψ〉 ∈ H = A⊗ B ⊗ E = (CD)⊗k ⊗ (CD)⊗k ⊗ (CD)⊗N−2k
= Cd ⊗ Cd ⊗ Cs
with d = Dk , s = DN−2k
If N = 5k , then s = DN−2k = D3k = d3
If the entire system is in the pure state |ψ〉 ∈ H or |ψ〉〈ψ| in thedensity matrix formalism, then the AB-marginal is given by thepartial trace trE |ψ〉〈ψ| = trCs |ψ〉〈ψ| .
If |ψ〉 ∈ Cd ⊗ Cd ⊗ Cs ∼ Cn ⊗ Cs is identified with a matrixA ∈Mn×s , then ρ = AA†.
Setup, notation; partial trace
The state of the entire system is described by
|ψ〉 ∈ H = A⊗ B ⊗ E = (CD)⊗k ⊗ (CD)⊗k ⊗ (CD)⊗N−2k
= Cd ⊗ Cd ⊗ Cs
with d = Dk , s = DN−2k
If N = 5k , then s = DN−2k = D3k = d3
If the entire system is in the pure state |ψ〉 ∈ H or |ψ〉〈ψ| in thedensity matrix formalism, then the AB-marginal is given by thepartial trace trE |ψ〉〈ψ| = trCs |ψ〉〈ψ| .
If |ψ〉 ∈ Cd ⊗ Cd ⊗ Cs ∼ Cn ⊗ Cs is identified with a matrixA ∈Mn×s , then ρ = AA†.
Setup, notation; partial trace
The state of the entire system is described by
|ψ〉 ∈ H = A⊗ B ⊗ E = (CD)⊗k ⊗ (CD)⊗k ⊗ (CD)⊗N−2k
= Cd ⊗ Cd ⊗ Cs
with d = Dk , s = DN−2k
If N = 5k , then s = DN−2k = D3k = d3
If the entire system is in the pure state |ψ〉 ∈ H or |ψ〉〈ψ| in thedensity matrix formalism, then the AB-marginal is given by thepartial trace trE |ψ〉〈ψ| = trCs |ψ〉〈ψ| .
If |ψ〉 ∈ Cd ⊗ Cd ⊗ Cs ∼ Cn ⊗ Cs is identified with a matrixA ∈Mn×s , then ρ = AA†.
Random marginals and “standard” ensembles ofrandom matrices
• If |ψ〉 is random on Cn ⊗ Cs , then trCs |ψ〉〈ψ| is a random stateon Cn
• If |ψ〉 is distributed uniformly on the sphere of Cn ⊗ Cs , then Ais uniform on the Hilbert-Schmidt/Frobenius sphere in Mn×s .
Either way, ρn,s := trCs |ψ〉〈ψ| ∼ AA† is a random state on Cn
and s is a parameter.
Another way to represent such random states on Cn is to consider
ρn,s = BB†
trBB† ,
where B is the standard n×s complex Gaussian matrix; so ρn,s is anormalized complex Wishart matrix.
Random marginals and “standard” ensembles ofrandom matrices
• If |ψ〉 is random on Cn ⊗ Cs , then trCs |ψ〉〈ψ| is a random stateon Cn
• If |ψ〉 is distributed uniformly on the sphere of Cn ⊗ Cs , then Ais uniform on the Hilbert-Schmidt/Frobenius sphere in Mn×s .
Either way, ρn,s := trCs |ψ〉〈ψ| ∼ AA† is a random state on Cn
and s is a parameter.
Another way to represent such random states on Cn is to consider
ρn,s = BB†
trBB† ,
where B is the standard n×s complex Gaussian matrix; so ρn,s is anormalized complex Wishart matrix.
Random marginals and “standard” ensembles ofrandom matrices
• If |ψ〉 is random on Cn ⊗ Cs , then trCs |ψ〉〈ψ| is a random stateon Cn
• If |ψ〉 is distributed uniformly on the sphere of Cn ⊗ Cs , then Ais uniform on the Hilbert-Schmidt/Frobenius sphere in Mn×s .
Either way, ρn,s := trCs |ψ〉〈ψ| ∼ AA† is a random state on Cn
and s is a parameter.
Another way to represent such random states on Cn is to consider
ρn,s = BB†
trBB† ,
where B is the standard n×s complex Gaussian matrix; so ρn,s is anormalized complex Wishart matrix.
Random marginals and “standard” ensembles ofrandom matrices
• If |ψ〉 is random on Cn ⊗ Cs , then trCs |ψ〉〈ψ| is a random stateon Cn
• If |ψ〉 is distributed uniformly on the sphere of Cn ⊗ Cs , then Ais uniform on the Hilbert-Schmidt/Frobenius sphere in Mn×s .
Either way, ρn,s := trCs |ψ〉〈ψ| ∼ AA† is a random state on Cn
and s is a parameter.
Another way to represent such random states on Cn is to consider
ρn,s = BB†
trBB† ,
where B is the standard n×s complex Gaussian matrix; so ρn,s is anormalized complex Wishart matrix.
Random marginals and “standard” ensembles ofrandom matrices
• If |ψ〉 is random on Cn ⊗ Cs , then trCs |ψ〉〈ψ| is a random stateon Cn
• If |ψ〉 is distributed uniformly on the sphere of Cn ⊗ Cs , then Ais uniform on the Hilbert-Schmidt/Frobenius sphere in Mn×s .
Either way, ρn,s := trCs |ψ〉〈ψ| ∼ AA† is a random state on Cn
and s is a parameter.
Another way to represent such random states on Cn is to consider
ρn,s = BB†
trBB† ,
where B is the standard n×s complex Gaussian matrix
; so ρn,s is anormalized complex Wishart matrix.
Random marginals and “standard” ensembles ofrandom matrices
• If |ψ〉 is random on Cn ⊗ Cs , then trCs |ψ〉〈ψ| is a random stateon Cn
• If |ψ〉 is distributed uniformly on the sphere of Cn ⊗ Cs , then Ais uniform on the Hilbert-Schmidt/Frobenius sphere in Mn×s .
Either way, ρn,s := trCs |ψ〉〈ψ| ∼ AA† is a random state on Cn
and s is a parameter.
Another way to represent such random states on Cn is to consider
ρn,s = BB†
trBB† ,
where B is the standard n×s complex Gaussian matrix; so ρn,s is anormalized complex Wishart matrix.
Entanglement emergence for ρn,s , Cn ∼ Cd ⊗ Cd
Theorem There is a sharp “entanglement emergence” thresholdsent = sent(d) for ρn,s verifying cd3 ≤ sent ≤ Cd3 log2 d
, whereC , c > 0 are universal constants.
More precisely, if ε > 0, then
If s ≤ (1− ε)sent(d), then P(ρn,s is separable) ≤ 2 exp(−c(ε)sent)
If s ≥ (1 + ε)sent(d), then P(ρn,s is entangled) ≤ 2 exp(−c(ε)s)
Corollary The statement in the abstract
Earlier results in various directions/cases were inKendon–Zyczkowski–Munro 2002, Hayden–Leung–Winter 2006,Aubrun-S. 2006, Ye 2010
Entanglement emergence for ρn,s , Cn ∼ Cd ⊗ Cd
Theorem There is a sharp “entanglement emergence” thresholdsent = sent(d) for ρn,s verifying cd3 ≤ sent ≤ Cd3 log2 d , whereC , c > 0 are universal constants.
More precisely, if ε > 0, then
If s ≤ (1− ε)sent(d), then P(ρn,s is separable) ≤ 2 exp(−c(ε)sent)
If s ≥ (1 + ε)sent(d), then P(ρn,s is entangled) ≤ 2 exp(−c(ε)s)
Corollary The statement in the abstract
Earlier results in various directions/cases were inKendon–Zyczkowski–Munro 2002, Hayden–Leung–Winter 2006,Aubrun-S. 2006, Ye 2010
Entanglement emergence for ρn,s , Cn ∼ Cd ⊗ Cd
Theorem There is a sharp “entanglement emergence” thresholdsent = sent(d) for ρn,s verifying cd3 ≤ sent ≤ Cd3 log2 d , whereC , c > 0 are universal constants.
More precisely, if ε > 0, then
If s ≤ (1− ε)sent(d), then P(ρn,s is separable) ≤ 2 exp(−c(ε)sent)
If s ≥ (1 + ε)sent(d), then P(ρn,s is entangled) ≤ 2 exp(−c(ε)s)
Corollary The statement in the abstract
Earlier results in various directions/cases were inKendon–Zyczkowski–Munro 2002, Hayden–Leung–Winter 2006,Aubrun-S. 2006, Ye 2010
Entanglement emergence for ρn,s , Cn ∼ Cd ⊗ Cd
Theorem There is a sharp “entanglement emergence” thresholdsent = sent(d) for ρn,s verifying cd3 ≤ sent ≤ Cd3 log2 d , whereC , c > 0 are universal constants.
More precisely, if ε > 0, then
If s ≤ (1− ε)sent(d), then P(ρn,s is separable) ≤ 2 exp(−c(ε)sent)
If s ≥ (1 + ε)sent(d), then P(ρn,s is entangled) ≤ 2 exp(−c(ε)s)
Corollary The statement in the abstract
Earlier results in various directions/cases were inKendon–Zyczkowski–Munro 2002, Hayden–Leung–Winter 2006,Aubrun-S. 2006, Ye 2010
Entanglement emergence for ρn,s , Cn ∼ Cd ⊗ Cd
Theorem There is a sharp “entanglement emergence” thresholdsent = sent(d) for ρn,s verifying cd3 ≤ sent ≤ Cd3 log2 d , whereC , c > 0 are universal constants.
More precisely, if ε > 0, then
If s ≤ (1− ε)sent(d), then P(ρn,s is separable) ≤ 2 exp(−c(ε)sent)
If s ≥ (1 + ε)sent(d), then P(ρn,s is entangled) ≤ 2 exp(−c(ε)s)
Corollary The statement in the abstract
Earlier results in various directions/cases were inKendon–Zyczkowski–Munro 2002, Hayden–Leung–Winter 2006,Aubrun-S. 2006, Ye 2010
Why the sharp threshold?
Replacing k by k − 1 results in replacing
d = Dk with d1 = Dk−1 = dD ≤
d2 and
s = DN−2k with s1 = DN−2(k−1) = D2s ≥ 4s
and it takes only an increase in s by a factor 1+ε1−ε to switch from
“generic entanglement” to “generic separability.”
Remark πd ,s := P(µd2,s is separable) is a decreasing function of d .
Problem Is πd ,s an increasing function of s?
Why the sharp threshold?
Replacing k by k − 1 results in replacing
d = Dk with d1 = Dk−1 = dD ≤
d2 and
s = DN−2k with s1 = DN−2(k−1) = D2s ≥ 4s
and it takes only an increase in s by a factor 1+ε1−ε to switch from
“generic entanglement” to “generic separability.”
Remark πd ,s := P(µd2,s is separable) is a decreasing function of d .
Problem Is πd ,s an increasing function of s?
Why the sharp threshold?
Replacing k by k − 1 results in replacing
d = Dk with d1 = Dk−1 = dD ≤
d2 and
s = DN−2k with s1 = DN−2(k−1) = D2s ≥ 4s
and it takes only an increase in s by a factor 1+ε1−ε to switch from
“generic entanglement” to “generic separability.”
Remark πd ,s := P(µd2,s is separable) is a decreasing function of d .
Problem Is πd ,s an increasing function of s?
Why the sharp threshold?
Replacing k by k − 1 results in replacing
d = Dk with d1 = Dk−1 = dD ≤
d2 and
s = DN−2k with s1 = DN−2(k−1) = D2s ≥ 4s
and it takes only an increase in s by a factor 1+ε1−ε to switch from
“generic entanglement” to “generic separability.”
Remark πd ,s := P(µd2,s is separable) is a decreasing function of d .
Problem Is πd ,s an increasing function of s?
Why the sharp threshold?
Replacing k by k − 1 results in replacing
d = Dk with d1 = Dk−1 = dD ≤
d2 and
s = DN−2k with s1 = DN−2(k−1) = D2s ≥ 4s
and it takes only an increase in s by a factor 1+ε1−ε to switch from
“generic entanglement” to “generic separability.”
Remark πd ,s := P(µd2,s is separable) is a decreasing function of d .
Problem Is πd ,s an increasing function of s?
Rephrasing the entanglement property
In ∈ S ⊂ D ⊂ H1 := tr(·) = 1 ⊂ Msa
n
Translate everything to the origin:
0 ∈ S0 := S − In ⊂ D0 . . .⊂ H0 := tr(·) = 0 ⊂ Msa
n
Then
ρ is unentangled ⇐⇒ ρ− In∈ S0 ⇐⇒
∥∥∥ρ− In
∥∥∥S0≤ 1
where ‖x‖K := mint ≥ 0 : x ∈ tK is the gauge of a convex set K
(0 ∈ the interior of K ) and of course
ρ is entangled ⇐⇒∥∥∥ρ− I
n
∥∥∥S0> 1
Need to show that, for appropriate values of n, s and for ρ = ρn,s ,the above occur with probability close to 1 or 0.
Rephrasing the entanglement property
In ∈ S ⊂ D ⊂ H1 := tr(·) = 1 ⊂ Msa
n
Translate everything to the origin:
0 ∈ S0 := S − In ⊂ D0 . . .⊂ H0 := tr(·) = 0 ⊂ Msa
n
Then
ρ is unentangled ⇐⇒ ρ− In∈ S0 ⇐⇒
∥∥∥ρ− In
∥∥∥S0≤ 1
where ‖x‖K := mint ≥ 0 : x ∈ tK is the gauge of a convex set K
(0 ∈ the interior of K ) and of course
ρ is entangled ⇐⇒∥∥∥ρ− I
n
∥∥∥S0> 1
Need to show that, for appropriate values of n, s and for ρ = ρn,s ,the above occur with probability close to 1 or 0.
Rephrasing the entanglement property
In ∈ S ⊂ D ⊂ H1 := tr(·) = 1 ⊂ Msa
n
Translate everything to the origin:
0 ∈ S0 := S − In ⊂ D0 . . .⊂ H0 := tr(·) = 0 ⊂ Msa
n
Then
ρ is unentangled ⇐⇒ ρ− In∈ S0 ⇐⇒
∥∥∥ρ− In
∥∥∥S0≤ 1
where ‖x‖K := mint ≥ 0 : x ∈ tK is the gauge of a convex set K
(0 ∈ the interior of K ) and of course
ρ is entangled ⇐⇒∥∥∥ρ− I
n
∥∥∥S0> 1
Need to show that, for appropriate values of n, s and for ρ = ρn,s ,the above occur with probability close to 1 or 0.
Rephrasing the entanglement property
In ∈ S ⊂ D ⊂ H1 := tr(·) = 1 ⊂ Msa
n
Translate everything to the origin:
0 ∈ S0 := S − In ⊂ D0 . . .⊂ H0 := tr(·) = 0 ⊂ Msa
n
Then
ρ is unentangled ⇐⇒ ρ− In∈ S0 ⇐⇒
∥∥∥ρ− In
∥∥∥S0≤ 1
where ‖x‖K := mint ≥ 0 : x ∈ tK is the gauge of a convex set K
(0 ∈ the interior of K )
and of course
ρ is entangled ⇐⇒∥∥∥ρ− I
n
∥∥∥S0> 1
Need to show that, for appropriate values of n, s and for ρ = ρn,s ,the above occur with probability close to 1 or 0.
Rephrasing the entanglement property
In ∈ S ⊂ D ⊂ H1 := tr(·) = 1 ⊂ Msa
n
Translate everything to the origin:
0 ∈ S0 := S − In ⊂ D0 . . .⊂ H0 := tr(·) = 0 ⊂ Msa
n
Then
ρ is unentangled ⇐⇒ ρ− In∈ S0 ⇐⇒
∥∥∥ρ− In
∥∥∥S0≤ 1
where ‖x‖K := mint ≥ 0 : x ∈ tK is the gauge of a convex set K
(0 ∈ the interior of K ) and of course
ρ is entangled ⇐⇒∥∥∥ρ− I
n
∥∥∥S0> 1
Need to show that, for appropriate values of n, s and for ρ = ρn,s ,the above occur with probability close to 1 or 0.
Rephrasing the entanglement property
In ∈ S ⊂ D ⊂ H1 := tr(·) = 1 ⊂ Msa
n
Translate everything to the origin:
0 ∈ S0 := S − In ⊂ D0 . . .⊂ H0 := tr(·) = 0 ⊂ Msa
n
Then
ρ is unentangled ⇐⇒ ρ− In∈ S0 ⇐⇒
∥∥∥ρ− In
∥∥∥S0≤ 1
where ‖x‖K := mint ≥ 0 : x ∈ tK is the gauge of a convex set K
(0 ∈ the interior of K ) and of course
ρ is entangled ⇐⇒∥∥∥ρ− I
n
∥∥∥S0> 1
Need to show that, for appropriate values of n, s and for ρ = ρn,s ,the above occur with probability close to 1 or 0.
The strategy : concentration
Step 1: Show that, for the appropriate values of n, s and ε > 0,
E‖ρn,s − In‖S0 ≤ 1− ε or E‖ρn,s − I
n‖S0 ≥ 1 + ε, as needed.
Step 2: Show that A→ f (A) := ‖AA† − In‖S0 is smooth enough
and so it concentrates around its median (and its mean).
Recall that A varies over the Frobenius sphere in Mn×s ; therelevant metric will also given by the Frobenius norm ‖ · ‖2.
Step 2 is nontrivial, but relatively routine.
Step 1 is harder and requires a few new tricks.
Step 2 : smoothness and concentration
A→ f (A) = ‖AA† − In‖S0 is a composition of two operations
• A→ AA† − In , ‖ · ‖2 → ‖ · ‖2
• X → ‖X‖S0 , with ‖ · ‖2 in the domain
The latter is a Lipschitz function with constant r−1, where r is the(Euclidean/Frobenius) inradius of S0, which is known to be thesame as the inradius of D (Gurvits-Barnum 2002), which is1/√
n(n − 1) ∼ 1/n.
So the Lipschitz constant is ∼ n, in fact < n.
Step 2 : smoothness and concentration
A→ f (A) = ‖AA† − In‖S0 is a composition of two operations
• A→ AA† − In , ‖ · ‖2 → ‖ · ‖2
• X → ‖X‖S0 , with ‖ · ‖2 in the domain
The latter is a Lipschitz function with constant r−1, where r is the(Euclidean/Frobenius) inradius of S0, which is known to be thesame as the inradius of D (Gurvits-Barnum 2002), which is1/√
n(n − 1) ∼ 1/n.
So the Lipschitz constant is ∼ n, in fact < n.
Step 2 : smoothness and concentration
A→ f (A) = ‖AA† − In‖S0 is a composition of two operations
• A→ AA† − In , ‖ · ‖2 → ‖ · ‖2
• X → ‖X‖S0 , with ‖ · ‖2 in the domain
The latter is a Lipschitz function with constant r−1, where r is the(Euclidean/Frobenius) inradius of S0, which is known to be thesame as the inradius of D (Gurvits-Barnum 2002), which is1/√
n(n − 1) ∼ 1/n.
So the Lipschitz constant is ∼ n, in fact < n.
Step 2 : smoothness and concentration
A→ f (A) = ‖AA† − In‖S0 is a composition of two operations
• A→ AA† − In , ‖ · ‖2 → ‖ · ‖2
• X → ‖X‖S0 , with ‖ · ‖2 in the domain
The latter is a Lipschitz function with constant r−1, where r is the(Euclidean/Frobenius) inradius of S0, which is known to be thesame as the inradius of D (Gurvits-Barnum 2002), which is1/√
n(n − 1) ∼ 1/n.
So the Lipschitz constant is ∼ n, in fact < n.
Step 2 : smoothness and concentration
A→ f (A) = ‖AA† − In‖S0 is a composition of two operations
• A→ AA† − In , ‖ · ‖2 → ‖ · ‖2
• X → ‖X‖S0 , with ‖ · ‖2 in the domain
The latter is a Lipschitz function with constant r−1, where r is the(Euclidean/Frobenius) inradius of S0, which is known to be thesame as the inradius of D (Gurvits-Barnum 2002), which is1/√
n(n − 1) ∼ 1/n.
So the Lipschitz constant is ∼ n, in fact < n.
Smoothness of A→ AA† − In , ‖ · ‖2 → ‖ · ‖2
This operation is not Lipschitz, but its restriction to theHilbert-Schmidt sphere is 2-Lipschitz, so f is 2n-Lipschitz.
We can now appeal to Levy’s lemma:
Let f : Sm−1 → R be an L-Lipschitz function and let ε > 0. Then
P(|f −Mf | > ε) ≤ C exp(− m2L2
ε2),
where Mf is the median (or mean) of f , P is the normalizeduniform measure on the sphere and C > 0 is a universal constant.
Verification of the exponent:
m = 2ns, L ∼ 2n, so m2L2∼ 2ns
8n2= s
4n .
Not quite the Ω(s) that we wanted, but since ultimately we areinterested in the range s = Ω(d3) = Ω(n3/2), this yields someconcentration for ε = o(1) and so is “marginally sufficient” (even ifnot optimal).
Smoothness of A→ AA† − In , ‖ · ‖2 → ‖ · ‖2
This operation is not Lipschitz, but its restriction to theHilbert-Schmidt sphere is 2-Lipschitz, so f is 2n-Lipschitz.
We can now appeal to Levy’s lemma:
Let f : Sm−1 → R be an L-Lipschitz function and let ε > 0. Then
P(|f −Mf | > ε) ≤ C exp(− m2L2
ε2),
where Mf is the median (or mean) of f , P is the normalizeduniform measure on the sphere and C > 0 is a universal constant.
Verification of the exponent:
m = 2ns, L ∼ 2n, so m2L2∼ 2ns
8n2= s
4n .
Not quite the Ω(s) that we wanted, but since ultimately we areinterested in the range s = Ω(d3) = Ω(n3/2), this yields someconcentration for ε = o(1) and so is “marginally sufficient” (even ifnot optimal).
Smoothness of A→ AA† − In , ‖ · ‖2 → ‖ · ‖2
This operation is not Lipschitz, but its restriction to theHilbert-Schmidt sphere is 2-Lipschitz, so f is 2n-Lipschitz.
We can now appeal to Levy’s lemma:
Let f : Sm−1 → R be an L-Lipschitz function and let ε > 0. Then
P(|f −Mf | > ε) ≤ C exp(− m2L2
ε2),
where Mf is the median (or mean) of f , P is the normalizeduniform measure on the sphere and C > 0 is a universal constant.
Verification of the exponent:
m = 2ns, L ∼ 2n, so m2L2∼ 2ns
8n2= s
4n .
Not quite the Ω(s) that we wanted, but since ultimately we areinterested in the range s = Ω(d3) = Ω(n3/2), this yields someconcentration for ε = o(1) and so is “marginally sufficient” (even ifnot optimal).
Smoothness of A→ AA† − In , ‖ · ‖2 → ‖ · ‖2
This operation is not Lipschitz, but its restriction to theHilbert-Schmidt sphere is 2-Lipschitz, so f is 2n-Lipschitz.
We can now appeal to Levy’s lemma:
Let f : Sm−1 → R be an L-Lipschitz function and let ε > 0. Then
P(|f −Mf | > ε) ≤ C exp(− m2L2
ε2),
where Mf is the median (or mean) of f , P is the normalizeduniform measure on the sphere and C > 0 is a universal constant.
Verification of the exponent:
m = 2ns, L ∼ 2n, so m2L2∼ 2ns
8n2= s
4n .
Not quite the Ω(s) that we wanted, but since ultimately we areinterested in the range s = Ω(d3) = Ω(n3/2), this yields someconcentration for ε = o(1) and so is “marginally sufficient” (even ifnot optimal).
Smoothness of A→ AA† − In , ‖ · ‖2 → ‖ · ‖2
This operation is not Lipschitz, but its restriction to theHilbert-Schmidt sphere is 2-Lipschitz, so f is 2n-Lipschitz.
We can now appeal to Levy’s lemma:
Let f : Sm−1 → R be an L-Lipschitz function and let ε > 0. Then
P(|f −Mf | > ε) ≤ C exp(− m2L2
ε2),
where Mf is the median (or mean) of f , P is the normalizeduniform measure on the sphere and C > 0 is a universal constant.
Verification of the exponent:
m = 2ns, L ∼ 2n, so m2L2∼ 2ns
8n2= s
4n .
Not quite the Ω(s) that we wanted, but since ultimately we areinterested in the range s = Ω(d3) = Ω(n3/2), this yields someconcentration for ε = o(1) and so is “marginally sufficient” (even ifnot optimal).
A better argument
Let T := A : ‖A‖∞ = O(
1√n
). Then
• P(T ) ≥ 1− e−cs
• A→ AA† is “locally Lipschitz on T ” with constant O(
1√n
).
The bottom line is that the local Lipschitz constant of f on T isO(
1√n× n)
= O(√n).
This follows from AA† being a rescaled Wishart distribution, whoseeigenvalue distribution approximates (for large n, s) the rescaledMarchenko-Pastur distribution. In particular, the singular values ofA are typically in the interval 1√
n× [1−
√ns , 1 +
√ns ], hence
‖A‖∞ = O(
1√n
). The probability estimate stated above is a
consequence of the corresponding large deviation bound.
Local Levy’s lemma
Let T ⊂ Sm−1 be a subset of measure larger than 3/4. Letf : Sm−1 → R be a function such that the restriction of f to T isL-Lipschitz. Then, for every ε > 0,
P(|f −Mf | > ε) ≤ P(Sm−1 \ T ) + C exp(− m
2L2ε2),
where Mf is the median of f and C > 0 a universal constant.
Recalculation of the exponent:
m = 2ns, L = O(√n), so m
2L2= ns
L2= Ω(s).
We had exp(−c(ε)s) in the Theorem, so this is about right.
Step 1, estimating E‖ρn,s − In‖S0 : the difficulty
The gauge ‖ · ‖S0 is hard to work with directly (NP-hard tocalculate; Gurvits 2003).
Yesterday’s argument was based on working with the “dual picture”and the somewhat easier dual quantity, ‖ · ‖S0 , where for K ⊂ Rm
K := x ∈ Rm : 〈x , y〉 ≤ 1 for all y ∈ K ,
and subsequently on estimating w(S0) :=∫SH0‖u‖S0 du.
Step 1, estimating E‖ρn,s − In‖S0 : the argument
The argument is based on two facts/substeps, both non-trivial:
Substep (i) [Random matrices] : If n, sn →∞, then
E‖ρn,s − In‖S0 ∼
1√sw(S0 ) = 1√
s
∫SH0‖u‖S0 du.
There are two-sided estimates with universal constants if n ≤ s.Note: SH0 is the Hilbert-Schmidt sphere in the space of trace 0matrices. The equality is the definition of the mean width w(·).
Substep (ii) [Geometry] : cd3/2 ≤ w(S0 ) ≤ Cd3/2 log d
Once these are shown, we set sent := w(S0 )2 and everything fallsnicely into place. For example, if s ≥ (1 + ε)sent , then
E‖ρn,s − In‖S0 ∼
1√sw(S0 ) =
√sents ≤
1√1+ε∼ 1− ε
2
Thus the mean (and the median) of E‖ρn,s − In‖S0 are ε
4 -separatedfrom 1 and so, by concentration, P(‖ρn,s − I
n‖S0 ≤ 1) ≈ 1.
Step 1, estimating E‖ρn,s − In‖S0 : the argument
The argument is based on two facts/substeps, both non-trivial:
Substep (i) [Random matrices] : If n, sn →∞, then
E‖ρn,s − In‖S0 ∼
1√sw(S0 ) = 1√
s
∫SH0‖u‖S0 du.
There are two-sided estimates with universal constants if n ≤ s.Note: SH0 is the Hilbert-Schmidt sphere in the space of trace 0matrices. The equality is the definition of the mean width w(·).
Substep (ii) [Geometry] : cd3/2 ≤ w(S0 ) ≤ Cd3/2 log d
Once these are shown, we set sent := w(S0 )2 and everything fallsnicely into place. For example, if s ≥ (1 + ε)sent , then
E‖ρn,s − In‖S0 ∼
1√sw(S0 ) =
√sents ≤
1√1+ε∼ 1− ε
2
Thus the mean (and the median) of E‖ρn,s − In‖S0 are ε
4 -separatedfrom 1 and so, by concentration, P(‖ρn,s − I
n‖S0 ≤ 1) ≈ 1.
Step 1, estimating E‖ρn,s − In‖S0 : the argument
The argument is based on two facts/substeps, both non-trivial:
Substep (i) [Random matrices] : If n, sn →∞, then
E‖ρn,s − In‖S0 ∼
1√sw(S0 ) = 1√
s
∫SH0‖u‖S0 du.
There are two-sided estimates with universal constants if n ≤ s.Note: SH0 is the Hilbert-Schmidt sphere in the space of trace 0matrices. The equality is the definition of the mean width w(·).
Substep (ii) [Geometry] : cd3/2 ≤ w(S0 ) ≤ Cd3/2 log d
Once these are shown, we set sent := w(S0 )2 and everything fallsnicely into place. For example, if s ≥ (1 + ε)sent , then
E‖ρn,s − In‖S0 ∼
1√sw(S0 ) =
√sents ≤
1√1+ε∼ 1− ε
2
Thus the mean (and the median) of E‖ρn,s − In‖S0 are ε
4 -separatedfrom 1 and so, by concentration, P(‖ρn,s − I
n‖S0 ≤ 1) ≈ 1.
Step 1, estimating E‖ρn,s − In‖S0 : the argument
The argument is based on two facts/substeps, both non-trivial:
Substep (i) [Random matrices] : If n, sn →∞, then
E‖ρn,s − In‖S0 ∼
1√sw(S0 ) = 1√
s
∫SH0‖u‖S0 du.
There are two-sided estimates with universal constants if n ≤ s.Note: SH0 is the Hilbert-Schmidt sphere in the space of trace 0matrices. The equality is the definition of the mean width w(·).
Substep (ii) [Geometry] : cd3/2 ≤ w(S0 ) ≤ Cd3/2 log d
Once these are shown, we set sent := w(S0 )2 and everything fallsnicely into place.
For example, if s ≥ (1 + ε)sent , then
E‖ρn,s − In‖S0 ∼
1√sw(S0 ) =
√sents ≤
1√1+ε∼ 1− ε
2
Thus the mean (and the median) of E‖ρn,s − In‖S0 are ε
4 -separatedfrom 1 and so, by concentration, P(‖ρn,s − I
n‖S0 ≤ 1) ≈ 1.
Step 1, estimating E‖ρn,s − In‖S0 : the argument
The argument is based on two facts/substeps, both non-trivial:
Substep (i) [Random matrices] : If n, sn →∞, then
E‖ρn,s − In‖S0 ∼
1√sw(S0 ) = 1√
s
∫SH0‖u‖S0 du.
There are two-sided estimates with universal constants if n ≤ s.Note: SH0 is the Hilbert-Schmidt sphere in the space of trace 0matrices. The equality is the definition of the mean width w(·).
Substep (ii) [Geometry] : cd3/2 ≤ w(S0 ) ≤ Cd3/2 log d
Once these are shown, we set sent := w(S0 )2 and everything fallsnicely into place. For example, if s ≥ (1 + ε)sent , then
E‖ρn,s − In‖S0 ∼
1√sw(S0 ) =
√sents ≤
1√1+ε∼ 1− ε
2
Thus the mean (and the median) of E‖ρn,s − In‖S0 are ε
4 -separatedfrom 1 and so, by concentration, P(‖ρn,s − I
n‖S0 ≤ 1) ≈ 1.
Step 1, estimating E‖ρn,s − In‖S0 : the argument
The argument is based on two facts/substeps, both non-trivial:
Substep (i) [Random matrices] : If n, sn →∞, then
E‖ρn,s − In‖S0 ∼
1√sw(S0 ) = 1√
s
∫SH0‖u‖S0 du.
There are two-sided estimates with universal constants if n ≤ s.Note: SH0 is the Hilbert-Schmidt sphere in the space of trace 0matrices. The equality is the definition of the mean width w(·).
Substep (ii) [Geometry] : cd3/2 ≤ w(S0 ) ≤ Cd3/2 log d
Once these are shown, we set sent := w(S0 )2 and everything fallsnicely into place. For example, if s ≥ (1 + ε)sent , then
E‖ρn,s − In‖S0 ∼
1√sw(S0 ) =
√sents ≤
1√1+ε∼ 1− ε
2
Thus the mean (and the median) of E‖ρn,s − In‖S0 are ε
4 -separatedfrom 1 and so, by concentration, P(‖ρn,s − I
n‖S0 ≤ 1) ≈ 1.
Substep (ii): cd3/2 ≤ w(S0) ≤ Cd3/2 log d
Recall: w(S0 ) is the average of ‖ · ‖S0 , which is hard to work withdirectly. However, for the dual quantity w(S0) = w(S) we have
Fact (Aubrun-S.) The mean width w(S) = Θ(n−3/4) = Θ(d−3/2).
One always has w(K )w(K ) ≥ 1, which yields w(S0 ) ≥ cd3/2.If K is centrally symmetric and is in the so-called `-position, thereverse inequality almost holds:
Fact (The MM∗-estimate; Pisier, Figiel–Tomczak-Jaegermann)w(K )w(K ) = O(log dimK ), so w(K ) = O
(w(K )−1 log dimK
).
Note: K being in the `-position means that K is isotropic in someprecise technical sense; this can always be achieved by rescaling.
If we had the MM∗-estimate for S0, it would follow thatw(S0 ) = O
(d3/2 log d
), hence sent = w(S0 )2 = O(d3 log2 d),
exactly as needed.
Substep (ii): cd3/2 ≤ w(S0) ≤ Cd3/2 log d
Recall: w(S0 ) is the average of ‖ · ‖S0 , which is hard to work withdirectly.
However, for the dual quantity w(S0) = w(S) we have
Fact (Aubrun-S.) The mean width w(S) = Θ(n−3/4) = Θ(d−3/2).
One always has w(K )w(K ) ≥ 1, which yields w(S0 ) ≥ cd3/2.If K is centrally symmetric and is in the so-called `-position, thereverse inequality almost holds:
Fact (The MM∗-estimate; Pisier, Figiel–Tomczak-Jaegermann)w(K )w(K ) = O(log dimK ), so w(K ) = O
(w(K )−1 log dimK
).
Note: K being in the `-position means that K is isotropic in someprecise technical sense; this can always be achieved by rescaling.
If we had the MM∗-estimate for S0, it would follow thatw(S0 ) = O
(d3/2 log d
), hence sent = w(S0 )2 = O(d3 log2 d),
exactly as needed.
Substep (ii): cd3/2 ≤ w(S0) ≤ Cd3/2 log d
Recall: w(S0 ) is the average of ‖ · ‖S0 , which is hard to work withdirectly. However, for the dual quantity w(S0) = w(S) we have
Fact (Aubrun-S.) The mean width w(S) = Θ(n−3/4) = Θ(d−3/2).
One always has w(K )w(K ) ≥ 1, which yields w(S0 ) ≥ cd3/2.If K is centrally symmetric and is in the so-called `-position, thereverse inequality almost holds:
Fact (The MM∗-estimate; Pisier, Figiel–Tomczak-Jaegermann)w(K )w(K ) = O(log dimK ), so w(K ) = O
(w(K )−1 log dimK
).
Note: K being in the `-position means that K is isotropic in someprecise technical sense; this can always be achieved by rescaling.
If we had the MM∗-estimate for S0, it would follow thatw(S0 ) = O
(d3/2 log d
), hence sent = w(S0 )2 = O(d3 log2 d),
exactly as needed.
Substep (ii): cd3/2 ≤ w(S0) ≤ Cd3/2 log d
Recall: w(S0 ) is the average of ‖ · ‖S0 , which is hard to work withdirectly. However, for the dual quantity w(S0) = w(S) we have
Fact (Aubrun-S.) The mean width w(S) = Θ(n−3/4) = Θ(d−3/2).
One always has w(K )w(K ) ≥ 1, which yields w(S0 ) ≥ cd3/2.If K is centrally symmetric and is in the so-called `-position, thereverse inequality almost holds:
Fact (The MM∗-estimate; Pisier, Figiel–Tomczak-Jaegermann)w(K )w(K ) = O(log dimK ), so w(K ) = O
(w(K )−1 log dimK
).
Note: K being in the `-position means that K is isotropic in someprecise technical sense; this can always be achieved by rescaling.
If we had the MM∗-estimate for S0, it would follow thatw(S0 ) = O
(d3/2 log d
), hence sent = w(S0 )2 = O(d3 log2 d),
exactly as needed.
Substep (ii): cd3/2 ≤ w(S0) ≤ Cd3/2 log d
Recall: w(S0 ) is the average of ‖ · ‖S0 , which is hard to work withdirectly. However, for the dual quantity w(S0) = w(S) we have
Fact (Aubrun-S.) The mean width w(S) = Θ(n−3/4) = Θ(d−3/2).
One always has w(K )w(K ) ≥ 1, which yields w(S0 ) ≥ cd3/2.If K is centrally symmetric and is in the so-called `-position, thereverse inequality almost holds:
Fact (The MM∗-estimate; Pisier, Figiel–Tomczak-Jaegermann)w(K )w(K ) = O(log dimK ), so w(K ) = O
(w(K )−1 log dimK
).
Note: K being in the `-position means that K is isotropic in someprecise technical sense; this can always be achieved by rescaling.
If we had the MM∗-estimate for S0, it would follow thatw(S0 ) = O
(d3/2 log d
), hence sent = w(S0 )2 = O(d3 log2 d),
exactly as needed.
Substep (ii): cd3/2 ≤ w(S0) ≤ Cd3/2 log d
Recall: w(S0 ) is the average of ‖ · ‖S0 , which is hard to work withdirectly. However, for the dual quantity w(S0) = w(S) we have
Fact (Aubrun-S.) The mean width w(S) = Θ(n−3/4) = Θ(d−3/2).
One always has w(K )w(K ) ≥ 1, which yields w(S0 ) ≥ cd3/2.If K is centrally symmetric and is in the so-called `-position, thereverse inequality almost holds:
Fact (The MM∗-estimate; Pisier, Figiel–Tomczak-Jaegermann)w(K )w(K ) = O(log dimK ), so w(K ) = O
(w(K )−1 log dimK
).
Note: K being in the `-position means that K is isotropic in someprecise technical sense; this can always be achieved by rescaling.
If we had the MM∗-estimate for S0, it would follow thatw(S0 ) = O
(d3/2 log d
), hence sent = w(S0 )2 = O(d3 log2 d),
exactly as needed.
Substep (ii): cd3/2 ≤ w(S0) ≤ Cd3/2 log d
Recall: w(S0 ) is the average of ‖ · ‖S0 , which is hard to work withdirectly. However, for the dual quantity w(S0) = w(S) we have
Fact (Aubrun-S.) The mean width w(S) = Θ(n−3/4) = Θ(d−3/2).
One always has w(K )w(K ) ≥ 1, which yields w(S0 ) ≥ cd3/2.If K is centrally symmetric and is in the so-called `-position, thereverse inequality almost holds:
Fact (The MM∗-estimate; Pisier, Figiel–Tomczak-Jaegermann)w(K )w(K ) = O(log dimK ), so w(K ) = O
(w(K )−1 log dimK
).
Note: K being in the `-position means that K is isotropic in someprecise technical sense; this can always be achieved by rescaling.
If we had the MM∗-estimate for S0, it would follow thatw(S0 ) = O
(d3/2 log d
), hence sent = w(S0 )2 = O(d3 log2 d),
exactly as needed.
The workaround
One guarantee for K ⊂ Rm to be in the `-position is when theisometry group of K acts irreducibly on Rm.
This is not the case if K = S0, but still there are sufficiently manyisometries to imply the MM∗-estimate via simple representationtheory.
Specifically, the action of the isometry group (local unitaries) splitsinto three irreducible factors, one of dimension (d2 − 1)2 and twoof dimension d2 − 1. It then follows from general theory that tobring S0 to the `-position we only need to apply some dilations inthe two smaller factors, and since their dimensions are relativelysmall, this does not affect in a major way the mean width of S0 orits polar.
Oops. . . need central symmetry for the MM∗-estimate and S0 isnot symmetric. . . there is another workaround based on Santalo,inverse Santalo, Urysohn, Rogers-Shephard inequalities . . .
The workaround
One guarantee for K ⊂ Rm to be in the `-position is when theisometry group of K acts irreducibly on Rm.
This is not the case if K = S0, but still there are sufficiently manyisometries to imply the MM∗-estimate via simple representationtheory.
Specifically, the action of the isometry group (local unitaries) splitsinto three irreducible factors, one of dimension (d2 − 1)2 and twoof dimension d2 − 1. It then follows from general theory that tobring S0 to the `-position we only need to apply some dilations inthe two smaller factors, and since their dimensions are relativelysmall, this does not affect in a major way the mean width of S0 orits polar.
Oops. . . need central symmetry for the MM∗-estimate and S0 isnot symmetric. . . there is another workaround based on Santalo,inverse Santalo, Urysohn, Rogers-Shephard inequalities . . .
The workaround
One guarantee for K ⊂ Rm to be in the `-position is when theisometry group of K acts irreducibly on Rm.
This is not the case if K = S0, but still there are sufficiently manyisometries to imply the MM∗-estimate via simple representationtheory.
Specifically, the action of the isometry group (local unitaries) splitsinto three irreducible factors, one of dimension (d2 − 1)2 and twoof dimension d2 − 1. It then follows from general theory that tobring S0 to the `-position we only need to apply some dilations inthe two smaller factors, and since their dimensions are relativelysmall, this does not affect in a major way the mean width of S0 orits polar.
Oops. . . need central symmetry for the MM∗-estimate and S0 isnot symmetric. . . there is another workaround based on Santalo,inverse Santalo, Urysohn, Rogers-Shephard inequalities . . .
The workaround
One guarantee for K ⊂ Rm to be in the `-position is when theisometry group of K acts irreducibly on Rm.
This is not the case if K = S0, but still there are sufficiently manyisometries to imply the MM∗-estimate via simple representationtheory.
Specifically, the action of the isometry group (local unitaries) splitsinto three irreducible factors, one of dimension (d2 − 1)2 and twoof dimension d2 − 1. It then follows from general theory that tobring S0 to the `-position we only need to apply some dilations inthe two smaller factors, and since their dimensions are relativelysmall, this does not affect in a major way the mean width of S0 orits polar.
Oops. . . need central symmetry for the MM∗-estimate and S0 isnot symmetric. . .
there is another workaround based on Santalo,inverse Santalo, Urysohn, Rogers-Shephard inequalities . . .
The workaround
One guarantee for K ⊂ Rm to be in the `-position is when theisometry group of K acts irreducibly on Rm.
This is not the case if K = S0, but still there are sufficiently manyisometries to imply the MM∗-estimate via simple representationtheory.
Specifically, the action of the isometry group (local unitaries) splitsinto three irreducible factors, one of dimension (d2 − 1)2 and twoof dimension d2 − 1. It then follows from general theory that tobring S0 to the `-position we only need to apply some dilations inthe two smaller factors, and since their dimensions are relativelysmall, this does not affect in a major way the mean width of S0 orits polar.
Oops. . . need central symmetry for the MM∗-estimate and S0 isnot symmetric. . . there is another workaround based on Santalo,inverse Santalo, Urysohn, Rogers-Shephard inequalities . . .
Substep (i): E‖ρn,s − In‖S0 ∼
1√s
∫SH0‖u‖S0 du.
(true for any gauge ‖ · ‖K )
The trick : linearizationWe approximate ρn,s − I
n by 1n√sGUE0, where GUE0 is the
standard H0-valued random Gaussian matrix, and use therelationship between the spherical and the Gaussian mean
E‖ · ‖K ∼√m∫Sm−1 ‖u‖K du
(E with respect to the standard Gaussian measure on Rm)
The bottom line is then
E‖ρn,s − In‖S0 ∼
1n√s‖GUE0‖S0 ∼
√n2−1n√s
∫SH0‖u‖S0 du
as needed.
Substep (i): E‖ρn,s − In‖S0 ∼
1√s
∫SH0‖u‖S0 du.
(true for any gauge ‖ · ‖K )
The trick : linearizationWe approximate ρn,s − I
n by 1n√sGUE0, where GUE0 is the
standard H0-valued random Gaussian matrix, and use therelationship between the spherical and the Gaussian mean
E‖ · ‖K ∼√m∫Sm−1 ‖u‖K du
(E with respect to the standard Gaussian measure on Rm)
The bottom line is then
E‖ρn,s − In‖S0 ∼
1n√s‖GUE0‖S0 ∼
√n2−1n√s
∫SH0‖u‖S0 du
as needed.
Substep (i): E‖ρn,s − In‖S0 ∼
1√s
∫SH0‖u‖S0 du.
(true for any gauge ‖ · ‖K )
The trick : linearizationWe approximate ρn,s − I
n by 1n√sGUE0, where GUE0 is the
standard H0-valued random Gaussian matrix, and use therelationship between the spherical and the Gaussian mean
E‖ · ‖K ∼√m∫Sm−1 ‖u‖K du
(E with respect to the standard Gaussian measure on Rm)
The bottom line is then
E‖ρn,s − In‖S0 ∼
1n√s‖GUE0‖S0 ∼
√n2−1n√s
∫SH0‖u‖S0 du
as needed.
Approximating ρn,s − In by 1
n√s
GUE0, sanity check
Asymptotically, the spectrum of ρn,s = AA† lives on the interval
1n × [
(1−
√ns
)2,(1 +
√ns
)2] ≈ [ 1n −
2√ns, 1n + 2√
ns],
from Marchenko-Pastur, while – by Wigner – the spectrum of1√nGUE0 lives on [−2, 2], so at least the scaling is right: both
ensembles have spectrum on [− 2√ns, 2√
ns],
Approximating ρn,s − In by 1
n√s
GUE0, sanity check
Asymptotically, the spectrum of ρn,s = AA† lives on the interval
1n × [
(1−
√ns
)2,(1 +
√ns
)2] ≈ [ 1n −
2√ns, 1n + 2√
ns],
from Marchenko-Pastur, while – by Wigner – the spectrum of1√nGUE0 lives on [−2, 2], so at least the scaling is right:
both
ensembles have spectrum on [− 2√ns, 2√
ns],
Approximating ρn,s − In by 1
n√s
GUE0, sanity check
Asymptotically, the spectrum of ρn,s = AA† lives on the interval
1n × [
(1−
√ns
)2,(1 +
√ns
)2] ≈ [ 1n −
2√ns, 1n + 2√
ns],
from Marchenko-Pastur, while – by Wigner – the spectrum of1√nGUE0 lives on [−2, 2], so at least the scaling is right: both
ensembles have spectrum on [− 2√ns, 2√
ns],
Approximating ρn,s − In by 1
n√s
GUE0, for real
Fact (Wigner, Geman?, Bai-Yin) Both√ns(ρn,s − I
n
)and
1√nGUE0 converge to the standard semicircular law as n, sn →∞.
Usually cited: weak convergence of n−1∑n
i=1 δλi (X ), but alsoextreme eigenvalues converge to the endpoints of the support ofthe limit spectrum.
Can be subsumed as convergence in the ∞-Wasserstein distance:
d∞(µ, ν) := inf ‖X − Y ‖L∞ ,with infimum over all couples (X ,Y ) of random variables with lawsµ and ν, defined on a common probability space.
The needed equivalence E‖ρn,s − In‖S0 ∼ E‖ 1
n√sGUE0‖S0 will now
follow from the general theory. However, since the gauge ‖ · ‖S0 israther poorly continuous, some finesse is needed.
Approximating ρn,s − In by 1
n√s
GUE0, for real
Fact (Wigner, Geman?, Bai-Yin) Both√ns(ρn,s − I
n
)and
1√nGUE0 converge to the standard semicircular law as n, sn →∞.
Usually cited: weak convergence of n−1∑n
i=1 δλi (X ), but alsoextreme eigenvalues converge to the endpoints of the support ofthe limit spectrum.
Can be subsumed as convergence in the ∞-Wasserstein distance:
d∞(µ, ν) := inf ‖X − Y ‖L∞ ,with infimum over all couples (X ,Y ) of random variables with lawsµ and ν, defined on a common probability space.
The needed equivalence E‖ρn,s − In‖S0 ∼ E‖ 1
n√sGUE0‖S0 will now
follow from the general theory. However, since the gauge ‖ · ‖S0 israther poorly continuous, some finesse is needed.
Approximating ρn,s − In by 1
n√s
GUE0, for real
Fact (Wigner, Geman?, Bai-Yin) Both√ns(ρn,s − I
n
)and
1√nGUE0 converge to the standard semicircular law as n, sn →∞.
Usually cited: weak convergence of n−1∑n
i=1 δλi (X ), but alsoextreme eigenvalues converge to the endpoints of the support ofthe limit spectrum.
Can be subsumed as convergence in the ∞-Wasserstein distance:
d∞(µ, ν) := inf ‖X − Y ‖L∞ ,with infimum over all couples (X ,Y ) of random variables with lawsµ and ν, defined on a common probability space.
The needed equivalence E‖ρn,s − In‖S0 ∼ E‖ 1
n√sGUE0‖S0 will now
follow from the general theory. However, since the gauge ‖ · ‖S0 israther poorly continuous, some finesse is needed.
Approximating ρn,s − In by 1
n√s
GUE0, for real
Fact (Wigner, Geman?, Bai-Yin) Both√ns(ρn,s − I
n
)and
1√nGUE0 converge to the standard semicircular law as n, sn →∞.
Usually cited: weak convergence of n−1∑n
i=1 δλi (X ), but alsoextreme eigenvalues converge to the endpoints of the support ofthe limit spectrum.
Can be subsumed as convergence in the ∞-Wasserstein distance:
d∞(µ, ν) := inf ‖X − Y ‖L∞ ,with infimum over all couples (X ,Y ) of random variables with lawsµ and ν, defined on a common probability space.
The needed equivalence E‖ρn,s − In‖S0 ∼ E‖ 1
n√sGUE0‖S0 will now
follow from the general theory. However, since the gauge ‖ · ‖S0 israther poorly continuous, some finesse is needed.
∞-Wasserstein distance and gauges
Let Rn,0 := (xj) ∈ Rn :∑
j xj = 0; νx := 1n
∑j δxj for x ∈ Rn.
Proposition Let µ be a 0-centered non-point mass measure.Then ∀ε > 0 ∃η > 0 such that if• x , y ∈ Rn,0 verify d∞(νx , µ) ≤ η and d∞(νy , µ) ≤ η,and• φ is a permutationally invariant convex function on Rn,0,then φ(x) ≤ φ((1 + ε)y).
The link to our context is as follows. If K is a convex body inH0 ⊂Msa
n (containing 0 in the interior), we set
φK (x) =∫U(n) ‖UDiag(x)U†‖KdU
We now apply the Proposition with φ = φS0 , µ–the semicirculardistribution, x , y – the spectra of the two ensembles, and take theexpected values. In addition to the convergence of the ensemblesto a common limit spectral distribution, the calculation uses theirinvariance under conjugation by unitaries.
∞-Wasserstein distance and gauges
Let Rn,0 := (xj) ∈ Rn :∑
j xj = 0; νx := 1n
∑j δxj for x ∈ Rn.
Proposition Let µ be a 0-centered non-point mass measure.Then ∀ε > 0 ∃η > 0 such that if• x , y ∈ Rn,0 verify d∞(νx , µ) ≤ η and d∞(νy , µ) ≤ η,and• φ is a permutationally invariant convex function on Rn,0,then φ(x) ≤ φ((1 + ε)y).
The link to our context is as follows. If K is a convex body inH0 ⊂Msa
n (containing 0 in the interior), we set
φK (x) =∫U(n) ‖UDiag(x)U†‖KdU
We now apply the Proposition with φ = φS0 , µ–the semicirculardistribution, x , y – the spectra of the two ensembles, and take theexpected values. In addition to the convergence of the ensemblesto a common limit spectral distribution, the calculation uses theirinvariance under conjugation by unitaries.
∞-Wasserstein distance and gauges
Let Rn,0 := (xj) ∈ Rn :∑
j xj = 0; νx := 1n
∑j δxj for x ∈ Rn.
Proposition Let µ be a 0-centered non-point mass measure.Then ∀ε > 0 ∃η > 0 such that if• x , y ∈ Rn,0 verify d∞(νx , µ) ≤ η and d∞(νy , µ) ≤ η,and• φ is a permutationally invariant convex function on Rn,0,then φ(x) ≤ φ((1 + ε)y).
The link to our context is as follows. If K is a convex body inH0 ⊂Msa
n (containing 0 in the interior), we set
φK (x) =∫U(n) ‖UDiag(x)U†‖KdU
We now apply the Proposition with φ = φS0 , µ–the semicirculardistribution, x , y – the spectra of the two ensembles, and take theexpected values. In addition to the convergence of the ensemblesto a common limit spectral distribution, the calculation uses theirinvariance under conjugation by unitaries.
∞-Wasserstein distance and gauges
Let Rn,0 := (xj) ∈ Rn :∑
j xj = 0; νx := 1n
∑j δxj for x ∈ Rn.
Proposition Let µ be a 0-centered non-point mass measure.Then ∀ε > 0 ∃η > 0 such that if• x , y ∈ Rn,0 verify d∞(νx , µ) ≤ η and d∞(νy , µ) ≤ η,and• φ is a permutationally invariant convex function on Rn,0,then φ(x) ≤ φ((1 + ε)y).
The link to our context is as follows. If K is a convex body inH0 ⊂Msa
n (containing 0 in the interior), we set
φK (x) =∫U(n) ‖UDiag(x)U†‖KdU
We now apply the Proposition with φ = φS0 , µ–the semicirculardistribution, x , y – the spectra of the two ensembles, and take theexpected values. In addition to the convergence of the ensemblesto a common limit spectral distribution, the calculation uses theirinvariance under conjugation by unitaries.
∞-Wasserstein distance and gauges
Let Rn,0 := (xj) ∈ Rn :∑
j xj = 0; νx := 1n
∑j δxj for x ∈ Rn.
Proposition Let µ be a 0-centered non-point mass measure.Then ∀ε > 0 ∃η > 0 such that if• x , y ∈ Rn,0 verify d∞(νx , µ) ≤ η and d∞(νy , µ) ≤ η,and• φ is a permutationally invariant convex function on Rn,0,then φ(x) ≤ φ((1 + ε)y).
The link to our context is as follows. If K is a convex body inH0 ⊂Msa
n (containing 0 in the interior), we set
φK (x) =∫U(n) ‖UDiag(x)U†‖KdU
We now apply the Proposition with φ = φS0 , µ–the semicirculardistribution, x , y – the spectra of the two ensembles, and take theexpected values. In addition to the convergence of the ensemblesto a common limit spectral distribution, the calculation uses theirinvariance under conjugation by unitaries.
The concept behind the Proposition: majorization
For x , y ∈ Rn,0, we write x ≺ y if, for every k ∈ 1, . . . , n,∑ki=1 x
↓i ≤
∑ki=1 y
↓i ,
where x↓ the non-increasing rearrangement of x .
This is equivalentto each of the following
(a) whenever φ is a permutationally invariant convex function onRn,0, then φ(x) ≤ φ(y)
(b) for every t ∈ R , we have∑n
i=1 |xi − t| ≤∑n
i=1 |yi − t|
Note that the assertion of the Proposition is in the spirit of (a),while 1
n
∑ni=1 |xi − t| from (b) can be rewritten as
∫Φ dνx , where
Φ(u) := |u − t|, and so can be related to convergence of measures.
The concept behind the Proposition: majorization
For x , y ∈ Rn,0, we write x ≺ y if, for every k ∈ 1, . . . , n,∑ki=1 x
↓i ≤
∑ki=1 y
↓i ,
where x↓ the non-increasing rearrangement of x . This is equivalentto each of the following
(a) whenever φ is a permutationally invariant convex function onRn,0, then φ(x) ≤ φ(y)
(b) for every t ∈ R , we have∑n
i=1 |xi − t| ≤∑n
i=1 |yi − t|
Note that the assertion of the Proposition is in the spirit of (a),while 1
n
∑ni=1 |xi − t| from (b) can be rewritten as
∫Φ dνx , where
Φ(u) := |u − t|, and so can be related to convergence of measures.
The concept behind the Proposition: majorization
For x , y ∈ Rn,0, we write x ≺ y if, for every k ∈ 1, . . . , n,∑ki=1 x
↓i ≤
∑ki=1 y
↓i ,
where x↓ the non-increasing rearrangement of x . This is equivalentto each of the following
(a) whenever φ is a permutationally invariant convex function onRn,0, then φ(x) ≤ φ(y)
(b) for every t ∈ R , we have∑n
i=1 |xi − t| ≤∑n
i=1 |yi − t|
Note that the assertion of the Proposition is in the spirit of (a),while 1
n
∑ni=1 |xi − t| from (b) can be rewritten as
∫Φ dνx , where
Φ(u) := |u − t|, and so can be related to convergence of measures.
THANK YOU