Deep learning and the AdS/CFT correspondence · 2019. 2. 13. · Deep learning and the AdS=CFT...

Deep learning and the AdS=CFT correspondence

Koji Hashimoto,1 Sotaro Sugishita,1 Akinori Tanaka,2,3,4 and Akio Tomiya51Department of Physics, Osaka University, Toyonaka, Osaka 560-0043, Japan

2Mathematical Science Team, RIKEN Center for Advanced Intelligence Project (AIP),1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan

3Department of Mathematics, Faculty of Science and Technology, Keio University,3-14-1 Hiyoshi, Kouhoku-ku, Yokohama 223-8522, Japan

4Interdisciplinary Theoretical & Mathematical Sciences Program (iTHEMS) RIKEN 2-1,Hirosawa, Wako, Saitama 351-0198, Japan

5Key Laboratory of Quark & Lepton Physics (MOE) and Institute of Particle Physics,Central China Normal University, Wuhan 430079, China

(Received 18 March 2018; published 27 August 2018)

We present a deep neural network representation of the AdS=CFT correspondence, and demonstrate theemergence of the bulk metric function via the learning process for given data sets of response in boundaryquantum field theories. The emergent radial direction of the bulk is identified with the depth of the layers,and the network itself is interpreted as a bulk geometry. Our network provides a data-driven holographicmodeling of strongly coupled systems. With a scalar ϕ4 theory with unknown mass and coupling, inunknown curved spacetime with a black hole horizon, we demonstrate that our deep learning (DL)framework can determine the systems that fit given response data. First, we show that, from boundary datagenerated by the anti–de Sitter (AdS) Schwarzschild spacetime, our network can reproduce the metric.Second, we demonstrate that our network with experimental data as an input can determine the bulk metric,the mass and the quadratic coupling of the holographic model. As an example we use the experimental dataof the magnetic response of the strongly correlated material Sm0.6Sr0.4MnO3. This AdS=DL correspon-dence not only enables gravitational modeling of strongly correlated systems, but also sheds light on ahidden mechanism of the emerging space in both AdS and DL.

DOI: 10.1103/PhysRevD.98.046019

I. INTRODUCTION

The AdS=CFT correspondence [1–3], a renownedholographic relation between d-dimensional quantumfield theories (QFTs) and (dþ 1)-dimensional gravity,has been frequently applied to strongly coupled QFTsincluding QCD and condensed matter systems. For phe-nomenology, the holographic modelings were successfulonly for a restricted class of systems in which symmetriesare manifest, mainly because the mechanism of how theholography works is still unknown. For a given quantumsystem, we do not know whether its gravity dual exists andhow we can construct a holographic model.Suppose one is given experimental data of the linear/

nonlinear response of a quantum system under someexternal field: can one model it holographically, i.e., canone solve the inverse problem? In this paper we employ

deep learning (DL) [4–6], an active subject of computa-tional science, to provide a data-driven holographic gravitymodeling of strongly coupled quantum systems. In conven-tional holographic modeling, a chosen gravity metriccalculates QFT observables, which are then compared withexperimental data. In our novel DL method, experimentaldata calculates a suitable bulk metric function [7], whichwill be used to predict other observables.Our strategy is simple: we provide a deep neural

network representation of a scalar field equation in (dþ 1)-dimensional curved spacetime. The discretized holographic(“AdS radial”) direction is the deep layers; see Fig. 1.The weights of the neural network to be trained areidentified with a metric component of the curved space-time. The input response data is at the AdS boundary, andthe output binomial data is the black hole horizon con-dition. Therefore, successful machine learning results in aconcrete metric of a holographic model of the systemmeasured by the experiment [11]. We call this implemen-tation of the holographic model into the deep neuralnetwork the AdS=DL correspondence.We check that the holographic DL modeling works

nicely with the popular anti–de Sitter (AdS) Schwarzschild

Published by the American Physical Society under the terms ofthe Creative Commons Attribution 4.0 International license.Further distribution of this work must maintain attribution tothe author(s) and the published article’s title, journal citation,and DOI. Funded by SCOAP3.

PHYSICAL REVIEW D 98, 046019 (2018)

2470-0010=2018=98(4)=046019(14) 046019-1 Published by the American Physical Society

https://crossmark.crossref.org/dialog/?doi=10.1103/PhysRevD.98.046019&domain=pdf&date_stamp=2018-08-27

https://doi.org/10.1103/PhysRevD.98.046019




https://creativecommons.org/licenses/by/4.0/

https://creativecommons.org/licenses/by/4.0/

metric, by showing that the metric is successfully learnedand reproduced by the DL framework. Then we proceed touse experimental data of the magnetic response ofSm0.6Sr0.4MnO3 which is known to have strong quantumfluctuations, and demonstrate the emergence of a bulkmetric via the AdS=DL correspondence.Our study gives a first concrete implementation of the

AdS=CFT correspondence into deep neural networks. Weshow the emergence of a smooth geometry from givenexperimental data, which opens a possibility to reveal themystery of the emergent geometry in the AdS=CFTcorrespondence with the help of the active research inDL. A similarity between the AdS=CFT correspondenceand DL was discussed recently [12,13], and it can bediscussed using tensor networks and the AdS=MERAcorrespondence [17,18].Let us briefly review a standard deep neural network. It

consists of layers (see Fig. 1), and between adjacent layers alinear transformation xi → Wijxj and a nonlinear trans-formation known as an activation function, xi → φðxiÞ aresuccessively performed. The final layer is for summarizingall the components of the vector. So the output of the neuralnetwork is

yðxð1ÞÞ ¼ fiφðWðN−1Þij φðWðN−2Þ

jk � � �φðWð1Þlm xð1Þm ÞÞÞ: ð1Þ

In the learning process, the variables of the network

ðfi;WðnÞij Þ for n ¼ 1; 2;…; N − 1 are updated by a gradient

descent method with a given loss function of the L1-normerror,

E≡Xdata

jyðxð1ÞÞ − yj þ EregðWÞ: ð2Þ

Here the sum is over the whole set of pairs fðxð1Þ; yÞg of theinput data xð1Þ and the output data y. The regularization Ereg

is introduced to require the expected properties of theweights [64].

II. NEURAL NETWORK OF SCALARFIELD IN AdS

Let us embed the scalar field theory into a deep neuralnetwork. A scalar field theory in a (dþ 1)-dimensionalcurved spacetime is written as

S ¼Z

ddþ1xffiffiffiffiffiffiffiffiffiffiffiffiffi− det g

p �−1

2ð∂μϕÞ2 −

1

2m2ϕ2 − VðϕÞ

�:

ð3Þ

For simplicity we consider the field configuration todepend only on η (the holographic direction). Here thegeneric metric is given by

ds2 ¼ −fðηÞdt2 þ dη2 þ gðηÞðdx21 þ � � � þ dx2d−1Þ ð4Þ

with the asymptotic AdS boundary condition f ≈ g ≈exp½2η=L�ðη ≈∞Þ with the AdS radius L, and anotherboundary condition at the black hole horizon, f ≈ η2;g ≈ constant ðη ≈ 0Þ. The classical equation of motionfor ϕðηÞ is

∂ηπ þ hðηÞπ −m2ϕ −δV½ϕ�δϕ

¼ 0; π ≡ ∂ηϕ; ð5Þ

where we have defined π so that the equations become firstorder in derivatives. The metric dependence is combined ashðηÞ≡ ∂η log

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffifðηÞgðηÞd−1

p. Discretizing the radial η

direction, the equations are rewritten as

ϕðηþ ΔηÞ ¼ ϕðηÞ þ ΔηπðηÞ;

πðηþ ΔηÞ ¼ πðηÞ − Δη�hðηÞπðηÞ −m2ϕðηÞ − δVðϕÞ

δϕðηÞ�:

ð6Þ

We regard these equations as a propagation equation on aneural network, from the boundary η ¼ ∞ where the inputdata ðϕð∞Þ; πð∞ÞÞ is given, to the black hole horizonη ¼ 0 for the output data; see Fig. 2. The N layers of thedeep neural network are a discretized radial direction ηwhich is the emergent space in AdS, ηðnÞ≡ðN−nþ1ÞΔη.The input data xð1Þi of the neural network is a two-dimensional real vector ðϕð∞Þ; πð∞ÞÞT. So the linearalgebra part of the neural network (the solid lines inFig. 1) is automatically provided by

FIG. 1. The AdS=CFT correspondence and DL. Top: A typicalview of the AdS=CFT correspondence. The CFT at a finitetemperature lives at a boundary of asymptotically AdS spacetimewith a black hole horizon at the other end. Bottom: A typical deeplearning neural network.

HASHIMOTO, SUGISHITA, TANAKA, and TOMIYA PHYS. REV. D 98, 046019 (2018)

046019-2

WðnÞ ¼�

1 ΔηΔηm2 1 − ΔηhðηðnÞÞ

�: ð7Þ

The activation function at each layer reproducing Eq. (6) is

�φðx1Þ ¼ x1;

φðx2Þ ¼ x2 þ Δη δVðx1Þδx1

:ð8Þ

The definitions (7) and (8) bring the scalar field system incurved geometry (3) into the form of the neural network(1) [65].

III. RESPONSE AND INPUT/OUTPUT DATA

In the AdS=CFT correspondence, asymptotically AdSspacetime provides a boundary condition of the scalar fieldcorresponding to the response data of the QFT. With theAdS radius L, asymptotically hðηÞ ≈ d=L. The externalfield value J (the coefficient of a non-normalizable mode ofϕ) and its response hOi (that of a normalizable mode) in theQFT are [66], in units of L ¼ 1, a linear map

ϕðηiniÞ ¼ J exp½−Δ−ηini� þ hOi exp½−Δþηini�Δþ − Δ−

;

πðηiniÞ ¼ −JΔ− exp½−Δ−ηini� − hOiΔþ exp½−Δþηini�Δþ − Δ−

;

ð9Þ

with Δ� ≡ ðd=2Þ �ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffid2=4þm2L2

p(Δþ is the conformal

dimension of the QFToperatorO corresponding to the bulkscalar ϕ). The value η ¼ ηini ≈∞ is the regularized cutoffof the asymptotic AdS spacetime. We use Eq. (9) to convertthe response data of the QFT to the input data of the neuralnetwork.The input data at η ¼ ηini propagates in the neural

network toward η ¼ 0 (the horizon). If the input data ispositive, the output at the final layer should satisfy theboundary condition of the black hole horizon (see e.g.,Ref. [67]),

0 ¼ F≡�2

ηπ −m2ϕ −

δVðϕÞδϕ

�η¼ηfin

: ð10Þ

Here η ¼ ηfin ≈ 0 is the horizon cutoff. Our final layer isdefined by the map F, and the output data is y ¼ 0 for

positive-answer response data ðJ; hOiÞ. In the limitηfin → 0, the condition (10) is equivalent to πðη ¼ 0Þ ¼ 0.With this definition of the network and the training data,

we can make the deep neural network learn the metriccomponent function hðηÞ, the parameter m and the inter-action V½ϕ�. The training is with a loss function E given byEq. (2) [68]. Experiments provide only positive-answerdata fðJ; hOiÞ; y ¼ 0g, while for the training we also neednegative-answer data: fðJ; hOiÞ; y ¼ 1g. It is easy togenerate false-response data ðJ; hOiÞ, and we assign theoutput y ¼ 1 to them. To make the final output of the neuralnetwork be binary, we use the function tanh jFj (or itsvariant) for the final layer rather than just F, becausetanh jFj provides ≈1 for any negative input.

IV. LEARNING TEST: AdS SCHWARZSCHILDBLACK HOLE

To check whether this neural network can learn the bulkmetric, we first demonstrate a learning test. We will see thatwith data generated by a known AdS Schwarzschild metric,our neural network can learn and reproduce the metric [69].We work here with d ¼ 3 in units L ¼ 1. The metric is

hðηÞ ¼ 3 cothð3ηÞ ð11Þ

and we discretize the η direction by N ¼ 10 layers withηini ¼ 1 and ηfin ¼ 0.1. We fix for simplicity m2 ¼ −1 andV½ϕ� ¼ λ

4ϕ4 with λ ¼ 1. Then we generate positive-answer

data with the neural network with the discretized Eq. (11),by collecting randomly generated ðϕðηini; πðηiniÞÞ givingjFj < ϵwhere ϵ ¼ 0.1 is a cutoff. The negative-answer dataare similarly generated under the criterion jFj > ϵ. Wecollect 1000 positive and 1000 negative data points; seeFig. 3. Since we are interested in the smooth continuumlimit of hðηÞ, and the horizon boundary condition

hðηÞ ≈ 1=ηðη ≈ 0Þ, we introduce the regularization Eð1Þreg ≡

cregP

N−1n¼1 ðηðnÞÞ4ðhðηðnþ1ÞÞ − hðηðnÞÞÞ2 ∝

Rdηðh0ðηÞη2Þ2,

with creg ¼ 10−3.

FIG. 2. The simplest deep neural network reproducing thehomogeneous scalar field equation in a curved spacetime.Weights W are shown by solid lines explicitly, while theactivation is not.

φ

∏

FIG. 3. The data generated by the discretized AdS Schwarzs-child metric (11). Blue points are positive data (y ¼ 0) and thegreen points are negative data (y ¼ 1).

DEEP LEARNING AND THE AdS=CFT CORRESPONDENCE PHYS. REV. D 98, 046019 (2018)

046019-3

We use the Python deep learning library PYTORCH toimplement our network [70]. The initial metric is randomlychosen. By setting the batch size to 10, we find that after100 epochs of the training our deep neural networksuccessfully learned hðηÞ and it coincides with Eq. (11);see Fig. 4(b) [71]. The statistical analysis with 50 learnedmetrics [Fig. 4(c)] shows that the asymptotic AdS region isalmost perfectly learned. The near-horizon region has≈30% systematic error, and this amount is also expectedfor the following analysis with experimental data.

V. EMERGENT METRIC FROM EXPERIMENTS

Since we have checked that the AdS Schwarzschildmetric is successfully reproduced, we shall apply the deepneural network to learn the bulk geometry for a given set ofexperimental data. We use experimental data of themagnetization curve (the magnetization M½μB=Mn� vs theexternal magnetic fieldH [Tesla]) for the three-dimensionalmaterial Sm0.6Sr0.4MnO3 which is known to have a strongquantum fluctuation [72]; see Fig. 5. We employ a set ofdata at temperature 155 K which is slightly above thecritical temperature, since it exhibits a deviation from thelinear M-H curve suggesting a strong correlation. To formpositive data we add random noise around the experimentaldata, and also generate negative data positioned away fromthe positive data [73].The same neural network is used, except that we add a

new zeroth layer to relate the experimental data with ðϕ; πÞ,motivated by Eq. (9):

ϕðηiniÞ ¼ αH þ βM;

πðηiniÞ ¼ −Δ−αH − ΔþβM: ð12Þ

We introduce the normalization parameters α and β torelate ðH;MÞ to the bulk ϕ, and the asymptotic AdSradius d=hð∞Þ≡ L is included in Δ� ¼ ðd=2Þð1�ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1þ 4m2=hð∞Þ2

pÞ. In our numerical code we intro-

duce a dimensionful parameter Lunit with which allthe parameters are measured in units Lunit ¼ 1. We add

another regularization term Ereg ¼ Eð1Þreg þ Eð2Þ

reg with Eð2Þreg ≡

cð2ÞregðhðηðNÞÞ − 1=ηðNÞÞ2 which forces hðηðNÞÞ, the metricvalue near the horizon, to match the standard horizonbehavior 1=η; see Appendix D for details. We chose

FIG. 4. Before the learning (a) and after the learning (b). (a-1) The ðϕ; πÞ plot at the first epoch. Blue and green dots are positive data.Orange and green dots are data judged as “positive” by using the initial trial metric. (a-2) The orange line is the initial trial metric(randomly generated), while the blue line is the discretized AdS Schwarzschild metric (11). (b-1) The ðϕ; πÞ plot after training for 100epochs. (b-2) The learned metric (orange line) almost coincides with the original AdS Schwarzschild metric, which means that ourneural network successfully learned the bulk metric. (c) Statistical analysis of 50 learned metrics.

FIG. 5. Left: Experimental data of magnetization (M) versusmagnetic field (H) for the material Sm0.6Sr0.4MnO3. The figure istaken from Ref. [72]. Right: Positive (blue) and negative (orange)data sets generated by the experimental data at the temperature155 K, with random noise added.


046019-4

N ¼ 10 and cð2Þreg ¼ 10−4. In the machine-learning pro-cedure, m and λ, and α and β are trained, as well as themetric function hðηÞ.We stop the training when the loss becomes smaller than

0.02, and collect 13 successful cases. The emergent metricfunction hðηÞ obtained by the machine-learning procedureis shown in Fig. 6. It approaches a constant at the boundary,meaning that it is a proper asymptotically AdS spacetime.The obtained (dimensionless) parameters for the scalar fieldare m2L2 ¼ 5.6� 2.5, λ=L ¼ 0.61� 0.22 [74]. In thismanner, a holographic model is determined numericallyfrom the experimental data, by the DL framework.

VI. SUMMARY AND OUTLOOK

We created a bridge between two major subjects abouthidden dimensions: the AdS=CFT correspondence and DL.We initiated a data-driven holographic modeling of quan-tum systems by formulating the gravity dual on a deepneural network. We showed that with an appropriate choiceof the sparse network and input/output data the AdS=DLcorrespondence is properly formulated, and standardmachine learning works nicely for the automatic emergenceof the bulk gravity for given response data of the boundaryquantum systems.Our method was not to construct more accurate holo-

graphic models, but rather to solve an inverse problem. Itwill help model builders, since conventionally a sense ofchoice of metric has been necessary. Once a metric isinversely learned from data, it can be used to predict otherobservables. As any model building requires fitting ofexperimental data, our method surely reduces efforts to finda better model. Since holographic modeling is currentlyused in many subjects, we believe our method has a widerange of applications in physics.How can our study shed light on the mystery of the

emergent spacetime in the AdS=CFT correspondence? Acontinuum limit of deep neural networks can accommodatearbitrarily nonlocal systems as the network basicallyincludes all-to-all interlayer connections. So, the emer-gence of the new spatial dimension would need a reduction

of the full DL parameter space. A criterion to find aproperly sparse neural network which can accommodatelocal bulk theories is missing, and the question is similar tothe AdS=CFT correspondence where the criteria for QFT tohave a gravity dual are still missing. At the same time, ourwork suggests that the bulk emergence could be a moregeneric phenomenon. For further exploration of theAdS=DL correspondence, we plan to formulate a “holo-graphic autoencoder,” motivated by the similarity betweenDL autoencoders and continuous MERA at finite temper-ature [75,76], and also the thermofield formulation of theAdS=CFT correspondence [77,78]. The characterization ofblack hole horizons in DL may be a key to understandingthe bulk emergence.

ACKNOWLEDGMENTS

We would like to thank H. Sakai for providing uswith the experimental data. K. H. would like to thankS. Amari, T. Ohtsuki and N. Tanahashi for valuablediscussions. The work of K. H. was supported in partby JSPS KAKENHI Grants No. JP15H03658,No. JP15K13483, and No. JP17H06462. S. S. is supportedin part by the Grant-in-Aid for JSPS Research Fellow,Grant No. JP16J01004. The work of A. Tanaka wassupported by the RIKEN Center for AdvancedIntelligent Project. A. Tomiya was fully supported byHeng-Tong Ding. The work of A. Tomiya was supportedin part by NSFC under Grant no. 11535012.

APPENDIX A: HAMILTONIAN SYSTEMSREALIZED BY A DEEP NEURAL NETWORK

Here we show that a restricted class of Hamiltoniansystems can be realized by a deep neural network with alocal activation function [79]. We consider a genericHamiltonian Hðp; qÞ and its Hamilton equation, and lookfor a deep neural network representation (1) representingthe time evolution by Hðp; qÞ. The time direction isdiscretized to form the layers. (For our AdS=CFT exam-ples, the radial evolution corresponds to the time directionof the Hamiltonian which we consider here.)

FIG. 6. Left: The result of machine learning for fitting the experimental data. Blue and green dots are positive experimental data.Orange and green dots are data judged as “positive” by using the learned metric (center). The total loss after the training is 0.0096. Right:Statistical average of the 13 learned metrics that have a loss less than 0.02.


046019-5

Let us first try the following generic neural network andidentify the time translation t → tþ Δt with the interlayerpropagation:

qðtþ ΔtÞ ¼ φ1ðW11qðtÞ þW12pðtÞÞ;pðtþ ΔtÞ ¼ φ2ðW12qðtÞ þW22pðtÞÞ: ðA1Þ

This consists of successive actions of a linear W trans-formation and a local φ nonlinear transformation. Therelevant part of the network is shown in the left panel of

Fig. 7. The units xðnÞ1 and xðnÞ2 are directly identified with thecanonical variables qðtÞ and pðtÞ, and t ¼ nΔt. We want torepresent the Hamiltonian equations in the form (A1). Itturns out that it is impossible except for free Hamiltonians.In order for Eq. (A1) to be consistent at Δt ¼ 0, we need

to require

W11 ¼ 1þOðΔtÞ; W22 ¼ 1þOðΔtÞ;W12 ¼ OðΔtÞ; W21 ¼ OðΔtÞ; φðxÞ ¼ xþOðΔtÞ:

ðA2ÞSo use the ansatz

Wij ¼ δij þ wijΔt; φiðxÞ ¼ xþ giðxÞΔt; ðA3Þwhere wij ði; j ¼ 1; 2Þ are constant parameters and giðxÞ(i ¼ 1, 2) are nonlinear functions. Substituting these intothe original Eq. (A1) and taking the limit Δt → 0, weobtain

_q ¼ w11qþ w12pþ g1ðqÞ; _p ¼ w21qþ w22pþ g2ðpÞ:ðA4Þ

In order for these equations to be Hamiltonian equations,we need to require a symplectic structure

∂∂q ðw11qþ w12pþ g1ðqÞÞ

þ ∂∂p ðw21qþ w22pþ g2ðpÞÞ ¼ 0: ðA5Þ

However, this equation does not allow any nonlinearactivation function giðxÞ. So, we conclude that a simpleidentification of the units of the neural network with thecanonical variables allows only linear Hamiltonian equa-tions, and thus free Hamiltonians.In order for a deep neural network representation to

allow generic nonlinear Hamiltonian equations, we need toimprove our identification of the units with the canonicalvariables, and also the identification of the layer propaga-tion with the time translation. Let us instead try

xiðtþ ΔtÞ ¼ WijφjðWjkxkðtÞÞ: ðA6ÞThe difference from Eq. (A1) is twofold: first, we define i, j,k ¼ 0, 1, 2, 3 with x1 ¼ q and x2 ¼ p, meaning that we haveadditional units x0 and x3; second,we considermultiplicationby a linear W. So, in total, this consists of successive actions ofa linearW transformation, a nonlinear local φ transformationand a linear W transformation, and we interpret this set as atime translationΔt. Sincewepile up these sets asmany layers,the last W at t and the nextW at tþ Δt are combined into asingle linear transformation WtþΔtWt, so the standard form(1) of the deep neural network is kept.We arrange the following sparse weights and local

activation functions:

W ¼

0BBB@

0 0 v 0

0 1þ w11Δt w12Δt 0

0 w21Δt 1þ w22Δt 0

0 u 0 0

1CCCA; W ¼

0BBB@

0 0 0 0

λ1 1 0 0

0 0 1 λ2

0 0 0 0

1CCCA;

0BBB@

φ0ðx0Þφ1ðx1Þφ2ðx2Þφ3ðx3Þ

1CCCA ¼

0BBB@

fðx0ÞΔt1

1

gðx3ÞΔt

1CCCA; ðA7Þ

FIG. 7. Left: A naive identification of the canonical variables q, p and the units, and of the time translation with the interlayerpropagation. Right: An improved neural network whose continuum limit provides a nonlinear Hamiltonian system.


046019-6

where u; v; wij (i, j ¼ 1, 2) are constant weights, and φiðxiÞ are local activation functions. The network is shown in the rightpanel of Fig. 7. Using this definition of the time translation, we arrive at

_q ¼ w11qþ w12pþ λ1fðvpÞ; _p ¼ w11qþ w12pþ λ2gðuqÞ: ðA8Þ

Then the symplectic constraint means w11 þ w22 ¼ 0, and the Hamiltonian is given by

H ¼ w11pqþ 1

2w12p2 −

1

2w21q2 þ

λ1vFðvpÞ − λ2

uGðuqÞ ðA9Þ

where F0ðx0Þ ¼ fðx0Þ and G0ðx3Þ ¼ gðx3Þ. This is thegeneric form of the nonlinear Hamiltonians which admita deep neural network representation. Our scalar fieldequation in the curved geometry (5) is within this category.For example, choosing

w11 ¼ w21 ¼ 0; w12 ¼ 1=m; λ1 ¼ 0;

λ2 ¼ 1; u ¼ 1; ðA10Þ

gives the popular Hamiltonian for a nonrelativistic particlemoving in a potential,

H ¼ 1

2mp2 − GðqÞ: ðA11Þ

A more involved identification of the time translation andthe layer propagation may be able to accommodate

Hamiltonians which are not of the form (A9). We leavea generic argument of this for future investigations [80].

APPENDIX B: ERROR FUNCTION OFTHE AdS SCALAR SYSTEM

For λ ¼ 0, we can obtain an explicit expression for theerror function (loss function) for the machine-learningprocedure in our AdS scalar field system. The scalar fieldequation (5) can be formally solved as a path-ordered form

�πðηÞϕðηÞ

�¼ P exp

�Zηini

ηdη

�hðηÞ −m2

−1 0

��πðηiniÞϕðηiniÞ

�:

ðB1Þ

So, in the continuum limit of the discretized neural net-work, the output is provided as

tanh jπð0Þj ¼ tanh

�ð1 0ÞP exp

�Z∞

0

dη

�hðηÞ −m2

−1 0

��πð∞Þϕð∞Þ

��: ðB2Þ

Then the error function (2) is provided as

E½hðηÞ� ¼X

fπð∞Þ;ϕð∞Þg positive

�tanh

�ð1 0ÞP exp

�Z∞

0

dη

�hðηÞ −m2

−1 0


��2

þX

fπð∞Þ;ϕð∞Þg negative

�tanh

�ð1 0ÞP exp

�Z∞

0

dη

�hðηÞ −m2

−1 0


��− 1

�2

: ðB3Þ

The learning process is equivalent to the following gradientflow equation with a fictitious time variable τ:

∂hðη; τÞ∂τ ¼ ∂E½hðη; τÞ�

∂hðη; τÞ : ðB4Þ

For the training of our numerical experiment using theexperimental data, we have chosen the initial configurationof hðηÞ as a constant (which corresponds to a pure AdSmetric). For a constant hðηÞ ¼ h, the error function can beexplicitly evaluated with

πð0Þ ¼ 1

λþ − λ−ðλþðπðηiniÞ − λ−ϕðηiniÞÞe−λþηini

þ λ−ð−πðηiniÞ þ λþϕðηiniÞÞe−λ−ηiniÞ ðB5Þ

where λ� ≡ 12ð−h�

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffih2 þ 4m2

pÞ is the eigenvalue of the

matrix which is path-ordered. Using this expression, wefind that at the initial epoch of the training the function hðηÞis updated by the addition of a function of the formexp½ðλþ − λ−Þη� and of the form exp½−ðλþ − λ−Þη�. Thismeans that the update is effective in two regions: near theblack hole horizon η ≈ 0 and near the AdS boundary η ≈∞.


046019-7

Normally in deep learning the update is effective near theoutput layer because any back propagation could be sup-pressed by the factor of the activation function. Howeverour example above shows that the update near the inputlayer is also updated. The reason for this difference is thatin the example above we assumed λ ¼ 0 to solve the errorfunction explicitly, and this means that the activationfunction is trivial. In our numerical simulations whereλ ≠ 0, the back propagation is expected to be suppressednear the input layer.

APPENDIX C: BLACK HOLE METRIC ANDCOORDINATE SYSTEMS

Here we summarize the properties of the bulk metric andthe coordinate frame which we prefer to use in the main text.The four-dimensional AdS Schwarzschild black hole

metric is given by

ds2 ¼ −fðrÞdt2 þ 1

fðrÞ dr2 þ r2

L2

X2i¼1

dx2i ;

fðrÞ≡ r2

L2

�1 −

r30r3

�ðC1Þ

where L is the AdS radius, and r ¼ r0 is the location of theblack hole horizon. r ¼ ∞ corresponds to the AdS boun-dary. To put this in the form of Eq. (4), we make thecoordinate transformation

r ¼ r0

�cosh

3η

2L

�2=3

: ðC2Þ

With this coordinate η, the metric is given by

ds2 ¼ −fðηÞdt2 þ dη2 þ gðηÞX2i¼1

dx2i ;

fðηÞ≡ r20L2

�cosh

3η

2L

�−2=3

�sinh

3η

2L

�2

;

gðηÞ≡ r20L2

�cosh

3η

2L

�4=3

: ðC3Þ

The AdS boundary is located at η ¼ ∞while the black holehorizon resides at η ¼ 0. The function hðηÞ appearing in thescalar field equation (5) is

hðηÞ≡ ∂η logffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffifðηÞgðηÞd−1

q¼ 3

Lcoth

3η

L: ðC4Þ

The r0 dependence, and hence the temperature dependence,disappears because our scalar field equation (5) assumestime independence and xi independence. This hðηÞ isbasically the invariant volume of the spacetime, and isimportant in the sense that a certain tensor component ofthe vacuum Einstein equation coming from

SE ¼Z

d4xffiffiffiffiffiffiffiffiffiffiffiffiffi− det g

p �Rþ 6

L2

�ðC5Þ

results in the closed form

−9

L2þ ∂ηhðηÞ þ hðηÞ2 ¼ 0: ðC6Þ

It can be shown that the ansatz (C1) leads to a unique metricsolution for the vacuum Einstein equations, and thesolution is given by Eq. (C4) up to a constant shift of η.Generically, whatever the temperature is, and whatever thematter energy-momentum tensor is, the metric functionhðηÞ behaves as hðηÞ ≈ 1=η near the horizon η ≈ 0, andgoes to a constant (proportional to the AdS radius L) at theAdS boundary η ≈∞.One may try to impose some physical condition on hðηÞ.

In fact, the right-hand side of Eq. (C6) is a linearcombination of the energy-momentum tensor, and gener-ally we expect that the energy-momentum tensor is subjectto various energy conditions, which may constrain the ηevolution of hðηÞ. Unfortunately it turns out that a suitableenergy condition for constraining hðηÞ is not available, toour knowledge. So, nonmonotonic functions in η areallowed as a learned metric.

APPENDIX D: DETAILS ABOUT OUR CODINGFOR THE MACHINE-LEARNING PROCEDURE

1. Comments on the regularization

Before getting into the detailed presentation of the coding,let us make some comments on the effect of the regulari-zation Ereg and the statistical analysis of the learning trials.First, we discuss the meaning of Ereg in Eq. (2). In the

first numerical experiment for the reproduction of the AdSSchwarzschild black hole metric we took

Eð1Þreg ≡ 3 × 10−3

XN−1

n¼1

ðηðnÞÞ4ðhðηðnþ1ÞÞ − hðηðnÞÞÞ2

∝Z

dηðh0ðηÞη2Þ2: ðD1Þ

This regularization term works as a selection of the metricswhich are smooth. We are interested in the metric withwhich we can take a continuum limit, so a smooth hðηÞ isbetter for our physical interpretation. Without Ereg, thelearned metrics are far from the AdS Schwarzschild metric:see Fig. 8 for an example of the learned metric without Ereg.Note that the example in Fig. 8 achieves an accuracy that isof the same order as that of the learned metric with Ereg. So,in effect, this regularization term does not spoil the learningprocess, but actually picks up the metrics which aresmooth, among the learned metrics achieving the sameaccuracy.Second, we discuss how the learned metric shown in

Fig. 4 is generic, for the case of the first numerical


046019-8

experiment. We have collected results of 50 trials of themachine-learning procedure, and the statistical analysis ispresented in Fig. 4(c). It is shown that the metric in theasymptotic region is quite nicely learned, and we canconclude that the asymptotic AdS spacetime has beenlearned properly. On the other hand, for the result in theregion near the black hole horizon, the learned metricreproduces qualitatively the behavior around the horizon,but quantitatively it deviates from the true metric. Thiscould be due to the discretization of the spacetime.Third, let us discuss the regularization for the second

numerical experiment for the emergence of the metric forthe condensed-matter material data. The regularizationused is

Ereg ¼ Eð1Þreg þ Eð2Þ

reg

¼ 3 × 10−3XN−1

n¼1

ðηðnÞÞ4ðhðηðnþ1ÞÞ − hðηðnÞÞÞ2

þ cð2ÞregðhðηðNÞÞ − 1=ηðNÞÞ2; ðD2Þ

with cð2Þreg ¼ 10−4. The second term is introduced to fit themetric hðηÞ near the horizon to the value 1=η, because 1=ηbehavior is expected for any regular horizons. In Fig. 9, wepresent our statistical analyses of the obtained metrics fortwo other distinct choices of the regularization parameter:

cð2Þreg ¼ 0 and cð2Þreg ¼ 0.1. For cð2Þreg ¼ 0, there is no regulari-zation Ereg, so the metric goes down to a negative number at

the horizon. For cð2Þreg ¼ 0, which is a strong regularization,the metric is almost completely fixed to a value 1=η withη ¼ ηðNÞ. For all cases, the learned metrics achieve a loss≈0.02, so the system is successfully learned. The onlydifference is how we pick up “physically sensible” metrics

among many learned metrics. In Fig. 6, we chose cð2Þreg ¼10−4 which is in between the values used in Fig. 9, becausethe deviation of the metric near the horizon is of the sameorder as that near the asymptotic region.

2. Numerical experiment 1: Reconstructingan AdS Schwarzschild black hole

We have performed two independent numerical experi-ments: the first one consisted of the reconstruction of theAdS Schwarzschild black hole metric, and the second oneconsisted of the emergence of a metric from the exper-imental data of a condensed-matter material. Here weexplain details about the coding and the setup, for eachnumerical experiment.In the first numerical experiment, we fix the mass of the

scalar field m2 and coupling constant in the potentialVðϕÞ ¼ λ

4ϕ4 to

m2 ¼ −1; λ ¼ 1; ðD3Þ

and prepare data fðxð1Þ; yÞg to train the neural network. Thetraining data is just a list of initial pairs of xð1Þ ¼ ðϕ; πÞ andcorresponding answer signals y. We regard xð1Þ ¼ ðϕ; πÞ asfield values at the AdS boundary, and define the answersignal so that it represents whether they are permissible or

FIG. 9. Statistical results of the 13 obtained metrics. Left: cð2Þreg ¼ 0. Right: cð2Þreg ¼ 0.1.

FIG. 8. A learned metric with a high accuracy, without the useof the regularization Ereg. The setup used is the same as what weused for the reproduction of the AdS Schwarzschild metric.


046019-9

not when they propagate toward the black hole horizon.More explicitly, what we do is the iteration defined below.(1) Randomly choose ϕ ∈ ½0; 1.5�, π ∈ ½−0.2; 0.2� and

regard them as input: xð1Þ ¼ ðϕπÞ.(2) Propagate it using the equation of motion (6) with

the AdS Schwarzschild metric (11) from ðϕðηiniÞ¼ϕπðηiniÞ¼πÞ

to ðϕðηfinÞπðηfinÞÞ.(3) Calculate the consistency F, i.e., the right-hand

side of Eq. (10), and define the answer signal:

y ¼n0 if F < 0.1;1 if F > 0.1:

To train the network appropriately, it is better to preparedata containing roughly an equal number of y ¼ 0 samplesand y ¼ 1 samples. We use a naive strategy here: if theresult of step 3 becomes y ¼ 0, we add the sample ðxð1Þ; yÞto the positive data category; if not, we add the sample tothe negative data category. Once the number of samples ofone category saturates to 103, we focus on collectingsamples in another category. After collecting both sets ofdata, we concatenate positive data and negative data andregard it as the total data for the training:

Training data D¼ð103positive dataÞ⊕ ð103negative dataÞ;

where

�positive data¼fðxð1Þ; y¼ 0Þg;negative data¼fðxð1Þ; y¼ 1Þg:

In addition, we prepare the neural network (1) with therestricted weight (7). The only trainable parameters arehðηðnÞÞ, and the purpose of this experiment is to seewhether trained hðηðnÞÞ are in agreement with the AdSSchwarzschild metric (11) encoded in the training dataimplicitly. To compare y and the neural net output y,we make the following final layer. First, we calculateF≡ πðηfinÞ [which is the rhs of Eq. (10) in the limitηfin → 0], and second, we define y≡ tðFÞ where

tðFÞ ¼ ½tanhð100ðF − 0.1ÞÞ − tanhð100ðF þ 0.1ÞÞ þ 2�=2:ðD4Þ

We plot the shape of tðFÞ in Fig. 10. Before running thetraining iteration, we should take certain initial values forhðηðnÞÞ. We use the initial hðηðnÞÞ ∼N ð1=ηðnÞ; 1Þ (which isa Gaussian distribution), because any black hole horizon ischaracterized by the 1=ηðnÞ behavior at ηðnÞ ≈ 0. [81] Aftersetting the initial values for the trained parameters, werepeat the training iteration.(1) Randomly divide the training data into a direct

sum: D ¼ ðmini data 1Þ ⊕ ðmini data 2Þ ⊕ � � � ⊕ðmini data 200Þ.

(2) Calculate the loss (2) and update hðηðnÞÞ using theAdam optimizer [82] for each mini data set.

When the target loss function (2) becomes less than 0.0002,we stop the iteration 1 and 2.

3. Numerical experiment 2: Emergent metricfrom experimental data

As a next step, we perform the second numericalexperiment. In this case, we use experimental data [72]composed of pairs of magnetic field strengths H andcorresponding magnetic responses M of Sm0.6Sr0.4MnO3

at the temperature 155 K. To pad the data, we plot theexperimental paired ðH;MÞ values as a two-dimensionalscatter plot and fit it by using a polynomial with respect toH up to 15th order (see Fig. 11), and call it fðHÞ. By usingthis fðHÞ, we prepare the training data fðXð1Þ; yÞg asfollows.(1) Randomly choose H ∈ ½0; 6�;M ∈ ½0; 2� and regard

them as input: Xð1Þ ¼ ðHMÞ.(2) Define the answer signal: y ¼n

0 if M ∈ ½fðHÞ − noise; fðHÞ þ noise�;1 otherwise;

where

the noise ∼N ð0; 0.1Þ.

FIG. 10. Final layer function tðFÞ in Eq. (D4).

FIG. 11. Experimental data of magnetization (M) versusmagnetic field (H) and its polynomial fitting.


046019-10

We prepare 104 positive data and 104 negative data, thesame as in the first numerical experiment. See Fig. 5 forthe padding of the obtained data. On the neural network,we insert an additional layer as the first layer (12). Inaddition to the values for hðηðnÞÞ, we update α, β in Eq. (12)and m2, λ in Eqs. (6) and (7) with VðϕÞ ¼ λ

4ϕ4. As one can

notice, m2 appears in the definitions for Δ�, so Eq. (12)includes m2 implicitly. The training is performed in thesame manner as in the first numerical experiment. We use aten-layer neural network in our numerical experiments.When the target loss function (2) becomes smaller than0.02, we stop the learning. Initial conditions for thenetwork are taken as hðηðnÞÞ ∼N ð2; 1Þ; m2 ∼N ð2; 1Þ;λ ∼N ð1; 1Þ and α; β ∼ ½−1; 1�.

APPENDIX E: COMMENTS ON THECONFORMAL DIMENSIONS

Here we review the critical exponents for a magneticsystem, which are described by a scalar field near thecritical point. In D-dimensional space ðD ¼ d − 1Þ, thecorrelation function of the scalar field behaves as

GðxÞ ∼ jxj−ðD−2þηÞ ðE1Þat the critical temperature, where η is the anomalousdimension. Thus, the scaling dimension of the scalar isgiven by

Δ ¼ D − 2þ η

2: ðE2Þ

The critical exponent δ is defined as

M ∼H1=δ ðE3Þat the critical temperature, i.e., δ characterizes how themagnetization M depends on the magnetic field H nearH ¼ 0. It is known (see e.g., Ref. [83]) that the scalinghypothesis relates the critical exponents δ and η as

δ ¼ Dþ 2 − η

D − 2þ η: ðE4Þ

The critical exponent δ should be positive because themagnetization M should vanish when the magnetic field His turned off. Thus, the scaling law (E4) implies that theanomalous dimension η satisfies η < Dþ 2. Therefore, thescaling dimension Δ should be bounded as Δ < D. Inparticular, by setting D ¼ 3, we should have Δ < 3.However, in our numerical experiment using the mag-

netic response data of the material Sm0.6Sr0.4MnO3 at155 K, from the obtained data we can calculate theconformal dimension, Δþ ¼ 4.89� 0.32. The estimatedvalue of the conformal dimension is larger than the boundΔþ < 3, and we have to be careful in the interpretation ofthe value here.Let us discuss several possible reasons for the violation

of the bound. In fact, we use a scalar model which does notproperly reflect the spin structure of the operator. For aholographic treatment of the magnetization, several meth-ods have been proposed; see Refs. [84–88]. Depending onthe model, the identification of the conformal dimensioncould be different.Another reason is that when we compute Δþ numeri-

cally, we set ηini ¼ 1 to reduce the computational cost. If wechose ηini to take a much larger value ηini=L ≫ 1, the extentof the violation would have been milder.We also speculate that the temperature 155 K we chose

for the analyses may not be close enough to the criticaltemperature [89]. In addition, because the order of thephase transition is not evident in the experimental data, thescaling law discussed above may not be applied. Of course,even if the temperature is near the critical temperature, thereis no persuasive reason that the material Sm0.6Sr0.4MnO3

can be described holographically by a classical bulk scalarfield. The simulation is just a demonstration of how our DLis used for the given experimental data, and we do not takethe violation of the bound as a serious problem in thispaper. It is more interesting to find a material such that thescaling dimension computed from our DL agrees with thecritical exponents estimated from the experimental data.The agreement suggests that such a material has a holo-graphic dual [90].

[1] J. M. Maldacena, The large-N limit of superconformalfield theories and supergravity, Int. J. Theor. Phys.38, 1113 (1999); Adv. Theor. Math. Phys. 2, 231(1998).

[2] S. S. Gubser, I. R. Klebanov, and A.M. Polyakov, Gaugetheory correlators from noncritical string theory, Phys. Lett.B 428, 105 (1998).

[3] E. Witten, Anti–de Sitter space and holography, Adv. Theor.Math. Phys. 2, 253 (1998).

[4] G. E. Hinton and R. R. Salakhutdinov, Reducing thedimensionality of data with neural networks, Science313, 504 (2006).

[5] Y. Bengio and Y. LeCun, Scaling learning algorithmstowards AI, in Large-Scale Kernel Machines, edited by


046019-11

https://doi.org/10.1023/A:1026654312961

https://doi.org/10.1023/A:1026654312961

https://doi.org/10.4310/ATMP.1998.v2.n2.a1


https://doi.org/10.1016/S0370-2693(98)00377-3

https://doi.org/10.1016/S0370-2693(98)00377-3



https://doi.org/10.1126/science.1127647

https://doi.org/10.1126/science.1127647

L. Bottou, O. Chapelle, D. DeCoste, and J. Weston (MIT,Cambridge, MA, 2007).

[6] Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature(London) 521, 436 (2015).

[7] While conventional solvers of inverse problems work onlyin linear theories and perturbations, the deep learningmethod works in the full nonlinear regime with physicalobservables at the boundary. It should be mentioned that theboundary entanglement entropy can reconstruct in principlethe bulk metric (see e.g., Refs. [8,9]), though the entangle-ment entropy is not a physical observable. The remarkablereproduction of Einstein’s equation itself from entanglemententropy [10] is, at present, a perturbative analysis.

[8] V. Balasubramanian, B. D. Chowdhury, B. Czech, J. deBoer, and M. P. Heller, Bulk curves from boundary data inholography, Phys. Rev. D 89, 086004 (2014).

[9] R. C. Myers, J. Rao, and S. Sugishita, Holographicholes in higher dimensions, J. High Energy Phys. 06(2014) 044.

[10] T. Faulkner, F. M.Haehl, E.Hijano,O. Parrikar, C. Rabideau,and M. Van Raamsdonk, Nonlinear gravity from entangle-ment in conformal field theories, J. High Energy Phys. 08(2017) 057.

[11] We assume that the system can be described holographicallyby a classical scalar field in asymptotically AdS space.

[12] Y. Z. You, Z. Yang, and X. L. Qi, Machine learning spatialgeometry from entanglement features, Phys. Rev. B 97,045153 (2018).

[13] See Refs. [14,15] for related essays. A continuum limit ofthe deep layers was studied in a different context [16].

[14] W. C. Gan and F.W. Shu, Holography as deep learning, Int.J. Mod. Phys. D 26, 1743020 (2017).

[15] J. W. Lee, Quantum fields as deep learning, arXiv:1708.07408.

[16] H. Abarbanel, P. Rozdeba, and S. Shirman, Machinelearning, deepest learning: Statistical data assimilationproblems, arXiv:1707.01415.

[17] B. Swingle, Entanglement renormalization and holography,Phys. Rev. D 86, 065007 (2012).

[18] An application of DL or machine learning to quantummany-body problems is a rapidly developing subject. SeeRef. [19] for one of the initial papers, together with therecent papers [20–55]. For machine learning applied to thestring landscape and compactification, see Refs. [56–63].

[19] G. Carleo and M. Troyer, Solving the quantum many-bodyproblem with artificial neural networks, Science 355, 602(2017).

[20] L. Wang, Discovering phase transitions with unsupervisedlearning, Phys. Rev. B 94, 195105 (2016).

[21] G. Torlai and R. G. Melko, Learning thermodynamics withBoltzmann machines, Phys. Rev. B 94, 165134 (2016).

[22] K. Ch’ng, J. Carrasquilla, R. G. Melko, and E. Khatami,Machine learning phases of strongly correlated fermions,Phys. Rev. X 7, 031038 (2017).

[23] D.-L. Deng, X. Li, and S. D. Sarma, Machine learningtopological states, Phys. Rev. B 96, 195145 (2017).

[24] A. Tanaka and A. Tomiya, Detection of phase transition viaconvolutional neural network, J. Phys. Soc. Jpn. 86, 063001(2017).

[25] T. Ohtsuki and T. Ohtsuki, Deep learning the quantum phasetransitions in random two-dimensional electron systems,J. Phys. Soc. Jpn. 85, 123706 (2016).

[26] G. Torlai and R. G. Melko, A Neural Decoder for Topo-logical Vodes, Phys. Rev. Lett. 119, 030501 (2017).

[27] Y. Zhang and E.-A. Kim, Quantum Loop Topography forMachine Learning, Phys. Rev. Lett. 118, 216401 (2017).

[28] L.-G. Pang, K. Zhou, N. Su, H. Petersen, H. Stöcker, andX.-N. Wang, An equation-of-state-meter of QCD transitionfrom deep learning, Nat. Commun. 9, 210 (2018).

[29] T. Ohtsuki and T. Ohtsuki, Deep learning the quantum phasetransitions in random electron systems: Applications tothree dimensions, J. Phys. Soc. Jpn. 86, 044708 (2017).

[30] J. Chen, S. Cheng, H. Xie, L. Wang, and T. Xiang,Equivalence of restricted Boltzmann machines and tensornetwork states, Phys. Rev. B 97, 085104 (2018).

[31] D.-L. Deng, X. Li, and S. D. Sarma, Quantum Entanglementin Neural Network States, Phys. Rev. X 7, 021021 (2017).

[32] X. Gao and L.-M. Duan, Efficient representation of quantummany-body states with deep neural networks, Nat. Com-mun. 8, 662 (2017).

[33] Y. Huang and J. E. Moore, Neural network representation oftensor network and chiral states, arXiv:1701.06246.

[34] K. Mills, M. Spanner, and I. Tamblyn, Deep learning and theSchrödinger equation, Phys. Rev. A 96, 042113 (2017).

[35] S. J. Wetzel, Unsupervised learning of phase transitions:From principal component analysis to variational autoen-coders, Phys. Rev. E 96, 022140 (2017).

[36] W. Hu, R. R. P. Singh, and R. T. Scalettar, Discoveringphases, phase transitions and crossovers through unsuper-vised machine learning: A critical examination, Phys. Rev.E 95, 062122 (2017).

[37] F. Schindler, N. Regnault, and T. Neupert, Probing many-body localization with neural networks, Phys. Rev. B 95,245134 (2017).

[38] P. Ponte and R. G. Melko, Kernel methods for interpretablemachine learning of order parameters, Phys. Rev. B 96,205146 (2017).

[39] M. Koch-Janusz and Z. Ringel, Mutual information, neuralnetworks and the renormalization group, Nat. Phys. 14, 578(2018).

[40] Y. Zhang, R. G. Melko, and E.-A. Kim, Machine learningZ2 quantum spin liquids with quasi-particle statistics, Phys.Rev. B 96, 245119 (2017).

[41] H. Fujita, Y. O. Nakagawa, S. Sugiura, and M. Oshikawa,Construction of Hamiltonians by supervised learning ofenergy and entanglement spectra, Phys. Rev. B 97, 075114(2018).

[42] S. J. Wetzel and M. Scherzer, Machine learning of explicitorder parameters: From the Ising model to SU(2) latticegauge theory, Phys. Rev. B 96, 184410 (2017).

[43] K. Mills and I. Tamblyn, Deep neural networks for direct,featureless learning through observation: The case of two-dimensional spin models, Phys. Rev. E 97, 032119 (2018).

[44] H. Saito, Solving the Bose-Hubbard model with machinelearning, J. Phys. Soc. Jpn. 86, 093001 (2017).

[45] K. Ch’ng, N. Vazquez, and E. Khatami, Unsupervisedmachine learning account of magnetic transitions in theHubbard model, Phys. Rev. E 97, 013306 (2018).


046019-12

https://doi.org/10.1038/nature14539

https://doi.org/10.1038/nature14539


https://doi.org/10.1007/JHEP06(2014)044




https://doi.org/10.1103/PhysRevB.97.045153


https://doi.org/10.1142/S0218271817430209

https://doi.org/10.1142/S0218271817430209

http://arXiv.org/abs/1708.07408




https://doi.org/10.1126/science.aag2302

https://doi.org/10.1126/science.aag2302



https://doi.org/10.1103/PhysRevX.7.031038


https://doi.org/10.7566/JPSJ.86.063001



https://doi.org/10.1103/PhysRevLett.119.030501


https://doi.org/10.1038/s41467-017-02726-3




https://doi.org/10.1038/s41467-017-00705-2

https://doi.org/10.1038/s41467-017-00705-2


https://doi.org/10.1103/PhysRevA.96.042113

https://doi.org/10.1103/PhysRevE.96.022140







https://doi.org/10.1038/s41567-018-0081-4

https://doi.org/10.1038/s41567-018-0081-4









[46] N. C. Costa, W. Hu, Z. J. Bai, R. T. Scalettar, and R. R. P.Singh, Learning fermionic critical points, Phys. Rev. B 96,195138 (2017).

[47] T. Mano and T. Ohtsuki, Phase diagrams of three-dimensional Anderson and quantum percolation modelsusing deep three-dimensional convolutional neural network,J. Phys. Soc. Jpn. 86, 113704 (2017).

[48] H. Saito and M. Kato, Machine learning technique to findquantum many-body ground states of bosons on a lattice,J. Phys. Soc. Jpn. 87, 014001 (2018).

[49] I. Glasser, N. Pancotti, M. August, I. D. Rodriguez, and J. I.Cirac, Neural-Network Quantum States, String-Bond States,and Chiral Topological States, Phys. Rev. X 8, 011006(2018).

[50] R. Kaubruegger, L. Pastori, and J. C. Budich, Chiraltopological phases from artificial neural networks, Phys.Rev. B 97, 195136 (2018).

[51] Z. Liu, S. P. Rodrigues, and W. Cai, Simulating the Isingmodel with a deep convolutional generative adversarialnetwork, arXiv:1710.04987.

[52] J. Venderley, V. Khemani, and E.-A. Kim, Machine Learn-ing Out-of-Equilibrium Phases of Matter, Phys. Rev. Lett.120, 257204 (2018).

[53] Z. Li, M. Luo, and X. Wan, Extracting critical exponent byfinite-size scaling with convolutional neural networks,arXiv:1711.04252.

[54] E. van Nieuwenburg, E. Bairey, and G. Refael, Learningphase transitions from dynamics, arXiv:1712.00450.

[55] X. Liang, S. Liu, Y. Li, and Y.-S. Zhang. Generation ofBose-Einstein condensates’ ground state through machinelearning, arXiv:1712.10093.

[56] Y. H. He, Deep-learning the landscape, arXiv:1706.02714.[57] D. Krefl and R. K. Seong, Machine learning of Calabi-Yau

volumes, Phys. Rev. D 96, 066014 (2017).[58] Y. H. He, Machine-learning the string landscape, Phys. Lett.

B 774, 564 (2017).[59] J. Liu, Artificial neural network in cosmic landscape,

J. High Energy Phys. 12 (2017) 149.[60] J. Carifio, J. Halverson, D. Krioukov, and B. D. Nelson,

Machine learning in the string landscape, J. High EnergyPhys. 09 (2017) 157.

[61] F. Ruehle, Evolving neural networks with genetic algo-rithms to study the String Landscape, J. High Energy Phys.08 (2017) 038.

[62] A. E. Faraggi, J. Rizos, and H. Sonmez, Classification ofstandard-like heterotic-string vacua, Nucl. Phys. B927, 1(2018).

[63] J. Carifio, W. J. Cunningham, J. Halverson, D. Krioukov, C.Long, and B. D. Nelson, Vacuum selection from cosmologyon networks of string geometries, arXiv:1711.06685.

[64] In Bayesian neural networks, regularizations are introducedas a prior.

[65] Note that φðx2Þ in Eq. (8) includes x1 so it is not local,opposed to the standard neural network (1) with localactivation functions. See the Appendixes for an improvedexpression with local activation functions.

[66] I. R. Klebanov and E. Witten, AdS=CFT correspondenceand symmetry breaking, Nucl. Phys. B556, 89 (1999).

[67] G. T. Horowitz, Introduction to holographic superconduc-tors, Lect. Notes Phys. 828, 313 (2011).

[68] The explicit expression for the loss function is available forλ ¼ 0: see the Appendixes.

[69] See the Appendixes for the details about the coordinatesystem.

[70] See the Appendixes for the details of the setup and coding,and the effect of the regularization and statistics.

[71] At the first epoch, the loss was 0.2349, while after the 100thepoch, the loss was 0.0002. We terminated the learningwhen the loss did not decrease.

[72] H. Sakai, Y. Taguchi, and Y. Tokura, Impact of bicriticalfluctuation on magnetocaloric phenomena in perovskitemanganites, J. Phys. Soc. Jpn. 78, 113708 (2009).

[73] Our experimental data does not have an error bar, so we addthe noise.

[74] For the numerically estimated conformal dimension and itsimplications, see the Appendixes.

[75] H. Matsueda, M. Ishihara, and Y. Hashizume, Tensornetwork and a black hole, Phys. Rev. D 87, 066002(2013).

[76] A. Mollabashi, M. Nozaki, S. Ryu, and T. Takayanagi,Holographic geometry of cMERA for quantum quenchesand finite temperature, J. High Energy Phys. 03 (2014)098.

[77] J. M. Maldacena, Eternal black holes in anti–de Sitter,J. High Energy Phys. 04 (2003) 021.

[78] T. Hartman and J. Maldacena, Time evolution of entangle-ment entropy from black hole interiors, J. High EnergyPhys. 05 (2013) 014.

[79] Here we regard the time evolution of the Hamiltonian as thepropagation in the neural network. For other ways toidentify Hamiltonian systems in machine learning; seeRef. [91].

[80] Deep learning is a regression method using many layers of aneural network. The standard regression methods in sta-tistics are limited to a finite small number of parameters forthe fitting, while deep learning deals with, in principle, aninfinite number of parameters. We compare our holographicradial direction with the depth layers, while here the timedirection is the depth layers. The discretization providesthe neural network, and the “deep limit” corresponds to thecontinuum limit. We use deep learning, rather than thestandard regression method, to be capable of dealing withthe optimization of a network with a large number of layers.

[81] Note that we do not teach the value of hðηÞ at the AdSboundary, i.e., 3 in our case.

[82] D. Kingma and J. Ba, Adam: A method for stochasticoptimization, arXiv:1412.6980.

[83] P. Di Francesco, P. Mathieu, and D. Senechal, ConformalField Theory (Springer, New York, 1997).

[84] N. Iqbal, H. Liu, M. Mezei, and Q. Si, Quantum phasetransitions in holographic models of magnetism and super-conductors, Phys. Rev. D 82, 045002 (2010).

[85] K. Hashimoto, N. Iizuka, and T. Kimura, Towards holo-graphic spintronics, Phys. Rev. D 91, 086003 (2015).

[86] R. G. Cai and R. Q. Yang, Paramagnetism-ferromagnetismphase transition in a dyonic black hole, Phys. Rev. D 90,081901 (2014).

[87] R. G. Cai, R. Q. Yang, Y. B. Wu, and C. Y. Zhang, Massive2-form field and holographic ferromagnetic phase transition,J. High Energy Phys. 11 (2015) 021.


046019-13

















https://doi.org/10.1016/j.physletb.2017.10.024

https://doi.org/10.1016/j.physletb.2017.10.024






https://doi.org/10.1016/j.nuclphysb.2017.12.006

https://doi.org/10.1016/j.nuclphysb.2017.12.006


https://doi.org/10.1016/S0550-3213(99)00387-9

https://doi.org/10.1007/978-3-642-04864-7






https://doi.org/10.1088/1126-6708/2003/04/021









[88] N. Yokoi, M. Ishihara, K. Sato, and E. Saitoh, Holographicrealization of ferromagnets, Phys. Rev. D 93, 026002(2016).

[89] The relation between the black hole temperature and theactual temperature 155 K is still obscure, since they arerelated through the updated parameters α and β, with somescale. Any holographic model requires the introduction of acertain energy scale to break the conformal invariance, andthat scale is used for measuring the physical data H and M.The present paper demonstrates the learning of metricfunctions, and a more detailed investigation on scales willbe given in a future publication. For example, to obtain aprediction for data at a different temperature, one needs adifferent metric function with that Hawking temperature. Toobtain the metric, one needs to solve the bulk gravityequation of motion with a different boundary condition,where the bulk gravity action itself is determined such thatthe present learned metric satisfies its equation of motion.

[90] Requiring a precise holographic duality severely restricts theallowed class of boundary quantum field theories [92].However, even for QCD which does not have a gravity dualin the precise sense (as it is not at the large-Nc limit), variousholographic QCD models have flourished and gainednovel insight into QCD, including some universal behaviorsof holographic QCD models such as viscosity in view ofcomparison with heavy-ion collision experiments. In thissense, even though generic materials may not allow aprecise holographic dual, we expect that we will be ableto learn new aspects of materials through their possibleholographic models.

[91] H.W. Lin, M. Tegmark, and D. Rolnick, Why does deep andcheap learning work so well?, J. Stat. Phys. 168, 1223(2017).

[92] I. Heemskerk, J. Penedones, J. Polchinski, and J. Sully,Holography from conformal field theory, J. High EnergyPhys. 10 (2009) 079.


046019-14



https://doi.org/10.1007/s10955-017-1836-5

https://doi.org/10.1007/s10955-017-1836-5

https://doi.org/10.1088/1126-6708/2009/10/079

https://doi.org/10.1088/1126-6708/2009/10/079

Date post:	05-Sep-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

Deep learning and the AdS/CFT correspondence · 2019. 2. 13. · Deep learning and the AdS=CFT...

Documents