+ All Categories
Home > Documents > NNPDF STUDIES ON PDFnnpdf.mi.infn.it/wp-content/uploads/2018/10/pdf4lhcstat.pdf · 2018. 10. 4. ·...

NNPDF STUDIES ON PDFnnpdf.mi.infn.it/wp-content/uploads/2018/10/pdf4lhcstat.pdf · 2018. 10. 4. ·...

Date post: 30-Jan-2021
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
14
NNPDF STUDIES ON PDF UNCERTAINTIES STEFANO FORTE MILAN UNIVERSITY & INFN FOR THE COLLABORATION: R. D. BALL, L. DEL DEBBIO, S.F., A. GUFFANTI, J. I. LATORRE, J. ROJO, M. UBIALI PDF4LHC WORKSHOP DESY, OCTOBER 23, 2009
Transcript
  • NNPDF

    STUDIES ON PDF UNCERTAINTIES

    STEFANO FORTEMILAN UNIVERSITY & INFN

    FOR THE COLLABORATION: R. D. BALL, L. DEL DEBBIO,

    S.F., A. GUFFANTI, J. I. LATORRE, J. ROJO, M. UBIALI

    PDF4LHC WORKSHOP DESY, OCTOBER 23, 2009

  • SOME QUESTIONS:

    � ARE EXPERIMENTAL UNCERTAINTIES SIZABLY UNDERESTIMATED?

    ARE THERE SIGNIFICANT DATA INCOMPATIBILITIES?

    � WHERE DOES THE UNCERTAINTY ON PDFS COME FROM?

    IS IT RELATED TO PARTON PARAMETRIZATION?

    � DOES THE TREATMENT OF CORRELATED UNCERTAINTIES HAVE AN IMPACT?

  • SOME QUESTIONS:

    � ARE EXPERIMENTAL UNCERTAINTIES SIZABLY UNDERESTIMATED?

    ARE THERE SIGNIFICANT DATA INCOMPATIBILITIES?

    � WHERE DOES THE UNCERTAINTY ON PDFS COME FROM?

    IS IT RELATED TO PARTON PARAMETRIZATION?

    � DOES THE TREATMENT OF CORRELATED UNCERTAINTIES HAVE AN IMPACT?

    WILL BE ADDRESSED USING THE NNPDF METHODOLOGY;

    ALL STUDIES BASED ON PUBLISHED NNPDF1.2 FIT

  • RELEVANT NNPDF FEATURES

    A REMINDER

    MONTE CARLO

    � PDFS ARE FITTED TO DATA REPLICAS

    � REPLICAS FLUCTUATE ABOUT CENTRAL DATA:

    F

    (art)(k)

    i;p

    = S

    (k)

    p;N

    F

    exp

    i;p

    �1 + r

    (k)

    p �stat

    p +

    PNsys

    j=1

    r(k)

    p;j�

    sys

    p;j�

    � SIZE OF FLUCTUATION $ DATA UNCERTAINTY

    SAME AS FLUCTUATION OF CENTRAL DATA

    ABOUT “TRUE” VALUE

    REPLICA STANDARD DEV.

    VS. UNCERTAINTIES

    0.0001

    0.001

    0.01

    0.1

    1

    10

    100

    0.0001 0.001 0.01 0.1 1 10

    Mon

    te C

    arlo

    rep

    licas

    Experimental data

    NNPDF1.2 - Errors

    Nrep=10 Nrep=100 Nrep=1000

  • RELEVANT NNPDF FEATURES II

    CROSS-VALIDATION

    � REPLICAS ARE FITTED TO A DATA SUBSET

    � A DIFFERENT SUBSET OF DATA USE FOR EACH REPLICA

    OPTIMAL FITTING

    2 FIT TO DATA

  • RELEVANT NNPDF FEATURES II

    CROSS-VALIDATION

    � REPLICAS ARE FITTED TO A DATA SUBSET

    � A DIFFERENT SUBSET OF DATA USE FOR EACH REPLICA

    � THE BEST FIT IS NOT AT THE MINIMUM OF THE �2

    OVERFITTING

    2 FIT TO DATA

  • IDEAS

    Thanks to J. Pumplin

    � FIT TO REPLICAS VS. FIT TO DATA PARTITIONS ,

    ,FLUCTUATION OF DATA (TRUE) VS. FLUCTUATION OF REPLICAS (NOMINAL)

    � FIT TO PARTITIONS VS. FIT TO A SINGLE PARTITION ,

    , UNCERTAINTY DUE TO DATA VS. UNCERTAINTY DUE TO OTHER SOURCES

    � OPTIMAL FIT VS. OVERLEARNING FIT ,

    , UNDERLYING LAW VS. STATISTICAL NOISE

  • WHERE IS THE UNCERTAINTY COMING FROM?FIT TO REPLICAS VS RANDOM SUBSET OF CENTRAL VAL.S

    REPLICAS CENTRAL V.

    �2 1.32 1.32

    h�2irep 2:79� 0:24 1:65� 0:20

    h�dati 0.039 0.035GLUE

    replias . vals.

    LIGHT QUARKS

    STRANGE

    � QUALITY OF FIT &PDFS UNCHANGED

    � REDUCTION OF h�2irep BY FACTOR � 2 ) FLUCTUATIONS ABOUT TRUE VALUE HALVED

    � UNCERTAINTY ON DATA ONLY REDUCED BY 1.1 ) EXPT. UNCERTAINTIES UNDERESTIMATED

    OR UNDERLYING INCOMPRESSIBLE UNCERTAINTY

  • WHERE IS THE UNCERTAINTY COMING FROM?CENTRAL VALUES: VARYING PARTITION VS FIXED PARTITION

    REPLICAS CENTRAL VALUE FIXED PARTITION

    �2 1.32 1.32 �1.3

    h�2irep 2:79� 0:24 1:65� 0:20 � 1:6� 0:2

    h�dati 0.039 0.035 �0.03

    �xed partition results obtained averaging over 5 di�erent hoies of

    partition (100 replias eah); more partitions needed for a

    urate results

    � QUALITY OF FIT UNCHANGED

    � h�2irep UNCHANGED ) CENTRAL FIT UNCHANGED

    � UNCERTAINTY ON PREDICTION (I.E. ON PDFS) REDUCED

    FUNCTIONAL UNCERTAINTY

    � MORE THAN HALF OF UNCERTAINTY DUE TO “FUNCTIONALFORM”: h�dati =� 0:3 SMALLER FOR HERA DATA

    � REMAINING UNCERTAINTY ROUGHLY SCALES WITH DATA UN-CERTAINTY: h�dati =� 0:005 CENT.; h�dati =� 0:009 REP.

    GLUE

    x-510 -410 -310 -210 -110 1

    ) 02xg

    (x,

    Q

    -2

    -1

    0

    1

    2

    3

    4CTEQ6.6

    MRST2001E

    NNPDF1.2

    Current fit

    VALENCE

    x0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

    ) 02 (

    x, Q

    TxV

    -0.2

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    1.4CTEQ6.6

    MRST2001E

    NNPDF1.2

    Current fit

    TRIPLET

    x0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

    ) 02 (

    x, Q

    3xT

    0

    0.1

    0.2

    0.3

    0.4

    0.5CTEQ6.6

    MRST2001E

    NNPDF1.2

    Current fit

    STRANGE

    x0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

    ) 02 (

    x, Q

    +xs

    0

    0.05

    0.1

    0.15

    0.2

    0.25

    0.3CTEQ6.6

    MRST2001E

    NNPDF1.2

    Current fit

  • ARE WE CONSTRAINED BY THE FUNCTIONAL FORM?REMOVE STOPPING: OVERLEARNING FIT

    PERFORM A FIT WITH A FIXED, VERY LARGE NUMBER OF GA GENERATIONS:25000 gens. (AVERAGE 1000 gens. FOR STANDARD FIT)

    STANDARD STOPPING FIXED LONG

    REPLICAS CENTRAL VALUE FIXED PARTITION REPLICAS CENTRAL VALUE

    �2 1.32 1.32 �1.3 1.18 1.19

    h�2irep 2:79� 0:24 1:65� 0:20 � 1:6� 0:2 2:43� 0:13 1:29� 0:06

    h�2trirep 2.76 1.59 �1.6 2.40 1.27

    h�2valirep 2.80 1.61 �1.6 2.47 1.30

    h�dati 0.039 0.035 �0.03 0.032 0.019

    2

    OF THE GLOBAL FIT DECREASES A LOT!IS IT REALLY OVERLEARNING?

    � PERCENTAGE DIFFERENCE BETWEEN VALIDATION AND TRAINING

    h�2irep MORE THAN DOUBLED (FROM 1.5% TO 3%)

    (NOTE 1650 DATA POINTS EACH)

    � SOME PDFS HAVE FUNNY SHAPES

    � REDUCTION OF h�dati BY FACTOR 1:7 >

    p

    2WHEN GOING FROM REPLICAS TO CENTRAL VALUES

    � , h�2irep

    GLUON

    x-510 -410 -310 -210 -110 1

    ) 02xg

    (x,

    Q

    -2

    -1

    0

    1

    2

    3

    4CTEQ6.6

    MRST2001E

    NNPDF1.2

    Current fit

    TRIPLET

    x0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

    ) 02 (

    x, Q

    3xT

    0

    0.1

    0.2

    0.3

    0.4

    0.5CTEQ6.6

    MRST2001E

    NNPDF1.2

    Current fit

  • ARE WE CONSTRAINED BY THE FUNCTIONAL FORM?REMOVE STOPPING: OVERLEARNING FIT

    PERFORM A FIT WITH A FIXED, VERY LARGE NUMBER OF GA GENERATIONS:25000 gens. (AVERAGE 1000 gens. FOR STANDARD FIT)

    STANDARD STOPPING FIXED LONG

    REPLICAS CENTRAL VALUE FIXED PARTITION REPLICAS CENTRAL VALUE

    �2 1.32 1.32 �1.3 1.18 1.19

    h�2irep 2:79� 0:24 1:65� 0:20 � 1:6� 0:2 2:43� 0:13 1:29� 0:06

    h�2trirep 2.76 1.59 �1.6 2.40 1.27

    h�2valirep 2.80 1.61 �1.6 2.47 1.30

    h�dati 0.039 0.035 �0.03 0.032 0.019

    2

    OF THE GLOBAL FIT DECREASES A LOT!IS IT REALLY OVERLEARNING?

    � PERCENTAGE DIFFERENCE BETWEEN VALIDATION AND TRAINING

    h�2irep MORE THAN DOUBLED (FROM 1.5% TO 3%)

    (NOTE 1650 DATA POINTS EACH)

    � SOME PDFS HAVE FUNNY SHAPES

    � REDUCTION OF h�dati BY FACTOR 1:7 >

    p

    2WHEN GOING FROM REPLICAS TO CENTRAL VALUES

    � AMOUNT OF OVERLEARNING SMALL, , h�2irep DOUBLES WHEN

    GOING FROM CENTRAL VALS. TO REPLICAS,

    SHOULD REMAIN UNCHANGED FOR EXTREME OVERLEARNING

    YES!

    GLUON

    x-510 -410 -310 -210 -110 1

    ) 02xg

    (x,

    Q

    -2

    -1

    0

    1

    2

    3

    4CTEQ6.6

    MRST2001E

    NNPDF1.2

    Current fit

    TRIPLET

    x0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

    ) 02 (

    x, Q

    3xT

    0

    0.1

    0.2

    0.3

    0.4

    0.5CTEQ6.6

    MRST2001E

    NNPDF1.2

    Current fit

  • WHERE IS THE UNCERTAINTY COMING FROM?WHEN THE BEST FIT IS NOT AT THE MINIMUM

    STANDARD STOPPING FIXED LONG

    REPLICAS CENTRAL VALUE FIXED PARTITION REPLICAS CENTRAL VALUE

    �2 1.32 1.32 1.35 1.18 1.19

    h�2irep 2:79� 0:24 1:65� 0:20 1:60� 0:19 2:43� 0:13 1:29� 0:06

    h�dati 0.39 0.35 0.28 0.32 0.19

    � FIT QUALITY:

    { “FUNCTIONAL” UNCERTAINTY SUPPRESSED IN OVERLEARNING FITS:

    ) h�

    dati � 0:2 ) “DATA” UNCERTAINTY

    { FLUCTUATION OF h�2irep FOR OVERLEARNING FIT STATISTICAL:

    � =

    q2

    Ndat

    � 0:05

    { FLUCTUATION OF h�2irep IN STANDARD FIT MUCH LARGER:

    CONTROLLED BY DISTANCE FROM THE MINIMUM

    IF ��2 = 1 DUE TO UNDERLYING PARM AT �2min

    , THEN ONE SIGMA VARIATION AROUND

    �20

    > �2min

    EQUALS

    p�20

    � �2min

    � DATA INCONSISTENCY: FOR STANDARD FIT, VALUE OF �2 = 1:3 > 1

    ) ERRORS UNDERESTIMATED BY 30%

  • THE IMPACT OF CORRELATED UNCERTAINTIESREPEAT THE FIT NEGLECTING ALL CORRELATIONS (A.Donati)

    � DIAGONAL �2 OF DIAGONAL FIT MUCH LOWER,

    CORREL. �2 OF TWO FITS UNCHANGED

    � DIAGONAL FIT REWEIGHTS EXPERIMENTS

    ) EXPTS WITH LARGER SYST. (FIXED TARGET)GET SMALLER WEIGHT

    � VALENCE & STRANGE PDFS AFFECTED

    AT THE

    14� LEVEL

    SINGLET STRANGE

  • SUMMARY

    � A LARGE FRACTION OF THE UNCERTAINTY COMES FROM THE FREEDOM TO

    CHOOSE THE FUNCTIONAL FORM

    FLUCTUATIONS OF FIT QUALITY DOMINATED BY LACK OF KNOWLEDGE OF

    THE “TRUE” UNDERLYING FUNCTIONAL FORM

    � SOME DATA INCOMPATIBILITY (UNDERESTIMATION OF DATA UNCERTAINTY),BUT SMALL EFFECT

    ABOUT 30% ON AVERAGE, CONCENTRATED ON LIMITED NUMBER OF DATAPOINTS

    � INCLUSION OF CORRELATED SYSTEMATICS HAS A SMALL BUT

    NON-NEGLIGIBLE EFFECT


Recommended