Document-level Text Quality: Models for Organization and ...

Post on 09-Dec-2021

6 views 0 download

transcript

Document-levelTextQuality:ModelsforOrganizationandReaderInterest

AnnieLouisOctober16,2015

Université Catholique deLouvain

JointworkwithAni Nenkova

Peoplespontaneouslyrespondtodifferencesinwriting

2

3

“Finnegans Wake islong,dense,andlinguisticallyknotty,yethugelyrewarding,ifyou'rewillingtolearnhowtoreadit...”

http://www.publishersweekly.com

“MyFaith:WhyIdon'tsingthe'StarSpangledBanner’”

“What a poorly written article. Strays off topic and hardly even addresses the point of the article.

The only brief mention of why they don’t play the national anthem is that they believe in church and state. This just was one long rant about his religion.”

4

http://www.cnn.com

5

http://www.vocabula.com

TextQualityPrediction

Thisarticleiswell-written.Nextone..

Canweteachcomputerstomakesimilarjudgements?

6

o Howtoformulatethetask?

o Getsuitabledatawithdistinctions

o Findcorrelatesintext

Whydowecare?• Informationretrieval,articlerecommendation

– Allarticlesarenotofthesamequality– Canfilterbyqualityinadditiontorelevance

• Authoringsupport,educationalassessment– Automaticassessmentischeap,consistentandquick– Spellingandgrammarcorrectionarecommerciallysuccessful

• Textgenerationsystems– Systemscanunderstandhowtogeneratecoherenttext– Canevaluatesystemoutput

7

Thistalk

• Definingtextqualityandcreatingacorpusofoverallarticleratings– Largescalerealisticsampleofwritingdifferences

• Twomodels– Amodelfororganizationusingsyntaxpatterns– Amodelforreaderinterest

• Document-levelqualityprediction– Incontrasttospellingandgrammar– Oftennotabinary,correct/in-correctdistinction

8

>>DefiningTextQuality

9

• Aspectsofquality• Whoistheaudience?

Aspectsofquality

• Weadoptadefinitionfromtheeducationfield

10

Ideasanddevelopment

Organization(Smooth

transitions)

Voice(Personaltouch)

Wordchoice(vivid, lively)

Sentencefluency(Rhythm)

Conventions(Mechanics)

SixTraits[Spandel 2004]Detailsandtheirpresentation

Flowbetweensentences

Interestingnature,beautifulwriting

Spelling,grammar,layout

11

Audiencefortextquality– Anexpert

lowcompetency highcompetency

ExperiencedNLPresearcher

Readerofmachine-generatedtext

Adult readerofnewspaper

• Increasedfocusonlinguisticpropertiesofthetext

12

Relationshiptoreadability• Readabilityhasastrongfocusoncomprehension

Gradelevel1

Gradelevel2..

Gradelevel12

13

• Audiencedistinctions– childvs.adult,novicevs.expert,cognitivedisabilityornot

>>ACorpusforDocument-levelQuality

14

Louis&Nenkova,DiscourseandDialogue,2013

Sciencejournalism:examplesnippet

Sarah Lewis is fluent in firefly.

On this night she walks through a farm field in eastern Massachusetts, watching the first fireflies of the evening rise into the air and begin to blink on and off.

Dr. Lewis, an evolutionary ecologist at Tufts University...

15

Category1:VERYGOODarticles• Seedset=63NewYorkTimesarticlesthatappearedintheBestAmericanScienceWritingseries

• WechooseonlytheNYTarticles– WeusetheNYTCorpustoexpandourcategory– Normalizefordifferencesinwritingduetosource

16

Topicsintheseedset

17

Tag AppearanceMedicineandHealth 22

Space 14Physics 10Biology andBiochemistry 8

GeneticsandHeredity 8

Archaeology andAnthropology 7

ComputersandtheInternet 4

ExpandingtheVERYGOODset

• Assume:~40authorsoftheseedsetareexcellentwriters

• OtherarticlesfromtheNYTwrittenbythesameauthors– whichareresearchrelated– duringthesame10yearperiod– onsimilartopics– similarlengths

18

Category2:TYPICALwritingintheNYT

• Othersciencearticlesaroundthesametime,butnotwrittenbythepopularauthors

Category TotalArticles

VERY GOOD 3,530

TYPICAL 20,242

Thegeneralcorpus:

19

Atopic-pairedcorpus

• Thegeneralcategoriesmixdifferenttopics– geography,biology,astronomy,linguistics…

• ButanIRsystemcomparesarticlesonthesametopic

• ForeachVERYGOODarticle,get10mostsimilarTYPICALarticles(basedonthecontent)

• Enumerateallpairsof(VERYGOOD,TYPICAL)

• 35,300pairs

20

Twoqualitypredictiontasks

`Same-topic’– whicharticleinthepairistheVERYGOODone?

2categoriesGOOD(~3500)TYPICAL(~3500)

Topicallysimilarpairs<VERYGOOD,TYPICAL>

~35,000pairs

`Any-topic’– isthisarticleVERYGOODorTYPICAL?

21

Propertiesofthedataset

• Distinguishesaveragewritingfromverygood

• Allowtofocusonaspectssuchasbeautifulwriting– Lesslikelytohavespellingandgrammarerrors

• Largescaleandrealisticsampleofwritingdifferences– Previousworkoftenusedmachinegeneratedtextorartificiallymanipulatedtext

22

>>Predictingorganizationquality

23

Louis&Nenkova,EMNLP2012

Somesequencesofsentencetypesconveytheoverallpurposebetter

24

Solving X is useful for many applications.

We present a new approach to address X.

Results show that our method works well.

Motivation

Introduceapproach

Results

Intentionalstructureofanarticle• Everytexthasapurposethattheauthorwishestoconvey

• Influentialearlytheoriesdiscussitatlength

[Grosz&Sidner 1986]

• Particularlyforacademicwriting,itispopulartoseearticlesasasequenceofintentions

[Swales1990,Teufel 2000]

Narrative

Explanation Critique

25

Oraclemodelofintentionalstructure• UsingmanualannotationsofintentionsonACLarticles

26

STARTBackground

AimOwn

Contrast

TextualEND0.7

0.1

0.3

MarkovChainonIntroductionsections

[corpusbyTeufel,2000]

0.8

0.4

Otherswork0.1

Mainideaofthework• Annotatingsentencetypesishard.Pre-definingthesetof

sentencetypesisharder

• Assume

27

Syntax~roughproxyforsentencetype

28

Syntacticpatternsinexplanations

• An aqueduct is awatersupplyornavigablechannelconstructedtoconveywater.Inmodernengineering, thetermisusedforanysystemofpipes,canals,tunnels,andotherstructuresusedforthispurpose.

• A cytokinereceptoris areceptor thatbindscytokines.Inrecentyears, thecytokinereceptorshavecometodemandmoreattentionbecausetheirdeficiencyhasnowbeendirectlylinkedtocertaindebilitatingimmunodeficiencystates.

Definitionslooklikethis

Descriptivearticleslooklike

this

indefinitearticletermtodefine

Relativeclause

is/are NPMorespecific:topicalizedPP

Syntax-basedHMMmodel

START END

0.5

0.3

0.2

0.3

0.2

VPà VBZNP

NPà DTADJP

NPà NPPP

….

“Definitions”

NPà NNPCCNNP

NPà CD

NPà NP,NP

“Citations”

VPàMDVP

VPà VBVP

VPà VBPP

“Speculations”

29*Usesgrammaticalproductions

30

• Moreinformationaboutadjacentconstituents• APOStagsequencelosesallabstraction

• D-sequence– controlabstractionusingaparameter“depth”(d)

Asecondmodel:basedond-sequences

S

NP”,S“ VP .

NP VP

DT VBZ NP

NN

NNPNNP VBD

JJ

[“DTVBZJJNN,”NNPNNPVBD.]

31

Step1– depthcutoffROOT

S

NP”,S“ VP .

NP VP

DT VBZ NP

JJ NN

NNPNNP VBD

Chooseadepthd

Terminatetreeatd

Readoffnewleavesfrom lefttoright

d=2

“S,”NPVP.

d=3“NPVP,”NNPNNPVBD.

“That’sgoodnews,”Dr.Leaksaid.

32

Step2:NodeaugmentationROOT

S

NP”,S“ VP .

NP VP

DT VBZ NP

JJ NN

NNPNNP VBD

Forphrasalnodes ind-sequence,

- Annotatewithleftmostleafinfulltree

d=2

“ SDT , ” NPNNP VPVBD .

d=3

“ NPDTVPVBZ ,” NNPNNPVBD.

DT NNP VBD

DT VBZ

Evaluationtaskonacademicwriting

• ACLanthologycorpus– abstract,introduction,relatedwork

• Approximatedistinction fororganizationquality– Originalarticleà well-organized– Randompermutationoforiginalà poorly-organized– Createpairs<original,permutation?

• Task:identifytheoriginalversioninthepair– Baseline50%accuracy

33

Summaryofresultsonacademicwriting

• Correct=higherlikelihoodfororiginalarticle– versuspermutedarticle

• D-seq model

34

ACLconference Accuracy

Abstract 62.9

Introduction 68.8

Relatedwork 72.7

Baseline=50%

DosentencetypesdistinguishVERYGOODandTYPICALsciencenews?

• CreatetheHMMonVERYGOODtrainingarticles

• Getlikelihoodandmostlikelystatesequenceforanewarticle– Computefeaturesbasedonthese

• AclassifieristrainedtopredicttheVERYGOODarticle

35

Resultsonourcorpus

36

AnyTopic:Givenanarticle,isit“VERYGOOD”or“TYPICAL”?

System Accuracy

Baseline(random) 50%

HMM-productions 61%

§ 10foldcrossvalidationresults§ SVMclassifier

SameTopic:Givenapairofarticlesonthesametopic,whichoneis“VERYGOOD”?

System Accuracy

Baseline(random) 50%

HMM-productions 63%

>>Predictingreaderinterest

37

Louis&Nenkova,TACL2013

Predictinginterest:Anewtask

• Alotofworkonidentifyingwhatiswrongwithatext– Spellingmistakes,grammarerrors,incoherentwriting

• Itisnotknownhowtocharacterizewritingthatisengaging,interestingandnice

38

Approachtofeaturedevelopment

• Focusoninterpretablefeatures– Only41features– Eachfeatureisacompositeone:indicatesanaspectdirectly– Linguisticallyinteresting

• Confirmthatfeaturesrepresenttheintendedaspect– Tunebycheckingfeaturevaluesonrandomsnippetsoftext

39

1.UnusualwordsandphrasesIsthephrasingandlanguageuseunique?

• Word-based– highperplexityunderaphonemen-grammodel– Eg:‘undersheriff’,‘powwow’,‘chihuahua’,‘qipao’

• Wordpairs--based– adjective-noun,noun-noun,adverb-verb,subject-verbpairs– perplexityunderalanguagemodel– Eg:‘plastickywoman’,‘so-calledsuperkids’

40

2.VisualnatureIstherescenesetting?

• Creatingalargelexiconofvisualterms– Source:animage-taggedcorpus– Largesourceofpotentiallyvisualwords,butnoisy

• CreateLDA-basedtopicsonthetagset– UsethemanualMRCtermstofilteroutnon-visualtopics

41

grass,mountain,green,hill,blue,field,sand...round,ball,circles,logo,dots,square,sphere...silver,white,diamond,gold,necklace,chain...

Humaninterestandtextstructure

3. UseofpeopleinthestoryDoesthestoryrevolvearoundaperson?

– animacy informationfromNEs,pronouns,ngram patterns

4. Sub-genreIsthearticleisanarrative,interviewordialog

– Eg:narrativescore~pasttenseverbs,pronouns,propernames

42

SentimentandResearch

5. AffectIsthereanemotionalangletothestory?

– usingsentimentworddictionaries

6. ResearchcontentHowmuchexplicitresearchdescriptionispresent?

– usingahand-builtdictionaryofresearchwords

43

Howthefeaturesvaryinarandomsampleofverygoodandtypicalarticles(t-test)

Higher valuesinVERYGOODset

ü Visualwordsinbeginningandendofarticles

☓ Totalvisual words

☓Animacy countsü Unusualwordsandphrases

☓Narrative,interviewordialogformat

ü Sentimentwords,negativepolarity

ü Researchwords44

AccuraciesonthetwotasksAnyTopic:Givenanarticle,isit“VERYGOOD”or“TYPICAL”?

System Accuracy

Baseline(random) 50%

Interesting-sciencefeatures

75%

§ 10foldcrossvalidationresults§ SVMclassifier

SameTopic:Givenapairofarticlesonthesametopic,whichoneis“VERYGOOD”?

System Accuracy

Baseline(random) 50%

Interesting-sciencefeatures

68%

45

Combininginterestwithotheraspects

46

Featureset anytopic sametopic

Interesting science 75.3 68.0

PreviousmethodsforpredictingotheraspectsReadable(article length,language

model, cohesion,syntax)16features

65.5 63.0

Well-written(entitygrid[BL08],discourserelations[PN08])

23features

59.1 59.9

Interesting-fiction[ML09]22features

67.9 62.8

Combination ofallfeaturesAllwriting aspects 76.7 74.7

Differentaspectsofwritinghave

complementarystrengths

Genre-specificmeasuresarestrongerthangenericones

Conclusions

• Textqualityisaninterestingandchallengingtask

• Moresuccessonthetopicrecently– applicationtonovels,tweets,essays

• Futurework– Alottobedoneintermsofformalizingthetasks,collectingdata,modelsandevaluation

– Transferringtheknowledgetogeneratingtexts

47

Thankyou!

48