GCS HLRS · has become a major research hub for simulation in Europe. The focus of our work is on...

Post on 07-Jul-2020

0 views 0 download

transcript

GCS HLRS

•MichaelResch,

Universityof Stuttgart,HLRS

2

researchworkforapplicationsonsupercomputers.

OverthelastyearsHLRShasexpandeditsresearchactivitiessubstantiallyandhasbecomeamajorresearchhubforsimulationinEurope.ThefocusofourworkisondevelopingnewmethodsandapplicationsinthefieldofHPCanddistributedcomputing.GridandCloudcomputingareonlytwoofthecornerstonesofourresearch.Visualizationisamainfieldofourresearchasitenablestheintegrationofsimulationinworkflowsbothinscienceandindustry.ThefundingfortheseprojectscomesfromtheStateofBaden-Württemberg,theFederalMinistryofScience(BMBF),theDeutscheForschungsgemein-schaft(DFG),theEuropeanCommis-sionanddirectlyfromindustry.

Abookletlikethiscanonlygivearoughoverviewofouractivities.Oursystemsareupandrunning7x24.SupportforourusersisgivenfromBarcelonatoPoznanandfromRometoStockholm.Ontheaverage80usergroupsacrossEuropehaveaccesstooursystems.Eachofthesegroupsisfocusingonresearchprojectsinthefieldofsimulation.Atthesametimemorethan30inhouseresearchprojectsinthefieldofHPCareconductedbyscientistsfromHLRS.TogethertheseactivitiescreateacenterofexcellencethatisacornerstoneoftheUniversityofStuttgartandisrenownedworldwide.

HighPerformanceComputingattheUniversityofStuttgartstartedabout50yearsago.In1996theHPCCenter/HöchstleistungsrechenzentrumStuttgart(HLRS)wasfoundedasthefirstGermannationalHPCcentre.SincethenHLRShasestablisheditselfasoneofthelead-ingcentersworldwidewithitsfocusonengineeringscienceandindustrialHPC.FromthestartHLRSwasgivingsupporttolocalindustrialleaderslikeDaimlerandPorsche.ThroughhwwGmbHHLRSprovidesindustrywithaccesstosystemsandthroughtheAutomotiveSimulationCenterStuttgart(ASCS)providesexpertiseinsimulation.Since2007HLRSisamemberintheGaussCentreforSupercomputing(GCS).TogetherwithitsfellowcentresatJülichandGarchingitsupportsEuropeanresearch-ersintheEuropeanPartnershipforAdvancedComputinginEurope(PRACE).Throughalongtermstrategichard-wareandsoftwareconceptHLRShaslaidthefoundationforthesuccessfulusageofsimulationscienceinresearchandindustry.

ThemissionofHLRStodayistoactasacentreofcompetence,supportusersandconductresearchinthefieldofHPC.ThefocusofourusersisonCom-putationalFluidDynamics.Applicationsofthistechnologyrangefromflowoverthewingsofanaircrafttotheoceancirculation.Inthisbookletwepresentthehighlightsofapplicationsoftheyears2010and2011.ManyofthemwereawardedtheGoldenSpikeAwardofthesteeringcommitteeofHLRS.Thisawardisgiveneveryyeartoyoungsci-entistswhoshowoutstandingqualityof

Welcome to the World of HLRS

HLRS» GrowingScienceatHLRS–BeyondBareMetal» Industryinside» ComputingSystems

Applications» Laminar Flow Control in Vortex-deformed Swept-WingFlows:PinpointSuction» Fluid-Structure coupled Flow Simulations of Helicopter Rotors» Direct Numerical Simulation of Active Separation Control Devices» High Order large Scale Calculations» Exotic State in Correlated Relativistic Electrons» The Agulhas System as a Key Region of the global oceanic Circulation» Modelling Convection over West Africa» The Maturing of Giant Galaxies by Black Hole Activity

Projects» CoolEmAll-PlatformforOptimizingtheDesign, OperationandCoolingofmodularconfigurable ITInfrastructures» VISIONAIR » Collaborative Research into Exascale Systemware, Tools and Applications (CRESTA)» Debugging on the next Level: Temanejo» Towards High Performance Semantic Web – Experience of the LarKC Project» plugIT – Plug Your Business into IT» GAMES (Green Active Management of Energy in IT Service centres)» Towards EXascale ApplicatTions (TEXT)

3

Contents

46

10

14

18

22

283238

4250

56

6264

6872

7680

84

Figure1:ProjectfundingofHLRSoverthelast10years Figure2:NumberofHLRSemployeesoverthelast7years(RAisstudentresearchassistants

Theimportanceofasupercomputingcenteristypicallymeasuredinspeedofitssupercomputer.Asoftodaythekeynumbertobeachievedis1Petaflop/sandbeyond.However,thesystemitselfisbyfarnotenough.Itisonlyatooltodevelopsolutions.Hence,systemshavetobesetinaworking,stable,andwellorganizedenvironment.Firstofall,theyneedanexcellentinfrastructuretohostthem.Second,theyneedsupportbythehostingcenterthathelpsuserstogetperformancefromthesystem.Third,ittakesalotofsystemrelatedresearchtodevelopnewmethodsandharvestthefullpotentialofasuper-computer.Finally,inordertobridgethegapbetweenpureresearchandrealworldusageofthesystems,aconceptisrequiredtointegratesupercomputingintotherealworldofresearch,devel-opmentandproduction.

HLRShasinstalledaCrayXE6systemrecently.Thepeakperformanceisintherangeof1PF/sandweseesustainedperformanceintheorderof20-25%ofthepeakperformance[1].BeyondthepurehardwareHLRShasmadefurtherstepstogrowitsHPC-ecosystem.

InfrastructurePowersupplyandcoolingareoneofthemostpressingissuesthesedays.ThereforeHLRShasbuiltnewpowerandcoolingfacilities.Theyprovideadditional4MWofpowerwhichwillbringpowersupplyofHLRStoatotalof5MW.Atthesametimeitwillprovideaveryefficientwatercoolinginfra-structurerelyingfullyonfreecoolingupto18degreesCelsiusoutsidetemperature.AnaveragePUEof1,15isexpected.EnergyoftheCrayXE6willbeusedinthenewresearchbuildingofHLRScurrentlyunderconstruction.

Withanincreaseinnumberofresearchprojectsandfunding(seeFig.1)alsothenumberofscientistshasgrown(seeFig.2)overthelastyears.InordertomeettherequirementsofHLRS,theplanningforanewresearchbuildingwasstartedin2007andtheconstructionworkstartedinJune2011.ThebuildingisanextensionoftheexistingHLRSheadquarterandwillprovideadditional1,950m²ofof-ficespaceallowingHLRStobringallemployeestogetherinasingleenviron-ment.Furthermorethenewbuilding

Growing Science at HLRS – Beyond Bare Metal

Activities

4

•MichaelResch,

Universityof Stuttgart,HLRS

willintegrateafive-sidedVirtualRealityenvironment.Ontheonehandthiswillallowformorerealisticvisualizationofsimulationresults.OntheotherhandthenewCAVEwillhaveadirectphysi-callinktothesupercomputersallowingmoreinteractiveusagemodelssuchassimulationsteering.

ResearchOverthelastyearsHLRShasincreasedthescopeofitsresearchtremendously.FourmajorlevelsofresearchcanbeidentifiedwhicharemergedbyHLRSintoacoherentprogramofresearchforHighPerformanceComputing.

Statelevel:AtthestatelevelHLRShasagreedwiththeStateMinistryofScienceonamajorinitiativetodevelopscalablesoftwareforHighPerformanceComputing.Theinitiativewillmakeavailableupto30Mio.Euroforthepurchaseanddevelopmentofhighlyscalablesoftware.

Federallevel:SincethestartHLRSisparticipatingintheHighPerformanceComputingsoftwareinitiativeoftheGermanFederalMinistryofScience.HLRSisworkingonavarietyofapplicationdrivenresearchprojectsthataimattransformingflopsintosolutions.AtthesametimeHLRSistakingpartinaGermanClusterofExcellencefundedbytheGermanResearchSociety(DFG)called“SimTech”[2].SimTechhasafocusonbasicmethodsinsimulationtechnologyandistheonlysuchclusterofexcellenceinGermany.WithinSim-TechtheDirectorofHLRSProfessorMichaelReschservesasaprincipalinvestigatorforHPC.

Europeanlevel:AttheEuropeanlevelHLRShasalonghistoryofprojectresearchandcollaboration.Themost

recentHighPerformanceComputingprojectCRESTAisaimingatsupportingthedevelopmentofExascalesystemsinEurope.However,projectresearchatHLRSgoesbeyondExascale.ManyofourprojectsaimatmakingHPCmoreproductiveorbringingHPCclosertotheindustrialenvironmenttowhichHLRSisconnected.

Industriallevel:ThecollaborationofHLRSwithindustryismanifold.HLRShaslongcollaboratedwithMicrosoftindevelopingsoftwareforclustersystems.RecentlyHLRSstartedaCrayCenterofDevelopment.OverthenextfiveyearsresearchersfromCrayandfromHLRSwillworkonscalableapplicationsandontoolstoharvestthepotentialoffutureHPCsystems.

SummaryBeyondtheinstallationofhardwareanditsparticipationinorganizationalcollaborationslikeGCSandinfrastructureinitiativeslikePRACE[3]HLRSisheavilyinvolvedinresearch.OverthelastyearsandintheyearstocomeHLRSissubstantiallyimprovingitsinfrastructureandhasatthesametimesetupaframe-workforresearchthatintegratesvariousscopesandlevelsoffundingtodevelopbettersolutionsforitsshare-holders.

References[1] Resch,M. NewHLRSSystemHERMIT,inSiDE, Vol.9,No.1,Spring2011

[2] SimtechExcellenceCluster, www.simtech.uni-stuttgart.de

[3] PRACE,www.prace-ri.eu

Activities

5

HighPerformanceComputingiswidelyclaimedtohaveanimportantimpactontheeconomyofacountry.HPCisconsideredtobeakeytechnologyinordertostaycompetitive(seeforex-ample[1]).However,alookatthefast-estsystemsintheworldsrevealsthatstill92%ofthetop100systemsarenotindustrialsystems.Thisdiscrepancyisstriking.OnewouldexpectindustrytouseHPCasmuchaspossibleinordertoimproveitscompetitiveness.InthisarticlewedescribethemodelsuccessfullydeployedatHLRSachievingHPCintegrationinthesimulationprocessesinindustryonlargescalecomputingfacilitiesalreadyformanyyears.Furthermore,wepresentnewapproachesofhowtoextendthereachofHPCbothintypeofcompany(SME)andintypeofusage.

RoadblocksSo,whataretheroadblocksforindustrialHPCmakingitobviouslysodifficulttouseHPCinindustry?First,therearecostissues.HPCisexpensive.Whenlookingatthetop100systemsintheworldwehavetoconsiderinvestmentcostsintherangeofEuro5Mio.,andadditionalfixedoperationalcostsinasimilarrangeoveranoperationalperiodof3years.Furthermoretheincreasedpowerdemandandcorre-spondingneedforaninfrastructurewithahighlevelofpowerefficiencyleadtosubstantiallyraisedinvestmentcostsfortheinfrastructure.Foratop100systemonewouldestimatethistobeintheorderofanotherEuro5to10Mio.atleastevery5years.AllinallHPCrequiresarelativelyhighleveloffinancialeffortpotentiallydeliveringbenefitinthelongtermbuttypicallyHPCdoes

notguaranteeimmediatereturnofinvestment.AdditionallycostsforHPCinvestmentarefixedcosts.Theycanhardlybereducedintimesofcrisis.

TheobviousalternativeisoutsourcingofHPC.However,publicHPCsystemsoperateinawell-definedenvironment-whichtypicallyisexactlytheoppositeofwhatindustryrequires.ThefocusofapublicHPCcenterisonopenness(wecansafelyignorethespecialcasesofclassifiedsystemsasthese-bynature-arenotavailabletoindustry).Theserviceprovidedisbesteffortastypicallytheservicesareprovidedforfreeorbasedonaresearchgrant.Forindustryopennessisnotaproblempersebutveryoftensimulationdataandsoftwarearesensitive.So,wayshavetobefoundtoguaranteedatasecurityandsafety.Besteffortwasacceptable10to15yearsagowhensimulationinindustrywasitselfaresearchactivityoranaccompanyingmeasureatbest.Today,assimulationisonetooltodevelopnewproducts,andexperimentsareoftenmademainlytosupportresultsofextensivesimulations,simulationisacorepartofwhatindustrycallsproduction.Andasmuchasindustryisworriedaboutstoppingphysicalproductionlines,italsostartstoworryaboutitsvirtualproductionlines-ascanbefoundinHPCsimulation.

PublicHPCforIndustryHPCattheUniversityofStuttgarthasalongtradition.Atleastsince25yearsthecomputingcenterhasoperatedsystemstogetherwithindustry.Com-moninvestmentsweremadealreadybackin1986.Hence,whentheHighPerformanceComputingCenterStuttgart(HLRS)wasfoundedin1996asthefirstGermannationalsupercomputingcenter,anindustrialstrategywasan

Industry inside

News

6

importantpartoftheconcept.Inthelast15yearsthisconcepthasbeenextended.GeographicallythiswasachievedbysettingupverycloserelationswiththeKarlsruheInstituteofTechnology(KIT)anditsSteinbuchCenterforComputing(SCC).Organizationwisethisisamendedbycontinuouslyworkingwithourindustrialpartnersfurtherdevelopingthenecessarytoolsandorganizationalframeworks.NotonlyprovidingHPCcyclestoindustrybutbeyondthatshapingtheprocessesandmethodsrequiredforindustrialHPCadoption.In2011thisconcepthasbeenmovedtoanewlevelintroducingacoupleofdeepchangestotheexistingco-operationmodels.ChangesthatareextremelyimportantforHLRSbutthatcanalsoserveasamodelforageneralconceptofindustrialHPCusageworld-wide.

AnindustrialSolutionIndustrialuseofHPCsystemscomesindifferentflavorsthatrequiredifferentapproachestomeettheirindividualneeds.Thereisnoonesizefitsallapproachforthis.Inthefollowingwelookatthreekeyconceptsthathavedrawnsomeattentioninthelastyears:• Simulationproductioninindustry• Improvementofindustrialin-house methods• Pre-competitiveresearch.

SimulationProductionSimulationasdescribedaboveispartofevery-daydesignanddevelopmentprocessinmanycompaniesalready.Assuchsimulationispartofprocesschainsandhastobeintegratedintooverallprocesses.HPCneedstobemadeavailableonareliableandsecurebasis.Noeffortsforresearcharerequired.

HLRSsetupasolutionforthistogetherwithindustrialpartnersalreadyin1995.

TheHöchstleistungsrechnerfürWissen-schaftundWirtschaftBetriebsgesell-schaftmbH(hww)wasestablishedtogetherwithDaimlerandPorsche.OvertimeitsownershipchangedfromDaimlertoT-Systemswhichiscurrentlyholding40%ofhww.Porscheisstillholding10%.OvertimeHLRSinvitedfurtheruniversitiestojointheminhwwsuchthattoday50%ofthesharesofhwwareevenlysplitbetweentwouniversitiesandtheStateofBaden-Württemberg.

WithHPCbecomingmoreandmorepartoftheproductionprocesstheoperationalmodelofhwwhadtobechangedtoo.Overthelast3yearstheadvisoryboardofhwwhasworkedoutanewbusinessconcept.AccordingtothisHighPerformanceComputingisnowprovidedasacommodityviahww.hwwservesasaplatformandcanpro-videaccesstovariousHPCsystemsondemandinacloud-likeway.ThesuccessofmovingtosuchaserviceorientedapproachcaneasilybeseenfromtheusageofHPCsystemsatHLRS.Overthelastyearsusagehasincreasedfromroughly2millioncorehoursin2007toanewrecordofover20Mio.corehoursin2011.Currentusagenumbersanddiscussionswithindustryindicatethatthegrowthwillcontinueoverthecomingtwoyears.

Thesuccessofhwwisbasedonanumberoffactors:• Cleartechnicalconceptforoperationincludingsecuritymeasuresconformingtoindustrialstandardsandoperationalproceduresagreeduponbetweenindustryandpubicproviders.

• Clearfinancialconceptmakingsurethatbothpublicrequirementsandeconomicnecessitiesarecombinedtofindanoptimumeconomicsolution.

News

7

• Clearlegalproceduresthatensurethatalllegalregulationsaremetandthattaxissuesareresolvedinsuchawayastoeliminatethependingrisksoftaxationforthepublicsector.

Eventhoughsharingpublicresourceswithindustryimposesnewrulesandadditionalworkonthepublicsideitalsocomeswithanumberofbenefits.• ClarificationoffinancialaspectsofHPC.AcomparisonofpublicHPCser-viceswithprivatecloudofferingsiseasilypossibleandcanbedoneanytime.• Increasedlevelofoperationalsecuritywhichisalsobeneficialforthepublicusers.• StabilizationoffinancialplanningasindustrialusagegivespoliticalbackingforHPCinvestment.

Improvingin-houseMethodsImprovementofin-housemethodsisoftendiscussedinthedebateaboutprivateuseofpublicHPCresources.TheargumentgoesthatindustryneedstohaveaccesstoHPCinordertobeableco-operatewithpublicresearchtoimproveitsprocesses.InGermanythereisalongtraditionforindustrial-universityco-operationthathascreatedaframeworkthatisuniqueworld-wide.Inthefieldsofengineering,chemistry,electronicsandmanyotherappliedsciencesresearchdepartmentsofuniversitiesandresearchorganizations

typicallyworktogetherwithindustryinresearchprojects.Veryoftensuchcollaborationshavebeenestablished50ormoreyearsagoandarehandedoverfromdirectortodirectorofaninstituteovertime.Aspartofsuchwell-establishedframeworkspublicandprivateresearcharewellconnected.Currentlyweestimatethataboutonethirdofallprojectsrunningonthesys-temsofHLRShavesomerelationwithapublicindustrialresearchproject.Theverypositivethingaboutsuchprojectsisthattheresultsachievedaremadepubliclyavailable.Onlyiftheresearchisdoneprivatelyandresultsarekeptconfidential,industryhastogothroughthehwwmodelasdescribedabove.

AnotheradvantageofthismodelisthatmanymasterthesesandPhDthesesaredoneincollaborationbetweenpublicresearchdepartmentsandindustry.Throughthisprocessalotofknow-howiscreatedonbothsidesthatbenefitsbothsides.Youngresearchersgetafirstlookintoindustrialprocessesduringtheirthesis.Industrygetstoworkwiththemostadvancedmethodsandcaneasilyrecruitwelltrainedstaffthroughsuchco-operations.

Pre-competitiveResearchWhenitcomestointernationalcompetitivenessanationalapproachisimportant.But,ontheotherhand,competitionisnotonlyaninternationalissuebutalsoanationalone.So,onehastofindawaytotargetgeneralre-searchtopicsbygettingthebestresearcherstofocustheirmindonthem.Thatmaydefinethecompetitivenessofanationaleconomyinthedecadestocome.Andatthesametimeonehastofindawaytokeepindustrialcompaniesclosetosuchresearchinordertoallowaneasyandquicktransitionofresearchresultsintotheindustrial

News

8

•MichaelResch,

Universityof Stuttgart,HLRS

simulationprocess.HLRSandSCChavedecidedtosetup“SolutionCenters”inordertomeetbothrequirements.Todaythefollowingareactive:• AutomotiveSimulationCenter Stuttgart(ASCS)• EnergySolutionCenterKarlsruhe (ENSOC).Bothcentersbringtogetherresearchersandindustry.BothcentershaveintegratedHWandSW-vendorstomakesurethatwhatevernewmethodsareworkedoutbyresearcherscanimmediatelybeintegratedintoexistingcommercialsoftwareandcanbeoptimizedforexistinghardwarefast.

BeyondlargeScaleIndustries–SME-to-HPCUsageofHPCrequiresahighlevelofexpertise.Furthermoreitcomeswithsomecosts.Finallythereisatrustissueconcerningindustrialdata.That’swhySMEstypicallyarenotatthefore-frontofpublicHPCusage.Forthemallthreeissuesareextremelydifficulttohandle.ExpertiseforHPCistypicallynotpartofthecorebusinessofsmallercompanies.TheavailablebudgetforHPCusageistightandevensmallinitialinvestmentsmaybeprohibitivelyhigh.Finally,knowledgeanddataareveryoftenthekeyandonlyassetsofasmallcompany.

HLRSandSCCarethereforespecificallytargetingSMEsintheStateofBaden-Württemberg.InBaden-Württembergalotofinnovationisdeliveredbythesesmallcompanies.Anysupportforthemmaycreateasubstantiallocalreturnofinvestment.SupportforSMEsreliesontrust.Hence,thefocushastobeonlocalinitiatives.Trustfurthermorerequiresalotofpersonalinvolvement.Especiallyatthebeginningitisimportanttohavepeopleathandthatlistenandprovidesolutionsifproblemsoccur.HLRSandSCCthereforedecidedtoset

upanewcompany.Followingtheacro-nymofhww(providingpurecloudlikeHPCcomputeservices)thenewcompanywasnamedHöchstleistungs-rechnerundVerteiltesRechnenVerbund(HVV).OneofthekeyactivitiesofHVVwillbetoidentifySMEsinBaden-WürttembergthatmaybenefitfromHPC,toeducatethemonthebenefitsofHPC,toproposeprojects,andtosupporttheminintegratingHPCintotheirownproductionschemes.HVVwillhavetoworkcloselywithHLRSandSCCtobesuccessful.However,weexpecttoseeatremendousimpactonHPCusageinSMEsoverthenext5yearsiftheconceptworksout.

ConclusionSupportforindustryinHPCisadifficulttaskforanypubliccenter.However,experienceshowsthatonecanbeverysuccessfulandcompetitivefocusingontherightissues.ThesewhereresolvedsuccessfullyintheStateofBaden-Württembergandthefindingscanbeusefulbeyond.Goingbeyondpurecycleprovisioningspeedofinnovationisaddressedbysolutioncenters.BasedonHVVweaddresstheSME-sphere.Bothactivitieswillrequiremoreeffortsinthecomingyearsandareexpectedtodeliverahighreturnofinvestmentforsocietyandeconomy.BothapproachesshouldbeextendedbeyondtheStateofBaden-Württemberg.

References[1] Joseph,EarlC.etal. AStrategicAgendaforEuropean LeadershipinSupercomputing:HPC2020- IDCFinalReportoftheHPCStudyforthe DGInformationSocietyoftheEuropean Commission,http://www.hpcuserforum. com/EU/downloads/SR03S10.15.2010.pdf,

lastaccessedAugust14,2011

News

9

ViewoftheHLRSCrayXE6"HERMIT"

FirstGermanNationalCenterBasedonalongtraditioninsupercom-putingatUniversityofStuttgart,HLRS(HöchstleistungsrechenzentrumStuttgart)wasfoundedin1995asthefirstGermanfederalCentreforHighPerformanceComputing.HLRSservesresearchersatuniversitiesandresearchlaboratoriesinEuropeandGermanyandtheirexter-nalandindustrialpartnerswithhigh-endcomputingpowerforengineeringandscientificapplications.

ServiceforIndustryServiceprovisioningforindustryisdonetogetherwithT-Systems,T-Systemssfr,andPorscheinthepublic-privatejointventurehww(HöchstleistungsrechnerfürWissenschaftundWirtschaft).Throughthisco-operationindustryalwayshasaccestothemostrecentHPCtechnology.

BundlingCompetenciesInordertobundleserviceresourcesinthestateofBaden-WürttembergHLRS

hasteamedupwiththeSteinbuchCenterforComputingoftheKarlsruheInstituteofTechnology.Thiscollaborationhasbeenimplementedinthenon-profitorganizationHVV.

WorldClassResearchAsoneofthelargestresearchcentersforHPCHLRStakesaleadingroleinresearch.ParticipationintheGermannationalinitiativeofexcellencemakesHLRSanoutstandingplaceinthefield.

Contact:HöchstleistungsrechenzentrumStuttgart(HLRS)UniversitätStuttgart

Prof.Dr.-Ing.Dr.hc.Dr.h.c.MichaelM.ReschNobelstraße1970569StuttgartGermany

Phone+49-711-685-87269resch@hlrs.de/www.hlrs.de

Computing Systems@HLRS

Centres

10

ViewoftheHLRSBW-GridIBMCluster(Photo:HLRS)

ComputeserverscurrentlyoperatedbyHLRS

AdetaileddescriptioncanbefoundonHLRS’swebpages:www.hlrs.de/systems

System Size

PeakPerformance

(TFlop/s) PurposeUserCommunity

CrayXE6"HERMIT"(Q42011)

3,552dualsocketnodeswith113,664AMDInterlagoscores

1,045 CapabilityComputing

EuropeanandGermanResearchOrganizationsandIndustry

NECHybridArchitecture

1216-waynodesSX-9with8TBytemainmemory+5,600IntelNehalemcores9TBmemoryand64NVIDIATeslaS1070

146 CapabilityComputing

GermanUniversities,ResearchInstitutesandIndustry,D-Grid

IBMBW-Grid 3,984IntelHarpertowncores8TBytememory

45.9 GridComputing

D-GridCommunity

CrayXT5m 896AMDShanghaicores1.8TBytememory

9 TechnicalComputing

BWUsersandIndustry

Centres

11

HLRShasalongtraditioninsupportingusersonHPCsystems.IntegratedinoneoftheleadingGermantechnicaluniversities–theUniversityofStuttgart–HLRSdrawsontheexpertiseofapplicationexpertsinmechanicalengineering,physicsandchemistryaswellasontheknow-howofcomputerscientistsandmathematicians.ItisonlynaturalthatwithintheGaussCentreforSupercomputing(GCS)HLRStakesaleadingroleforengineeringapplicationsandindustrialusageofHPC.However,usageofHLRSsystemsgoesbeyond.Inthefollowingwepresentthehigh-

lightsofapplicationsofthelasttwoyears.ManyofthemwereawardedtheGoldenSpikeAwardofthesteeringcommitteeofHLRS.Thisawardisgiveneveryyeartoyoungscientistswhoshowoutstandingqualityofresearchworkforapplicationsonsupercomputers.

ThebiggestshareinusageofHLRSsystemsisforapplicationsusingCom-putationalFluidDynamics.Applicationsofthistechnologyrangefromflowsinmechanicalengineeringtooceancirculationtoastrophysicalapplications.InthisbookletyouwillfindanumberofexamplesfortheapplicationofCFDinturbulenceresearch.Turbulenceisoneofthebigchallengesinengineering.Turbulentflowsincreasetransportationcostandmakesystemslessmanageable.

Averygoodexamplefortheapplicationofturbulenceresearchistheinvestigationofturbulenceoverthewingofanaircraft.Thereductionofturbulenceresultsinareductionoffuelconsumptioninairtraffic.ResearchersoftheInstitutefor

User App licationsUser Applications

12

•MichaelResch,

Universityof Stuttgart,HLRS

Aero-andGasdynamicsoftheUniversityofStuttgartwereabletodevelopanewwingshapethatwillhelptoreducefuelcostsforairplanesbyabout15%.ThiswillresultnotonlyinareductionoffuelcostsforairlinesbutalsoinasubstantialreductionofCO²emissions.

Attheotherendofthespectrumwefindlargescalesimulationsofoceancirculationthatisessentialtobetterunderstandtheinteractionsbetweenoceanandatmosphereinclimateresearch.Suchlargescalesimulationsrelyheavilyonthelargememoryofsupercomputersandonthehighlevelofperformance.Largememoryisessentialtoadequatelyresolvethefeaturesofalargeregion.Highperfor-manceisrequiredtobeabletostudylongtermphenomenathatarecrucialfortheworldclimate.

Furtherapplicationspresentedheredealwithphenomenainphysics.Againwepresenttwosidesofthespectrum.Ontheoneendwefindthesimulationofatoms.Suchsimulationsareneces-

sarytobetterunderstandthebehaviorofmaterials.Withincreasingmemorysizeandperformancenextgenerationsystemsareabletosimulatelargernumbersofatomsandhenceincreaseourunderstandingofcomplexmaterialbehavior.Attheotherendofthespec-trumwefindastrophysicalsimulations.Theunderstandingoftheprocessesthatactintheuniversecanbeusefultobettergraspthegenesisoftheworldasweknowittoday.

TheexamplespresentedherewerechosenfromourjournalInSiDEwhichispublishedtwiceayear.TheyaimtoreflectthespectrumofapplicationsdevelopedandsimulationsconductedattheHLRS.However,theycanonlygivethereaderaglimpseintothewealthofapplicationsthataredevelopedinover80ofourusersprojectsallyearround.

User App licationsUser Applications

13

Improvingthefuelefficiencyofaircrafthasbecomeanimportanttaskwithinthelastdecades.Notonlydoairlinesbenefitfromsavingincreasinglyexpensivefuelbutalsotheenvironmentalaspecthasgainedgrowinginterestanditwillonlybeamatteroftimeuntilenviron-mentallawslimitinggreenhousegasemissionswillbeapproved.Currentcommercialsfornewlydesignedair-craftsshowthedemandformoreeffi-cientairplanes:“The787Dreamlinerisusing20%lessfuelthananyotherair-planeofitssize”(Boeing)or“TheA380providesthelowestfuelburnperseat–whichallowsairlinestosubstantiallyreduceCO2-emissionswhileachievingprofitable,sustainablegrowthforde-cadestocome”(Airbus).

Todate,realizedoptimizationsfornewairplanesarelimitedtoenhancedshaping,avoidingtooroughsurfaces,andengineimprovement,butlittlepotentialisthoughttobeleftinthese

fieldsexceptsurfacequalityonaerodynamicsurfaces.New

conceptshavethereforetobeenvisagedthatcon-

sidertheunderlyingfluiddynamics

phenomena

indetail.Inflighttestswithakindofshark-skinsurfacestrivingforturbulentboundary-layerdragreductionhaveshownimprovementsonlyintherangeofveryfewpercents.Laminarflowcon-trol(LFC)ontheotherhandprovidesatotaldragreductionpotentialofe.g.16%byrealizing40%laminarboundary-layerflowonwingsandcontrolsur-facesofacurrentairliner[5].There-foreitisthemostpromisingcandidateforexpedientdragreduction.

Maintaininglargeregionsoflaminarboundary-layerflowhasbeenprovenfordecadesnowintwo-dimensionalsituationsbyapplyingboundary-layersuctionwhichefficientlydelayslaminar-turbulenttransition.Atypicalairlinerwing,however,issweptback(cf.figure1)toallowforhighercruisespeedatacceptablepressuredrag.Theevolv-ingcrossflowcomponentinsidetheboundarylayercausesanew,dominantinstabilitymechanism,andastraight-forwardimplementationofthetwo-dimensionalsuctionsetupsisnotpossible.Forathree-dimensionalboundarylayeritturnsoutthat-typicallysteady-longitudinalcrossflowvorticesevolveduetothenewprimaryinstabil-ityoftheflow.Whilestillbeinglaminarthesevortices-ifgrowntolargeampli-tudes-arehighlyunstabletoubiquitousunsteadybackgrounddisturbances.Duetotheextremelylargegrowthratesofthisso-calledsecondaryinstability

Laminar Flow Control in Vortex-deformed Swept-Wing Flows: Pinpoint Suction

Golden Spike Award by the HLRS Steering Committee in 2010

Applications

14

Figure1:Consideredintegrationdomainwithinviscidflowstreamline.

Figure2:Vortexvisualization(snapshots)intheboundary-layerflowonanaircraftwingofareferencecase(left)andacasewithpinpointsuction(right).Toscale.Thecolorschemeshowsthewall-normalcoordinate.Suc-tionholesaremarkedbyblackcirclesatthewall.Themainflowisfrombottomlefttoupperright,andthemaincrossflowfromrighttoleft.

laminar-turbulenttransitionsetsinrapidly,typicallyafteronlyfewpercentsoftheairfoilchordlength.Thecrossflowvorticesaregeneratedbyevenminutesurfacenon-uniformity.

Boundary-layersuctiondiminishesthecrossflowbysuckinghigh-momentumfluidtothewall,andthusalsoattenu-atescrossflowinstability.However,thetypicallyapplieddiscretesuction

Applications

15

Figure3:Vortexvisualization(snapshots)ofthreesetupstoillustratethepinpointsuctionconcept.Toscale.Red:refer-encecasewithoutsuction.Green:vorticesgeneratedbysuctionholeswithoutoncomingvortices.Blue:(non-linear)superposi-tion–appliedpinpointsuction.

holescan

generate,ontheother

hand,relativelylargeinitialcross-

flow-vortexdistur-bances,jeopardizing

theLFC.

Improvedsuctionconceptsforthiskindofthree-dimensional

wingflowshadthereforetobedeveloped.Messing&Kloker[4]

proposedanideacalleddistributedflowdeformation(DFD),inparticularformativesuction.Bydesigningasuit-ableslot-suctionpanelusefulvortices-withamuchcloserspanwisespacingthantheturbulence-triggeringones-arecontinuouslyexcitedandmain-tainedthatareknowntobestablewithrespecttosecondaryinstabilityandsuppressthenocentvortices.Laminar-turbulenttransitioncouldbedelayedsignificantly.

Anewideaofdirectlyinfluencinglarge-amplitudecrossflowvorticesandsecondaryinstabilitiescalledpinpointsuctioniscurrentlydeveloped[1,2,3].Thescenarioconsideredcontainstheharmful,secondarilyunstablecross-flowvorticesthatdevelopnaturally.Localized,strongsuctionthroughfewholesonlyattheupdraftsideofeachvortex,i.e.thelocallymostunstableregionwithhigh-shearlayers,directlyreducesthesecondarygrowthwhilealsoreducingthevortexstrength.

Iftheexactpositionofthevortexisknownthissuctionsetupturnsouttobeveryeffectiveandthetransi-tionlocationcanbeshiftedfardown-stream.Notethatthisisaninherentlynon-linearthree-dimensionalprocesscomparedtostandardsuctionwithitsmuchlowersuctionvelocity.

Figure2showstwosnapshotsofvorticalstructures.Ontheleftsideareferencecaseisshownwherethreesteadycrossflowvorticescanbeob-served.Apulse-likedisturbanceistrig-geredupstreamoftheshowndomainatextremelylowamplitudes,eventu-allyundergoingsecondaryinstability.Thesedisturbancesgrowrapidlyandsoonfinger-likesecondarystructurescanbedetectedthattriggerlaminar-turbulenttransition.Ontherightsidethedevelopmentofthesamevorti-ces(containingtheidenticalpulse)isshown,nowbeingsubjecttopin-pointsuction(cf.alsofigure3).Ninecloselyspacedholespervortexareplacedattheupdraftsideoftherespectivevortex(blackholesatthewall)andtransitionispreventedintheconsidereddomain.

Applications

16

•TillmannA.Friederich•MarkusJ.Kloker

InstitutfürAerodynamikundGasdynamik,UniversityofStuttgart

References[1] Bonfigli,G.,Kloker,M. Secondaryinstabilityofcrossflowvortices: validationofsecondaryinstabilitytheoryby DNS,2007,JournalofFluidMechanics, 583,pp.229-272

[2] Friederich,T.,Kloker,M. DirectNumericalSimulationofSwept-Wing LaminarFlowControlusingPinpoint Suction,in:HighPerformanceComputing inScienceandEngineering'10 (eds.Nagel,W.E.,Kröner,D.B.,Resch,M.M.), TransactionsoftheHLRS2010, pp.231-250,Springer

[3] Friederich,T.,Kloker,M. NumericalSimulationofCrossflow- TransitionControlusingPinpointSuction, 2011,in:NewResultsinNumericaland ExperimentalFluidMechanicsVIII(eds.n.n.), NNFM,reviewedcontributionstothe17. STAB/DGLR-Symposium,Nov.2010, 8pages,inpress,Springer

[4] Messing,R.,Kloker,M. Investigationofsuctionforlaminarflow controlofthree-dimensionalboundary layers,2010,JournalofFluidMechanics, 658,pp.117-147

[5] Schrauf,G. Statusandperspectivesoflaminarflow, 2005,TheAeronauticalJournal(RAeS), 109,no.1102,pp.639-644

Setupswithidenticalover-allsuctionbutthroughslitsor

homogeneouslypermeablewallemployingasmallermaximumsuctionvelocityonalargerareaturntobefarlesseffective.

Allresultswereob-tainedusingspatialdirectnumericalsimula-tionswith(incompressibleandalsocompressible)in-housecodesoftheIn-stitutfürAerodynamikundGasdynamikattheUniversity

ofStuttgart.Ourlargestset-upsofupto109gridpointsrequire0.4TBRAMonNECSX-8andSX-9vectorcomputersoperatingatover1.1TFlop/s.Scenarioswithmorecomplexdomains,e.g.containingsuc-

tionchannels,willrequirelargercomputationaldomainsandhencemorepowerfulsupercomputersinthefuture.

Applications

17

IntroductionToday,helicoptertechnologystillposesseveralunsolvedproblemsinaero-dynamics.Theaccuratenumericalpredictionofcertainfluid-structureinteractionalflowphenomenaofflightdynamicrelevanceisoneexampleforsuchproblemareas.PrototypingofhelicopteraircraftisincrasinglydonebasedonComputationalFluidDynamics(CFD).Helicopteraeromechanicalstud-iescallforamorewholisticapproachinthesimulationofflowfieldsinthattheyrequiretheincorporationofelasticdeformationsofflexiblestructuressuchasthemainrotorbladesintothecom-putationaltoolchain.ThisresultsinaCFD-CSD(ComputationalStructural

Dynamics)coupledsimulation.Addi-tionally,aprocedurefortrimmingtherotortowardssomeprescribedflightdynamicstatehasprovenessential.Onlyinfulfillmentofthesepresuppositionscanreasonablecomparabilityofsimula-tionandexperimentbeguaranteed.

Moststate-of-the-artsimulationen-vironmentstodayuseaaso-calledstructuredgridapproach,i.e.thefluidvolumetobesimulatedissubdividedintocuboidalelementsforwhichtheequationsofmotionforthefluidcanbesolvedinannicelyorderedmanneralongthedirectionsofthree-dimensionalspace.Recently,CFDsolutionmethodsfollowingadifferentapproachtermed

Fluid-Structure coupled Flow Simulations of Helicopter Rotors

Golden Spike Award by the HLRS Steering Committee in 2010

Applications

18

Figure1:Rotorwake/tailinterationsofthetailshakephenomenon

Figure2:Verticalforcesontherotordisk;left:unstructured,right:structuredsolver

unstructuredhavegainedpopularity.Here,theelementsforthediscretiza-tionofthevolumetobesimulatedcanbeofquitearbitraryshape,givingupthenicestructureofthesystemofequationsinreturnofincreasedgeo-metricalflexibility.

Asmentionedabove,quantitativenu-mericalinvestigationsofinteractionalphenomenaproblemsisoneoftheto

dateunsolvedissuesinhelicopteraero-dynamics.Anexamplewithinthisfieldistheso-calledtailshakephenomenon[1],whereinfastforwardflightconditionsinterationsofthemainrotorwakeandthetailboomandfinstructureexcitealateralbendingclosetothefrequen-ciescorrespondingtothelowerelasticmodesofthefuselagestructure.Figure1displaysthisscenario(figuretakenfrom[1]).Thisphenomenonexhibitsan

Applications

19

Figure3:Vortexsystem(greyshadesandcolorfulverticalplanes)oftheflowfieldinslowforwardflight.Colorsontherotorbladesshowthepressuredis-tribution(blue:low,red:highpressure)

undesirablerandomcharacterofunsteadynaturewhichcanbefeltbytheflightcrewaslateral“kicks”[1].Todate,CFDmethodsareincapableofpredictingandexplainingthecharacteroftailshakeevenbeforeearlyprototypeflighttestingandasaconsequencemanywell-knownhelicoptertypessuchastheEurocopterEC135[2]ortheBoeingAH-64D™LongbowApache™[3]showedthiseffectinearlyflighttesting.

SimulationTheCFDsimulationofsuchphenomenarequiresaverydetailedmodellingofthestructureoftheflowfieldandthushighgeometricaldetailespeciallyinthehubareaofthemainrotorisdesired.Here,theabovementionedconventionalstruc-turedgridapproachsuffersfromthedrawbackofexcessivemanualtimecon-sumptionandeventuallybecomesimpos-sible.Therefore,anewtoolchainbasedontheunstructuredgridapproachhasbeendeveloped.Firstresultsofthenewsimulationenvironmentarecomparedtothealreadyexitingstandardstructured-gridbasedtoolchain[4].Inbothcases,afluid-structurecouplingproceduretermedtheweakcouplingapproach[5]hasbeenemployed.Here,periodicdataisexchangedbetweenCFDandCSD.Inparalleltothis,analgorithmfortrimmingtherotorensuresthatspecifiedrotorforcesandmomentsaremet.Hereby,controlinputswhichareusuallysteeredbythepilotsuchasthecollectiveandcyclicpitchanglesareadaptedinordertoeventuallymeettheseloadsalsotermedtrimobjectives.

Inapresentstudy,helicopterrotorCFD-CSDcalculationswhereperformedmakinguseofthestandardstructuredaswellasthenewunstructuredapproach.Thebasisofthecurrentinvestigationwerewindtunnelexperimentsofa

generichelicopterconfigurationwhichwereperformedintheopentestsectionoftheGerman-Dutchwindtunnel(DNW).Thesimulationcontainedafour-bladedrotorinslowforwardflightconditions.

Figure2showsthecomparisonbetweenthetworespectivetoolchainsintheverticalforcesontherotordiscplanewheretheflightdirectionisdirectedfromrighttoleft.Bothapproachesyieldaverysimilardistributionofthe

forcesontheadvancing(0..180°)aswellasontheretreatingside(180..360°).CFDmethodsalsoallowforadetailedanalysisofthefluidvolume.InFigure3avortexvisualizationofthecomplete

Applications

20

•FelixBensing•ManuelKeßler

InstitutfürAerodynamikundGasdynamik,UniversityofStuttgart

flowfieldpasttherotorisdisplayed.Thevorticesshedatthetipandtheinnerrootsectionofthebladesaswellasthegreatrolled-upvorticestotheleftandrightoftheentirerotorpropagatingbackwardsareclearlyvisible.Therotorisrotatinginaclockwisesense,showingdistinctsuctionareasoflowpressureattheouterfrontpartsoftheblades(bluecolorontheblades).Twoverticalslicesthroughthevortexsystemindicatethevortexcorelocations(bluecolor).

rescalabilityofthecomputationalsetuptodifferentnumbersofcomputingunits,thenewunstructuredapproachallowsforeasyrepartitioningandcustomizationtoalmostarbitrarynumbersofcomputingprocesses.Additionally,thenewtoolchainshowedgreatpotentialinthescalabilitytoseveralhundredsofcomputingpro-cessesincomparablysmallsetups.Performanceofthecodeamountedto11GFlopswhichtranslatesintoanode-widepeakperformanceof12%makinguseoftheNECNehalem'sIntelXeonX5560processor.Thisflexibilitywillallowforthecomputationofextremelychallengingsimulationsinthefieldofhelicopteraerodynamicsbothintermsofproblemsizeaswellasgeometricalcomplexibilityintheadventoftheaccesstoevenlargercomputationalresources.

References[1] deWaard,P.G.,Trouvé,M. Tailshakevibration,1999,NLR-TP-99505, NLR,TheNetherlands

[2] Kampa,K.,Enenkl,B.,Polz,G.,Roth,G. AeromechanicAspectsintheDesignofthe EC135,1999,Proceedingsofthe23rd EuropeanRotorcraftForum,Dresden, Germany

[3] Hassan,A.A.,Thompson,T., Duque,E.P.N.,Melton,J. ResolutionofTailBuffetPhenomenonfor AH-64D™LongbowApache™,1997, Proceedingsofthe53rdAnnualForumof theAmericanHelicopterSociety,Virginia, USA

[4] Kroll,N.,Eisfeld,B.,Bleecke,H.M. FLOWer,in:NotesonNumericalFluid MechanicsViewegBraunschweig,1999, Vol.71,pp.58-68

[5] Altmikus,A.,Wagner,S.,Beaumier,P., Servera,G. AComparison:WeakversusStrong ModularCouplingforTrimmedAeroelastic RotorSimulations,AmericanHelicopter Society,58thAnnualForum,Montreal, Canada,2004

PerformanceThepresentcalculationswereperformedontheNECNehalemclusterattheHighPerformanceComputingCentre(HLRS)inStuttgart.Whilethestandardstruc-turedtoolchainposeslimitationsonthe

Applications

21

Aircraftdesignhasalwaysbeenastruggleforthemost“efficient”formormethod.Themostefficienthasbeendefinedeitherbyincreasedpeakper-formanceorinmorerecenttimesbyreducedoperationalcost.Inengineeringtermsreducedoperationalcostequalsreducedorconstantdragwithconstantorincreasedlift.Today'scommercialairlinerscommonlyencountersocalledflowseparationonthewingflapsduringlandingwhichledtotheemploymentofrathercomplexandheavyflapmechan-icswhichbecamenecessarytocom-pensateforthereducedlift.Thereforethesuppressionofsuchflowseparationconstitutesavitalsteptowardsmoreefficientwingandflapdesign.Thecauseforflowseparationarepositivepres-suregradientsalongthestreamlinewhichactasobstacletotheoncom-ingfluid.Inordertoenablethefluidtoovercomelargeradversepressuregradientsithastobeenergized(seeFig-ure1).Onemethodtodosoistomovehighenergyfluidoutoftheundisturbed

freestreamtowardsthesurface.ThenecessaryvorticalmotionisinducedbyalargelongitudinaleddycreatedwithsocalledVortexGenerators(VG).Alreadyinusearepassivevortexgenerators,i.e.smallsheetsattachedtothewingsurface.IndeedtheseVGsarewellabletogenerateaforementionededdiesbuttherearealsodisadvantagesduetothefactthatthepassiveVGsareoptimizedforaspecificpointofoperationandinduceparasiticdragatallotherflightattitudes.In1992experimentalworkbyJohnstonetal.[1]hasshownthegen-eralabilitytosuppressflowseparationusingso-calledJetVortexGenerators(JVG).ThemajoradvantageoversolidVGsstemsfromthefactthatJVGsareactivelycontrollableandthusdisposeofanyadditionaldragoutsideofthepointofoperation.ThesepromisingfindingsledtointensiveresearchinJVGsandcumulatedinexhaustiveexperimentalparameterstudiescoveringmanyas-pectssuchasvelocityratio,radiusandblowingangle.

Albeittheoutcomesoftheseexperi-mentsyieldaverygoodgeneralideaofthemechanismsofactiveflowcontroldevicestherearestillanumberofopenquestionsinvolvedasnodetailedpictureoftheformationofthevortexanditsinteractionwiththeboundarylayercouldbegainedfromexperimentsyet.Therefore,anydesignsugges-tionsrelyheavilyonempiricaldataandaredifficulttotransposetodifferentconfigurations.Duringindustrialde-signprocessesparameterstudiesarethusoftenundertakennumericallybyFigure1:SchemeofJetVortexActuator

Direct Numerical Simulation of Active Separation Control Devices

Golden Spike Award by the HLRS Steering Committee in 2009

Applications

22

Thecomputationaldomainconsistsofarectangularbox.Boundarycondi-tionsareappliedtodefineaninflow,outflow,freestreamandwallregion.Thefundamentaldifferentialequationsareconvertedintoapproximatedif-ferenceequationsusingfinitediffer-encesonastructuredgrid.Thefinitedifferencestencilsarechosentobeofcompactform,i.e.informationofbothflowvariableanditsderivativeistakenintoaccountresultinginanumericalschemeofspectral-likewaveresolutioninspace.Theevolutionintimeissimu-latedbyanexplicitRunge-Kuttatimeintegrationschemewhichmaintainstheaccuracyofthespatialresolution.Theresultinglinearsystemofequa-tionscanbesolvedveryefficientlyonvectorCPUsupercomputersliketheNECSX-9atHLRSStuttgartbecauseofthestronglystructuredform.Fur-thermorethedomaincanbesplitintoboxesofequaldimensionswhicharethenassignedtoonecomputational

Figure2:Instantaneousstreamwisevelocitycontoursandisosurfacesofvorticityvisualizedbyl2criterion(b=150°)

meansofReynoldsAveragedNavier-Stokes(RANS)methods.ThecruxofRANSliesinthefactthattheinvolvedmodelassumptionshavetobeadaptedtoeverynewconfigurationandinthefactthattheunderlyingequationsareinherentlysteadystateandthusdonotreallyallowforsimulationoftransientprocesses.WithinthiscontextthepresentedworkcoverssimulationsofagenericJVGconfigurationbymeansofhighlyaccurateDirectNumericalSimulation(DNS)methods.TheDNSapproachisusedforitslackofanymodelassumptions.Therefore,itiswellsuitedtoprovideareferencesolutionforcoarseror“moreapproxi-mate”numericalschemes.Further-more,DNSallowsforacomputationoftheunsteadyflowformationespe-ciallyinthebeginningofthevortexgenerationanddetailedanalysisofthefluiddynamicsinvolved.

Navier-Stokes3DThenumericalsolverNavier-Stokes3D(NS3D)hasbeendevelopedattheInstitutfürAero-undGasdynamik(IAG)atStuttgartUniversity[2].TheprogramisbasedonthecompleteNavier-Stokesequation,i.e.noturbu-lenceorsmallscalemodelsareused.

Applications

23

processeach.Interprocesscom-municationisrealizedbyuseofMes-sagePassingInterface(MPI)routines.WithineachMPIprocessthedomainisfurthermoreparallelizedemployingNECMicrotaskingshared-memorypar-allelization.AllcomputationsdescribedherehavebeenrunononenodeoftheNECSX-9within12hrscomputingtime.

JetVortexGeneratorSimulationsNumericalsimulationshavebeendonefortwoJet-in-Crossflowconfigurationswhichdifferinthejetexit’sorienta-tiononly.Theconfigurationswerechosentocloselymatchexperimentsdescribedin[3]andarerepresenta-tiveforseparationcontroldevices.ThecrossflowrepresentsalaminarboundarylayeronaflatplatewithzeropressuregradientatfreestreamMachnumberofMa=0.25whichcorre-spondstolandingspeedforcommer-

cialairliners.Thejetisdescribedbyasteadyvelocitydistributionatthewallboundaryofthecomputationaldomainandresemblestheprofileofapipeflow.Thejet-to-crossflowvelocityratioissettoR=3.Thejetsareinclinedbyanangleofattacka=30°andskewedtothedownstreamdirectionbyanglesb=30°andb=150°respectively.Onecanimagineanobliquejettobeblowneitheragainsttheoncomingmainfloworinlinewithit.Altogether18.4mgridnodesareusedandcomputationshavebeencarriedoutfor31flowthroughtimes.Albeitthisdoesnotsufficetoprovidedataforstatisticalanalysis,ityieldsagoodpictureoftheevolutionoftheperturbedflowfield.Eventhoughboththejetandthelaminarboundarylayerareinitiallyinasteadystatetheresultingflowregimebecomeshighlyunsteadywiththejetexhibitingun-stablemodesleadingtotheformationoflargetransientvortexstructures

Figure3:Instantaneousdownstreamvelocitylocityandstreamlines(b=30°)

Applications

24

reachingfaroutoftheboundarylayer(Figure2).Thejetaffectsthebound-arylayerbymainlytwomechanismsnamelytheblockageoftheoncomingfluidandthestrongshearatthejetslopes.Theblockageleadstoflowstructuressimilartothewakesfoundbehindsolidobstacles,i.e.periodiceddysheddingclosetothewallandasocalledhorseshoevortexwhichwrapsaroundthejet.Furthermoretheoncomingflowdeflectsthejetintothemainstreamwisedirectionwhichleadstoadditionalflowfeaturesuniquetojetsincrossflowsuchastheentrain-mentofnearwallfluidlayersupwardsintothejetwake.Boththeinducedperturbationsofthecrossflowbehindthejetexitaswellasthefluidentrain-mentevidentlydependonthejetpa-rameters.Theshearontheslopeofthejetontheotherhandinducesaro-tationalmotionwhichisprolongedandfortifiedbehindthejetexitandleads

totheformationofalargelongitudinaleddyabovethewall.Inordertogainaninsightinthefluiddynamicssnapshotsoftheflowfieldarerecordedandevaluated.Figures3and4depicttheflowatthelastrecordedtimestep.Incaseofaskewangleofb=30°twolon-gitudinalvorticesestablishintheflowandwraparoundeachotherduetotheinducedvelocitiesinthetransver-salplane.Thedevelopmentofahigh-speedstreakclosetothewalltakesplacewithaninclinationtothecenter-lineoftheflow.Thetwolongitudinalvorticesseemnotsubjecttoinstabili-tiesatthatpointintime.Theystretchratherwell-defineddownstreamandupwardsalongtheplate.Theoncom-ingfreestreamdeflectsthejetinbothspanwiseandwall-normaldirection.Thusastrongshearlayerdevelopsonthetopsideofthejet.Thesweepofthejetovertheboundarylayerresultsinthenear-wallfluidlayersbeingen-

Figure4:Instantaneousdownstreamvelocityandstreamlines(b=150°)

Applications

25

trainedupwardsintothetrajectoryofthejet.Theentrainmentregiondoesnotextentveryfardownstreamthoughandtheexchangeofhigh-andlow-speedlayersinsidetheboundarylayerisnotverystrong.Adifferentflowcanbeobservedwhenthejetisblownagainstthemainflow.Firstlytheblock-ageisalotstrongerthanintheprevi-ouscase.Secondlythewakeexhibitsinstabilitieswhichleadtoamixingofthefluidlayersinalldirections.Againthejetwakegetsdeflectedbutaddi-tionallytheentrainmentregionextentsfurtherdownstreamandthewakerollsintoalongitudinalvortex.

BoundaryLayerControlAsalreadymentioned,Jet-in-Cross-flowconfigurationsareinvestigatedasameanstosuppressboundarylayerseparation.Thisistobeachievedbymovingfasterfluidlayersclosertothewallandtherewithincreasingthewallfriction.Figuredepictsacompari-sonofthetime-averagedwall-shear-stressdistributionforthesimulatedcases.Bothconfigurationsleadtoanetincreaseofwallshearstressinaconfinedstripebehindthejet.Thisareaoverlapsthetrajectoryofthejetonlymarginallythoughincaseofajetangleofb=30°.Theadditional

Figure5:Comparisonofmeanwallsheardistributioninactuatedflow(top:b=150°,bottom:b=30)

Applications

26

• BjörnSelent• UlrichRist

Instituteof Aero-and Gasdynamics, University ofStuttgart

energyisfedtotheboundarylayerinanoverlystraightforwardfashionbyaddingmomentum.Inordertoexploitsuchamechanismwithreasonableinput-to-outputratioatangentialjetseemsmoreadvisable.Ontheotherhandwhenthejetisdirectedagainstthemainflowtheregionofincreasedshearextendspromisinglyinbothdownstreamandspanwisedirection.Thereasonbeingthedifferentfluiddy-namicsatwork.Firstlythejetactsasturbulatorwhichleadstofullervelocityprofilescomparedtotheunperturbedflowandfurthermorethejet-crossflowinteractionleadstotheformationofalongitudinalvortexwhichtransportsfastfluidlayersclosertothewall.Thereforethisconfigurationseemsfeasibleforactiveflowcontroldeviceseveninalreadyfullyturbulentshearflowsastheyarefoundonaircraftairfoilsandflaps.

ConclusionSimulationsoftwodifferentJet-in-CrossflowconfigurationshavebeencarriedoutinordertoinvestigatetheeffectonaboundarylayeratMa=0.25.Thejetshavebeenskewedandinclinedbyvaluestypicallyfoundinactive-flow-controldevicesetups.Thesimulationsgaveinsightintothegenerationofajet-vortexsystemandwakeperturbations.Incaseofajetalignedwiththemeanflowastableflowestablishedinwhichtheboundarylayerwasonlypoorlyenergizedbythejet.Askewofthejetagainstthemainflowledtothegenera-tionofalongitudinalvortexwhichinturnledtosignificantlyincreasedwallfriction.Followingsimulationsmayin-cludefurthervariationofthepitchandskewanglesofthejetaswellasveloc-ityratiovariation.Alsoofinterestisachangeofinitialconditionstowardsafullydevelopedturbulentboundarylayer.

AcknowledgementsWethanktheHighPerfomanceComputingCenterStuttgart(HLRS)forprovisionofsupercomputingtimeandtechnicalsupportwithintheproject“LAMTUR”.

References[1] Johnston,J.P.,Nishi,M.

“VortexGeneratorJets–aMeansforFlowSeparationControl”,AIAAJournal,28(6),pp.989–994,1990

[2] Babucke,A.,Kloker,M.,Rist,U.“DirectNumericalSimulationofaSerratedNozzleEndforJet-noiseReduction”,inM.Resch(Eds.),HighPerformanceComput-ing,inScienceandEngineering2007,Springer,2007

[3] Casper,M.,Kähler,C.J.,Radespiel,R.“FundamentalsofBoundaryLayeraControlwithVortexGeneratorJetArrays”,in4thFlowControlConference,numberAIAA,pp.2008-3995,2008

Applications

27

Figure1:Hybridblock-unstructuredmeshforthespherecalculation

Thisarticledescribestwocalculationsperformedforthenumericalsimulationofascram-jetintakeatahighMachnumberofM=8ontheHLRBIIatLRZ.ThisprojectispartoftheDFGGradui-ertenkollegGRK1095/2,-conceptionofascram-jetdemonstrator-anditspartnersatTUMünchen,RWTHAachen,UniversitätStuttgartandtheDLR.Inordertopavethewayforthesimulationoftheverycomplexintakeflowfield,preparatorycalculationswereperformedtotestcertaincodefeaturesandtoprovideinsightintolargeparallelcomputations.Thesecalculationsaredescribedinthisarticle.

NumericalCodeSinceallpresentednumericalproblemscontainstronglytimedependentin-stationaryphenomena,weareusingaspecialunstructuredexplicitdiscontinu-ousGalerkincode,calledHALO(HighlyAdaptiveLocalOperator).Thiscodeisofhighorderofaccuracyinbothspaceandtimeandcanhandleunstructuredhybridgridswithhangingnodes,consistingoftetrahedra,hexahedra,prismsandpyramidsin3Dandevenpolygonsin2D.Itisalmostfullyparal-lelizedandrequiresonlyaminimumofinter-processorcommunicationduetoitsexplicitcharacter.Amainfeatureofthiscodeisitstimediscretizationmethod[1,2,3],thatallowsahighorder

High Order large Scale Calculations

Applications

28

Figure2:Instantaneousl2isosurfaceofthesphere

timeconsistentlocaltimesteppingmechanism,whereineachgrid-cellcanadvancewithitsownmaximumpossibletimestep.Variousequationsystemsareimplemented,suchasEulerorNavier-StokesequationsaswellasviscousMagnetohydrodynamic(MHD)equations.AllpresentedcalculationsemploytheNavier-Stokesequations.Tosuppressoscillationsatdiscontinuitiessuchasshocks,artificialviscosityisaddedtosmeartheshockprofileresultinginstablecomputations.

Scale-upEfficiencyTotestthescalingcapabilitiesoftheHALOcode,wesetupanexampleforwhichaperfectloadbalancewasachievable.Wewereusingtheso-calledmanufacturedsolutiontechniqueforthe3DcompressibleunsteadyNavier-Stokesequations:InsertingananalyticalfunctionintotheNavier-Stokesequa-tionsleadstoarighthandsidethatisprescribedasasourceterminthenumericalcode.

Theproblemwassetupwithperiodicboundariessothattheboundarycom-municationwillnotdifferfromtheinter-processorcommunications.Thesizeofthecomputationalproblemwasincreasedinparallelwiththenumberofprocessorsforcalculation.Thisway,wekeptaconstantloadincomputationaswellasincommunication.

Table1showsthegoodscale-upeffi-ciencyoftheHALOcodeforupto4,080processorswithaconstantloadperprocessor.

Nb.ofprocs 1 1,000 2,197 4,080

Efficiency(%) - 99.1 97.8 98.8

Table1:Scale-upefficiencyoftheHALOcode

TheefficiencywhencalculatingonNprocessorsiscalculatedasthecalculationtimeononeprocessordividedbythetimeneededforacalculationonNprocessors.

PerformedCalculationsInordertobeabletoperformalargecalculationofthescram-jetwithfea-tureslikehp-adaption,shockcapturingandVMS,preparatorycalculationsweresetuptotestcertainfeaturesandpre-parethecodeforefficientcomputationonalargenumberofprocessors.Thesecalculationstacklephysicalproblemsandwillbedescribednext.

3DFlowaroundaSphereThelaminartimeperiodicflowaroundaspherewassetuptotestthecodesabilitytohandleunstructuredhybridgridsandap-adaptionmechanisminthreespacedimensions.WeweresolvingtheunsteadycompressibleNavier-Stokesequationswithafree-streamMachnumberofMa=0.3andaReynoldsnumberofRe=300.Theproblemwasdiscretizedwithablock-

Applications

29

Figure3:Distributionofthelocalpolynomialdegreeatendtimeofthespherecalculation

unstructuredgridconsistingofprismsfortheboundarylayer,tetrahedraandhexahedraelsewhere.Figure1showsthedifferentgridblocksanddimensionsofthecomputationaldomain.P-adaptionwasarrangedsothateachgridcellwasallowedtoadaptitspolynomialdegreebetween1and5every500timesteps.

Todemonstratethecalculationresults,Figure2showsa3Dviewoftheinstan-taneousvortexmeasureλ2forthistestcase.Thecolorlevelsindicatetheveloc-itymagnitude.Here,onecaneasilyseethattheverylargecellsattheendofthewakecannotprovidethenecessaryresolutionandarethereforeproducinglargeintercelljumpsofthesolution.

Finally,Figure3showsthedistributionofthelocalpolynomialdegreepatendtimetend=1,000.

3DFreestreamInjectorThiscalculationtargetsshockcapturingandp-adaptioncapabilitiesofthescheme,aswesimulateaMa=1.4,Re=30,000injection.

Theinjectionnozzleisdesignedaccord-ingtospecificationsforgasinjectiondevicesusedintheautomotiveindustry.Thisproblemalsocontainscomplexcurvedgeometriesthatarechallengingforhighorderschemes.Theobjectiveofthiscalculationistheaeroacousticsimulationoftheinstationaryinjectionprocess,includingthestartupoftheprocess.Resultscanbevalidatedwithcalculationsperformedwithothercodes,bothatourinstituteaswellasinindustry.Preliminary2Dcalcula-tionsalreadyprovidedinsightintothenecessarygridresolutionsandshockcapturingstrategies.Inourcase,theproblemwascalculatedonagridwithover16millionDGdegreesoffreedom

Applications

30

Figure4:Densitydistributionandvelocitystreamlinesofthe3Dfreestreaminjector

•ChristophAltmann•GregorGassner•MarcStaudenmaier•Claus-DieterMunz

InstituteofAero-andGasdynamics,UniversityofStuttgart

(4millionofhexahedralelements)on500to1,000processors.Figure4showsa2Dsliceplaneofthecalculation(densitydistributionandvelocitystream-lines)togetherwithanisosurfaceplotofthedensitythathighlightsthedevelop-mentoftheflow.

Toillustratetheflowdevelopment,severalstreamlineshavebeenadded.Pleasenotethatthepicturepresentsanearlyphaseofsimulation.Thegeom-etryitselfisrathercomplex,consistingoffourkidney-shapedinjection“nozzles”withinthecylindricalinjectorandisas-sembledwithunstructuredhexahedra,allowinghangingnodesandpolygonsatconnectionsurfaces.

CodePerformanceTheHLRBIIprovidesaneasyjobperformancesummarydirectlyinthecommandline.Thistoolcanprovideafirstinsightintotheperformanceofthecalculationsandhelptodeterminethecomputationalefficiencyofthenumeri-calcode.Weherebydiscoveredstrongperformancedifferences,dependingonthetestcase.Especiallythescale-uptestsaswellasthespherecalcula-tionwhereabletoperformwithupto700MFlop/sperprocessor.Thelatterperformancewasdiscoveredforthe4,080processorscale-upcalculationresultinginatotalofabout3TFlop/s.Itwasfoundthatcertainsettingsofp-adaptivityandmeshstructuresdohaveasignificantinfluenceonthecodeperformance.

OutlookThepresentedcalculationsdidnotonlyshowthepotentialoftheHALOcodebutalsogaveaninsightintothephysicstheywereaddressingandhelpedtoimprovethecode.Thecodeshouldnowbereadytotargetthefullintakesimulation.

References[1] Lörcher,F.,Gassner,G.andMunz,C.-D. AdiscontinuousGalerkinschemebased onaspace-timeexpansion.I.Inviscid compressibleflowinonespacedimension. In:JournalofScientificComputing,Vol.23, No.2,pp.175-199,2007 DOI=http://dx.doi.org/10.1007/s10915- 007-9128-x

[2] Lörcher,F.,Gassner,G.andMunz,C.-D. AdiscontinuousGalerkinschemebasedon aspace-timeexpansion.II.Viscousflow equationsinmultidimensions.In:Journal ofScientificComputing,Vol.34, No.3,pp.260-268,2008 DOI=http://dx.doi.org/10.1007/s10915- 007-9169-1

[3] Lörcher,F.,Gassner,G.andMunz,C.-D. AContributiontotheConstructionof DiffusionFluxesforFiniteVolumeand DiscontinuousGalerkinSchemes.In: JournalofComputationalPhysics,Vol. 224,No.2,pp.1049-1063,2007 DOI=http://dx.doi.org/10.1016/ j.jcp.2006.11.004

Applications

31

Figure1:Thehoneycomblatticewithitsunitcell,displayedbythedashedredline.Theunitcellcontainstwoatoms,eachonebelongingtoadifferentsublat-tice.Therefore,thehoneycomblatticehasabipartitestructure,wheretheatomsofagivensublatticearesurroundedbysitesoftheotherone.

Intherecentpastcondensedmattersystemsdisplayedanumberofexoticstateslikeunconventionalsupercon-ductivityinhigh-temperaturesupercon-ductors,supersolidityin4He,andspinliquidsinfrustratedmangnets,thatemergeduetocorrelationsinmany-bodysystems[1].Ofparticularinterestarespinliquids,wherequantumfluctu-ationsprecludeorderingfromtheliquidstatetoasolid,i.e.toaperiodicorder-ingofthemagneticmoments,evenatzeroabsolutetemperature.Generallyitisexpected,thatsuchstrongquantumfluctuationsariseduetocompetingin-teractionsthatfrustratetheformationofanorderedstate[1].

Asystemwherequantumfluctuationsmayplayadominantroleisgraphenethatwasrecentlyobtainedexperimen-tallybyexfoliationofgraphite[2].Bymeansofsuchamicromechanicalcleavage,singlelayersofcarbonatomswithahoneycombstructure,

schematicallydisplayedinFigure1,areproduced,that,remarkably,subsistasfree-standingtwo-dimensionalcrystals.Thehoneycomblatticeisabipartiteone,i.e.itconsistsoftwosublattices,wherethenearestneighboursofthesitesofoneofthemaresitesbelongingtotheothersublattice.Hence,insuchalatticegeometricfrustrationisabsent,sincee.g.antiferromagneticorderispossiblebyplacingmagneticmomentspointinginonedirectionononesublat-ticeandpointingintheoppositeoneontheothersublattice.However,duetothefactthatthehoneycomblatticehasthesmallestcoordinationnumberintwodimensions,theeffectofquan-tumfluctuationsisthestrongest.

Afurtherremarkablefeatureofthehoneycomblatticeappears,whenelec-tronsareplacedonit.Assumingforthemomentnon-interactingelectrons,thecorrespondingbandstructureisasshowninFigure2.There,itcanbeseeninthefirstplace,thattheelectron(particle)statesandthehole(antiparticle)statesaresymmetricallyplacedaroundthezeroofenergy,suchthatparticle-hole(chargeconjugation)symmetryispresent.ThezeroofenergycorrespondstotheFermienergywhentheaveragedensityofelectronsisunityperlatticesite,i.e.forahalf-filledband.Moreover,thelowenergystatesaroundparticularpointsinthetwo-dimensionalBrillouinzonedisplayarelativisticdis-persionandcanbereadilydescribedbyDirac’sequation[2].Wetherefore,willrefertothosepointsasDiracpointsinthefollowing.Therelativisticdispersioninthelowenergysectorathalf-fillingleadstoavanishingdensityofstates

Exotic State in Correlated Relativistic Electrons

Applications

32

Figure2:Energydispersionoftheelectronicstatesfornon-interactingelectronsonahoneycomblattice.Closetothezeroofenergythedispersioncorrespondstotheonecharacteristicofrelativisticfermions,asdisplayedbytheportionzoomedin.

attheFermienergy.Therefore,afinitestrengthforinteractionspromotingaspontaneoussymmetrybreakingisnecessary,enhancingthus,theroleoffluctuations.

Inordertostudytheeffectsofcorrela-tionsinelectronsonthehoneycomblat-ticeinitsmostbasicform,weconsidertheHubbardmodel,whereonlyanon-siteinteraction,termedU,ispresent.Suchamodelisaparadigmforstronglycorrelatedelectrons,asinthecaseofhightemperaturesuperconductors[3],andgivesanaccuratedescriptionofultra-coldfermionicatomsinopticallattices[4,5].ForlargevaluesoftherepulsiveinteractionU,andathalf-filling,theground-statecorrespondstoaMott-insulator,i.e.aninsulatingstateduetointeractionsincontrasttothemetallicstateofthenoninteractingsystem.Furthermore,inthislimitan-tiferromagneticcorrelationsdominateduetoPauli’sexclusionprincipleandthenecessityofgainingkineticenergy.Therefore,onabipartitelattice,theywillleadtoanantiferromagneticallyorderedground-state.However,asthe

interactionstrengthdiminishes,acompe-titionbetweenthetendencytoorderandquantumfluctuationswillsetin,sothatadetailedanalysisofcorrelationsisneededtocharacterizethepossiblephases.

Needlesstosay,anunbiasedstudyasdelineatedaboveisonlypossiblebynumericalmeans.Amongthedifferentmethods,QuantumMonteCarlo(QMC)simulationsarethemostappropriateonesinceacarefulextrapolationtothethermodynamiclimit,inthiscaseintwodimensions,ismandatorytodeterminewhetheraspontaneoussymmetrybreak-inghastakenplace.Weimplementedaprojective(temperature T=0)determi-nantalQMCalgorithminthecanonicalensemblethatisfreeofthesign-problemathalf-filling[6].Thisalgorithmallowsthecalculationoftheexpectationvalueofanyphysicalobservableintheground-statebyperforminganimaginarytimeevolutionofatrialwavefunctionthatisrequiredtobenonorthogonaltotheground-state.ThevalueΘreachedintheimaginarytimeevolutioncorre-spondstoaprojectionparameter[6].Foraspin-singlettrialwavefunction,

Applications

33

Figure3:PhasediagramfortheHubbardmodelonthehoneycomblatticeathalf-filling.Thesemimetal(SM)andtheantiferro-magneticMottinsulator(AFMI)areseparatedbyagappedspinliquid(SL)phaseinanintermediatecouplingregime.Δsp(K)denotesthesingle-particlegapatoneoftheDiracpoints(K),andΔsthespingap.msdenotesthestaggeredmagnetizationwhosesaturationvalueis1/2.

wefoundΘ=40/ttobesufficienttoobtainconvergedground-statequanti-tieswithinstatisticaluncertainty.Inthepresentedsimulations,weusedafiniteimaginarytimestepΔτ=0.05/t.Weverifiedbyextrapolating Δτ→0thatthisfiniteimaginarytimestepproducesnoartefacts.Thephasesdescribedinthefollowingweredeterminedbyafinite-sizeextrapolationtothethermo-dynamiclimitwithlatticesofN=2L2siteswithperiodicboundaryconditions,andlinearsizesLintermsoftheunitcellcontainingtwosites,withL≤18.Lwastakenasamultipleof3inordertobeabletoincludetheDiracpointsinourBrillouinzones,suchthatthelowenergyphysicsiscorrectlyrepre-sented.IntermsofasimulationofaclassicalIsingmodel,inourcasewithlong-rangeinteractions,thelargestsys-temssizescorrespondtoalatticewith518,400sites.

Afirstinsightinthepossiblephasesofthesystemisobtainedbyconsideringthesingle-particleexcitationgapΔsp(k)thatweextractedfromtheimaginary-timedisplacedGreenfunction(seeRef.[7]fordetails).Δsp(k)givesthemini-malenergynecessarytoextractonefermionfromthesystem,andcorre-spondstothegapthatcanbeobserved

inphotoemissionexperiments.AsshowninFigure3,Δsp(K)=0forU<Uc≈3.6t,wheretisthehoppingamplitudeintheHubbardmodel.Thevanishinggapcorrespondstoametal,thatiscommonlycalledasemimetal(SM)duetothefactthattheFermisurfaceisinthiscasereducedtoapoint.BeyondUc,thesystementersintoaninsulatingphaseduetointerac-tions,andhence,asexpectedforlargevaluesofU,thesystembecomesaMott-insulator.ThevaluesofthegapareobtainedviaanextrapolationoftheQMCdatatothethermodynamiclimitwithenergiesgiveninunitsoft[7].

Asexplainedabove,forvalueslarge

enoughofU,oneexpectslong-range

antiferromagentic(AF)correlations.We

thereforemeasuredtheAFspinstruc-

ture-factorSAFthatrevealslong-rangeAF

orderiflimN→∞ SAF/N>0.Theresultsof

afinite-sizeextrapolationarealsopre-

sentedinthephasediagramofFigure

3.AForderappearsbeyondU/t≈4.3.

Hence,contrarytotheusualexpectation

forabipartitelattice,AFlong-range

ordersetsinlaterthantheinsulating

phase,leavinganextendedwindow

3.6U/t4.3,withinwhichthe

systemisneitherasemimetal,noran

AFMott-insulator.

Applications

34

Figure4:SpingapinunitsoftasafunctionofU/tforvarioussys-temssizes.Thelowestcurvecorrespondstotheextrapolationtothether-modynamiclimit(TDL).

Furtherdetailsonthenatureofthisintermediateregionareobtainedbyexaminingthespinexcitationgap,ex-tractedfromthelong-timebehaviouroftheimaginary-timedisplacedspin-spincorrelationfunction[7].WeconsiderfirstthespingapΔsinthestaggeredsectoratk=0,whichvanishesinsidetheAFphaseduetotheemergenceoftwoGoldstonemodes,aswellasinthegaplessmetallicphase.Figure4showsfinitesizeestimatesofΔsfordifferentvaluesofU/t,alongwithanextrapolationtothethermodynamiclimit.AfinitevalueofΔspersistswithinanintermediateparameterregime3.5U/t4.3,whileitvanishesbothwithinthemetallicandtheAFphase.WealsocalculatedtheuniformspingapΔubyextrapolatingthespingapobservedatthesmallestfinitek-vectoroneachclustertothethermodynamiclimit.ΔuisfoundtobeevenlargerthanΔsinsidetheintermediateregion(e.g.Δu=0.101(8)atU/t=4),andvan-ishesinthemetallicandtheAFphase[7].

TheobservationofafinitespingaprulesoutgaplessphasessuchastripletsuperconductivityaswellasquantumspinHallstates.Theremain-ingpossibilitiescanbeenumeratedbyconsideringthecouplingtoorderpa-rametersthatleadtotheopeningofamassgapinDiracfermions,andhencetoaccountforthesingle-particlegapobservedintheQMCdata:(i)singletsuperconductivity,(ii)aquantumHallstate(QHS),(iii)chargedensitywave(CDW)order,and(iv)avalencebondcrystal(VBC).OurQMCresultsexcludeallthosestates,asdiscussedbelow.Thereby,theintermediatephaseisgen-uinelyanexoticstateofmattersinceitcannotbeunderstoodatthesingleparticlelevelwithinamean-fieldtheorywithalocalorder-parameter.

Furthermore,sincenospontaneoussymmetry-breakingisobserved,whileaspingapispresent,itcorrespondstoaspinliquidstate.

Inordertoassess,ifsuperconductivityarisesinthevicinityoftheMott-transition,weusedthemethodoffluxquantizationwhichprobesthesuper-fluiddensityandishenceindependentofthespecificsymmetryofthepairwavefunction[7].LetΦbeamagneticfluxtraversingthecentreofatorusonwhichtheelectronicsystemliesandE0(Φ/Φ0)thetotalgroundstateenergy, Φ0beingthefluxquantum.AsuperconductingstateofCooperpairsispresentifinthethermody-namiclimit,themacroscopicenergydifferenceE0(Φ/Φ0)−E0(Φ/Φ0 =1/2)isafunctionwithperiod1/2.Incontrast,ametallic(insulating)phaseischarac-terizedbyan(exponential)vanishingofE0(Φ/Φ0)−E0(Φ/Φ0=1/2)asafunctionofsystemsize.TheQMCdataisconsis-tentwiththevanishingofthisquantityinthethermodynamiclimit.Inaddition,wemeasuredsingletsuperconductingorderparametersof(extended)s-,p-,andf-wavesymmetry,whichturnouttoallvanishinthethermodynamiclimit[7].Hence,bothfluxquantizationaswell

Applications

35

Figure5:Realspaceplotofthespindimer-dimercorrelations.Rightside:thedimer-dimercorrelationfunc-tioninthespin-channelfora L =6systematU/t=4.Leftside:thesamecorrelationfortheisolatedHubbardhexagonalsoatU/t=4.Thereferencebondsaredressedwithstripes.Numbersinparenthesisindicatethestandarderrorofthelastdigit.

asadirectmeasurementofpairingcorrelationsinvarioussymmetrysectorsleadtonosignofsuperconductivity.

BoththeCDWandQHStriggerabreakingofthesub-latticesymmetryandtherebyopenamassgapatthemeanfieldlevel.Adetailedanalysisofthecharge-chargecorrelationfunctionsrulesoutaCDW.Furthermore,wehavefoundnosignatureforthepresenceof(spin)currentsintheground-state.ThisrulesoutthebreakingofsublatticeandtimereversalsymmetriesasrequiredfortheQHS[7].

ToexaminetheoccurrenceofaVBC,weprobefordimer-dimercorrelationsbetweenadimerformedbynearestneighboursites<ij>andadistantbondformedbysites<kl>[7].WehavefoundnoVBC,neitherinthecharge,norinthespinsector.TheleftsideofFigure4showstheresultsofthismeasurementinthespinsector,i.e.thecorrelationbetweensingletdimersat U/t=4.0.Thestripedbondistheonewithrespecttowhichcorrelationsweredetermined.Theyarefoundtobeshort-ranged,andconsistentwiththedominanceofaresonatingvalencebond(RVB)statewithinthehexagonsofthehoneycomblattice.Thiscanbeseenby

comparingthesinglet-correlationswiththoseofanisolatedhexagon(rightsideofFigure4),theclassicalexampleoftheresonancephenomenoninconju-gated-electrons[8].Accordingly,wefindnolong-rangedorderfromthedimer-dimerstructurefactorsinFourierspace.Ourresultsthusrevealagenuinelyexoticstateofmatter,wherenospontaneoussymmetry-breakingisobserved,whileaspingapispresent.ItcorrespondstoaspinliquidRVBstateintheintermediatecouplingregimeinthevicinityoftheMott-transition.

ThepresenceofaspinliquidintheHubbardmodelonthehoneycomblatticeclosetoanantiferromagneticMott-insulatorwashithertounex-pected,duetothebipartitenatureofthehoneycomblattice,andhence,theabsenceoffrustration.However,ourresultsindicatethatstrongenoughfluctuations,thatdevelopclosetothequantumcriticalpointwhereAFordersetsin,leadtosuchanexoticstateofmatter.Itcouldbeexpected,thatsuchfluctuationswouldpromotesomebrokensymmetrystateslikesuper-conductivity.However,thevanishingdensityofstatesattheFermienergymayberesponsibleforitsabsence,

Applications

36

•ZiYangMeng1

•ThomasC.Lang2

•StefanWessel1

•FakherF.Assaad2

•Alejandro Muramatsu1

1Institutfür Theoretische PhysikIII, Universität Stuttgart, Germany

2Institutfür Theoretische Physikund Astrophysik, Universität Würzburg, Germany

sinceinthiscase,afinitecouplingstrengthisneeded,atleastintheBCS-frame.

Havinganunexpectedrealizationofashort-rangeRVBstate,itwouldbehighlyinterestingtoexploretheconsequencesofdoping,inaspiritratherclosetotheoriginalscenarioproposedbyAnderson[3]andKivelsonet al.[9]forthecuprates.Inparticular,forthefullygappedshort-rangeRVBstate,thefinitespingapsetstheenergyscaleofpairinginthesuperconductingstate[9].Inthisrespect,thevalueobtainedforthespingapisratherpromizing.ThelargestvalueattainedisΔs∼0.025t(Fig.1),thatfortintherangeof1.5to2.5eV(ingrapheneist=2.8eV[2])correspondstoatemperaturescalerangingfrom400to700K.

AlthoughstudiesofdopingarebeyondthepowerofourquantumMonteCarloapproachduetothesignprob-lem,theycouldopeninterestingper-spectivese.g.infutureexperimentswithultra-coldatomsonahoneycombopticallattice,orwithhoneycomblatticesbasedongroupIVelementslikeexpandedgraphene(toenhancetheratioU/t)orSi,wherethenear-estneighbourdistanceisexpectedtobeapproximately50%largerthaningraphene[10],suchthatcorrelationseffectsareenhanced.

AcknowledgmentsWethankL.Balents,S.Capponi,A.H.CastroNeto,A.Georges,M.Hermele,A.L̈ auchli,E.Molinari,Y.Motome,S.Sachdev,K.P.SchmidtandS.Sorellafordiscussions.WearegratefultoS.A.Kivelsonforthoroughlyreadingourmanuscriptandprovidingimportantsuggestions.F.F.A.isgrateful

totheKITPSantaBarbaraforhospital-ityandacknowledgessupportbytheDFGthroughAS120/4andFG1162.A.M.thankstheAspenCenterforPhysicsforhospitalityandacknowledgespartialsupportbytheDFGthroughSFB/TRR21.S.W.acknowledgessup-portbytheDFGthroughSFB/TRR21andWE3649.WethankNICJülich,HLRSStuttgart,theBWGrid,andtheLRZMünchenfortheallocationofCPUtime.

References[1] NatureInsight ExoticMatter,Nature464,175, 2010

[2] Neto,A.H.C.,Guinea,F.,Peres, N.M.R.,Novoselov,K.S.andGeim,A.K. Rev.Mod.Phys.81,109,2009

[3] Anderson,P.W. Science235,1196,1987

[4] Jördens,R.,Strohmaier,N.,Günter,K., Moritz,H.andEsslinger,T. Nature455,204,2008

[5] Schneider,U.,Hackermüller,L.,Will,S., Best,T.,Bloch,I.,Costi,T.A.,Helmes,R.W., Rasch,D.andRosch,A. Science322,1520,2008

[6] Assaad,F.F.andEvertz,H.G. ComputationalMany-ParticlePhysics, LectureNotesinPhysics,p.739, Springer-Verlag,Berlin,2008

[7] Meng,Z.-Y.,Lang,T.,Wessel,S., Assaad,F.F.andMuramatsu,A. Nature464,847,2010

[8] Pauling,L.C. TheNatureoftheChemicalBondandthe StructureofMoleculesandCrystals:an IntroductiontoModernStructural Chemistry,CornellUniversityPress, 20thEdition,Ithaca,NewYork,USA, 1986

[9] Kivelson,S.A.,Rokhsar,D.S. andSethna,J.P. Phys.Rev.B35,8865,1987

[10]Cahangirov,S.,Topsakal,M.,Aktürk,E., Sahin,H.andCiraci,S. Phys.Rev.Lett.102,236804,2009

Applications

37

Figure1:Surfacecirculation(snapshot)aroundSouthAfrica.TheAgulhasCur-rent(redband)flowsalongtheeastcoastofSouthAfrica,retroflectingbackintotheIndianOcean.DuringthisprocessAgulhasringsarecutoffanddriftintotheAtlanticOcean.

TheoceancurrentsaroundSouthAfricaareanimportantelementintheglobaloceancirculation.UnderpresentclimateconditionstheflowofwarmandsaltywatersfromtheIndianOceanintotheAtlanticOceanaroundthesoutherntipofAfrica,the“Agulhasleakage”,pro-videsthebulkoftheupperlimbofthethermohalinecirculationintheAtlanticOcean.PartsofthiswaterlaterfeedintotheGulfStreamsystemoftheNorthAtlanticthatisresponsibleforthemildclimaticconditionsinEurope.TheunderstandingofthedynamicalfactorsdeterminingtheintensityandvariabilityofAgulhasleakageisstillincomplete,andsoisitsbehaviourunderachangingclimate.

Incontrasttoitslarge-scaleimportancethecirculationintheAgulhasregionisadynamicalmixtureofdifferenttimeandspacescales(Fig.1):Astrongwesternboundarycurrent,theAgulhasCurrent,transportsthewarmandsaltywatersouthwardintheIndianOcean.SouthofAfricaitovershootsthecontinentalslopeandabruptlyturnsbackintotheIndianOcean,whilesheddingenormousmesoscaleringsofseveral100kilo-metresindiameterandextendingoverlargepartsofthewatercolumn.Theseringstransporttheheatandsaltaspul-satingelementsintotheAtlanticOcean.Thecirculationdynamicsintheregionalsoincludessmall-scaleupstreamperturbationsasanimportantelement.EddiesareformedintheMozambiqueChannelandeastofMadagascar;

thesedriftsouthwardstowardstheAgulhasCurrent,displacing

itupto200kmoffshore.Thecorrespondingmeanders

rapidlyprogressdown-streamandtriggerthesheddingofAgulhasringsandthereforeAgulhasleakage.

ToexaminetheroleofAgulhasleakageintheglobaloceaniccirculation,aninnova-

tiveoceanmodellingprogramhasbeenset

upthatadvancesnewmethodologiesdeveloped

ininternationalcooperationwithFrenchandSouthAfrican

colleagues,aspartoftheEuropeanmodelcollaborationDRAKKAR[1].

The Agulhas System as a Key Region of the global oceanic Circulation

Golden Spike Award by the HLRS Steering Committee in 2010

Applications

38

Figure2:SchematicsofAGRIFnesting.Time-step-pingofthebase(left)andnested(right)grids.Thegreenboxesandarrowsin-dicateaninterpolationfromthebasegridontotheouterboundariesofthenest,theredonesanaveragingoftheouterandsurfaceboundariesofthenestontothebasegrid;themeshindicatesanaveragingofthewholenestontoitsbasegridpointsintheAgulhasregion.Greyar-rowsandnumbersindicatethetimestepsofbase(Bn)andnest(Nn)andtheirre-spectiveupdates(Bn’,Nn’).

Themodelhierarchyisbasedonthe“NucleusforEuropeanModellingoftheOcean”(NEMO,v.2.3)[2],consistingofacoupledocean/sea-icemodel.Theoceancomponentisafinite-differencediscreti-zationofavariantoftheNavier-StokesEquations(the“primitiveequations”),steppingthree-dimensionalvelocities,temperaturesandsalinitiesforwardintime.Afreesurfaceformulation(e.g.byaconjugategradientsolver),ahigh-order

polynomialfitofthedensityequationandlotsofparameterizationsfordif-ferentoceanphysicsandsmall-scaleprocessesletthecompleteprogramappearwithawiderangeofdifferentnumericalmethods,thoughflexibleinitsuseduetothemodularformulation.ItiswritteninFORTRAN90andhasgeographicaldomaindecompositioninthehorizontalforMPIparallelization.Traditionallythegridspacelayoutand

Golden Spike Award by the HLRS Steering Committee in 2010

Applications

39

Figure3:Large-scaleimpactoftheAgulhasdynamics.Temperaturesandcurrentsat450mdepthinthehigh-resolutionnestanditsembeddingintheglobalmodel.AsimilarfigurewasthebasisfortheNaturecoverpageonNovember26,2009[8].

theuseoftheverticalaxisinformofvectorsleadtoagoodperformanceonvectorsystems(upto33%ofthepeakperformance).Withabout37×106gridpointsandahightemporal(5-daily)resolutionneededtheoutputofatypi-cal50-yearexperimentiswithmorethan5TBquitelarge.

TheAgulhasmodelisacombinationofacoarse-resolutionglobalbasemodelandahigh-resolutionnestaroundSouthAfrica(Fig.1).Withanominalgridsizeof1/2°thebaseconfigura-tion(ORCA05)successfullysimulatesthelarge-scalewind-drivenandther-mohalinecirculation[3].Itisforcedbyobservedatmosphericconditionsduringtheperiod1958-2004.However,forafullrepresentationoftheAgulhasdynamicsahighspatialresolutionwithgridscaleslessthan10kilometresareneeded;thisisachievedherebynest-inga1/10°gridintothebasemodelusingAGRIF(“AdaptiveGridRefinementInFortran”,[4]).

AGRIFrecombinesthesubroutinesinthemodelcodeviaapreprocessingstepandprovidesroutinesforinter-polationandaveragingbetweenthetwogrids(Fig.2).Itallowsbothmodelstointeractatanygivenbasemodeltimestepwhere(i)thebaseupdatestheboundariesofthenest,(ii)thenestupdatesthecoarsergridpointsofthebase.Duetothedifferentresolutionofphysicalprocessesthenestedmodelhastoperform4-5timestepsbeforethebasemodelissteppedforwardintime.Thiseffectiveandnovel“two-way”nestingapproachassuresthatthissystemnotonlysimulatesthecurrentsystemaroundSouthAfricawithgreatverisimilitude;italsoallowsunravellinghowtheexplicitlysimulatedmesoscalevariabilityintheAgulhasdynamicsfeedsbacktotheglobalocean.

Firstanalyzesaddressedtheimportanceofmesoscaleprocesses,notonlyintherepresentationofthecirculationaroundSouthAfrica[5],butalsointhenetvolumetransferbetweentheIndianandAtlanticOcean(theAgulhasleak-age)[6].Comparisonwiththecoarse-resolutionbasemodelaloneconfirmedthatAgulhasleakageissignificantlyoverestimatedatcoarseresolution,andthereforeincurrentIPCC-typecoupledclimatemodels.TheexplicitsimulationoftheupstreameddiesoriginatingfromtheMozambiqueChannelandeastofMadagascarthatdrifttowardstheAgulhasCurrentanddotriggerthesheddingofAgulhasRings,however,donotmakeasignifi-cantimpactonthevariabilityofAgulhasleakageontimescalesofafewyearsandlonger.

Applications

40

•ArneBiastoch

Leibniz-InstitutfürMeeres-wissenschaften(IFM-GEOMAR),Kiel

WhatistheeffectoftheAgulhasCurrentsystemonthelarge-scalecirculationintheAtlanticOcean?Comparingthecirculationinsolutionswithandwith-outthehigh-resolutionAgulhasnestallowedidentifyinganintriguingcon-tributionofthemesoscaleAgulhasdy-namicsondecadalcurrentfluctuationsreachingfarintotheNorthAtlantic[7].ThedynamicalsignaloriginatingsouthofAfricarapidlytravelsnorthwardbyboundarywaves.InthetropicalandsubtropicalNorthAtlantictheAgulhas-inducedvariabilityhassimilaramplitudesasthevariabilityintroducedbysubpolardeepwaterformationsevents,amecha-nismthathasbeenknownforitsclimaticimpactandthathasbeenextensivelystudiedinthepast.

InadditiontothedecadalfluctuationsbytheAgulhasmesoscaleanothercli-mate-relevantprocessemergesfromtheAgulhasdynamics.ObservationsreportontheprogressivepolewardmigrationoftheSouthernHemispherewesterlywindsduringthelasttwo-threedecadesandlinkedthosetoanthropogenicforcing.Becauseofthesparseobservationalrecordsithasnotbeenpossibletodeterminewhethertherehasbeenaconcomi-tantresponseofAgulhasleakage.ResultswiththenestedAgulhasmodelshowedthatthetransportofIndianOceanwatersintotheSouthAtlanticviatheAgulhasleakagehasincreasedduringthelastdecadesinre-sponsetothechangeinwindforcing[8].Theincreasedleakagehascon-tributedtotheobservedsalinificationofSouthAtlanticthermoclinewaters.BothmodelandhistoricmeasurementsofSouthAmericasuggestthattheadditionalIndianOceanwatershavebeguntoinvadetheNorthAtlantic,withpotentialimplicationsforastabili-zationofthethermohalinecirculation.

ThefindingshighlighttheimportanceforstudyingtheAgulhasregimeanditsassociatedinteroceanictransportasaprominentkeyregionoftheglobalthermohalinecirculation.

References[1] TheDRAKKARGroup Eddy-PermittingOceanCirculationHind- castsofPastDecades.ClivarExchanges 12,8-10,2007

[2] Madec,G. NEMO=theOPA9oceanengine.Technical report,NoteduPoledemodelisation, InstitutPierreSimonLaplace(IPSL), France,2008

[3] Biastoch,A.,Böning,C.W.,Getzlaff,J., Molines,J.-M.,Madec,G. Causesofinterannual-decadalvariability inthemeridionaloverturningcirculation ofthemid-latitudeNorthAtlanticOcean, J.Climate21,6599-6615,2008

[4] Debreu,L.,Vouland,C.,Blayo,E. AGRIF:AdaptivegridrefinementinFortran, ComputersandGeosciences34,8-13,2008

[5] Biastoch,A.,Beal,L.,Casal,T.G.D., Lutjeharms,J.R.E. VariabilityandcoherenceoftheAgulhas Undercurrentinahigh-resolutionocean generalcirculationmodel,J.Phys. Oceanogr.39,2417-2435,2009

[6] Biastoch,A.,Lutjeharms,J.R.E., Böning,C.W.,Scheinert,M. Mesoscaleperturbationscontrolinter- oceanexchangesouthofAfrica,Geophys. Res.Lett.35,L20602,2008

[7] Biastoch,A.,Böning,C.W., Lutjeharms,J.R.E. Agulhasleakagedynamicsaffectsdecadal variabilityinAtlanticoverturningcirculation, Nature456,489-492,2008

[8] Biastoch,A.,Böning,C.W., Lutjeharms,J.R.E.,Schwarzkopf,F.U. IncreaseinAgulhasleakageduetopole- wardshiftoftheSouthernHemisphere westerlies,Nature462,495-498,2009

Applications

41

IntheAfricanSudanian(10°N-15°N)andSahelianclimatezones(15°N-18°N)convectivesystemsplayakeyroleinthewatercycle,becausetheycon-tributetoabout80%totheannualrainfall.Theconvectivesystemsarethunderstormcomplexeswithahori-zontalextentofseveralhundredsofkilometers.TheyarepartoftheWestAfricanmonsoon(WAM),whichalsoimpactsthedownstreamtropicalAtlanticbyprovidingtheseedlingdis-turbancesforthemajorityofAtlantictropicalcyclones[1].

TheWAMsystemischaracterizedbytheinteractionoftheAfricaneasterlyjet(AEJ),theAfricaneasterlywaves(AEWs),theSaharanairlayer(SAL),aswellasbythelow-levelmonsoonflow,theHarmattan,andthemeso-scaleconvectivesystems(MCSs).TheAEJcanbeobservedbetween10-12°Nandataheightofabout600hPaandhasatypicalwindspeedofabout12ms-1.TheAEJdevelopsasaresult

ofthereversedmeridionaltemperaturegradientduetotherelativelycoolandmoistmonsoonlayerandthehotanddry

SAL,whichislocatedabovethe

monsoonlayer.

TheAEWsaresynoptic-scaledistur-bancesthatpropagatewestwardsacrosstropicalWestAfricatowardtheeasternAtlanticandtheeasternPacific.TheAEWsarecharacterizedbypropagationspeedsof7-8ms-1,aperiodof2-5days,andawavelengthofabout2,500-4,000km.Importantparametersfortheinitiationofconvec-tionarethespatialdistributionandtemporaldevelopmentofwatervapourintheconvectiveboundarylayer(CBL).Besidesadvectiveprocesses,watervaporismadeavailableintheatmo-spherelocallythroughevapotranspi-rationfromsoilandvegetation.ManyresearchfindingsshowthatthesoilmoistureexertsgreaterinfluenceontheCBLthanvegetation.

InthescopeoftheAfricanMonsoonMultidisciplinaryAnalyses(AMMA)pro-jectweinvestigatethedevelopmentofMCSs,thesensitivityoftheirlifecycletodifferingsurfaceproperties,theroleoflarger-scaleweathersystems(AEWs,theSaharanHeatLow)intheirevolution,andthedevelopmentoftro-picalcyclonesoutofsuchsystems.Inaddition,westudytheinteractionoftheSALwithAfricanmonsoonweathersystems.

TosimulatetheweathersystemsoverWestAfricaweusetheCOSMO(COnsortiumforSmallscaleMOdelling,www.cosmo-model.org)model[2,3].COSMOisanoperationalweatherforecastmodelusedbyseveralEu-ropeanweatherservices,e.g.the

GermanWeatherService(DWD).Additionally,we

Modelling Convection over West Africa

Golden Spike Award by the HLRS Steering Committee in 2009

Applications

42

useCOSMOcoupledwiththeaerosolandreactivetracegasesmodule(COSMO-ART)toinvestigatetheinteractionoftheSALwithWAMsystems.COSMO-ART[4,5]wasdevelopedinKarlsruheandcom-putestheemissionandthetransportofmineraldust.Weusedthecom-puterfacilitiesattheHPXC4000attheSteinbruchCentreforComputing(SCC)astheCOMSOmodelrequiressubstantialsupercomputerresources.Inthefollowing,wefocusontwodif-ferenttopics.OntheonehandthelifecycleofanMCSoverWestAfricaismodelledwithrespecttothesoilcondi-tions.OntheotherhandthefocusliesoncomparisonbetweenthesimulatedMCSsoverWestAfricaandovertheEasternAtlantic.

Thefirstpartofthisstudyinvesti-gatesthesensitivityofthelifecycleofanMCStosurfaceconditions[6].TheanalysisisbasedonsimulationsofarealMCSeventon11June2006whichoccurredinthepre-onsetphaseofmonsoonwhenvegetationcoverislowandtheimpactofsoilmoistureisassumedtobedominant.Differentconditionsforsoilmoisturewereap-pliedforinitializationofthesoilmodelTERRA-ML.TherunbasedontheCOSMOsoiltypedistributionandonoriginalECMWFfieldswasdenotedwithMOI.

However,comparisonwithAMSR-EsatellitedatashowedthattheMOIfieldcontainedtoomuchsoilmoistureintheuppersurfacelayer.Therefore,wereducedthevolumetricsoilmoisturecontentinalllayersby35%compared

totheinitialconditionsofMOI.Thisresultedinasimilarsoilmoisturecon-tentintheuppermostlevelofTERRA-ML,comparedtothesoilmoisturevaluesofabout12%derivedbytheAMSR-Esatelliteandin-situmeasurementsofabout18%fortheuppermost5cmtakenatDano(3°Wand11°N)[7]fortheregionaround11°N,wheretheMCSwasobserved.ThecorrespondingsimulationwasdesignatedasCTRLexperiment.Toeliminatetheeffectofspatialsoilmois-turevariabilityontheinitiationofcon-vection,anadditionalsimulationwithahomogeneous(HOM)distributionofsoilmoistureandsoiltexturewasperformed.InthiscasethevolumetricsoilmoisturewasspecifiedasameanvalueofthevolumetricsoilmoistureintheCTRLexperimentalong11°Nfrom4.5°Wto4.5°E.Toinvestigatetheeffectofdryregionsonconvec-tivesystems,wherethesoilmoisturestructureislesscomplexthanthecon-ditionspresentintheCTRLrun,adrybandof2degreelongitudinalextensionwasinsertedintothehomogeneoussoilmoisturefield.Inthisbandthevol-umetricsoilmoisturewasreducedby35%comparedtothehomogeneousenvironment.Thecorrespondingex-perimentwasdenotedasBAND.

Golden Spike Award by the HLRS Steering Committee in 2009

Applications

43

IntheCTRLcasethreeseparatecellswereinitiatedinthesouth-easternpartofBurkinaFaso.Precipitationofupto6mmh-1wassimulatedat17UTC(Figure1a).Thesouth-westernmostcelldevelopedintheleeofanareawithorographicallyinducedupwardmotion.TriggeringofconvectionoftenoccursinthiswayinWestAfrica.Twofurthercellsdevelopedintheeast,at1.9°Eand11.9°Nandat2.2°Eand11.6°N.InFigure1atheprecipitationpatternoftheCTRLcaseat17UTCisoverlaidonthesoilmoisturedistributionintheuppermostlayerat15UTC.Thisfigureshowsthatallthreecellsdevelopedinthetransitionzonefromawettertodryersurface,whilethecentresoftheprecipitatingcellswerepositionedoverthedryersurface.IncomparisontotheCTRLcase,onlytwoprecipitatingcellshaddevelopedat17UTCintheHOMcase(Figure1b).

ThesetwocellswereobservedatroughlythesamelocationsasthemostintensivecellsintheCTRLcase.However,theprecipitationofbothcellswaslessintensethanthatoftheCTRLcaseatthesametime.IntheMOIcasethefavorableconditionswithhighcon-vectiveavailableenergy(CAPE)andlowconvectiveinhibition(CIN)valuesweremorelimitedinspacethanintheothercases.Inaddition,thesurfacetempera-tureintheregionofinterestintheMOIcasewasabout3°ClowerthanintheCTRLcase.Undertheseconditionsonlyoneweakprecipitatingcellhaddevelopedat17UTC.

Oncetriggered,theconvectivecellsdevelopedquicklyintheCTRLandHOMcaseandmovedwiththeAEJtowardsthewest.Abouttwohoursafteritsini-tiation,thecellshadalreadyorganizedintoanMCSintheCTRLrun.Inthe

Figure2:24-haccumulatedprecipitationinmm(colorshaded)startingfrom06UTCon11June2006(a),Hovmöllerdiagramofprecipitationinmmh-1(colorshaded)averagedbetween10.5°Nand13.5°NonJune11and12,2006(b),andCIN(c)inJkg-1(colorshaded)onJune11,2006at18UTCforBANDcase.Thesolidlinesenframetheareaofthedryband.Takenfrom[6].

Applications

44

HOMcasethreeseparatecellscouldstillbedistinguishedat19UTC,whichwerelessintensethanintheCTRLcase.TheMOIrunshowedonlyweakconvectiveactivity.Theimpactofthedrybandwithavolumetricmoisturecontentof8.3%,surroundedbyaho-mogeneousmoisturecontentof12.7%onthemodificationofamatureMCSisshowninFigures2aand2b.Precipita-tionwassignificantlyreducedbetween0and2°W,i.e.shiftedbyaboutonedegreetotheeastofthedryband.Precipitationwasinterrupted,whentheMCSapproachedthedryband,butre-generatedwhenthesystemreachedthewesternpartoftheband(Figure2b).Convectionwasalsoiniti-atedinthewesternpartofthedryband(around3°W)atabout19UTC.Thecloudclusterwasaccompaniedbysignificantrainfallandmovedtothewest.At18UTCanareawithlower

CAPEvalueshaddevelopedwithinthedryband,whileCINincreasedtotheeastandwestofit(Figure2c).Eastoftheband(0.83°W,12.2°N)CINfur-therincreasedduringthesubsequenthours.Insidetheband(1.8°W,12.2°N)CINwassignificantlylower.ThelowerCAPEinsidethedrybandresultedmainlythroughlowernear-surfacehumidity.Eastofthedrybandalowernear-surfacetemperatureledtoahigherCIN,whichinhibitedtheconvec-tionoftheapproachingMCS.Thein-creaseofCINinsidethedrybandlaterthatnightyieldedfromthenocturnaldecreaseofnear-surfacetemperatureandthepassageoftheMCSinthesurrounding.

Thesimulationsshowedthatconvec-tionwasinitiatedinallmodelexperi-ments,regardlessoftheinitialsoilmoisturedistribution.Theareawhere

Figure1:Volumetricsoilmoisturein%at15UTCintheuppermostlayer(colorshaded)andprecipitationinmmh-1(isolines,interval2)onJune11,2006at17UTCforCTRL(a),HOM(b),andMOI(c)case.Takenfrom[6].

Applications

45

convectionwasinitiatedinthesimula-tionscorrespondedroughlywiththeobservations.IntheCTRLcaseallthreecellswereinitiatedalongsoilmoistureinhomogeneitiesandshiftedtowardsthedrysurface.TriggeringofconvectionandoptimalevolutionwassimulatedinareaswithlowCIN,highCAPEandlowsoilmoisturecontent(<15%)orsoilmoistureinhomogene-ities.InCTRLandHOMthecellsde-velopedquicklyandmergedintoorga-nizedmesoscalesystemswhilemovingwestwards.TheMCSintheCTRLrunexperiencedasignificantmodification.Theprecipitationdisappearedwhen

theMCSreachedaregionwhichwascharacterizedbyhighCINvalues(>150Jkg-1),reducedtotalwatercon-tentandsoilmoistureinhomogeneities.

Inconclusion,triggeringofconvec-tiononthisdaywasfavouredbydriersurfacesand/orsoilmoistureinhomo-geneities,whileamaturesystemweak-enedinthevicinityofadriersurface.Thismeansthatapositivefeedbackbetweensoilmoistureandprecipitationexistedforamaturesystemwhereasanegativefeedbackwasfoundfortrig-geringofconvection.Thetwomecha-nismsareillustrativeforthecomplexity

Figure3:ComparisonbetweentheRapidDevelopingThunderstormProduct(RDT)(left)andCOSMOmodelruns(right)onSeptember10,2006at04UTC(upperrow)andonSeptember12,2006at8UTC(lowerrow).Left:ConvectiveobjectsaresuperimposedovertheMeteosatinfraredimagesusingshadingsofgreyupto-65°C,orange–redcolorsbetween-65°and-81°C,andblackabove-81°CintheRDTimages.CourtesyofMétéoFrance.Right:Theverticalintegralofcloudwater,cloudiceandhumidity(kgm-2),indicatingtheconvectiveupdraughtcores,fromthe2.8-kmCOSMOrunwhichwasinitializedonSeptember9,2006at12UTC.Takenfrom[10].

Applications

46

ofsoil-precipitationfeedbacksinanareawherehighsensitivityofprecipita-tiononsoilmoisturewasproven.

Thesecondstudyinvestigatesconvec-tivesystemsoverWestAfricaandtheeasternAtlantic.TheconvectivesystemsareembeddedintheAEWoutofwhichahurricanedeveloped.IntheafternoonhoursonSeptember9,2006anMCSwasinitiatedoverlandaheadofthetroughofthisAEW.Theconvectivesystemincreasedquicklyinintensityandsizeanddevelopedthreewestwardmovingarc-shapedconvectivesystems(Figure3a,b).Thenear-surfacewindsdepictthemostlywesterlymonsooninflowandtheweakeasterlyinflowintothesystem.Asthesystemdecayed,newconvectiveburstsoccurredintheremainsoftheoldMCS.ThisleadtostructuralchangesoftheformofasqualllinecrossingtheWestAfricancoastlineataroundmidnightonSeptember11,2006.Duringthenext24hours,in-tenseconvectiveburstsoccurredovertheeasternAtlantic.Theseconvectiveburstswereembeddedinacycloniccir-culationwhichintensifiedandbecameatropicaldepressiononSeptember12,2006,12UTCoutofwhichHurricaneHelenedeveloped.Thelargestandlongestlivedconvectiveburstinthisintensificationperiod(Figure3c,d)wasanalyzedhereandcomparedtotheconvectivesystemoverland.Theconvectivesystemsmodifytheiren-vironmentandthesechangescanberelatedtothestructureofthesystemitself.

AseriesofCOSMOrunsoverlargemodeldomains(1000x500gridpoints)with2.8kmhorizontalresolutionwerecarriedoutsuchthatthemodelareawascentredaroundtheMCS.Asthe

systemmovedacrossWestAfricathepositionofthemodelregionwasad-justedforeachsubsequentrun.Thisenabledustoidentifystructuralfea-turesofconvectivesystemsoverWestAfricaandtheEasternAtlantic,andtoanalyzeanddiscusstheirdifferences.Alltherunsare72hinduration.Theparameterizationofconvectionisswitchedoff.Themodelsourcecodewasadaptedtoprovideinformationformoisture,temperatureandmomentumbudgets[8,9,10].

Themodelwasabletocapturetheabovedescribedconvectivesystemsaswellastheirstructureandintensitychangesverywell.WeidentifiedthreestagesinthelifecycleoftheMCSoverWestAfrica:thedeveloping,thema-ture,andthedecayingphase.Toana-lyzethestructureoftheseconvectivesystemsinmoredetail,aneast-westcrosssection(Figure4a)isdrawnthroughtheconvectivesystemshowninFigures3a,b.Duringthematurephase,low-levelconvergenceoccursbetweenastrongwesterlyinflowandaneasterlyinflowfrombehindthesys-temandispartlyenhancedduetothedescendingairfromtherearsystem.Theassociatedascentregionextendsuptoabout200hPaandistiltedeast-wardswithheight(Figure4a).Thecon-vergencecontinuesupto700hPa.Atthisstagestrongdowndraughtshavedevelopedaround7.5°Wjustbehindthelow-levelconvergencezonethatislocatedunderthetiltedupdraughtregion.Thearea

Applications

47

ofthewesterlyinflowreachesdeepintothesystemuptoabout700hPaandtheAEJhasitsmaximumaround600hPa.Divergenceassociatedwiththeupper-leveloutflowcanbeseenataround200hPawithverystrongeasterlywindsaheadofthesystemandwesterlywindsintherear.Thus,thereisstrongmid-levelconvergenceeastofthetiltedupdraught.

Thecrosssectionthroughaconvec-tivesystemembeddedinthedevel-opingtropicaldepressionovertheeasternAtlanticisshowninFigure4b.AmajordifferencebetweentheMCSoverWestAfricaandtheconvectiveburststhatareembeddedinthecircu-lationovertheAtlanticistheirlifetime.TheMCSoverlandlastedforabout3daysandthesuccessionofMCSsovertheoceanonlyforabout6to24hours.Theregionofmaximumheat-ingandascentisverticalandnottiltedasintheconvectivesystemoverland.TheAEJseemstooccurataround700hPawestoftheconvectivesys-

tem.Thisisabout100hPalowerthanfortheMCSoverWestAfricawheretheairisacceleratedduetoairthatexitedtheupdraughtcoreatlowerlev-elsandslightlydescends.Itisalsoap-parentfromthecrosssectionthatthedowndraughtsarenotasstrongasfortheMCSoverland.Furthermore,thelow-levelconvergencecoversamuchbroaderregionthanoverthecontinent.

Toquantifythedifferencebetweentheconvectivesystemsoverlandandoverwaterandtoassesstheinflu-enceoftheconvectivesystemsontheenvironment,potentialtemperature,relativevorticity,andpotentialvorticitybudgetsforregionsencompassingtheconvectiveregionwerecalculated.De-tailsandtheresultsforthebudgetcal-culationscanbefoundin[10].Infuturestudiestheanalysiscouldbeappliedtoothercasesinordertogeneralisetheseresults.

Figure4:Crosssectionsthroughtheconvectivesystemoverland(a)andovertheocean(b).Theverticalvelocity(shaded)andzonalwind(contourintervalis3ms-1)areshown.Thecrosssection(a)isalong13.1°NonSeptember10,2006,05UTC,and(b)along12.8°NonSeptember12,2006,09UTC.Takenfrom[10].

Applications

48

• Juliane Schwendike• Leonhard Gantner• NorbertKalthoff• SarahJones

Institutfür Meteorologieund Klimaforschung, Karlsruher Institutfür Technologie

AcknowledgmentThisprojectreceivedsupportfromtheAMMA-EUproject.BasedonaFrenchinitiative,AMMAwasbuiltbyaninter-nationalscientificgroupandfundedbyalargenumberofagencies,especiallyfromFrance,UK,USandAfrica.Ithasbeenthebeneficiaryofamajorfinan-cialcontributionfromtheEuropeanCommunity’sSixthFrameworkRe-searchProgram.

Detailedinformationonscientificcoor-dinationandfundingisavailableontheAMMAInternationalwebsite:http://www.amma-international.org

References[1] Avila,L.A.,Pasch,R.J.

“AtlanticTropicalSystemsof1993”,MonthlyWeatherReview,123:887-896,1995

[2] Steppeler,J.,Doms,G.,Schättler,U., Bitzer,H.W.,Gassmann,A., Damrath,U.,GregoricG.

“Meso-gammaScaleForecastsusingtheNonhydrostaticModelLM”,MeteorologyandAtmosphericPhysics,82:75–97,2003

[3] Schättler,U.,Doms,G.,Schraff,C.“ADescriptionoftheNonhydrostaticRegionalModelLM”,partVII:User’sguide,DeutscherWetterdienst,www.cosmo-model.org,2008

[4] Vogel,B.,Hoose,C.,Vogel,H., Kottmeier,C.

“AModelofDustTransportAplliedtotheDeadSeaArea”,MeteorologischeZeitschrift,15:DOI:10.1127/0941–2948/2006/0168,2006

[5] Vogel,B.,Vogel,H.,Bäumer,D., Bangert,M.,Lundgren,K.,Rinke,R., Stanelle,T.

“TheComprehensiveModelSystemCOSMO-ART–RadiativeImpactofAerosolontheStateoftheAtmosphereontheRegionalScale”,AtmosphericChemistryandPhysics,9:14483–14528,2009

[6] Gantner,L.,Kalthoff,N.“SensitivityofaModelledLifeCycleofaMesoscaleConvectiveSystemtoSoilConditionsoverWestAfrica”,QuarterlyJournaloftheRoyalMeteorologicalSociety,136(s1):471-482,2010

[7] Kohler,M.,Kalthoff,N.,Kottmeier,C.“TheImpactofSoilMoistureModificationsonCBLCharacteristicsinWestAfrica:ACase-StudyfromtheAMMACampaign”,QuarterlyJournaloftheRoyalMeteorologi-calSociety,136(s1):442-455-,2010

[8] Grams,C.M.“TheAtlanticInflow:Atmosphere-land-oceanInteractionattheSouthWesternEdgeoftheSaharanHeatLow”,Master’sthesis,InstitutfürMeteorologieundKlimaforschung,UniversitätKarlsruhe,Karlsruhe,Germany,March2008

[9] Grams,C.M.,Jones,S.C., Marsham,J.H.,Parker,D.J., Haywood,J.M.,Heuveline,V.

“TheAtlanticInflowtotheSaharanHeatLow:ObservationsandModelling”,Quar-terlyJournaloftheRoyalMeteorologicalSociety,136(s1):125-140,2010

[10]Schwendike,J.,Jones,S.C.“ConvectioninanAfricanEasterlyWaveoverWestAfricaandtheEasternAtlan-tic”:aModelCaseStudyofHelene(2006).QuarterlyJournaloftheRoyalMeteorolo-gicalSociety,136(s1):364-396,2010

Applications

49

Extragalacticjetsareamongstthemostspectacularphenomenainastrophysics:thesedilutebuthighlyenergeticbeamsofplasmaareformedintheenvironsofactiveblackholes,movingwithspeedsverynearthespeedoflight.Theyrunintothehotgassurroundingthegalaxy,wheretheyareeventuallydeceleratedandheatupthegasconsiderably,dig-ginglargecavities(theso-calledjetco-coon)intotheextragalacticgas.

Boththejetsandthecavitiesareob-servabletodaywithradioandX-raytelescopesonearthorinspace.Theseobservationshaverevealedthatthepowerofthesejetsisevengreaterthanwasthoughtbefore:morethanonetrilliontimesthetotalpowerofoursun(1039watts)forthemostpower-fulones–withactivitydurationsofsometenmillionyears.Thispowerultimatelyoriginatesfromtheactivesupermassiveblackholeofthegalaxy,sincethejettapstheenormousro-tationalenergyoftheblackhole.This

hugeamountofenergyclearlyhasaconsiderableimpactontheenergybudgetoftherespectivegalaxyanditsenvironment.Sinceextragalacticjetsaremostfrequentinthemostmas-sivegalaxies(observedinroughlyeverythirdamongthem),theyhavebecomeacommonexplanationforcosmo-logicalsimulationsfailingtoreproducegiantgalaxiescorrectly,despitetheirotherwisegreatsuccessinmodelingtheevolutionofouruniverseanditsgalaxies.Whileastronomicalobserva-tionsshowthatgiantellipticalgalaxiesconsistofoldandredstars,showingalmostnosignsofcurrentformationofnewstars,thesegalaxiesstillgrowincosmologicalsimulationsandthereforearemuchbluer,activelyformingstarsandhaveconsiderablygreatermasses.Thishascausedanincreasedinterestinjetphysicsandinresearchprojectsfocusingontheinteractionofjetswiththeirhostgalaxyanditsenvironment,nowreferredtoas“jetfeedback”.Yet,thedetailedprocessesinvolvedand

Figure1:Velocityfieldofajet(densitycontrast0.001)after30millionyears.Thejetbeam(inred)reachesouttothejetheadandthenturnsback,inflatingaturbulentcocoon(mostlygreen).

The Maturing of Giant Galaxies by Black Hole Activity

Golden Spike Award by the HLRS Steering Committee in 2009

Applications

50

theexactimportanceofjetsisstillun-known.Thisbecomesobviousintwoopposingprocessesthatareinvokedinthecurrentliterature:positiveandnegativefeedback.

ThinkPositive–orNegative?Denseandverycoolcloudsofgasarethebirthplacesofstarssinceonlythencangravitysurmountthermalpressureandcausethecloudstocollapse,ulti-matelyformingnewstarsasnuclearfu-sionsetsin.Ifcloudsarehitbyajetoritsprecedingbowshock,theyarecom-pressedbytheimpactandcanbecomegravitationallyunstable.Thiswouldresultinanincreasedrateofstarformation(positivefeedback).Ontheotherhand,thesameimpactcouldaswelldisruptthecoldcloudsandtherebydestroytheseedsnecessaryfortheformationofnewstars(negativefeed-back).Additionally,thejetwouldheatupthehotgassurroundingthegalaxytohighertemperaturesandpreventitfromcoolingdown,gettingcompressedandformingnewcoldclouds.Bothpro-cesseshavebeendemonstratedtobepossible,andonlyadetailedstudywitharealisticunderlyingmodelcanhelpusrevealtheirrealimportance.

Observationsofdistantradiogalaxies,showingthemastheywere10billionyearsago,giveadditionalhintsontheinteractionbetweenthejetandthedisk.There,extendedemission-linenebulaealignedwiththejetaxisareobservedanditisconjecturedthatthesenebulaeareactuallycreatedbytheinteractionofthejetwithagalaxy'sgaseousdisk.

ComputationalChallengesWehaveconductedaseriesofmagne-tohydrodynamicjetsimulationstoex-aminetheinteractionofjetswiththeirenvironmentatveryhighresolution

assumingaxisymmetry.Thesecompu-tationsareextremelydemandingifre-alisticparametersareused:althoughthejetplasmamoveswithalmostthespeedoflight,itsdensityismuchlowerthantheambientdensityoftheextra-galacticgas(inoursimulationstypically1,000timessmaller),whichresultsinamuchslowerpropagationofthejet.Also,aconsiderablerangeofspatialscaleshastobecovered.Whilethejetsextendovermorethan100kpc(1kpc=3x1016km),thejetbeamsare100timesnarrowerandhavetoberesolvedindetail.Allthisresultsinsimulationswithmorethan6millioncellsandmorethanonemilliontimestepsnecessary.Theverylargenum-beroftimestepsmakessimulationstime-consumingbutdoesnotallowamassivelyparallelapproachsincethe“problemperprocessor”wouldbecometoosmallthenandcommunicationcostsbetweenprocessorswouldbe-comeunwieldy.Thesesimulationswere

Figure2:3Dsimulationshowingtheinteractionofajetandagalacticgaseousdisk(volumerenderingofdensity)

Applications

51

thereforeonlypossibleonapowerfulsupercomputersuchastheNECSX-8attheHighPerformanceComputingCenterStuttgart(HLRS),towhichwehaveadjustedandoptimizedourcode.Duetoitsvectorcapabilitiesandsharedmemoryarchitecturepernode,thecoderanveryefficientlyandallowedustoreachrealisticsizesforthejets.

Wefoundthattheexpansionofthejetcocoonfollowsself-similarmodelsonlyatearlytimes,butdeviatesfromthatsignificantlyoncethecocoonap-proachespressurebalancewithitsenvironment.Wefoundthistobeconsistentwithobservationsandalsoachievedbowshockshapesandstrengthssuchasthoseobservedtypically.Magneticfieldsturnedouttobeimportanttostabilizethecontactsurfacebetweenthejetandtheambi-entmediumandweremoreoverfoundtobeefficientlyamplifiedbyashearingmechanisminthejethead.

TheMistofDistantGalaxiesWithrespecttotheriddleoftheemission-linenebulae,wewereabletotesttwomodelsforthelocationandkinematicsoftheline-emittingcloudsagainstobservedpropertiesandfoundthatcloudsembeddedinthejetcocoonwereabletomuchbetterreproduceobservedvelocitiesandmorphologiesthancloudsembeddedintheshockedambientgas.

About10percentofthetotaljetpowerwasmeasuredaskineticenergyinthejetcocoon,whichmadeitpossibletolinkthesefindingstosimulationsofmulti-phaseturbulencethatmodeltheamountsandemissionpowerofthecoolgasphaseembeddedinthesetur-bulentregions.Theexpectedemissionlineluminosities,however,disagreeconsiderablywiththeobservedrangeofluminositiesandweconcludedthatthemodelsarestilltoosimpletoin-cludeallthenecessaryphysics.Thiswas,admittedly,nottoosurprisingsincethemodelsreliedoncloudspas-sivelyadvectedwithandspreadallacrossthecocoonplasma,whiletheinteractionofrealcloudsofcoldgaswouldbeconsiderablymorecomplex.

Movingon–Simulationsin3DToimproveourmodelofthejet–cloudinteraction,wehadtoextendoursimu-lationstofullthreedimensionssinceaclumpygalacticorintergalacticgascannotbemodeledproperlywithinaxi-symmetry.Thisalsomadeitnecessarytomovetoanothercode:RAMSES,anadaptivemeshrefinementcode,thatin-cludescooling,gravity,starformation,cosmologicalevolutionandmagneticfields.Themeshrefinementishighlysuitableforresolvingsmallcloudsinanotherwiselargecomputationaldomain,

Figure3:Thegaseouscloudsinthedisk(green)arecompressed(blue)bytheactionofthejetandtheremaininggasispushedoutwards.

Applications

52

• VolkerGaibler

Max-Planck- Institutfür extraterres- trischePhysik, Garching

incontrasttoauniformlyresolvedmesh.Thecurrentimplementationallowsamaximumdynamicrangeinlengthscaleof6ordersofmagnitude.ThecodeisparallelizedbyMPIandin-cludesdynamicload-balancingbetweenthedifferentMPIprocesses.The3Dsimulationsareclearlycomputationallyverydemanding,sinceanadditionaldimensionresultsinamuchlargernumberofcells,eveniftheresolutionissomewhatlower.Ontheotherhand,ahighernumberofcells(incontrasttotimesteps)generallycanbehandledbymassiveparallelizationifanefficientmethodisused.

OutlookWehavesuccessfullytestedRAMSESonboththeNehalemclusterattheHLRSandtheHLRB-IIattheLeibniz-Rechenzentrum(LRZ),andfoundittobescalingalmostlinearlywithupto4,000cores,ifsufficientbandwidthisavailable.OntheNehalemcluster,theperformancewas~50percentsmallerthanexpectedfromthesingle-coreperformancesincememorybandwidthbecomeabottleneckformorethan4corespernode;scalingthenincreasedalmostlinearlyforalargernumberofnodes.HLRB-II,however,didnotsufferfromthisandshowedexcellentscalingbehaviour.WeconjecturethatthememorybandwidthbottleneckontheNehalemclustermayberelatedtode-tailsoftheMPIimplementation,sinceMPIadjustmentsonalocalHarper-townBeowulfclusterwereabletogetaroundthislimitation.

Forasmalltestsimulation,wehavesuccessfullysetupaclumpygalacticgaseousdisk,similartodisksactuallyobservedindistantgalaxies.Ajetispositionedinthecenterofthegalaxyandinteractionofthejetwiththecold

gascloudsembeddedintheambientmediumiscomputedexplicitly.Prelimi-naryresultsindicatethatactuallybothpositivefeedbackbycompressionofcloudsinthecenteraswellasnega-tivefeedbackbyremovalofgasalongthejetaxismaybeactingatthesametime.Sofar,thesimulationshaveonlycoveredatimemuchshorterthanourprevioussimulations,butrunsonalargenumberofprocessorswillallowustoeventuallyexaminetheactionofjetfeedbackindetailandoverrealisti-callylongtimes.

References[1] Gaibler,V.,Krause,M.,Camenzind,M. “VeryLightMagnetizedJetsonLarge Scales–I.EvolutionandMagneticFields”, MonthlyNoticesoftheRoyalAstronomical Society,400,pp.1782-1802,2009

[2] Krause,M.,Gaibler,V. “PhysicsandFateofJetRelated EmissionLineRegions”,Conference contribution,atarXiv:0906.2122

[3] Gaibler,V.,Camenzind,M. “NumericalModelsforEmissionLine NebulaeinHighRedshiftRadioGalaxies”, WolfgangE.Nagel,DietmarB.Kröner, MichaelM.Resch(Eds.),Springer2010, HighPerformanceComputinginScience andEngineering‘09

[4] Gaibler,V. “VeryLightJetswithMagneticFields”, PhDThesis,Ruprecht-Karls-Universität Heidelberg,2008

Applications

53

Thevalueofleadingedgehighperfor-mancecomputingsystemscanonlyberevealediftheprogrammingenviron-mentforapplicationdevelopersallowstherealizationofefficientsimulationprograms.Usersofsuchapplicationsexpectareliableandrobustoperatingenvironmentwithagoodratiobetweenperformanceandenergycosts.

Thefastdevelopmentonthehardwaresectorandthecomplexityofthetasktoexploitmoreandmorecomputingcoresforsolvingincreasinglylargeproblemsorsolvethesameprobleminshortertimecannotbeaddressedwithcommodityoftheshelfproductsbutrequiresintensiveresearchactivitiesinclosecollaborationofapplicationdevel-opers,hardwarevendorsandsuper-computingcentressuchasHLRS.ConsequentlyHLRSisinvolvedinawiderangeofresearchactivitiesandhasestablishedclosevendorcollaborationssuchastheTeraflopWorkbenchandtheCrayExascaleResearchCentre.

StartingfromprojectsprovidingtheinfrastructuresuchasthePartnershipforAdvancedComputinginEurope(PRACE),VISIONAIRorHPC-Europa,moreshorttermorientedresearchprojectsaimtorealizeanefficientoperatingenvironment.AsexampletheprojectsGAMESandCoolEmAlldevelopindicators,methodsandsolutionsforrealizingenergyefficientdatacentres.Realizingreliableserviceprovisiondemandsforcontrolledprocessesandmanagedinfrastructures.TheplugITprojecthasbeenaddressingtheneedforalignmentofbusinessgoalsandtheITinfrastructurelayer.

Anotherimportantpartoftheresearchactivitiesisfocusedondevelopertools.HLRSisnotonlyinvolvedintheMPIstandardizationprocessandanactivedeveloperoftheOpenSourcereferenceimplementationOpenMPI,buthasdevelopedinnovativedebuggingtoolssuchasTemanejoforemergingnewtaskbasedprogrammingmodels.

Research ProjectsResearch Projects

54

•StefanWesner,

Universityof Stuttgart,HLRS

Themajorchallengeforthenexteighttotenyearsistofindsolutionsforexascalecomputingsystems.Thesignificantchangesonthehardwareleveldemandforradicallynewapproachesandanevenclosercollaborationwithapplicationdevelopers.HLRSisaleadingpartnerinCRESTA(CollaborativeResearchintoExascaleSystemware,ToolsandApplications),oneofthethreeexascaleprojectsattheEuropeanlevel.Inthislargescaleprojectallaspectsoftheproblemfromnewmathematicalmodelsandapproachesoverapplicationsandlibrariesdowntothehardwarelevelareinvesti-gated.AnotherexascaleprojectofHLRSisTEXT(TowardsEXascaleApplicaTions)validatinghowhybridcomputingapproachesmixingexistingandnewprogrammingmodelssupportincreasedscalability.

TheexamplesonthefollowingpageswerechosenfromourjournalinSiDEandcanonlypresentashallowimpressionoftheresultsachievedinmorethan40currentlyrunningresearchprojectswithmorethan60researchersatHLRSworkingontheseprojectsonstate,national,andEuropeanlevel.

Research ProjectsResearch Projects

55

Figure1:CoolEmAllConcept

ITinfrastructuresareresponsibleforatleast2%oftheglobalenergyconsump-tionmakingitequaltothedemandoftheaviationindustry.Furthermore,inmanycurrentdatacentrestheITequipmentusesonlyabouthalfofthetotalenergyforcomputing,whilstmostoftheremainingenergyisrequiredforcoolingandairmovement.ThisoftenresultsinpoorPowerUsageEffec-tiveness(PUE)valuesandsignificantCO2emissions.Forthisreasonissuesrelatedtocooling,heattransfer,andITinfrastructurelocationaregainingmoreattentionandarecarefullystudiedduringplanningandoperationofdatacentres.

Inthiscontext,theconstructionofdatacentresbyusingmodularbuildingblocksisgainingmoreandmoreattention,inparticularasapotentialantipoletospecializedfacilities.Thismodularapproachisbecomingincreasinglypopularduetoflexibilityofdesign,lowercostsandshorterbuildingtimes.Thismodularapproachcanrefertoavarietyofapproaches-oneofthemostpopulararedatacentreshousedinstandardshippingcontainers.Inaddition,thismodularapproachcanalsorefertopre-configuredunitswhicharejoinedtogethertobuild-uplargecomputingfacilities,withe.g.,hundredsofsquaremetersinsize.

CoolEmAll - Platform for Optimizing the Design, Operation and Cooling of modular configurable IT Infrastructures

Projects

56

blocks,andenergyre-usedbyfacilitiesconnectedtoITinfrastructuresareallcrucialtounderstandandimprovetheenergyefficiencyofdatacentresasawhole.Tocarefullystudytheseissues,simulation,visualization,anddecisionsupportingtoolsareneeded,supportingtheoptimizationofthedesignandopera-tionofnewenergy-efficientmodularITinfrastructuresandfacilities.

ToaddresstheaforementionedITenergyefficiencyissues,themaingoalofCoolEmAllistoprovideadvancedsimu-lation,visualizationanddecisionsupporttoolsalongwithblueprintsofcomputingbuildingblocksformodulardatacentreenvironments.Oncedeveloped,thesetoolsandblueprintsaregoingtoallowtominimizetheenergyconsumption,andconsequentlytheCO2emissionsoftheentireITinfrastructure,takingintoaccountthecorrespondingfacilitiesaswell.Thiswillbeachievedby:

1.thedesignofdiversetypesofmodularcomputingbuildingblocks(ComputeBoxBlueprints),whicharegoingtobewelldefinedbyenergyefficiencymetrics

2.thedevelopmentofasimulation,visualizationanddecisionsupporttoolkit(SVDToolkit)thatwillenabletheanalysisandoptimizationofITinfrastructuresassembledwiththesebuildingblocks.

Therefore,thesemodularcomputingmodulesaswellasthetoolkitaregoingtotakeintoaccountthreeaspectsreflectingthemajorimpactonactualenergyconsumption,namelythecoolingmodel,theaccordingapplicationpro-pertiesandworkloads,aswellaswork-loadandresourcemanagementpolicies.

Tothisend,theenergyefficiencyofcom-putingbuildingblockswillbepreciselydefinedbyasetofnovelmetricsex-

However,whilespecialisedfacilitiesareestablishedincurrentenvironments,thereisasignificantneedtoanalysetheenergyefficiencyaspectsofsuchamodularapproachinordertoallowforacomparisonoftheseapproaches.Inparticular,adeepinsightintothetotalenergyconsumptionofboth,largedatacentresandsmallerfacilities,enforceadditionalresearchtodeterminehowefficientthisapproachis.Animportantaspectwhenconsideringtheenergyefficiencyofmodulardatacentresisthecoolingtechnique.Theuseofap-proachessuchas“freeaircooling”whereexternalairisusedtocoolsystemsratherthanelectricalchillerscanhelptoimproveefficiencyandachievePUEratingsclosetotheidealof1.0.

Thecoolingandheattransferprocessesarenottheonlyimportantaspectsinfluencingtheenergyefficiencyofdatacentres.ActualpowerusageandeffectivenessofenergysavingmethodsheavilydependsonthetypesofITapplicationsandworkloadproperties.However,totakefulladvantageofthesemethods,

(i) applicationpowerusageand performancemustbemonitored inafine-grainedmanner,and

(ii)parametersandmetricscharacter- isingboth,applicationsandresources, mustbepreciselydefined.

Consequently,thereisalargeamountofparametersimpactingtheenergyefficiencyofITinfrastructures.Alltheseparametersshouldbetakenintoaccountduringthedesignandconfigurationofdatacentres.Issuessuchastypesandparametersofapplications,workloadandresourcemanagementpolicybasedscheduling,hardwareconfiguration,metricsdefiningefficiencyofbuilding

Projects

57

Figure2:Airflowvisualisationinadatacentre

pressingrelationsbetweentheenergyefficiencyandessentialfactorslistedabove.Inadditiontocommonstaticapproaches,theCoolEmAllplatformwillenablestudiesofdynamicstatesofITinfrastructuresbasedonchangingwork-loads,managementpolicies,coolingmethod,andambienttemperature.ThemainconceptoftheprojectispresentedinFigure1.

Therefore,CoolEmAllisgoingtoaddressthefollowingtechnicalobjectives:

1. Thedevelopmentofasimulation,visualization,anddecisionsupporttool-kit(SVDToolkit),allowingforanalysinganddesigningmodularITinfrastructuresandfacilitieswithresource-efficientcooling.ThisplatformwillsupportITinfrastructuredesigners,decisionmakersandadministratorsintheprocessofplanningnewinfrastructuresorimprovingexistingones,likeexemplaryshowninFigure2.TheintendedmodularapproachtobuildandmodelITinfra-structuresandfacilitiesallowsformanyextensionpossibilitiesandhighlevelofcustomization.CoolEmAllwilldevelopa

flexiblesimulationplatformintegratingmodelsofapplications,workloadschedulingpolicies,hardwarecharacter-istics,coolingandairandthermalflowsusingComputationalFluidDynamics(CFD)simulationtools.Theflexibilityofthesemodels,basedonmodelparam-etersettings,willensureflexibilityoftheentireCoolEmAllSVDplatform,allowingcapturingrequiredmodelsettingsandsimulatethesemodelsforawiderangeofapplications,workloadschedulingpolicies,IT-Infrastructureandhardwarecharacteristics.AdvancedvisualizationtoolsanduserinterfaceswillallowuserstoeasilyanalysevariousoptionsandoptimizeenergyefficiencyofplannedITinfrastructuresandfacilities,asshowninFigure3.

2.Theprovisioningofblueprintsofcomputingmodulesandabasicproto-type.Thebasicversionofthismodulewillenabletestsandreal-lifeexperimentsprovidingrealisticbehaviouralinforma-tionforthesimulationmodels,allowingcapturingthermalandenergyefficiencybehaviouronnodeandracklevel,andwillalsoenabletheverificationofthese

Projects

58

Figure3:VisualisationWorkflow

virtualizationandHPCapplications.Theproposedpolicieswillbeappliedinsimulationstostudytheirimpactonenergy-efficiencyindiverseconfigura-tionsandinlargescale.

4.Theanalysisandparameterizationofapplicationsandworkloads.TheCoolEmAllsimulationsaswellaswork-loadmanagementtechniqueswilltakeintoaccountspecificworkloadandapplicationcharacteristics.Tothisend,CoolEmAllwillpreparebenchmarksandclassificationofapplicationsandwork-loads.Thisknowledgeaboutapplicationpropertieswillbeusedto(i)simulatetheirimpactonthermalissuesandenergyefficiencyand(ii)toproposethermalawaremanagementpolicies.

5.Thedefinitionofspecificenergyefficiencymetrics.Precisedefinitionsofmetricsexpressingtrade-offsbetweenenergyandperformancewillbedefined.Thesemetricswillgobeyondexistingones(e.g.thosedefinedintheCodeofConductonDataCentersEnergyEf-ficiency).Withthisrespect,CoolEmAllisgoingtotakeintoaccountmetrics

models.Thisprototypewillincludefine-grainedmonitoringcapabilities,allowingforadetailedinspectionoftheentireenvironment.Basedonthisevaluation,arefinedandoptimizedprototypewillbedesignedfordiversescenariosinclud-ingvarioushardwaredensities,coolingmethods,workloadsandrequirements(Figure4).

3. Thedefinitionandevaluationofthermal-andenergy-awareworkloadschedulingandresourcemanagementpolicies.Theproposedpolicieswillincludeintelligentworkloadschedulingandresourcemanage-ment(e.g.dynamicswitchingoffnodes,loweringfrequencyandvoltagetoavoidexcessiveheatgeneration).Thecor-respondingdecisionswillbebasedonfine-grainedhardwareandapplicationmonitoring.Theselectionandsetupofthecorrespondinghardwarewilldependonapplicationstypes,workloadrequire-ments,coolingmethod,andambienttemperatures.InordertoreflecttheevaluationoftheCoolEmAllapproachwithinarealisticenvironment,twomajortypesofworkloadwillbeconsidered:datacentrecloudworkloadsusing

Projects

59

Figure4:SVDArchitecture

definedbyotherprojectsaswell,extendthemorproposeadditionalmetricsexpressingclassesofefficiencyincludingrelationbetweenenergyefficiencyandapplicationcharacteristics,workloadproperties,ambienttemperatures,requiredheatre-useefficiency,etc.Inparticular,themetricsdefinedandevaluatedwithintheGAMESproject[1]aregoingtobetakenintoaccount.

6.Verificationofsimulationtoolsandtheirapplicationforspecificscenarios.Theverificationoftheproposedmethodsandsoftwarewillbeperformedbytestsinrealenvironmentsusingabasicprototypemodule,realapplications,aswellasenhancedmonitoringsystemsbasedonsensors.CoolEmAllwillalsoperformcoupledsimulationsforseveraldiversesettingsincludinglargescaleITinfrastructuressuchaswholedatacentres.Thesesimulationswillusecollectedtraces(e.g.fromtheEUproject–GAMESorpartners)toplan,designandoperatenewandexistingITinfrastruc-turesandfacilities.Inthisway,thefinalblueprintsofthecomputingmoduleswillbeevaluatedandoptimizedinspecificsettings.

SimulationandvisualizationtechnologiesareessentialpillarsoftheCoolEmAllconcept,asitallowstoleverageandexplorenewdatacentrearrangementsandsolutions.Therefore,theopensourceCFDpackagecalledOpenFoam[3]andtheCOVISEsoftware[4],developedbyHLRS,willbetheintegrationandenhancedtoprovideareal-timeCFDmodellingcapability.Theresultingpackageenablesintegrationofcollectedoperationaldataintoasimulationtoachieveoptimalenergy-efficientandthermal-awaredesignofdatacentresconsistingofmodularcomputingunits.ThefirstprototypesofthesemodularcomputingunitsaregoingtobedevelopedbasedontheexistingexperienceofChristmannwiththeirRECS(ResourceEfficientComputingSystem[2]).

TheuseofthesemodularcomputeunitsisentirelyinlinewithCoolEmAll’sresearchintoimpactofcoolingsolutionsontheenergyefficiencyofITinfra-structures–inparticularleveragingoutsideairventilationforcoolingwithoutartificialequipmentaswellasreusingwasteheatgeneratedbycomputation.Themodularcomputeunitsdeveloped

Projects

60

•AlexanderKipp•UweWössner Universityof Stuttgart,HLRS

schedulingofworkloads,takingadvantageofcoldermachinesorspecifichardwarebestsuitedforagivenjob,cansignifi-cantlyinfluencetheairflowinsideadatacentre,reducingoreveneliminatingtheneedforartificialcooling.TheimpactoftheworkloadmanagementbasedontheSVDToolkit-designeddatacentresemphasisestheCoolEmAll’sgoalofaholisticapproachtonextgenerationgreendatacentres.

HLRScontributestoCoolEmAllbyactingasthetechnicalmanagerofthisprojectaswellassimulationandvisualizationexpert.Inparticular,HLRSisgoingtocoordinatethedesignandrealizationoftheSVD-platformandcontributewithitsbroadexperienceinsystem-monitor-ingandmanagementexpertise.Finally,HLRSisgoingtocontributewithitsknowledgeinthedefinitionandanalysisofenergy-efficiencyrelatedmetrics.

Participants• InstytutChemiiBioorganicznej PAN,PL• HighPerformanceComputing CentreStuttgart(HLRS),D• UniversitePaulSabatier,F• ChristmannInformationstechnik+ MedienGmbH&Co.KG,D• The451GroupLTD,UK• InstitutdeRercercaenEnergiade Catalunya,E• AtosOrigin,E

References[1] http://www.green-datacenters.eu[2] http://shared.christmann.info/download/ project-recs.pdf[3] http://www.openfoam.com/[4] http://www.hlrs.de/organization/av/vis/ covise

aspartofCoolEmAllcanbeusedinasymbiosisdeploymentscenario,wherecomputingfacilitiesbenefitotherbuild-ingsthatsurroundthem,resultinginimprovedoverallenergy-efficiencyofanurbanarea.AnotherresultoftheCoolEmAllresearchandsimulationsofthemodularcomputeunitdesignanddeploymentscoulddirectlybenefitthefieldofhigh-densityserverracksbymodellingtheairflowaroundthemandhelpinfindingsolutionsforthecoolingproblemofdenseHPCdatacentres.Bothmodularhigh-densitydatacentresandsymbiosisdeploymentsdirectlybenefitfromtheSVDToolkit(simulation,visualizationanddecisionsupport),whichisoneofthemajoroutcomesoftheCoolEmAllproject.

Software,especiallyapplicationsplaysignificantrolesbothintermsoftheperformanceandenergy-efficiencyofcomputations.Therefore,theCoolEmAllprojectaimstoenhanceexistingandtodevelopandstandardizenovelfine-grainedenergyandthermal-awareap-plicationandhardwaremetrics.Thesewilltakeintoaccountboth,theenergybudgetandthermalairflowimpactsofanapplicationrunningonaparticularhardware.Duetothegranularityofthesemetrics,theexistingmonitoringplatformsareinsufficientduetoexces-sivebandwidthandprocessingpowerrequirements,thusCoolEmAllwilldevelopnewmonitoringsolutionstofacetheproblemfornextgenerationgreenITinfrastructure.

Giventhesemetricsandprovidedafine-grainedmonitoringofbothhard-wareandsoftware,CoolEmAllisgoingtoadvancethefieldofclusterschedul-ingwithnewworkloadmanagementalgorithmsandpoliciesleveragingappli-cationcharacteristicsandenergyandthermalmeasurements.Proper

Projects

61

VISIONAIRisestablishingaEuropeaninfrastructureforhighlevelvisualizationfacilitiesthatisopentoresearchcom-munitiesacrossEuropeandaroundtheworld.Byintegratingexistingfacilitiestoapan-Europeannetwork,itwillcreateaworld-classresearchinfrastructureenablingtoconductcutting-edgeresearch.Currentscientificchallengessuchasclimateevolution,environ-mentalrisks,molecularbiology,health,andenergyrequirethemanagementofincreasinglycomplexandvoluminousinformation,thuscallingforthedevelop-mentofevermorepowerfulvisualizationmethodsandtools.OnmanysitesacrossEurope,itisinfeasibletofundthenecessaryvisualizationfacilitiesthatareneededtotacklehighfidelity,largescreenand/orimmersivevisualization.VISIONAIRistargetedtofillinthisgapbyprovidingaccessto

thepartnerfacilities,openingitsdoorsforinterestedre-searcherstousethemultitudeofser-vicesavailableacrosstheEuropeanvisu-alizationfacilities.Aftersubmittingasuccessfulresearchproposal,interna-tionalresearchers

areinvitedtovisitthepartnerfa-cilityoftheir

choiceorwhatfitsbesttothescien-tificgoalstheyhaveinmindtoconducttheirresearch.Theyarenotonlygivenaccesstothetopvisualizationfacili-tiesinEurope,butarealsosupportedintheirresearchbyfundingtheirlivingandtravelexpenses.Researcherscanchoosefromover20facilitieslocatedin12countriesinEuropeandIsrael.

Theprojecttargetsdifferentfieldsofvisualization.ScientificVisualizationoffersaccesstomethods,softwareandhardwareneededforsuccessfullyvisualisingscientificdata,including-butnotlimited-toengineering,medicalvisualization,biology,chemistryandphysics.Ultra-High-Definitionfacilitiesconnectedbyhighspeednetworksaretargetedatusersthatwanttocreatehighresolution,highqualityimages(upto8k)andaccessthosebyhighspeednetworks.VISIONAIRprovidesthehardwareandtheuniquenetworkdis-tributionservicesneededfortrans-missionoftheimagestotheirend-points.Thenetworkservicesenablemultiplehigh-resolutiondigital-mediastreamstobetransportedamongglobalsites,usingdynamicallyprovisionedopticallightpathsacrossmultipledomains,whichcanbeusedonascheduledoron-demandbasis.WhileScientificandUltra-High-DefinitionVisualizationcanbedoneinanyenvironments,researchersspecificallytargetingVRcanalsoapplyatamultitudeoffacilities.Here,thefocusisonimmer-sive-possiblyalsohaptic-experiencesinvirtualenvironments.Equipmentavailableforresearchersrangesfrom

headmounteddisplaystofullyfledgedstereoscopicPower-

WallsandCAVEs.Further

VISIONAIR

Projects

62

•UweWoessner Universityof Stuttgart,HLRS, Germany

specializedequipmentavailableallowstocarryoutresearchbyusingAug-mentedReality,atechniquethatallowstooverlaytherealenvironmentwithcontextdependentcomputergeneratedimages.Researchersalsohaveaccesstothelatestdevelopmentsindisplaytechnology,likeholographicdisplaysortheabovementioned8kdisplays.

Theprojectmaintainsanalreadyhugedatabaseofvisualizationsoftwareandmodelsthatisavailableforallresearch-ersforfree.Thus,expertscanexplorethemultitudeofvisualizationpackagesthatarealreadyavailable.Softwarecoveredhererangesfromprocessingfilters,convertersandreaderstofullyfledgedmodellersandvisualizationpackages.VISIONAIRisroundedupbyseveralresearchactivitiesconcerningtheusabilityandaccessibilityofthefacilitiesandtheirsoftwareforexternalresearcherswithastrongfocusoncollaboration.

VISIONAIRisacommoninfrastructurethatgrantsresearchersaccesstohighlevelvisualisationfacilitiesandresources.Bothphysicalaccessandvirtualserviceswillbeprovidedbytheinfrastructure.Fullaccesstovisualization-dedicatedsoftwareisorganized,whilephysicalaccesstohighlevelplatformsisaccessibletootherscientists,freeofcharge,basedonthequalityoftheprojectsubmitted.Indeed,researchersfromEuropeandaroundtheworldarewelcometocarryouttheirresearchprojectsusingthevisualizationfacilitiesprovidedbytheinfrastructure.BycreatingthisEuropeanvisualizationnetworkitwillbepossibletocreatealandmarkthatwillhaveabroadvisibilitythroughoutthere-searchcommunitiesaroundtheworld.

Withinthisproject,theHLRSisprovidingaccesstoitsCAVE,Power-Walls,headmounteddisplaysanditshapticdrivingsimulator.Visitorswillbeabletointeractivelyvisualizelargesimulationresultsorevenrealizecomputationalsteeringandinteractivesimulationsbyleveragingthepowerofa40nodesvisualizationcluster.ThevisualizationsoftwareCOVISEwillnotonlyallowvisitorstoanalysetheirdatainVirtualRealitybuttheycanalsooverlaytheirvisualizationoverphysicalprototypesortestbedsusingAugmentedRealitytechniques.Thecollaborativefeatureswillallowthemtoanalysethesimulationstogetherwiththeircolleaguesathomeorwithremotescientists.

Projectcallsareexpectedtoopenendof2011.Interestedresearchersareinvitedtosubmitaproposalat www.infra-visionair.eu

PartnersINPGGrenoble(F),GrenobleINP(F),UniversityofPatras(GR),CranfieldUniversity(UK),UniversiteitTwente(NL),UniversitätStuttgart(D),PSNCPosnan(PL),UniversitédelaMediterranéMarseille(F),CNRGenova(I),INRIARennes(F),KTHStockholm(S),TechnionHaifa(IL),RWTHAachen(D),PoznanUniversityofTechnology(PL),ENSAMAix-en-Provence(F),TUKaiserslautern(D),UniversityofSalford(UK),FraunhoferIPKBerlin(D),i2catBarcelona(ES),UniversityofEssexColchester(UK),MTASZTAKIBudapest(HU),ECNNantes(F),UniversityCollegeLondon(UK),PolitecnicodiMilano(I),EMIRACLEBrussels(B).

Projects

63

Forthepastthirtyyears,theneedforevergreatersupercomputerperformancehasdriventhedevelopmentofmanycomputingtechnologieswhichhavesubsequentlybeenexploitedinthemassmarket.Deliveringanexaflop(or10^18calculationspersecond)bytheendofthisdecadeisthechallengethatthesupercomputingcommunityworld-widehassetitself.TheCollaborativeResearchintoExascaleSystemware,ToolsandApplicationsproject(CRESTA)bringstogetherfourofEurope’sleadingsupercomputingcentres,withoneoftheworld’smajorequipmentvendors,twoofEurope’sleadingprogrammingtoolsprovidersandsixapplicationandproblemownerstoexplorehowtheexaflopchallengecanbemet.CRESTAfocusesontheuseofsixapplicationswithexascalepotentialandusesthemasco-designvehiclestodevelop:thedevelopmentenvironment,algorithmsandlibraries,usertools,andtheun-derpinningandcross-cuttingtechnolo-giesrequiredtosupporttheexecutionofapplicationsattheexascale.TheapplicationsrepresentedinCRESTAhavebeenchosenasarepresentativesamplefromacrossthesupercom-putingdomainincluding:biomolecularsystems,fusionenergy,thevirtualphysiologicalhuman,numericalweatherpredictionandengineering.

Nooneorganization,betheyahard-wareorsoftwarevendororserviceprovidercandeliverthenecessaryrangeoftechnologicalinnovations

requiredtoenablecomputingattheexascale.Thisisrecognizedthroughtheon-goingworkoftheInternationalExascaleSoftwareProject[1]and,inEurope,theEuropeanExascaleSoft-wareInitiative[2].CRESTAwillactivelyengagewithEuropeanandInternationalcollaborativeactivitiestoensurethatEuropeplaysitsfullroleworldwide.Overits36monthdurationtheprojectwilldeliverkey,exploitabletechnologiesthatwillallowtheco-designapplicationstosuccessfullyexecuteonmulti-petaflopsystemsinpreparationforthefirstexascalesystemstowardstheendofthisdecade.

OverallConceptandObjec-tivesoftheCRESTAProjectHPCsystemsexisttodeliverresultstotheirusersfromnumericalsimulationandmodellingapplications.AtthecentreofCRESTAarethereforesixapplicationsdesignedtoexecutewellonpetascalesystemstodayandthatwillbeexpectedtoexecutewellonexascalesystemstomorrow.

Co-designbyApplicationsEachoftheseapplicationshasbeencarefullychosen(a)asanapplicationthatcanbereasonablyexpectedtoneedtorunattheexascale(forrea-sonsofproblemsize,timetocomputeetc.)and(b)torepresentakeyusercommunitywithagrandchallengewhohavetheneedtocomputeattheexascaleinordertodelivertheirscientificorengineeringresults.

Collaborative Research into Exascale Systemware, Tools and Applications (CRESTA)

Projects

64

Table1:CRESTA’sco-designapplications

Byunderstandingthecurrentstateofthelimitationsofthealgorithmsandproblemsizesforeachoftheseapplica-tions,CRESTAwillbeabletodevelopimprovedapplicationperformanceatthepetascaleonthencurrentsystems(perhaps100petaflop/sby2014)anddefineaclearroadmapforeachappli-cationtogetittotheexascalebytheendofthisdecademappedagainsttheexpectedhardwaredesignsweexpecttoseebytheendofthedecade.

SystemwareforExascaleHowever,applicationoptimizationandalgorithmicmodificationsonlyrepre-sentpartoftheexascalechallenge.

Systemsofthescaleenvisagedpresentenormouschallengesintermsofpowerrequirements,operatingsys-temissuessuchasresiliencyandprocessmanagement,communicationandprogramminglibraries,languages,compilers,debuggersandprofilers.Applicationsmustinteractwithmanyoftheseaspectsoftheexascalesys-temwarerequiredtocompile,link,run,debugandprofileapplicationcodes.

CRESTAwillthereforeusetheknow-ledgeavailablefromourtargetCrayplatform,withestimatesofwhatanexascalesystembuiltusingthesehardwaretechnologieswilllooklike

Application nameScientific grand

challenge domainPartner responsible

GROMACS Biomolecular systemsKUNGLIGA TEKNISKA HOEGSKOLAN (KTH)

ELMFIRE Fusion energyABO AKADEMI UNIVERSITY (ABO)

HemeLBVirtual Physiological Human

UNIVERSITY COLLEGE LONDON (UCL)

IFSNumerical Weather Prediction

EUROPEAN CENTRE FOR MEDIUM-RANGE WEATHER FORECASTS (ECMWF)

OpenFOAM Engineering

THE UNIVERSITY OF EDINBURGH, UNIVERSITY OF STUTTGART (HLRS, IHS), CENTRALE RECHERCHE SA (CRSA)

Nek5000 EngineeringKUNGLIGA TEKNISKA HOEGSKOLAN (KTH)

Projects

65

Table2:Roleofapplicationsfordifferentresearchchallenges

fromasystemsperspective,andtherequirementsandformofeachofourapplicationscodes,tobuildandexploreappropriatesystemware.Inadditiontooperatingsystemfeatureswewilllookatcompilerandlibraryissues,howtodebugattheexascale(withAllinea’sDDTdebugger),howtooptimizeap-plicationperformanceattheexascale(withTUD’sVampirtool-suiteandKTH’sperfminer)andhowtomanagethepre-andpost-processingofdataresultingfromsimulations(anoftenneglectedareainsystemsdesign).ThebalanceoftechnicalworkinCRESTAwillbeapportioned40%onapplicationsand60%onsystemware.

Incrementalvs.disruptiveApproachesAkeyfeatureofCRESTAisitsuseofdualpathwaystoexascalesolutions.ManyproblemsinHPChardwareandsoftwarehavebeensolvedovertheyearsusinganincrementalapproach.

Mostoftoday’ssystemshavebeendevelopedincrementally,growinglargerandmorepowerfulwitheachproductrelease.However,weknowthatsomeissuesattheexascale,particularlyonthesoftwareside,willrequireacom-pletelynew,disruptiveapproach.Forexample,thecommunicationsoverheadofaparticularnumericalsolvermaygrowquicklywiththenumberofcores.Withmillioncoresystemsonthehorizon,theperformanceofsuchalgorithmsmaybesopoorthatallspeedupstops.Inthesecasesadifferentmethodofnumericalsolutionwillberequired.Thismaybehighlydisruptivetotheapplica-tioncodebutitwillsetitonthepathtoexecutingattheexascale.

CRESTAwillthereforeemployincre-mentalanddisruptiveapproachestotechnicalinnovation-sometimesfollowingbothpathsforaparticularproblemtocompareandcontrastthechallengesassociatedwitheachapproach.

GROMACS ELMFIRE HemeLB IFSOpen-FOAM

Nek5000

Underpinning and cross-cutting technologies

Significant Significant Significant Significant Significant Significant

Development environment

Significant Essential Essential Essential Significant Significant

Algorithms and libraries

Essential Useful Useful Essential Essential Essential

User tools (including pre and post processing)

Useful Significant Essential Useful Essential Useful

Application

ResearchChallenge

Projects

66

•MarkParsons1

•StefanWesner2

1 EPCC, TheUniversity ofEdinburgh

2Universityof Stuttgart,HLRS, Germany

TheCo-designProcessAttheheartoftheproposalistheco-designprocess,involvingguidanceandfeedbackbetweentheco-designappli-cationsandthesystemwareworkpack-ages.Ahighlevelroadmaptoachievingexascaleforeachco-designvehiclewillbedevelopedaswillamoredetailedneedsanalysistoguidedevelopmentworkinthesystemwareWPs.Theyinturnwillinputexpertiseandprovidesolutionstothechallenges.Thiswillbeacyclicalprocess,theneedsanalysiswillbedynamicandupdatedregularlybasedonexperiencesofallthedevelop-ersacrossWPs.

Thesolutionstothechallengeswillbedifferentforeachapplicationandtheirintegrationwithdifferentworkpack-ageswilldependonthese.Foreachapplicationwehavecharacterizedtheirinteractionwitheachworkpackageonathree-pointscale.TasksareeitherEssential,SignificantorUsefulbasedonhowcriticalthedevelopmentistoenablingtheapplicationforexascale.

CollaborationandExploitationNooneorganization,betheyahard-wareorsoftwarevendororserviceprovidercandeliverthenecessaryrangeoftechnologicalinnovationsrequiredtoenablecomputingattheexascale.Thisisrecognizedthroughtheon-goingworkoftheInternationalExascaleSoftwareProjectand,inEurope,theEuropeanExascaleSoft-wareInitiative.Atthesametime,thePRACEResearchInfrastructure(PRACE-RI)[3]givesEurope,forthefirsttime,aleadership-classHPCresearchinfrastructureaccessiblebyanysuitablyqualifiedEuropeanscien-tistwithasuitableproblemtosolve.

Withinthisdecade,thePRACE-RIwillprovideexascalecomputingresourcesforEurope’sscientists.AsfouroftheleadingpartnersinthevariousPRACEprojects,thesupercomputingcentresrepresentedinCRESTAwillensureex-ploitationoftheresultsoftheCRESTAprojectbythePRACE-RI.

CRESTAwillcollaboratewithEESI[4]andanysubsequentprojecttoensurethatEurope’sexascaleresearchcom-munityhasanaturalmeetingplacetosharediscoveriesanddemonstrateleadershipontheworld-stage.TheCRESTApartnerswhoareinvolvedinIESPwillcontinuethisactivityandseek,whereappropriatetocollaborateontheInternationalstageasevidencedbythelettersofsupportincludedwiththisproposal.

AllnewsoftwaredevelopedbyCRESTAwillbemadeavailableasOpenSourceSoftware.

References[1] TheInternationalExascaleSoftware Project,http://www.exascale.org/ [LastaccessedAugust23,2011][2] TheEuropeanExascaleSoftwareInitiative, http://www.eesi-project.eu/ [LastaccessedAugust23,2011][3] PRACERI,http://www.prace-ri.eu/[4] EuropeanExascaleSoftwareInitiative, http://www.eesi-project.eu/

Projects

67

IntroductionSupercomputersnowadaysconsistofhundredsofthousandsofprocessingunits.ForexampleHLRSwillinstalla1PFlopsCrayXE6withdual-socket16-coreAMDInterlagosnodesinover3,500nodes,i.e.112,000cores.Theparallelizationtechniquesofthepro-gramswhichrunonthesemachines,e.g.MPIandOpenMP,justtonametwoofthemostused,havehoweverbeendesignedwhenthelargestcom-puterswere100timessmaller.Duetoconstantdevelopment,thesetech-niquesstillwork,buttheyhavetobeexpandedbyadditionalprogrammingmodelsinordertokeepupthescalability.

ApromisingapproachistocombinethewidelyusedMPIwithsharedmem-oryparallelizationusingtaskbasedparallelizm,forinstanceSMPSs[1].Inthisprogrammingmodel,piecesofcode,e.g.functions,arespecifiedaspotentiallyparallelusingspecialmarkers.Giventhetasks'datadepen-dencies,eitherprovidedbythepro-grammerorextractedautomaticallybythecompiler,theprogram'stasksarescheduledtoworkerthreadsatrun-time.Programmingthreadsalreadyishard,dueitssharingofressourcesduetopossibleraceconditions-addtothatissuesofidentifyingandcorrectlystatingthetasks's

Debugging on the next Level: TEMANEjO

Projects

68

Figure1:Asimpledependencygraph

withtentasksintwoindependentsub-

treesvisualizedbyTEMANEJO.Thenode's

interiorcolorrepresentthetask(the

taskifiedfunction),thesurrounding

margin'scolorrepresentthetask'sstate,

e.g.queued(yellow)orrunning(green).

Theredmarginsindicatethatthetask

hasunfulfilleddependenciesandcan

thereforenotbequeuedyet.Thenode

shapes(heretriangleandbox)denote

twodifferentworkerthreadsforrunning

tasks,whichmaybescheduledtodifferent

cores.Thetextlabelsandcoloursofthe

edgesindicatethememoryaddressesof

thedependencies.

dependenciesandperformanceimplicationsmakesitdifficultfortheprogrammertocomeupwithanefficientparallelizationstrategyofagivenprogram.

Wedevelopedagraphicaldebugger[2]whichiscapableofdisplayingtherelevantinformationinanaccessiblemannerandgivingtheprogrammermanypossibilitiesforinteractionwiththerunningprogram.

Thedifferencebetweentask-basedprogrammingmodelsandotherwaysofparallelisingapplicationsisthatitisnotknownaprioriwhenataskwillbeexecutedorwhichcorewillexecuteit.Theonlyinformationpertaskisthedatadependencies(basicallythedata'smemoryaddress),andthereforewhichothertasksdependonit.Asaresultadependencygraph,tobepreciseadirected-acyclicgraph(DAG)

iscreatedatruntime,withnodesbeingtasksandedgesbeingthede-pendenciesasseeninFigure1.

Newtaskswillbeaddedtothedepend-encygraphwhiletheprogramisrun-ningandfinishedtasksareremovedfromit.Whenforagiventaskallthedependenciesarefulfilleditcanbeexecutedbyanythreadatanytime.Thereforeeveryrunofanap-plicationcanresultinadifferentorderofexecutionondifferentpro-cessingunitsinthesystem.Thismakestaskbasedparallelprogramsextremelyhardtodebugwithnormaldebuggerssuchasgdb.Twoexamplesoftask-basedprogrammingmodelsareSMPSsandOMPSs.Botharede-velopedintheTEXTprojectwithparticipationoftheApplication,ModelsandToolsgroupattheHLRS[3].

AnewDebuggerTraditionaldebuggersworkcommandbasedonaperthreadbasis-toswitchbetweenconcurrentlyexecutingthreads,theprogrammerneedstoissuecommandsandretrievethestateofthethread.

Thatmeansonestepsthroughlinesofcodeandswitchesfromonethreadtoanother.Intaskparallelprogramming,onecannotseewhyataskcanorcannotbeexecuted.Moreover,uptonowtheprogrammerwaslefttoherowndevicesimaginingthedependencygraphorviewingpostprocessedgraphsfromprogramoutputs.

Wedevelopedagraphicaldebuggerfortaskparallelprogramsinorderenabletheprogrammertovisualizethedependencygraphandinteract

Projects

69

withtheruntimewhiletheprogramisexecuting(seeFig.2).

Debuggingtheparallelcodecanbesepa-ratedintotwophases.Analysisofthedependencygraphandcontrolling,i.e.changingtheactualexecutionoftheapplication.Fortheformertheinfor-mationaboutthenodes(tasks)andedges(dependencies)ofthedependencygraphhavetobeextractedandcom-municatedtoavisualizationtool.Forthelatteracontroltool(whichcanbethesameasthevisualizationtool)willenabletheprogrammertosendcommands(inthefollowingcalledrequests)totheapplication.

Insteadofsteppingfromlinetolineorputtingbreakpointsatcertainlinesorfunctions,theprogrammerofatask-basedapplicationwantstostepthroughthedependencygraphandinhibitorforcetheexecutionoftasks.

Thisissupplementedbyattachingadebuggerlikegdbtotherunningprocessontheactualcomputenode.

Inordertoanalyzeandcontroltheexecutionofanapplicationweinstru-mentedtheruntimedeveloppedbyBarcelonaSupercomputingCenter(BSC)witheventhandlersatcertainpointsofthetaskslifecycle.Theeventhandlerisimplementedasaweakreferencetoalibraryfunctionwhichisdynami-callylinkedtotheapplicationusingtheLD_PRELOADenvironmentvariable.

ThiseventhandlerisimplementedasalibrarycalledAYUDAME(Spanishforhelpme)andiscalledatdistinctmomentsduringruntime.Itperformsnumeroustasks,topasstheinfor-mationtoanexternaltool,likeTEMANEJOviaTCPconnectionorreacttotheeventitself.

Projects

70

•SteffenBrinkmann•ChristophNiethammer•JoséGracia•RainerKeller Universityof Stuttgart,HLRS

Figure2:ScreenshotofTEMANEJOrunninga

sparsematrixinversionparallelizedwithSMPSs.

Onthefrontendsideanytoolmayattachviatheopenprotocol-thedebuggerTEMANEJO(SpanishforIhandleyou)performstwotasks:

1.ItvisualizesthedependencygraphoftheStarSsapplicationgivingtheuserthepossibilitytoverifythecorrectnessofthetaskinterdependenciesandtooptimizethedependencystructure.

2.ItcontrolstheStarSsruntimeen-vironmentbysendingrequeststothelibrary.Requestscanbetopausetheruntimeunderspecificconditions,toblocktasks,orsingle-andmulti-stepthroughthegraph,orattachingadebuggerlikegdb.

FordisplayingthedependencygraphTEMANEJOneedsatleasttwopiecesofinformation:Whichtasksexistandwhataretheirdependencies.Theformerisreceivedwhentheprogramframeworkcreatestaskswhilethelatterisgatheredwhenthedependenciestothepredecessorsofeachnewlycreatedtaskareanalyzed.Furtherinformationpassedisthestatusofeachtask,i.e.whethertheyarequeued,runningorfinished,thememoryaddressofade-pendencyvariableandatimestamp.

AllofthisinformationisstoredinadatastructureinTEMANEJOforfurtheranalysis(e.g.numberofdependenciesofatask,longestandshortestpathsthroughthegraph,executiondurationoftasksetc.).Usingthenodecolour,nodemargincolour,nodeshapeandedgecolourasconfigurableindicators,

theprogrammercanaccesstheneededinformationinaconvinientandintuitiveway,allowingreductionofinformation,e.g.color-codingtheexecutingthread,showingimbalance.

ConclusionsWithAYUDAMEascalableandexten-sibleframeworkfortoolsfortaskbasedprogrammingmodelshasbeendevelopped.IthasbeentestedwithSMPSsandOMPSs,butcanbeusedwithanytaskbasedparallelizationmodelaslongasitisinstrumentedwithcallstotheAYUDAMEeventhandler.TheTEMANEJOdebuggerallowsvisualisingthedependencygraph,steppingthroughthetaskgraph,blocktasksandprioritisetasks.

Offeringthisfunctionality,TEMANEJOempowerstheprogrammeroftaskparallelapplicationstodebugandopti-mizetheapplicationefficiently.

References[1] Marjanovic,V.,Labarta,J., Ayguade,E.,Valero,M. Overlappingcommunicationand computationbyusingahybridMPI/SPMSs approach.InICS'10:Proceedingsofthe 24thACMInternationalConferenceon Supercomputing,pages5-16,NewYork, NY,USA,June2-4,2010,ACM

[2] Brinkmann,S.,Gracia,J., Niethammer,C.,Keller,R. TEMANEJO-adebuggerfortask-based parallelprogrammingmodels. InProceedingsofParCo'11,2011, submittedforpublication

[3] TEXT-TowardsEXaflopapplicaTions (www.project-text.eu)

Projects

71

Figure1:LarKC–ahighperformanceSemanticWebreasoningplatform

TheessenceofSemanticWebistheideathattheWebcanexploittech-niquesfrom,e.g.formalKnowledgeRepresentation,tomakeinformationavailableinamachine-processablefor-mat,sothatamoreintelligentusersupportcanbeachievedontheWeb[1].Suchmachine-understandabledataformats,forinstancetheResourceDescriptionFramework(RDF),enablenovelusesoftheWebsuchasseman-ticsearch,dataintegration,statisticalanalysisandothers.RecentadvantagesinSemanticWebhaveforcedWebapplicationstoscaleuptotherequire-mentsoftherapidlyincreasingamountofinterconnectedanddistributeddatasuchasobservedintheLinkedOpenDatarepositoryfordatalocatedacross

theWebortheLinkedLifeDatase-man-ticintegrationplatformforthebio-medicaldomain,butalsoine-Scienceande-Commerce(e.g.Ontoprise).

ThemassiveandtremendouslygrowingamountofdatarequireseffectiveexploitationandishenceofagreatchallengeforthemodernITplatformsandinfrastructures.Anotherbigchal-lengeforachievingtheefficiencyandweb-scalingofSemanticWebappli-cationsistheheterogeneousnatureofexploreddataontheWeb,whichresultsindatainconsistencies,incom-pleteness,butalsoredundanciesduetovaryingmethodologiesusedduringthedatagenerationandcollection.

GiventhelargeproblemsizesaddressedbySemanticWebandconsideringthecomplexityofsomedataexplorationalgorithmssuchasRandomIndexingdescribedbelow,itseemsnaturaltoexplorethebenefitsofportingSemanticWebapplicationsforrunningonHighPerformanceComputingarchitectures.

LargeKnowledgeColliderOneofthemajorpracticalattemptstobuildaSemanticWebenginecapa-bleofprocessingbillionsofstructureddata,i.e.,web-scaledata,isperformedintheEUFP7projectLarKC(www.larkc.eu).LarKC,whichstandsfortheLargeKnowledgeCollider,buildsanexperimentalplatformformassivedistributedincompletereasoning(seeFigure1),whichaimsatremovingthescalabilitybarriersrecognizedformostofthecurrentlyavailablereasoning

Towards High Performance Semantic Web – Experience of the LarKC Project

Networks

Sem

antic

Ser

vice

s e-

Infr

astr

uctu

re

App

licat

ions

Computing Storage

Developers

Users

Access Aggregation

Inference Transformation

Sharing

Projects

72

Figure4:Comparisonoftime(a)andbandwidth(b)ofinter-nodecommunicationofdifferentMPIlibrariesforJava(ontheHLRSNECNehalemclusterwithEthernetandInfinibandinterconnects).

Figure2:ParallelizationinSemanticWebapplicationworklows

engines.ThisgoalisachievedbymeansofanumberoftheoriginaltechniquesadoptedbyLarKC,e.g.ahighlyinnovativereasoningapproachformergingtheretrievalprocessandthereasoningbymeansofselection,identification,ortransformation[2].Ontheotherhand,LarKCenablesnumerousnovelITinfrastructuresolu-tionstosupportthoseoptimizationtechniques,suchasHighPerformanceorGridComputinge-Infrastructures.Theoptimalresourceprovisioningisofespecialimportanceforensuringtheweb-scalepropertyofSemanticWebapplications.However,sincein-troducingaspeciale-InfrastructureforSemanticWeb,asdoneinLarKC,processingvastamountofdataisnotamajorbottleneckanymore.

Nevertheless,leveragingthoseresourcesrequiresnecessaryadaptationsinthetraditional(serial)applicationcodes,i.e.theirparallelization.Theparallelizationbecomesthusamajorchallengefor

thenext-generationSemanticWebapplicationsexecutedinacontextofe-Infrastructure.

ParallelizationStrategiesadoptedbyLarKCInsolvingthoseissuesrelatedtothelarge-scaleSemanticWebapplications,LarKCallowsareasoningapplicationtobebuiltontopofnumerouslightweightSemanticWebcomputationalblocks(plug-ins,seetheactuallistonLarKCMarketPlaceathttp://www.larkc.eu/plug-in-marketplace),usedforidentifi-cation,selection,transformation,andactualreasoning.Whencombinedinacommonworkflow,suchasoneshowninFigure2,theseplug-inscanbeefficientlyutilizedforsolvingproblemsofthealmostvirtualdimensionality.

Data Parallelism

Identifier

Selecter 1

Reasoner

Decider

Selecter 2

Identifier

1

r

Query Transformer

S

S

Decide

Instruction Parallelism

Multithreading MPI

Ope

ratio

n 1

Map Reduce

Ope

ratio

n 2

Ope

ratio

n 3

Ope

ratio

n 4

Message Lenght [Byte] Message Lenght [Byte]

Projects

73

Figure4:HybridMPI+JavaThreadscommunicationpattern.

Tosupportthisfeature,thecompositionoftheplug-insinaworkflowenablesparallelexecutionoftheplug-insonthehighperformanceresources.IntermsofLarKCapplications,theparallelizationsuggeststheidentificationofthecon-currentregionsoftheapplicationdata-aswellasinstructionflow,withfurthermappingthemtotheindependentpro-cessorunitsofaparallelsystem.

Amongthemostwidelyutilizedandsus-tainableinSemanticWebparallelizationapproaches,suchasMultithreading,Map-Reduce,aswellastheMessage-PassingInterface(MPI),thelatter(MPI)isthemostpromisingoneintermsoftheimplementationeffortsneededforaserialapplicationaswellasintermsoftheprovidedscalability.TherehavebeenseveralinitiativesstrivingtoprovideHPCsupportforJava,whichisde-factoadefaultprogramminglanguageintheLarKCSemanticWebcommunity.OneofthemostsuccessfulMPIimplementa-tionsforJavahasprovedtobempiJava,

chosenforadoptioninLarKC(Figure3).ThempiJavaframeworkiscurrentlydevelopedandsupportedbyHLRS.

MPIRealizationofRandomIndexingRandomIndexingisanovelapproachforvectorspacemodelling[3].Thevec-torspacerepresentsthedistributionalprofileofthewordsinrelationtotheconsideredcontexts/documents.Themainmethodicalvalueofthisprofileisthatitenablescalculationofthese-manticsimilaritybetweenthewordsinscopeofthedocumentcollection(textcorpus),basedonthecosinesimilarityfunctionofthegivenwords’contextvectors(1).

wherefqisaco-occurrencefunctionbetweenthewordxfromthewordsetXandeachofthecontextscjЄC

m,misatotalnumberofthecontexts,nisatotalnumberofthewordsinallcontexts.

However,suchpopularRandomIndex-ingimplementationpackages,suchasAirhead[4],areincreasinglyineffectivewhencomplexlyaddressinglargedataamounts,e.g.ascollectedbyLinkedLifeData.LarKChasexaminedthedomaindecompositionbasedparallelimplemen-tationofRandomIndexing,asdepictedinFigure4.

WithregardtotheaforementionedAirheadlibrary[5],verypromisingperformancecharacteristicswereobtainedforbothpureMPIandhybridMPI-JavaThreadsimplementations(seeFigure5).ThedocumentsetbasedonaselectionoftheWikipediaarticles(1Mhighdensityentries,16GBdiskspace)wasusedforthisperformance

Problem Domain

MPI Process 2MPI Process 1

Compute Node 1 Compute Node 2

Thread Pool Thread Pool

∀x∈Xn, ∃ v=[ fq(x,cj= )] (1)

∀x∈Xn, ∃ v=[ fq(x,cj= )] (1)

Projects

74

•AlexeyCheptsov•MatthiasAssel

Universityof Stuttgart,HLRS

benchmark.Detailedresultsfordifferentinputdocumentsizesaswellasclusterconfigurationsarereportedin[6].

OutlookRecentadvantagesintheSemanticWebrequiretheunderlying(Java)applicationstoscaleuptotherequire-mentsoftherapidlyincreasingamountofprocesseddata,suchasthosecom-ingfrommillionsofsensorsandmobiledevices,orTBofdatavolumescon-ductedduringscientificexperimentsusinglaboratoryequipment.IntroducingHPCinSemanticWebdomaincangreatlysupportthischallenge.

Traditionally,theSemanticWebandtheHighPerformanceComputingcom-munityhavebeensomewhatdisjoint.However,astheneedsandcapabilitiesofthesetwocommunitiescontinuetoconverge,itturnstobebeneficialforbothtoleveragetheirrespectivetech-nologies.Parallelrealizationofserialcodesisakeyenablerofhighperfor-mancearchitecturesandisthereforeagreatchallengeforthemajorityofSemanticWebapplications.LarKCaimsatsimplifyingdevelopmentofhighperformance,parallelizedapplications,andthusbridgingthegapbetweenSemanticWebandHPC.

References[1] Daconta,M.C.,Smith,K.T.,Obrst,L.J. TheSemanticWeb:aGuidetotheFuture ofXml,WebServices,andKnowledge Management.JohnWiley&Sons,Inc.,2003

[2] Fensel,D.,vanHarmelen,F. UnifyingReasoningandSearchtoWeb Scale.In:IEEEInternetComputing11(2), pp.96-95,2007

[3] Sahlgren,M. Anintroductiontorandomindexing, MethodsandApplicationsofSemantic IndexingWorkshopatthe7thInternational ConferenceonTerminologyandKnowledge EngineeringTKE2005,pp.1-9,2005

[4] Jurgens,D.,Stevens,K. TheS-SpacePackage,AnOpenSource PackageforWordSpaceModels. ProceedingsoftheACL2010System Demonstrations,pp.30-35,Association forComputationalLinguistic,2010

[5] Cheptsov,A.,Assel,M.,Koller,B., Gallizo,G. EnablingHighPerformanceComputing forJavaApplicationsusingtheMessage- PassingInterface,P.IványiandB.H.V. Topping(eds.),ProceedingsoftheSecond InternationalConferenceonparallel, distributed,gridandcloudcomputing forengineering,2011

[6] Assel,M.,Cheptsov,A.,Czink,B., Damljanovic,D.,Quesada.J. MPIRealizationofHighPerformance SearchforQueryingLargeRDFGraphs usingStatisticalSemantics.Proceedings ofthe1stWorkshoponHigh-Performance ComputingfortheSemanticWeb, collocatedwiththe8thExtendedSemantic WebConference(ESWC2011),toappear in2011

Figure5:PerformancecharacteristicsoftheparallelRandomIndexingrealization(a)andcomparisonofpureMPIvs.MPI+JavaThreadscommunicationperformance(b).

Total Execution

Similarity Search

Doc.space Loading

MPI Communication

Pure MPI

MPI + Java Threads

CPU NodesCPU Nodes

Projects

75

Figure1:“PlugyourBusinessintoIT”

BusinessandITAlignmentusingaModel-basedPlug-inFrameworkThealignmentofInformationTechnologyandBusinessisstillahighlycomplexandhardtoautomateprocessandremainsthereforemainlydrivenandperformedbyhumans.Bynature,thebackgroundandknowledgeofthosehumanscandiffer,dependingontheirrolewithintheirorganization.Sincedifferentpartiesoftendon´tshareacommonknowledgespace,thewholesituationislikelytobecomecomplex.

plugITinGeneralTheplugITproject[1]isbasedontheobservationofthenecessitytoalignBusinessandIT[2]duetotherolechangeofITfromanenablertoanindustrialsector.plugITaimsatdevelopinganITSocketthatrealizesthevisionof“plugging”businessintoITinawaysimilartotheoneusedto

provideelectricityviaasockettoanydevicethatcanbepluggedin.ThischallengecanbetakenupbycapitalizingonsemantictechnologiesforITGovernance.InplugIT,theNextGenerationModellingFramework[3]isdevelopedwhichreliesonresearchadvancesyieldingthefollowingbenefits:

• Atighterinvolvementofdomain expertsismadepossibletoexpress formalandsemi-formalknowledge viatheuseofgraphicalmodelling languages

• Differentgraphicalmodelling languagesfordifferentviews anddifferentlevelsofformal expressivenesscanbeused

• Domainspecificnotationsfor semanticsareintroducedby mergingformalconceptsof semanticswithgraphicalnotations

plugIT – Plug Your Business into IT

Projects

76

Figure2:InteractionoftheplugITITSocketandtheOPSsystem

AnHLRSUseCaseHLRSisoneofthreeusecasepart-nerswithinplugIT.ThedetailedusecaseofHLRScoversanOnlinePro-posalSubmissionprocess(OPS)inwhichprojectapplicantscansubmitaprojectrequesttoaccessHLRScom-putingresourcesandperformtheircomputationaltasks.

Basedontherequirementsoftheprojectapplicant,modeltranslationsareusedtofindthebestfittingoffer,representedasasetofrecommenda-tionsforServiceLevelAgreements(SLAs)[4].TheITSocketsupportsthewholeprocessofcreatingaproposal,analysingtheproposalparameterswithrespecttoexistingmodelsandfinallyrecommendingandgeneratingSLAs.Thisisenabledthankstotheuseofaso-calledsemantickernelwhichusesgraphicalmodelscom-binedwithsemanticinformation.

InthecurrentproductionversionoftheOPSprocess,HLRSusesawebformbasedapplicationenablingaprojectapplicanttomakerequests

forcomputingresourcesatHLRS.Theapplicantcanentervariouspiecesofinformationdescribinghis/hercomputingresourceneeds.Oncesubmitted,therequestisanalyzedbyaprojectapproverandapprovedor,incasemodificationsareneces-sary,sentbackforupdates.Sofar,allthishasbeendonewithoutanyautomatedsupportingprocesseandhasreliedheavilyontheknowledgeoftheprojectapprover.

NecessaryEnhancementstocoverfutureDevelopmentsNow,withtheadventofnewparadigmslikeCloudComputing,thisprocessneedstobeenhanced.Whilstuptonowmostoftheapplicantscanbeassumedtobespecialistswithintheirdomain,whichmakestheprocesssimpletomanage,theofferingofcomputingresourcesneedstobecomemoreintel-ligentinthefuture.Inthelongterm,weneedtoensurethatalsoapplicantswithonlymoderateknowledgeoftheunderlyingsysteminfrastructureneedtobeabletoapplyforresources.

IT Provider

Business Client

„Project“ Description

IT-Infrastructure & SLA Description

Reference Models

Semantic Technology

Challenge !

Consultant

Projects

77

Figure3:HLRSinplugIT

Moreover,theresourcesthemselvesarealsogettingmoreandmorecom-plex.TheacquisitionofanewCraysupercomputeratHLRS[5]isjustonesteptowardsanewgenerationofhighlycomplexinfrastructures.Theroleoftheprojectapproverthusgetsmoreandmoredifficultandthebenefitofanysupportingtechnologybecomesobvious.

TheplugITEnhancementsBymeansoftheplugITITSocket,HLRShasconcentratedontheprovisioningofsupportfortheprojectapprovers.Inparallel,therealizationoftheneces-sarystepsforintroducingSLAsintotheOPSprocesshavebeenaddressed.

AsplugITfollowsamodel-basedap-proach[6][7][8],thefirstactionwithintheprojectwastocreateanumberofreferencemodelswhichwereintendedtoprovidethenecessaryfoundations

fortheprojectdevelopment.ThiswasdoneviaanonlinemodelrepositorycontaininggraphicalmodelsrelatedtotheITinfrastructureandITservicesofHLRS.ModellingwasperformedwiththeNextGenerationModellingFramework,oneofthedeliverablesoftheproject.AllthemodelsarelinkedtoeachotherandtheyrepresenttheknowledgeofHLRSprojectapprovers.

TheOPSprocessisexecutednowasfollows:AprojectapplicantsendsarequestforcomputingresourcestotheHLRSprojectapprover.Uponre-ceptionoftherequest,theapprovermakesuseoftheITSockettogetbackfromitrecommendationsonwhichSLAtooffer.Thisisrealizedthroughtheautomatedprocessingoftheproj-ectapplicant’srequestbyasemanticworkflowoftheITSocket.Therecom-mendationsareSLAoffersthatdefinethecategoryoftheSLAthatcouldbe

plugIT IT Socket

plugIT NGMF

HLRS Project Approver Project Applicant

HLRS Modeller

1. IT Infrastructure (Compute Resources + Configuration) Models

2. IT Service (SLAs + Criteria) Models

create models

make request for compute resources

offer specific to compute resources

OPS application enacts HLRS workflow

send models

Projects

78

•AxelTenschert•PierreGilet•BastianKoller

UniversityofStuttgart,HLRS

proposedtotheprojectapplicant.Inthecurrentscenario,thecategoriesarebronze,silverorgold,basedonthequalityoftheoffer.Eachofferrelatestoadedicatedcomputingresource.Inaddition,therecommendationsarerankedandvisualizedinawaysimplefortheprojectapprovertounderstandwhichSLAofferrecommendationsarethebestfittingtheprojectapplicant’srequirements.TheHLRSprojectap-proverhasthepossibilitytoviewthegraphicalmodelsifhe/sherequiresmoreinformationregardingtheSLAofferrecommendations.

Thepossibilitytopluginthebusinessrequirements–theprojectapplicant’srequest–intotheITdomainimprovestheefficiencyandoverallperformanceoftheOPSsystemandallowsHLRStobroadenitscustomerbase.Thegraphicalmodellingapproachalsoshowsobviousadvantagesintermsofmaintenanceofinformationandknowledgetransfer.

FactsplugITisaprojectfundedbytheEuro-peanUnionwithinthe7thFrameworkprogram.Theconsortiumconsistsofeightprojectmembers.plugITstartedontheMarch1,2009andwillrununtiltheAugust31,2011.

ThePartners• BOCAssetManagementGmbH(AT)• TelespazioItalia(IT)• UniversityofVienna,Departmentof Knowledge&BusinessEngineering(AT)• FoundationforResearchand TechnologyHellas(GR)• FachhochschuleNordwestschweiz(CH)• CINECA(IT)• InnovationTechnologyGroupSA(PL)• UniversityofStuttgart,HLRS(GER)

Websitehttp://plug-it.org/

References[1] plugITwebsite,http://plug-it.org

[2] Woitsch,R.,Karagiannis,D., Plexousakis,D.,Hinkelmann,K. BusinessandITAlignment:TheIT-Socket. E&IElektrotechnikundInformations- technik126,pp.308-321,2009

[3] NextGenerationModellingFramework Portalofthe2ndplugITPrototype: http://83.65.190.84/plugIT/workbench/

[4] Koller,B. EnhancedSLAManagementintheHigh PerformanceComputingDomain,Ph.D. Dissertation,UniversityofStuttgart,2011

[5] CraywinsSupercomputerContract FromtheUniversityofStuttgartvalued atmorethan$60Million, http://investors.cray.com/phoenix.zhtml?c =98390&p=irol-newsArticle&ID=1486975 &highlight

[6] Woitsch,R.,Karagiannis,D., Plexousakis,D.,Hinkelmann,K. PlugyourBusinessintoIT:Businessand IT-AlignmentusingaModel-basedIT-Socket, eChallengese-2009Conference, (eChallenges09),Turkey,IOSPress,2009

[7] Bork,D.,Sinz,E.J. DesignofaSOMBusinessProcess ModellingToolbasedontheADOxx Meta-modellingPlatform.IndeLara, J.,Varro,D.,Margaria,T.,Padberg,J., Taentzer,G.,eds.:4thInternational WorkshoponGraph-basedTools (GraBaTs2010),Enschede,The Netherlands,pp.89-101,2010

[8] Bézivin,J. ModelDrivenEngineering: AnEmergingTechnicalSpace, GenerativeandTransformational TechniquesinSoftwareEngineering, Volume4143,pp.36-64,2006

Projects

79

Figure1:Sampleofsimulationinthedesignphase

EnergyconsumptionandimplicitCO2emissionsofcomputinganddatacen-treshaveincreaseddrasticallyoverrecentyearsandareexpectedtoin-creaseevenfurther.BesidetheraisingcostsforenergyconsumedbyITser-vicecentres,peoplearegettingmoreandmoreawareaboutthefollow-upofthehighdemandofenergyfortheITsector,liketheimpactonglobalwarm-ingandCO2emissions.Asanexample,

worldwidedatacentresCO2emissionsarealreadyequivalenttoabouthalfofthetotalairlines’CO2emissionsandareexpectedtoovercomethe40%ofTotalCostofOwnershipofworldwideITby2012.Datacentreelectricityconsumptionaccountsforalmost2%oftheworldproductionandtheirover-allcarbonemissionsaregreaterthanbothArgentinaandtheNetherlandstogether[1].

GAMES (Green Active Management of Energy in IT Service centres)

Projects

80

Figure1:Sampleofsimulationinthedesignphase

Sincecomputingdemandandelectricitypricesarerisingwhilstbecomingdwin-dlingresources,energyconsumptionofITsystemsanddatacentreenergyefficiencyareexpectedtobecomeapriorityfortheindustry.Despitethefactthatmanystakeholdershavebeenundertakingsignificanteffortsinde-liveringnewyetmoreenergyeffectiveITequipmentallowingsignificantcostandenergysavings,unfortunatelytheproblemoftheenergyefficiencyofInformationSystemsasawholehasnotbeenproperlyaddressedsofar.

GreenComputingisanewdisciplineandpracticeaimedatdesigningandusingcomputingresourcesinanenviron-mentally-awareway.Itwasoriginatedmorethanadecadeagowiththemaingoalofreducingenergyconsumptionofcomputingresources,yetmaintainingaclearfocusontheimpactontheen-vironment.AlthoughmanyprogresseshavebeenmadebyGreenComputing,makingnewchipsandserversavail-ablewhichundoubtedlyconsumelessenergy,inmostcasesimprovementsinefficiencyaredevouredbyincreasingdemandforcomputingpowerandca-pacity,drivenbynewdigitizedbusinessprocessesandservices.

TheGAMESproject[2]aimsatdevelop-ingtoolsandmethodologiestoimprovetheenergyefficiencyofITservicecen-tresbyenablingactivemanagementofresourcesandsoftwareequally.Whilststoragehostscanprincipallyreducethecomputingfrequencytosaveenergy,computeprovidersandinparticularHighPerformanceComputingcentreshavemoredynamicandchangingdemandstowardstheinfrastructureusage–suchastheflexibledegreeofscaleoutofaprocessorthedifferentscopeofdataaccessandstorageof

differentapplications.Mostresourcesinsuchanenvironmentdonotallowforfastenoughadaptationoftheirenergyparameterswithoutaffectingtheover-allperformance.Whatismore,mostparametersandrelationshipsbetweenusageandconsumptionarenotevenknownasyet,e.g.thetotalenergyprofileofanapplicationthatrunsatmaximumCPUclockrateforashorttimemaybelowerthanthatofthesameapplicationrunningathalfclockrateforalongertime,dependingonthebehaviouralprofile.

ThereforetheGAMESprojectwillex-aminetheenergyprofilesofdifferentapplicationsandsystemsaccordingtotheirspecificbehaviourinmoredetail,derivingenergyprofilesfromthiswhichindicatehowtoconfiguretheinfrastructureforbestenergyandperformanceefficiency.Itwillexposetheprofileparameterstoenablede-veloperstowriteenergyefficientap-plicationsandconfigureperformanceaccordingtotheirneeds.GAMESwillfurthermoredevelopamonitoringandmanagementsystemtightlycoupledtotheresourceinfrastructure,thusenablingdynamic,flexibleandimmedi-atereactiontochangingrequirements,withoutaffectingtheoverallexecutionperformance.

Projects

81

Inparticular,theGAMESprojectaimsatdevelopingasetofinnovativemethodologies,metrics,OpenSourceICTservicesandtoolsfortheactivemanagementofenergyefficiencyofITServiceCentres.Itfocusesonthefollowingtwoaspects:

1. Co-designofenergy-aware informationsystemsandtheir underlyingservicesandITservice centrearchitecturesinorderto satisfyusersrequirements(Quality ofService),serviceperformance, context,addressingenergyefficiency andcontrollingemissions(cp.Fig.1). AcombinationofGreenPerformance Indicatorsareproposedtoevaluateif andtowhatextentagivenservice andworkloadconfigurationaffects thecarbonfootprintemissions’levels;

2.Run-timemanagementofIT ServiceCentreenergyefficiency, exploitingtheadaptivebehaviourof thesystematruntime,bothatthe service/applicationandITarchitec- turelevels(includingITcomponents likeservers,andstorage),whilst consideringtheinteractionswiththe facilitymanagementaswellinanover- allunifyingvision.

Inparticular,GAMESwilladvancethecurrentscientificandtechnologystate-of-the-artinenergyefficiencyforITser-vicecentresinthefollowingdomains:

• GAMESwillcreateandmake availableanintegratedmethodology (GAMESco-designmethodology) fortheshareddesignof“GreenIT ServiceCentres”,trading-offQuality

Projects

82

•AlexanderKipp

Universityof Stuttgart,HLRS

ofServices,users’businessand functionalrequirementsagainst energyefficiencyandemissions;

• GAMESwillcomplementandextend oneofthemostleading-edgeOpen Sourcedatacentremonitoring tools,namelyNAGIOS,withthe capacityofassessing,monitoring andcontrolling,bothinareactive andproactiveway,energycosts andemissions(GAMESEnergy EfficiencyTool)inrealtimeof alternativeyetviableoptions/ configurationsfordistributing servicesamongthevirtualized machines,workloadamongservers andstoragedevicesaswellas balancingpoweragainstheat/ temperatureatfacilitylevel;

• TheGAMESweb-toolwillbeenriched withadvancedknowledge-basedand informationextractionfeatures (GAMESKnowledge&MiningModule) byoriginallycombiningdatamining, semanticandcontextmanagement technologiesforcloselyaligning usersbusinessrequirementsfor powerdemandwithhistoricaltrend andrealavailableresources;

• TheGAMEStoolingframeworkwill provideanadaptivecontrolfeature (GAMESMonitoring&Adaptive ControlInfrastructure),matchingthe plannedbehaviourwiththeoutput dynamicallyprovidedbytheenergy sensingandmonitoringinfrastruc- ture,theusercontextandhistorical patterns,forevaluatingifandtowhat extenttheadoptedcourseofactions willcontributetoeffectivelymanage theenergyefficiency;

• GAMESwilldefinecomprehensive energyefficiencyassessment

metrics(GAMESGreenPerfor- manceIndicators)asanenabler tocombineenergyefficiencyfacility featureswithITinfrastructureand business/applicationenergyfeatures.

HLRSisparticipatingintworoleswithinGAMES.FirstofallHLRSisintheroleofapotentialend-userofGAMESasanationalsupercom-putingcentrewithincreasingpowerdemandsofcurrently5MWforoperatingthedifferenthardwaresystemsforacademiaandindustry.InthisroleHLRSalsosupportstheactivitiesincreatingtheknowledgeandinformationbasefortheman-agementframework.AdditionallyHLRScontributestoGAMESasatechnologyandsoftwareproviderofhighlyscalablemonitoringsolu-tionsandserviceorientedarchitecturedrivenITsolutionswithaHighPerfor-manceComputingfocus.

Participants• EngineeringIngegneriaInformatica,I• PolitecnicodiMilano,I• HighPerformanceComputing CenterStuttgart(HLRS),GER• TechnicalUniversityofCluj-Napoca, RO• IBMISRAEL- ScienceandTechnologyLTD,IL• ChristmannInformationstechnik,GER• ENERGOECO,RO• ENELSi,I

References[1] Kaplan,J.M.,Forrest,W.andKindler,N.RevolutionizingDataCenterEnergyEfficiency.McKinsey&Company,July2008

[2] http://www.green-datacenters.eu

Projects

83

Figure1:PartofexemplaryschedulinggraphofSMPSswithcommunicatingtasks

Withmany-coreprocessorsofferingevermorecomputepowerpersocket,andlarge-scalesupercomputersbuiltfromthesebricks,thelong-lastingdiscussionsonparallelprogrammingmodelsarebeingposedagain.VenturingintotherealmofPetascaleapplications,severalkeyquestionsregardingscal-abilityintermsofmemoryandprocess-ingoverheadperparallelinstantiationareconsideredandweighedagainsttheneedforportability,readabilityandmaintainability.

TheTEXTprojectisfundedbytheECaspartoftheINFRA-2010callfortwoyears.TheninepartnersfromSpain,GermanytheUK,France,GreeceandSwitzerlandsharethevision,thatthekeycomponenttosupporthighproduc-tivityandefficientuseofasystemistheprogrammingmodel.Amongthepart-nersarefourHPCcenters,alsomem-bersofthePRACEcollaboration,withJSChavingaPetaflopsmachineinpro-duction.TheprojectcentersaroundtheMPI/SMPSs,whichispartoftheStar-SuperScalar(StarSs)modeldevelopedbyBarcelonaSupercomputingCenter.

OverviewTheTEXTproject’stechnologycom-binestheavailablescalabilityoftheMessagePassingInterface(MPI)acrosscomputenodeswiththepos-sibilitiesofper-nodeconcurrencyviaasynchronoustaskofSMP-Superscalar(SMPSs).GivenanexistingapplicationsusingMPIforworkdecomposition,theprogrammermayfurtherparallelizetheapplicationintoso-calledtasksusingSMPSs.Thesetasksthenaredynami-callyscheduledbytheSMPSsruntimeenvironmenttobeexecutedasynchro-niously.Theruntimegeneratesagraphandefficientlymapsready-to-executetasksontotheavailablecores,takingcareofdependenciesamongthetasks.WithlessMPIranksrunning,onehaslowerconnectivity,thereforelowermemoryoverheadforMPI-internalbuf-fers,andpotentiallybiggermessagesizes,whichtogetherwiththeSmpSstaskmodelallowforbettercommu-nication-computationoverlap.Inthelattercase,theMPIcommunicationishandledwithinanSMPSs-taskandscheduledbytheruntime,wheneverthecomputeddataisavailableatthesender.

SimilartoOpenMP,theprogrammerannotatesherapplicationusingprag-mas,specifyingthefunctionstoberunastask,theirinput,outputandinoutparameters,aswellastheirsizes.Anexampleofasimplefunctionmaybe:

#pragma css task input(SIZE)

inout(v[SIZE])

void compute_vector (float *v,

int SIZE){...}

Towards EXascale ApplicatTions (TEXT)

Projects

84

•RainerKeller•JoséGracia

Universityof Stuttgart,HLRS

Aftertheprogrammerhasinitializedtheenvironmentusing#pragmacssstart,anycalltotheabovefunctionwillbeasignedtoathreadandexecutedasynchronouslyonthecoresofthenode.Furthermore,synchronizationpointsmaybenecessarytowaitforandguarantueeacommonviewonthecomputeddata.Themainpointhoweverhereissimplepoint-to-pointdependencybetweenasynchronouslyexecutedtasks,whichallowthegraphschedulertomoreflexiblyparallelizeindependenttasks.

Usingapre-compiler,inourcaseSMPSs-cc,theannotatedCorFortransourcecodeisamendedwithfurtheradministrativecode,andfinallypassedtothenativeback-endcompiler.Usingthe-keepcompileroption,thepro-grammermayseetheactuallygener-atedintermediatesourcecode,whichiscompiledbytheback-endcompiler.

AimoftheProjectTheStarSsprogrammingmodelhasshowngoodresultsinitsGridSsandCellSsincarnationsforexecutionintheGridandontheIBMCellarchitecture,respectively.IntheTEXTproject,wehopetoextendtheprogrammingmodelontotheexistingMPI-parallelapplications,whichareimportanttothecomputecenters.

TheseapplicationschosenhavebeenusedalreadyinthecontextofthePRACEproject,andincludeSPECFEM3d(UPPA),PSCandPEPC(JSC),BESTandLS1(HLRS)andCPMD(IBM).BasedontheMPI-parallelversion,combinationofthenode-localparallelizationusingSMPSsplusMPIwillbeinvestigatedusingperformanceanalysis.

Eachapplicationoffersitsownchal-lenges,e.g.LS1beingaC++codehasveryelaborateclassstructureandisoneofthefirstC++codestobeusedwithStarSs,whileBESTusessomeofthemoreintricatefeaturesofFortran95andFortran2003.BothofthesecodesarebeingportedtoMPI+SMPSs.

Whiletheoptiontokeeptheinter-mediatecodeallowsusingtraditionaltoolstowork,thisiscumbersome.Thereforeperformanceanddebug-gingtoolswillbeenhancedtosupportthespecialrequirementsofSMPSs.Forexample,beingabletodebugwith-outhavingtofallbacktodebuggingtheintermediatecode,orbeingabletodebugusingbreak-pointsintasksbeinggenerated.

Toenhanceperformance,itwillbenecessarytoevaluateproperchunksizesofthetasksandestimatetheover-headintroducedduetodependencies.

ThePartners• BarcelonaSupercomputingCenter (BSC)• HighPerformanceComputing CenterStuttgart(HLRS)• JülichSupercomputingCenter(JSC)• EdinburghParallelComputingCenter (EPCC)• FoundationforResearchand TechnologyHellas(FORTH)• UniversityofManchester(UMAN)• UniversitédePauetdesPaysde l’Adour(UPPA)• UniversitatJaumeIdeCastellón (UJI)• IBMResearch,Zurich

Projects

85

Sarah Jones sarah.jones@kit.edu

Norbert Kalthoff norbert.kalthoff@imk.fzk.de

Rainer Kellerkeller@hlrs.de

Manuel Keßlerkessler@iag.uni-stuttgart.de

Alexander Kippkipp@hlrs.de

Markus J. Klokerkloker@iag.uni-stuttgart.de

Bastian Kollerkoller@hlrs.de

Thomas C. Langlang@physik.uni-wuerzburg.de

Zi Yang Mengmeng@theo3.physik.uni-stuttgart.de

Claus-Dieter Munzmunz@iag.uni-stuttgart.de

Alejandro Muramatsumu@theo3.physik.uni-stuttgart.de

Christoph Niethammerniethammer@hlrs.de

Mark Parsonsm.parsons@epcc.ed.ac.uk

Authors

Christoph Altmann altmann@iag.uni-stuttgart.de

Fakher F. Assaadassaad@physik.uni-wuerzburg.de

Matthias Assel assel@hlrs.de

Felix Bensingbensing@iag.uni-stuttgart.de

Arne Biastochabiastoch@ifm.geomar.de

Steffen Brinkmanbrinkmann@hlrs.de

Alexey Cheptsov cheptsov@hlrs.de

Tillmann A. Friederichfriederich@iag.uni-stuttgart.de

Volker Gaibler vgaibler@mpe.mpg.de

Leonhard Gantner leonhard.gantner@imk.fzk.de

Gregor Gassnergassner@iag.uni-stuttgart.de

Pierre Giletgilet@hlrs.de

José Graciagracia@hlrs.de

Michael Reschresch@hlrs.de

Ulrich Rist rist@iag.uni-stuttgart.de

Juliane Schwendike juliane.schwendike@kit.edu

Björn Selent selent@iag.uni-stuttgart.de

Marc Staudenmaierstaudenmaier@iag.uni-stuttgart.de

Axel Tenscherttenschert@hlrs.de

Stefan Wesnerwesner@hlrs.de

Stefan Wesselwessel@theo3.physik.uni-stuttgart.de

Uwe Wössner

woessner@hlrs.de

PublisherProf.Dr.-Ing.Dr.h.c.Dr.h.c.MichaelM.Resch

Editor&DesignF.RainerKlank,HLRS klank@hlrs.deCarinaMöhlig,HLRS moehlig@hlrs.deAntjeA.Häusser,HLRS haeusser@hlrs.de

86

ArticlesarereprintsofinSiDEVol.8No.1Spring2010-Vol.9No.2Autumn2011

inSiDEispublishedtwotimesayearbyTheGaussCentreforSupercomputing(HLRS,LRZ,JSC)

87

GCS HLRS

High Performance Computing Center StuttgartNobelstrasse 19 | 70550 Stuttgart | Germanyphone ++49 (0)7 11 - 685 - 8 72 69fax ++49 (0)7 11 - 685 - 8 72 09www.hlrs.de

©HLRS2012