113
113
The Alabama Metadata Portal: http://portal.gsa.state.al.usByPhilipT.Patterson
GeologicalSurveyofAlabama420HackberryLane
P.O.Box869999Tuscaloosa,AL35468-6999Telephone:(205)247-3611
Fax:(205)349-2861e-mail:[email protected]
INTRODUCTION
�nrecentyearsfederal,state,andlocalgovernmententitiesinAlabamahavemadesubstantialinvestmentsinthecollection,management,anduseofgeospatialdata.However,therehasbeennolargescaleefforttosharedataeffectively and efficiently. The result was unnecessary expendituresinredundantdatacreation.
MostAlabamaGeographic�nformationSystems(G�S)userscurrentlyhavebroadbandinternetaccess.Theincreasednetworkconnectivityandhighdata-transmissionrateshaveproducedtheexpectationthatlargeamountsofdatacanbeaccessedinstantly.Thisdemandfordataac-cesshasmotivatedtheAlabamaEmergencyManagementAgency(AEMA)andtheGeologicalSurveyofAlabama(GSA)tocollaborateindevelopingthegeospatialdataportal,whichallowscooperatorsanduserstosearchfor,discover,andaccessgeospatialdata(GSA,2006).
BACKGROUND
Beforestartingtheproject,extensiveresearchonavarietyofdatadeliveryoptionswasperformed.Themajorityoftheoptionswererelatedtodataclearing-houses, which are mainly useful for specific types of staticdatalikeimagery,civicboundaries,centerlines,etc.However,thedatadeliverywebsitetobebuiltwouldnotbeintendedforstaticdataalone.Theneedwastobuildarobustcompilationofalldifferenttypesofvectorandrasterdata,rangingfromgeneraldatasetstoobscuredataspecific to individual projects. Also long-term administra-tionresponsibilitiesforthistypeofcomplexcompilationsitewereaconcernforGSA.Eventually,thegrantforsitedevelopmentwouldend,andGSAwouldhavetosupportmanagingandupdatingthesitefrominternalresources.
WithsupportfromEnvironmentalSystemsResearch�nstitute�nc.(ESR�),weaddressedthisconcernwithamodified out-of-the-box application using open-source webapplicationsinconjunctionwithArc�MS,ArcSDE,andanunderlyingdatabasemanagementsystem(DMS).
Theresultingsiteprovidesthefunctionsofaclearinghouseforgeneraldataandasearchengineforuniquedata.�talsoofferssemi-automatedadministration,whichallowsusers,aswelltheadministrator,tomanagethesite.Thissolutionisidealinaddressingthedatadeliverygoalsandthelong-termadministrationconcernsposedbythisproject.
CONNECTION
Thissearchengineanddownloadsiteprovidetheframeworkforamutualgeospatialusercommunityoforganizationsandstakeholdersthatfacilitatesdiscovery,sharing,anddeliveryofG�Scontentandservices.Theportalalsofacilitatestheorganizationofcontentandservicessuchasdirectories,searchtools,communityinformation,andsupportresourcesapplications.
Theunderlyingstructureoftheportalisathree-partgeneralizedconnectionasfollows(Figure1):(1)theportalconnectstoadataprovider’smetadatalibrary,which grants users the rights to publish specified metadata recordstotheportal’sonlinecatalog;(2)thedatauserconnectstotheportal’ssearchoptiontolocatedatausingtheportal’ssearchenginewithoutphysicallybrowsingthroughthestakeholder’sdata;and(3)thedatauserswillconnecttothedataproviderfordownload,datacaptures,or the identification of the data resource.
Bystoringonlymetadatarecordsinourcatalog,wehavetheabilitytoindexalargeamountofvirtualdata,andmoreimportantly,theGSAandAEMAwillnothavetostorethephysicaldata.Ourgoalistoautomatethetasksofdatadiscoveryanddistributionsothatonceportalconnectionsarecomplete,minimalmaintenanceisrequiredfromthehostingagency.
ARCHITECTURE
Aportalisessentiallyamasterwebsite,whichisconnectedtoawebserverandcontainsadatabaseofmetadatainformationaboutgeographicdataandservices.Theservicesareexposedaswebapplicationsusingopen
114 D�G�TALMAPP�NGTECHN�QUES‘06
Figure 1.Generalizeddatapartnershipanduserconnec-tion concept (modified from ESRI, 2004).
sourceenvironments(Tomcat,Java,html,http,xslt,xml,andjsp)toprovideauser-friendlyandvisuallyappealinginterface.
Thearchitectureofthemetadataserver,whichcon-nectstoallindexedmetadatarecords,reliesonthreeexistingESR�products:Arc�MS,ArcG�S,andArcSDE.TheArc�MSprovidestheframeworkandarchitectureonwhichthemetadataserverruns.TheArcG�SArcCatalogapplicationservesasanauthoringandpublishingtool.TheArcSDEstorespublishedmetadatainrecordsinsidearelationaldatabase(ESR�,2004).Arc�MSintroducesanewapproachtoservingmapproductsovertheinternetthroughaJava-basedapplicationmanagementenviron-mentthatincludesmappingservicesandmapdesigntoolstosupportavarietyofinternetmapservices(ESR�,2004).MaincomponentsassociatedwiththeArc�MScommuni-cation architecture and web applications are identified in Figure2.
Figure 2. GIS software environment (Modified from ESR�,2004).
THE PORTAL’S ONLINE INTERFACE COMPONENTS
Home Page
ThehomepageshowninFigure3istheaccesspointforallonlinecomponents,anditprovidesquickaccesstothemostpopulardataapplications.Fromthehomepage,ausercandoabasickeywordsearch,navigatetothemapviewer, find help information, and access the quick links todownloadabledata,G�Sprojectsandservices,andG�Sresources.
Thehomepageisalsowhereuserslogintotheiraccounts.Auseraccountisnotnecessarytoaccesstheportal,butitincreasesusercapabilityandenhancesfunc-tionality. There are five distinct user levels of the portal basedonatop-downhierarchy;thatis,higherleveluserscandoeverythingalowerlevelusercando.Fromlowesttohighest,theseinclude:
1.Anonymoususerscanbeanyone.Theseusershavetheabilitytobrowsethesiteandusethreebasic
Figure 3.Exampleoftheportalhomepage.
115THEALABAMAMETADATAPORTAL:HTTP://PORTAL.GSA.STATE.AL.US
onlinecomponents:homepage,mapviewer,andsearchpage.
2.PublicUsershavetheabilitytosavetheircreatedmapsfromthemapviewerandsavetheirdatasearches,whichwillbeavailableontheusers’homepage.
3.PublisherUsershavetheabilitytocreate,publish,andmanagetheirmetadataonline.
4.ChannelManagershavetheabilitytocreateandpublishaquicklinkonthehomepage.
5.Administratorscheckmetadataforaccuracy,batch-uploadmetadata,harvestpublishermetadata,andmanageusers.
Map Viewer
TheportalmapviewershowninFigure4isamappingapplicationthatallowsuserstoviewoneormultipleinternetmapservicesatthesametimeintheirwebbrowser.Accesstoselectedfederal,state,andlocalWebMapServices(WMS)usingthe“addservice”menuisprovided,butthislimitednumberofservicescanbeexpandedbyenteringothermapserverURLaddressestoaccessotherWMSavailableonline.Viewinginternetmapservicesthroughtheportalmapviewerallowsusersto:
• addmapservicesfromtheportalandothermapservers
• displayoneormultiplemapservicesinasinglemapview
• setthetransparencyofmapservicesforoverlayingmultipleimages
• turnmaplayersonoroffwithinamapservice• find latitude/longitude anywhere in the state for
automaticnavigationofthemap• find street addresses in the state for automatic navi-
gationofthemap• identifyattributeinformationaboutfeaturesina
mapservice.
TheportalisnotlimitedtojustArc�MSWMS;italsosupports several specifications and services of the Open Geospatial Consortium (OGC). The OGC is a non-profit, international,voluntaryconsensusstandardsorganizationthatisleadingthedevelopmentofstandardsforgeospatialandlocation-basedservices(OGC,2006).Theportalsup-ports the following specifications from the OGC:
• WebMappingServicesversions1.0,1.1,and1.1.1• WebFeatureServicesversion1.0.0• WebCoverageServicesversion1.0.0
Figure 4.Theportal’smapviewer.
116 D�G�TALMAPP�NGTECHN�QUES‘06
• WebMapContextDocumentsversion1.0.0• GeographicMarkupLanguageversions2.0and3.0
(whenapproved)• OpenG�SLocationServicesversion1.0.
Search Function
ThesearchpageshowninFigure5isthetoolforsearchinganddiscoveringthemetadataofcontentofferedbymanypublishersoftheAlabamaMetadataPortal.Thesearchpageallowsuserstospecifythegeographicextent,keywords, content type, or content theme criteria to find matchingmetadataofmapservices,data,maps,webser-vices,activities,ordocumentspublishedintheAlabamaMetadata Portal. Users can search the portal by defining “where”theywouldliketosearch,“what”inthestatetheywouldliketosearch,and“when”theywouldlikethecontenttheyaresearchingfortohavebeencreatedorup-dated.Usersonlyneedoneparameterforasimplesearch;however,eachadditionalparameterhelpstonarroworretrieveasearch.
THE METADATA PUBLISHING FUNCTION
The importance of writing good metadata is difficult tocommunicatetopotentialpublishersoftheportal.ThesuccessoftheconnectioninFigure1,however,isbasedonaccurateandcurrentmetadata.Metadatadescribesthewho,what,when,where,why,andhowquestionsaboutthedata,whichgivesuserstheknowledgetodecidewhetherthedataisappropriatefortheirdesiredapplica-tion.Writinggoodmetadataalsomitigatestheoverallburdensandcostofdatamaintenance.ThestandardsforincludingmetadatarecordsintheportalaretheFederalGeographicDataCommittee’s(FGDC)ContentStandardforDigitalGeospatialMetadata(CSDGM).
Therearethreeuserlevelsthathavemetadataadmin-istration:Publisher,Channelcreator,andAdministrator.Theadministrationofmetadataincludestheabilitytocreate,manage,andaddmetadatatotheportal.Therearethreeoptionstomakemetadatarecordsavailablefor search in the portal. The first option is to publish a
Figure 5.Theportal’sadvancedsearchpage.
117THEALABAMAMETADATAPORTAL:HTTP://PORTAL.GSA.STATE.AL.US
metadatacollectiontoametadatarepositorywheretheportalcanharvestit.ThesecondoptionistouploadanindividualorbatchExtensibleMarkupLanguage(XML)formattedmetadatarecordtotheportal.Thethirdoptionistocreateametadatarecordonlineusingtheportal’smetadatacreationtool.
Metadata Harvesting
Metadataharvestingisaself-regulated,scheduledprocessforcollectingnewandupdatedmetadatafromvariousmetadatacollectionlibraries.Theprocessofhar-vestingallowstheportaltosynchronizeitsmetadatare-positorywiththepublisher’smetadatacatalog.�fpublish-ersparticipateinmetadataharvesting,anyupdatesmadetotheirlocalmetadatacollectionwillbeupdatedintheportalduringthenextharvestingsession.Currently,theportalcanharvestFGDC-compliantmetadatafromthreedifferenttypesofharvestingprotocols:Z39.50metadataclearinghousenode,Arc�MSmetadataservice,andWebAccessibleFolder.
MetadataharvestingintheAlabamaMetadataPortalisperformedinthreestepsasshowninFigure6:
1. Harvesting: Based on harvesting protocol specified atthetimeofregistration,theportalwillconnecttotheuser’slocalmetadatarepositoryandretrieveallnewandupdatedmetadatarecords.
2.Validation:Duringvalidation,theportaladminis-trator examines each metadata record to confirm thatminimumportalrequirementsaremet.Recordsthatarerejectedaresentbackviae-mailwithalistof invalid fields that need to be added. The records
willnotbeaddeduntilthemetadatarecordiscor-rectedandrevalidated.
3.Publishing:Allsuccessfullyvalidatedandacceptedmetadataispublishedintheportaldatabase.Oncethemetadataispublished,itissearchablethroughtheportal’ssearchinterfacebyallusers.
Direct Metadata Upload
�fusersdonothaveaccesstoanyofthemetadatadistri-butionserverprotocolsasdescribedabove,theycanuploadtheirXML-formattedmetadatarecordsdirectlytotheportal.Ametadatapublishercan,throughtheonlineadministrationtool,addandmanagemetadataontheirhomepage.Select-ingthe“UploadMetadata”button,userscanuploadindivid-ualmetadatarecordssavedontheirlocalcomputer.Theserecordswillbevalidatedandeitherrejectedorpublishedinthesameprocessasmetadataharvesting.Adrawbacktothedirectuploadoptionisthatuploadedpublishedmetadataisnotlinkedtothelocalmetadatarepository.Thatis,updatestoalocalmetadatarecordmustbeuploadedormanuallychangedbecausetheyarenotautomaticallyupdatedbytheportalwhentheuserupdateslocalrecords.
ArcCatalog Direct Metadata Upload
Batchuploadingofmetadatarecordsdirectlytotheportal’smetadata�MSserviceispossibleiftheuserisusingESR�’sArcG�Ssuite.ThroughArcCatalog,theuserwilldirectlyconnecttotheportal’sArc�MSmetadataserver;ametadatapublisheraccountnameandpasswordmust be specified. With this connection to the portal in place,theuserscandraganddroptheirfolderofmeta-
Figure 6. Diagram of the harvesting process(Modified from ESRI, 2004).
118 D�G�TALMAPP�NGTECHN�QUES‘06
datarecordsintothePublishMetadataService.Anaddedbenefit to this drag-and-drop method is that the metadata recordisvalidatedautomaticallyanddisplaysanerrormessage for all incorrect field values. The drawback of thismethodisthesameasthedirectuploadoptionwhereuploadedpublishedmetadataisnotlinkedtothelocalmetadatarepositoryastheywouldbewithharvesting.Updatestoalocalmetadatarecordmustbeuploadedormanuallychangedbecausetheyarenotautomaticallyupdatedbytheportalwhentheuserupdateslocalrecords.
Metadata Direct Entry
Theusermightnothaveaccesstometadatacreationoreditingsoftware,ormayhaveveryfewrecordstocon-tributetotheportal.�fthisisthecase,theusercanutilizethemetadatacreationtoolprovidedonthehomepage.Users will login to their account and find the “publish onlineform”buttonunderthe“MyFunction”section.ThisbuttonwilltakeuserstoanonlineformdesignedtoassistusersinthedevelopmentandproductionofFGDCmetadata quickly and efficiently. The form provides the users with drop menus, fields that are required (indicated by *), as well as help definitions and suggestions for each of the requested metadata fields.
Theminimalcomplianceofthedirectentrymethodprovidesonlytheelementsnecessaryfordatadiscoveryandisonlymoderatelyfunctionaltouserssearchingfordata.ThedirectentryisameansbywhichtoencourageuserstowritemetadatainthehopethattheywillseeitsimportanceandprogresstowardcreatingacomprehensiveFGDC-compliantrecordinthefuture.Byusingtheonlinecreationtool,themetadatawillbestoredonlyinthepor-tal,andallupdatesmustbemadethroughtheportal.
CONCLUSION
Datadownloadsitesandwebapplicationshavedramati-
callyimprovedtheG�Sproductivity.Tocompletejobsfaster,itiscriticalthattheG�Scommunitysharedataeffectivelyand efficiently: the portal is a powerful tool that benefits allusersandaddressestheseneeds.Fasterdiscoveryofspecific datasets and projects, data access to download sites anduseintheonlineMapViewer,loweringofdatacostsbyreducingtheredundancyofdata,comparisonofmultipleproviders to find data that suits their needs, and improve-mentofdataqualityandcoveragewithaconstantupdatingof agency metadata are a few benefits available through the portal.Moreimportantly,theportalheightensthevisibilityofparticipatingorganizationsbydisplayingthequalityandquantityoftheirdataofferings,whichisanindicationoftheirG�Scapabilities.Thisallowsabetterunderstandingofhowanorganizationcouldpartnerforfutureprojectsorinitiatives.
The first 18 months since the activation of the Ala-bamaMetadataPortal,therewere378,225totaldomainhits,whichrepresent16,197visitsby5,544uniqueusers(unique�Paddresses)showninFigure7.Weestimatethateachreturnuserhasviewedanaverageof68pages.ThiscurrentassessmentshowstheeffectivenessoftheAlabamaMetadataPortalandthepublic’sinterestinaccessingthedataprovided.�tisimportanttonotethattheportalinitiativeisbynomeansthesolesolutioninproducinganintegratedG�Scommunity;theportalrep-resentsafundamentalstepmovingAlabamaintothenextgenerationofG�Sproductivity.
REFERENCES
EnvironmentalSystemsResearch�nstitute,�nc.(ESR�),2004,G�Sportaltechnology,anEnvironmentalSystemsResearch�nstitute,�nc.WhitePaper:Redlands,California,9p.
GeologicalSurveyofAlabama(GSA),2006,GeologicalSurveyofAlabamaPortalProject:AlabamaMetadataPortal,ac-cessed01Sept.2006at<http://portal.gsa.state.al.us>.
OpenGeospatialConsortium,�nc.(OGC),2006,WelcometotheOGCwebsite:accessed01Sept.2006at<http://www.opengeospatial.org/>.
Figure 7.Dailyhitsontheportalfrom07/01/2005–10/01/2006.