Post on 29-Dec-2015
transcript
Conference ScheduleLegislative Process OverviewLegislative Branch UpdateBulk Data Task Force update on provisioning legislative dataLibrary of Congress and GPO Electronic Access Plans and DevelopmentsOfficial Tools Demo - Administrative Interface (docs.house.gov) - Democratic Caucus Intranet - Committee Roll Call Vote UtilityInternational UpdateElectronic Legislative Archiving:Panel of legislative archivists discuss how to preserve and curate electronic legislative recordsExtending Legislative XML for and by third parties:Address XML data standards and how to extend them for new applicationsUnder-digitized legislative data:What are the evolving standards and practices for integration and use of legislative data?
Legislative Process Overview
Kirsten Gullickson, Sr. Systems Analyst
Office of the Clerk
Rep. Ludlow placing bill into hopper 12/30/1936http://www.loc.gov/pictures/item/hec2009008605/
The Challenge
• Legislative documents and related data must be– prepared– managed,– distributed, and– archived.
• This includes paper and electronic means for handling the official documents.
How a bill become a law. After the vote has been taken, the result is noted in the Journal of Action by Louis Sirkey, House Journal Clerk. If the bill receives a passing vote, it is sent to the other chamber for action. If the bill failed to pass it must be reintroduced unless it is voted to refer it back to the committee for reconsideration
The Challenge (cont’d)
Government data should be– Public– Accessible– Described– Reusable– Complete– Timely– Managed Post-Release
White House M-13-13, Open Data Policy, Managing Information as an Asset
Where are the documents? Data?
•GOVERNMENT PRINTING OFFICE
– www.gpo.gov
•LIBRARY OF CONGRESS
– Thomas.loc.gov– Beta.congress.gov
•THE HOUSE– Clerk.house.gov– Docs.house.gov– www.house.gov– Committee websites
•THE SENATE– www.senate.gov– Committee websites
Introduction and Referral to Committee
Doc. 110-49, page 8How Our Laws Are Made
http://history.house.gov/Collection/Listing/2004/2004-019-000/
The Hopper
Questions and Answers
Until Jurgensen, Jr., a tally clerk designed this electric voting machine it took at least three months, using the old rubber stamp system, to compile the voting records of the 435 members of the House. Recording the yeas and nays, absent and present, paired for and paired against votes of each individual member, the machine which is similar to an adding machine, does the same job in less than two weeks. Greater accuracy is assured in counting votes with Jurgensen-designed machine.
New time saving voting machine 05/10/1938http://www.loc.gov/pictures/item/hec2009015711/
Bulk Data Task Force and Transparency Updates
Since our last meeting on January 30, 2013 here’s what we’ve been up to:
Bulk Data Task Force and Transparency Updates
Other projects:• Bulk Data Bill Summaries• House Modernization Project• Data Challenge• Data Dashboard• Clerk Twitter Account• Clerk/History Arts & Archives YouTube
Library of Congress and GPO Plans and Developments
Tammie NelsonLibrary of Congress
Matt LandgrafGovernment Printing Office
Background
Joint Committee on Printing approved collaboration on digitization of:
Statutes at Large
Bound Congressional Record
Roles and Responsibilities
Library of Congress:
Performs digitization
Provides files to GPO
GPO:
Creates access copies
Creates metadata
Statutes at Large Status
All work for volumes from 1951-2002 has been completed
Currently available via FDsys
Access files and metadata have been provided to LOC (to be available on congress.gov in the future)
Bound Congressional Record Status
LOC Digitization (1873-1998) to be completed by the end of calendar year 2013
FDsys development underway
Resources being identified for metadata creation
Content will be released on an iterative basis via FDsys, beginning in FY 2014
Bound Congressional Record: Key Issues
Size of collection
Large effort required to create descriptive metadata and access files at the article level
Official Tools Demonstration Panel
Michael BakerHouse Committee on Ways and Means
Stephen DwyerOffice of the Democratic Whip
Kathleen SwiatekGovernment Printing Office
The Official Intranet for House Democratic Staff
Presentation by Steve Dwyer, Office of the Democratic Whip
HISTORY & ORIGIN
• Originally launched in early 2009• We recently launched our 3rd major iteration• Private—only House Democratic staffers
have access• Why did we build it? • Why Democrats-only?
ORGANIZATION
• Over 120,000 nodes and counting• How do we organize content?
• Primarily by legislation• General issue tags• “Specific Topics” for big non-bill items• Authoring office and staffer
DATA SOURCES UTILIZED
• GovTrac for legislative information• House LDAP for permissions and
credentials• Housenet’s e-Dear Colleague system• DemocraticWhip.gov for House Floor
schedule
DATA SOURCES UTILIZED (CONTINUED)
• Docs.house.gov for Committee schedules
• POPVOX for organization letters and public sentiment
• Staffer data from a commercial vendor• Significant private listservs are auto-
consumed
The Official Intranet for House Democratic Staff
Presentation by Steve Dwyer, Office of the Democratic Whip
Electronic Legislative Archiving Panel
James JacobsGovernment Information Librarian, Stanford Univ.
Lisa LaPlantGovernment Printing Office
Marc LevittByrd Center for Legislative Studies
Preserving Electronic Legislative Information in FDsys
Legislative Data Transparency ConferenceMay 22, 2013
Lisa LaPlantGPO
GPO’s Mission
Keeping America Informed by producing, protecting, preserving, and distributing the official publications and information products of the Federal Government.
1
3
Legislative Publications Bills and Resolutions Committee Materials Congressional Calendars Congressional Directory Congressional Record United States Code Journal of the House of Representatives Procedural and Precedential Materials
4
Digital Preservation
Combination of the policies, strategies, and actions that ensure access to reformatted and born digital content regardless of the challenges of media failure and technology change.
6
Preservation Objectives Safeguard digital content along with all relevant metadata. Assess the condition and needs of collections of digital information. Meaningfully render content despite continuously changing technology. Manage processes which are auditable, replicable, and that build the basis for trust.
OAIS Reference Model
7
Consumer
Producer
System Administration
Ingest Access
Data Management
ArchivalStorage
Preservation Planning
Package Based Approach
8
Package 1Package 1
Rendition 2Rendition 2
ContentFiles
mods.xml
aip.xml
premis.xml
Rendition 1Rendition 1
ContentFiles
9
PREMIS
Record each significant event in the lifecycle of content in PREMIS metadata. Record the content source, changes that have occurred since the content was created or acquired, and who has custody of the content.
Events Recorded in PREMIS
Software Activities: Digest Calculation Ingest Fixity Check Rendition Creation ACP Creation Digital Signing Parsing
User Activities: Rendition Upload Rendition Deletion Submission Replacement AIP Deletion
10
11
Preservation Strategies Refreshment (bit-level preservation)
Content is transferred from one physical medium to another.
MigrationContent is converted or transformed into a more recent version or a more widely used format.
13
More Information
Lisa LaPlantOffice of Programs, Strategy, and Technology, GPOllaplant@gpo.gov
GPO’s FDsyswww.fdsys.gov
Preservation in FDsyswww.gpo.gov/preservation
Archiving Senator Byrd’s E-Records
Marc Levitt
Director of Archives
Robert C. Byrd Center for Legislative Studies
Records Received & Migrated
• Early Petitions (1790-1817)- PDFs with OCR• Byrd Migration Projects:
– Photographs- TIFF– A/V Material- Outside Vendor– Microfilm- PDFs, then OCR (in-house)
• Byrd Capture Projects:– CSPAN floor speeches– Congressional Record PDFs
• Byrd Office Files Received: – Hard drive with files from the shared drive– Constituent Services System (CSS) data on 2
DVDs
Case Study: CSS Processing
• Hired a contractor• Script to automate ingestion of data• CSV tables cleaned and optimized with
Google Refine• SQL database created• Waiting for installation
What the Office Uses:
Senator Byrd confers with President Jimmy Carter at the White House. (August 23, 1977). Official White House Photo.
What is Archived by the Vendor:
• <A color photograph of Senator Byrd (left) and President Carter discuss issues in an office.>
• <Senator Byrd is seated on a floral print couch.>
• <President Carter is seated on a blue chair.>• <Flower curtains hang behind the men.>• <A white lamp sits on a brown table between
them.>
Not the Same:
Full picture and functionality in original record
Loss of information and context through 3 phases of data migration
Issues
• Authenticity and Reliability• Standardization• Organization Schema• What to Save (and why it’s okay to do so)
Third Party Extensions of Legislative XML Panel
Daniel Bennett
eCitizen
Jim Harper
CATO Institute
Eric Mill
Sunlight Foundation
Daniel Bennett: Adding Financial Metadata to Legislative Docs
Extending XML
“Soup to Nuts”
- American English idiom conveying the meaning of "from beginning to end“- Derived from the description of a full course dinner, in which courses progress from soup to a dessert of nuts
Under-digitized Legislative Data Panel
Anne Washington
George Washington University
Grant Vergottini
Xcential, Inc.
Josh Tauberer
GovTrack
Why Digitize?
Anne L. Washington, PhD
George Mason University, School of Public Policy
May 2013
Legislative Data Standards ConferenceUS House of Representatives
Political Informatics
Poli-Informatics• Computational science & "big data"
– Data visualization– Machine learning
• Study of politics and government
http://poliinformatics.org
Poli-Informatics could…
• Visualize complex policy solutions.• Predict procedural progress through
language.• View nested organizational hierarchies
impacted by a policy.• Gather single policy idea across multiple
ideological discourses.• Track policy developments over time.
Joint PI-net
• George Mason University • University of Washington• Northwestern University• Cornell University• Carnegie Mellon University• Pennsylvania State University• & YOU !
http://poliinformatics.org
Anne L. Washington, PhDhttp://washington.gmu.edu
awashi14@gmu.eduAssistant Professor
School of Public Policy
Organizational Development & Knowledge Management
George Mason University, Arlington VA
Digitizing Legislative DataFrom documents to data to
information and beyond
Grant Vergottini
May 22, 2013
Digitizing Legislative Data From documents to data to information and beyond
Now
Web Services
XML Download
Data Scraping
Proprietary XML
Open XML Standards-Based XML
Past
Akoma Ntoso
Future
Step 1: Legislative Documents OnlinePutting the documents online
Data Scraping
Proprietary XML
Past
• Simple systems• Geared towards people rather than
programs
• Data Scraping for programs• Roll your own XML• Maintain your own repository
Step 2: Legislative Data Sources Improving data accuracy
XML Download
Data Scraping
Open XMLProprietary XML
Past
• Authentic data• More sophisticated
Web Sites
• Download XML directly
• Open Gov. data formats
• Still need your own repository
Now
Now
Next: Legislative Information Services
Future
Web Services
XML Download
Proprietary XML
Open XML Standards-Based XML
Past
Akoma Ntoso
Step 3: Legislative Information ServicesConnecting the information
Web Services
Standards-Based XML
• More reliable data• Authentic HTML & XML
• More useful data• Consumer rather than producer
oriented• Simpler standards-based information
models• Linked citations & other metadata• Microformats & Microdata for HTML
• More timely data• Web services rather than download• Link services stitch data together• Robust repository services – search,
query
Akoma Ntoso
Future
Step 4: The VisionConnecting the world
• State & Federal Laws
• Regulations to Legislation
• Treaties & Trade Agreements
So what’s left to do?
Joshua Tauberer (@JoshData)GovTrack.us
Legislative Data & StandardsMay 22, 2013
All legislative events are recordedin structured data.
All legislative artifacts arepublicly available.
(How hard could that be, right?)
Legislative Data
Bill Summary & StatusAmendment Status & TextList of MembersCommittee ArtifactsHistorical Bill Text, Statutes, and so on.