Legal Language Explorer
daniel martin katzmichael j bommarito ii
julie seamanadam candeub
eugene agichtein
Presentation @ 24th International Conference on Legal Knowledge and Information Systems (Jurix 2011)
Law is characterized by a relatively small number
of highly influential cases and jurists whose conceptualization of law
comes to dominate
Most Judges and Cases Are Quickly Forgotten
But A Select Few Persist ...
Extreme Skewing is a Typical Feature of Legal Systems
Katz, et al (2011)American Legal Academy
Katz & Stafford (2010)American Federal Judges
Geist (2009)Austrian Supreme Court
Smith (2007)U.S. Supreme Court
Smith (2007)U.S. Law Reviews
Post & Eisen (2000) NY Ct of Appeals Smith (2007)
U.S. Law Reviews
innovative use of language and metaphor is
In Part
How Holmes became “Holmes” and how Posner became “Posner”
Andhow certain cases
came to dominate the rest of the corpus
We Believe
The decision corpus is our archeological record
We Believe
The decision corpus is our archeological record
and that record can be usefully explored using the
tools of computational linguistics
We Want
To Democratize the Exploration of Legal Language ...
We Want
To Democratize the Exploration of Legal Language ...
to folks who are not programmers
We Want
To Democratize the Exploration of Legal Language ...
to folks who are not programmers
to folks who are not technically inclined
We Develop a Simple Web Interface that Leverages
the Visual Cortex
Relies on Our Ability to Engage in
Pattern Detection
To Explore Linguistic Patterns in Large Corpora of Legal
Documents
We Start with the Full Text Corpus of Decisions of the
United States Supreme Court 1791-2005
Develop a N-Gram Based Explorer with a Portal to the Full Text, etc.
From Text to N-Grams...
Generate the N-Gram Mapping
(1) extract the opinion text from each case and store it as a sequence of characters
(2) convert this sequence of characters into a sequence of words through Word Tokenization (Penn Tree Bank Algorithm)
(3) iterate over words in the length M sequence from index 1 to index M + 1 – N
Generate the N-Gram Mapping
Legal Language Explorer:A Brief Tour of The Interface
Go to LegalLanguageExplorer.comAnd Access This Interface
Notice The Default Search Is Interstate Commerce, Railroad, Deed
Notice The Default Search Is Interstate Commerce, Railroad, Deed
For Each Comma Separated Phrase The Frequency Plot Appears Below
For Each Comma Separated Phrase The Frequency Plot Appears Below
Currently Supporting Every U.S. Supreme Court Decision From 1791 - 2005
Currently Supporting Every U.S. Supreme Court Decision From 1791 - 2005
Years Can Be Changed By The End User
Years Can Be Changed By The End User
Click Here to Access Various Advanced Search Features
The Advanced Search Features
Are the Observed Trends a Function of changes in the volume of the case docket?
Normalization allows End User to control for the Size of the Case Docket
Normalization allows End User to control for the Size of the Case Docket
Change the Graph Type By Clicking Here
An Alternative Presentation of the Data
Export the Chart Data
(If You Want to Replot in Stata, Excel, R, etc.)
Export the Chart Data
(If You Want to Replot in Stata, Excel, R, etc.)
The Results(If You Want to Replot in
Stata, Excel, R, etc.)
Access a Case List and Full Text Results
Here are the Returned Results
Download List in Excel
(or any .csv)
Download List in Excel
(or any .csv)
Access the Full Text a Particular
Case
Click Through to Access
Results from BulkResource.org
For More Information - Full Version of the Paper
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1971953
For Implementation
Details (Including the Key Value Storage Method, etc.)
http://www.michaelbommarito.com/blog/2011/12/16/building-legal-language-explorer-interactivity-and-drill-down-nosql-and-sql/
Slides will Be Posted to CLS Bloghttp://computationallegalstudies.com/
This is Only The Beginning Stay Tuned for More in 2012
:)
daniel martin katzmichael j bommarito iijulie seaman adam candeub eugene agichtein