Page 1
Legal Language Explorer
daniel martin katzmichael j bommarito ii
julie seamanadam candeub
eugene agichtein
Presentation @ 24th International Conference on Legal Knowledge and Information Systems (Jurix 2011)
Page 2
Law is characterized by a relatively small number
of highly influential cases and jurists whose conceptualization of law
comes to dominate
Page 3
Most Judges and Cases Are Quickly Forgotten
Page 4
But A Select Few Persist ...
Page 5
Extreme Skewing is a Typical Feature of Legal Systems
Katz, et al (2011)American Legal Academy
Katz & Stafford (2010)American Federal Judges
Geist (2009)Austrian Supreme Court
Smith (2007)U.S. Supreme Court
Smith (2007)U.S. Law Reviews
Post & Eisen (2000) NY Ct of Appeals Smith (2007)
U.S. Law Reviews
Page 6
innovative use of language and metaphor is
In Part
Page 7
How Holmes became “Holmes” and how Posner became “Posner”
Page 8
Andhow certain cases
came to dominate the rest of the corpus
Page 9
We Believe
The decision corpus is our archeological record
Page 10
We Believe
The decision corpus is our archeological record
and that record can be usefully explored using the
tools of computational linguistics
Page 11
We Want
To Democratize the Exploration of Legal Language ...
Page 12
We Want
To Democratize the Exploration of Legal Language ...
to folks who are not programmers
Page 13
We Want
To Democratize the Exploration of Legal Language ...
to folks who are not programmers
to folks who are not technically inclined
Page 14
We Develop a Simple Web Interface that Leverages
the Visual Cortex
Page 15
Relies on Our Ability to Engage in
Pattern Detection
Page 16
To Explore Linguistic Patterns in Large Corpora of Legal
Documents
Page 17
We Start with the Full Text Corpus of Decisions of the
United States Supreme Court 1791-2005
Page 18
Develop a N-Gram Based Explorer with a Portal to the Full Text, etc.
Page 19
From Text to N-Grams...
Page 20
Generate the N-Gram Mapping
(1) extract the opinion text from each case and store it as a sequence of characters
(2) convert this sequence of characters into a sequence of words through Word Tokenization (Penn Tree Bank Algorithm)
(3) iterate over words in the length M sequence from index 1 to index M + 1 – N
Page 21
Generate the N-Gram Mapping
Page 22
Legal Language Explorer:A Brief Tour of The Interface
Page 23
Go to LegalLanguageExplorer.comAnd Access This Interface
Page 24
Notice The Default Search Is Interstate Commerce, Railroad, Deed
Page 25
Notice The Default Search Is Interstate Commerce, Railroad, Deed
Page 26
For Each Comma Separated Phrase The Frequency Plot Appears Below
Page 27
For Each Comma Separated Phrase The Frequency Plot Appears Below
Page 28
Currently Supporting Every U.S. Supreme Court Decision From 1791 - 2005
Page 29
Currently Supporting Every U.S. Supreme Court Decision From 1791 - 2005
Page 30
Years Can Be Changed By The End User
Page 31
Years Can Be Changed By The End User
Page 32
Click Here to Access Various Advanced Search Features
Page 33
The Advanced Search Features
Page 34
Are the Observed Trends a Function of changes in the volume of the case docket?
Page 35
Normalization allows End User to control for the Size of the Case Docket
Page 36
Normalization allows End User to control for the Size of the Case Docket
Page 37
Change the Graph Type By Clicking Here
Page 38
An Alternative Presentation of the Data
Page 39
Export the Chart Data
(If You Want to Replot in Stata, Excel, R, etc.)
Page 40
Export the Chart Data
(If You Want to Replot in Stata, Excel, R, etc.)
Page 41
The Results(If You Want to Replot in
Stata, Excel, R, etc.)
Page 42
Access a Case List and Full Text Results
Page 43
Here are the Returned Results
Page 44
Download List in Excel
(or any .csv)
Page 45
Download List in Excel
(or any .csv)
Page 46
Access the Full Text a Particular
Case
Page 47
Click Through to Access
Results from BulkResource.org
Page 48
For More Information - Full Version of the Paper
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1971953
Page 49
For Implementation
Details (Including the Key Value Storage Method, etc.)
http://www.michaelbommarito.com/blog/2011/12/16/building-legal-language-explorer-interactivity-and-drill-down-nosql-and-sql/
Page 50
Slides will Be Posted to CLS Bloghttp://computationallegalstudies.com/
Page 51
This is Only The Beginning Stay Tuned for More in 2012
:)
daniel martin katzmichael j bommarito iijulie seaman adam candeub eugene agichtein