CassBeth Inc. www.cassbeth.com 1
Natural Language Analysis of Requirements Tool Study
INCOSE Delaware Valley Chapter
April 2007
CassBeth Inc. www.cassbeth.com 2
Content
• Where did this come from and why do it– Natural Language Analysis (NLA) of specifications
• NASA IVV Study• Natural language analysis introduction • Who are the vendors in the study • What are the initial findings • What is the technology spread• Final comments
CassBeth Inc. www.cassbeth.com 3
Where did this come from?
• Carnegie Mellon SEI Report September 2005– QuARS: A Tool for Analyzing Requirements by Giuseppe Lami,
Report CMU/SEI-2005-TR-014
• NASA Automated Quality Analysis Of Natural Language Requirement Specifications (ARM)– William M. Wilson Software Assurance Technology
– Linda H. Rosenberg, Ph.D. Unisys / GSFC
– Lawrence E. Hyatt NASA Goddard Space Flight Center
• TIGER replaced FRED– Joseph Kasser, University Of South Australia
• Requirements Assistant at Hughes circa 1980– Dr. Herman Dreissen, Netherlands
CassBeth Inc. www.cassbeth.com 4
The Idea - NLA of Reqs
Prelim Spec Doc
Spec Analysis
Final Spec
ReportsUpdatesAuthors
PreviouslyManual
Inspections
• Let machines do what they do well– search, count, filter, categorize, profile, visualize
• Let humans do what they do well– creativity, critical thinking, inspiration, intuition
CassBeth Inc. www.cassbeth.com 5
CMU SEI Abstract
• Numerous tools and techniques are available for managing requirements.
• Many are designed to define requirements, provide configuration management, and control distribution.
• However, there are few automatic tools to support the quality analysis of natural language (NL) requirements.
CassBeth Inc. www.cassbeth.com 6
CMU SEI Abstract
• Ambiguity analysis and consistency and completeness verification are usually carried out by human reviewers who read requirements documents and look for defects.
• This clerical activity is boring, time consuming, and often ineffective.
• This report describes a disciplined method and a related automated tool that can be used for the analysis of NL requirements documents.
CassBeth Inc. www.cassbeth.com 7
NASA ARM Abstract
• An early life cycle tool for assessing requirements that are specified in natural language.
• The ARM tool searches the requirements document for terms the SATC has identified as quality indicators.
• Reports produced by the tool are used to identify specification statements and structural areas of the requirements document that need to be improved.
CassBeth Inc. www.cassbeth.com 8
Why Do It
• Requirement Management– Automated database oriented tools
• Modeling and Simulation– Automated visualization and calculation tools
• Requirement Text Authoring– Manual mentors and check lists
CassBeth Inc. www.cassbeth.com 9
Why Do It
• Over 50% software defects are reqs problems– Source: CMU SEI QuARS Presentation and James
Martin, INCOSE 21 June 05
• Over 80% rework spent on reqs related defects– Source: CMU SEI QuARS Presentation and Dean
Leffingwell, INCOSE 21 June 05
CassBeth Inc. www.cassbeth.com 10
Why Do It
• Specifications written in natural language– Initial text is rarely perfect
• Everyone relies on specification text– Users, designers, testers, vendors, policy makers
• Inspections used for surfacing defects, but– Time consuming, costly, only some defects found
• Inspections may not even be performed– Fear of findings, not sure how to proceed once surfaced
CassBeth Inc. www.cassbeth.com 11
NASA IVV Entry
• 2006 May 24– Working Group meeting
– IV&V analysts identify capabilities they believe are beneficial in an automated tool
• 2006 June 8– Tool demo day, 5 of 6 identified tools presented
• 2006 June - December– Studied tools
• 2007 January - April– Selected 2 tools for pilot project
http://sarpresults.ivv.nasa.gov/ViewResearch/106.jsp
CassBeth Inc. www.cassbeth.com 12
Introduction toNatural Language Analysis
• Lexical Analysis– Uses dictionary words and phrases
– Vague, subjective, imply choice or option
• Syntactical Analysis – Relates to the syntax or grammar of the language
– Weak phrases, multiplicity, implicit, under-spec
• Statistical Analysis– Statistical properties of language structure and usage
• Consistency check– Areas such as units of measure
CassBeth Inc. www.cassbeth.com 13
Examples• Lexical examples
– Ambiguous words: low, bad, clear, easy, efficient, etc.
• Syntactical examples– Multiple requirements: use of and / or
– Under-specification: e.g. ‘report’, what kind of report
• Statistical Analysis– Count frequency of words , such as ‘strip’
– If it occurs 50 times in one document, this would indicate that this is an important concept (domain term)
• Consistency Check– check units e.g. 5 Hz and 5kHz, 10 ft and 10 meters
CassBeth Inc. www.cassbeth.com 14
Tools Considered
• TEKchecker USA
• Requirements Assistant (RA) Netherlands
• Specification Analysis Tool (SAT) USA
• Lexior France
• E-Smart/ARM USA NASA
• QuARS Italy
• TIGER (not in study) Australia
=> International Effort <=
CassBeth Inc. www.cassbeth.com 15
Criteria
• Technical Requirements
• Analyst Support Capabilities
• Accuracy
• Interoperability
• Security
• Quality Attributes
• Reliability
• Usability
• Efficiency
• Maintainability
• Portability
• General Vendor Information
• Licensing
• Tool Installation
• Analysis Preparation
• Upload documents into tool
• Execution of tool
• Capture analysis results
• Analyze defects identified
• Generation of reports
CassBeth Inc. www.cassbeth.com 16
Initial Findings
• NASA IVV Selects SAT & RA for pilot program– Because of their percentage rates in identifying issues
• NASA IVV Objectives moving forward – Identify issues with implementing an automated tool
into the process
– Identify changes or updates needed to best meet the needs of analysts
– Capture and measure effectiveness of tool to assist analysts in identifying issues
– Measure improvement to process (time to analyze with assistance)
CassBeth Inc. www.cassbeth.com 17
SAT and RA
• SAT– Real time on the fly analysis– Based on search engine concept– User defined rules via templates
• RA– Batch type / over night processing– PROLOG expert engine– Proprietary rules growing since early 80’s
CassBeth Inc. www.cassbeth.com 18
Architecture
Templates
PreviousAnalysis
SAT Engine
Apache Server
WebBrowser
UserDocuments
Services & Rules
SATExports
- Metrics & Results- Excel & HTML
Help
Works Like Internet Search Engine But
Runs on your computer
Returns document text blocks
Search criteria many attributes
Search saved as templates
Grouped by rules & services
Fast 150 pages in 60 seconds
CassBeth Inc. www.cassbeth.com 19
Reports
Text Blocks
Metrics
Search Counts
All Words Counts
Settings
Shape
Reading Levels
CassBeth Inc. www.cassbeth.com 20
Evolution
SAT SAT SAT SAT SAT
spec writers
Designers QA Test IV&V
RelatedDocuments
Small 1-10 page
Problem StatementsProject Summaries
Architects
Uncommitted unclear position, do not begin modeling decomposing, designing, implementing until you understand & commit your stakeholders
Surface key Reqs
Consistency, completeness, testability, metrics
Evaluate docs, outline compliance, quality reqs
Specifications
Non Engineering
DomainsStartedHere
NASAStudy
CassBeth Inc. www.cassbeth.com 21
IV&V Challenge
• They are catching problems after the fact– It is all done unless they arrive day one
• They can use requirement findings to look for potential problems in design and implementation– If design, implementation, and test team was really
good then all problems were wrung out of system
• Costs still sunk unless team had excellent reqs• Industry needs to bring these tools into process
– Reduce risk and friction with evaluators
– Put money and schedule where needed
CassBeth Inc. www.cassbeth.com 22
Others Challenges
• Architects / front end staff forced to accept poor project statements & summaries – As system engineers rail against management in non
system engineering driven organizations
• Designers / Spec writers try to establish baselines– As politics whittles away at reason & logic
• QA asked to review requirement documents– Can be impartial watchdog against poor req baseline
• Test asked to create effective test programs– Like IV&V they can focus attention in poor req areas
CassBeth Inc. www.cassbeth.com 23
NASA Tool Demo Day• Review time not shortened
– Time usually arbitrarily set– People hunt & peck until bored or exhausted
• Tool findings more consistent– Humans tend to miss categories
• Tool finds all problems of certain type– Humans tend to miss full sets
• Humans better finding domain specific problems– So give them time to do so and let tool excel in its area
• There are more findings when tool is used– Humans and machines complement each other
As interpreted by SAT staff
CassBeth Inc. www.cassbeth.com 24
EvolutionPlain Language Analysis
tech writing
Legislative Analysiseducation
Constitutional Analysiseducation
Med Transcript Analysisprofessional
EU Green Paper 30 pgs
Stern Review700 pgs
US Climate Change Strategic Plan
400 pgs
General DocumentAnalysis (GDA)
How do individuals, organizations, countries, international bodies solve problems, what are the tools and techniques
For Policy
Makers
From Engineering Domain
Technology Spread
CassBeth Inc. www.cassbeth.com 25
Broad Approaches Service
0 50 100
Alternatives
Education
Engineering
Government
International
Markets
People
Science
Systems
Technology
EU
0 500 1000 1500
Alternatives
Education
Engineering
Government
International
Markets
People
Science
Systems
Technology
UK
0 500 1000 1500 2000
Alternatives
Education
Engineering
Government
International
Markets
People
Science
Systems
Technology
US
95 Rules grouped into 12 Services
If objects examined,counts are
zero
Primary themes
Technology Spread
CassBeth Inc. www.cassbeth.com 26
Broad Approaches Service
05
101520253035
Alternatives
Education
Engineering
Governm
ent
International
Markets
People
Science
System
s
Technology
US % UK % EU %
This is misleading because the count is low
Technology Spread
Again, basically
zero
CassBeth Inc. www.cassbeth.com 27
Policy Services & Rules• Broad Approaches
– Markets Technology Education Science Government People International Engineering Systems Alternatives
• Institutions– Academia Government Industry Labs Think Tanks NGO Non Profits
Popular Media Technical Media Associations
• Societal Approaches– Peaceful Sacrifice Stressed Violent
• Organizational Tools– Technical Process Management Gestalts Implementation
• Nation State Tools– Policy Legislation Regulation Deregulate Tax Military Departments
Resources Proactive Transparency
Technology Spread
CassBeth Inc. www.cassbeth.com 28
Policy Services & Rules• International Tools
– Policy Resources Proactive
• Social Warnings– General Human Anger Environment Health Economic Earth
• National Roles– Religion Defense Money Wealth Education Agriculture Health State
People Labor
• International Roles– People Government Industry International Academia Community Money
Time
• Special Interests– Defense Intelligence Food Health Pharmaceuticals Chemicals Mining
Energy
Technology Spread
CassBeth Inc. www.cassbeth.com 29
Other Services
0
5
10
15
20
25
30
35
Alte
rna
tives
Ed
uca
tion
En
gin
ee
ring
Go
vern
me
nt
Inte
rna
tion
al
Ma
rkets
Pe
op
le
Scie
nce
Sys
tem
s
Te
chn
olo
gy
Aca
de
mia
As
so
ciatio
ns
Go
vern
me
nt
Ind
us
try
La
bs
NG
O
No
n P
rofits
Po
pu
lar
Me
dia
Te
chn
ical
Me
dia
US % UK % EU %
0
2
4
6
8
10
12
Th
ink T
an
ks
Pe
ace
ful
Sa
crifice
Stre
ss
ed
Vio
len
t
Ge
sta
lts
Imp
lem
en
tatio
n
Ma
na
ge
me
nt
Pro
ces
s
Te
chn
ical
De
pa
rtme
nts
De
reg
ula
te
Le
gis
latio
n
Milita
ry
Po
licy
Pro
active
Re
gu
latio
n
Re
so
urce
s
Ta
x
Tra
ns
pa
ren
cy
Po
licy
Pro
active
Re
so
urce
s
An
ge
r
Ea
rth
Eco
no
mic
En
viron
me
nt
Ge
ne
ral
He
alth
Hu
ma
nTechnology
Spread
CassBeth Inc. www.cassbeth.com 30
0
10
20
3040
50
60
70
Ag
ricultu
re
De
fen
se
Ed
uca
tion
He
alth
La
bo
r
Mo
ne
y
Pe
op
le
Re
ligio
n
Sta
te
We
alth
Aca
de
mia
Co
mm
un
ity
Go
vern
me
nt
Ind
us
try
Inte
rna
tion
al
Mo
ne
y
Pe
op
le
Tim
e
Ch
em
icals
Co
mm
un
icatio
ns
De
fen
se
Ed
uca
tion
En
erg
y
En
terta
inm
en
t
En
viron
me
nt
0
5
10
15
20
25
Fin
an
cial
Fo
od
Gro
up
s
He
alth
Inte
llige
nce
Ma
nu
factu
ring
Min
ing
No
n P
rofit
Ph
arm
ace
utica
ls
Pro
pe
rty
Scie
nce
Te
chn
olo
gy
Tra
ns
po
rtatio
n
Utilitie
s
Fe
ar
Ha
pp
y
Ha
te
Lo
ve
Sa
d
Ne
ga
tive
Po
sitive
Other Services
Click around, look at text blocks, refine searches, gain other insights
There are Hits here, you just can’t see them
Technology Spread
CassBeth Inc. www.cassbeth.com 31
Evolution
• Next time you hear a politician speak listen for tools and techniques used to solve problems– There is difference & after using GDA on a major
policy document you will never view a policy speech the same way again
• How did we get from NASA IV&V Natural Language Analysis of Requirements to policy document analysis?– Something about technology spread
Technology Spread
CassBeth Inc. www.cassbeth.com 32
Where Do These Tools Fit
Original view of supporting development was significantly expanded
• Concept time – Mine related documents prior to writing req’s
• Development Time– Help write clean req’s
• Verification and Validation– Assess req’s
EngineeringDomain
CassBeth Inc. www.cassbeth.com 33
Why Do It
• Building helicopters & ATC systems was hard for our parents but should be easy for us– Everyone takes for granted, ripe with politics & agendas
– Expectations high so projects overloaded with features
• Global warming & sustainable development are new challenges in this century for us– Ripe with politics & external agendas
– Civilization has techniques individuals, organizations, countries, international bodies use to solve problems
• This is hard - doing things
CassBeth Inc. www.cassbeth.com 34
Final Comments
• Something like impartial document analysis can stop the politics and external agendas– Professor Harry G. Frankfurt, On Bullshit, Princeton
University, ISBN: 0-691-12294-6
• This is very important technology– Creating rules & analyzing 700+ page documents in 3
days in areas you have no expertise is an eye opener
• It is an international effort, it will not go away• In 10 years NLA of documents may be as
common as word processing, email, & the Internet
CassBeth Inc. www.cassbeth.com 35
Challenge
Can these tools be used for architecture validation?
• Pull a view of the architecture - picture• Using the picture create services in SAT
– Interfaces, Subsystems, Capabilities
• Drop the specifications into SAT• Review the mined results
CassBeth Inc. www.cassbeth.com 36
Links
• Evaluation of Current Requirements Analysis (RA) Tools Capabilities for IVV in the RA Phase– http://sarpresults.ivv.nasa.gov/ViewResearch/106.jsp
• Specification Analysis Tool (SAT)– http://www.cassbeth.com/sat/index.html
• Requirements Assistant (RA)– http://www.requirementsassistant.nl/