Date post: | 18-Jan-2018 |
Category: |
Documents |
Upload: | felix-mcdonald |
View: | 219 times |
Download: | 0 times |
Parser G ra p h Ob jectSage LanguageRe p o sito ry
Programclass CClass1 {public: void Method1();private: int Field1;};
Te xt
The ObjectSage:Building object models by
textual descriptions
Andrew [email protected]
2
Software CASE toolsstate-of-the-art
• UML modeling• Partially automatic code generation• Refactoring browsers (occasionally)• Context-sensitive search and filters• Visual interface building• Business logic support
Design and coding only – no analysis
3
Abbott’s method
In 1983 Russel J. Abbott formulated a method of program development by informal English descriptions.
Basics:• English syntax contains enough
information to consider some abstract to be an object, attribute or method.
4
Abbott’s method
• A subject (syntactical) is considered to be an object.
• A predicate is considered to be a method.• All the modifiers are considered to be
attributes of objects or parameters of methods.
The suggested approach helps to perform object-oriented analysis.
5
Abbott’s problem
«Although the process we follow in formalizing the strategy may appear mechanical, it is not (given the current state-of-the-art of computer science) an automatable procedure.
The process of identifying the data types, objects, operators, and control structures, even given the English informal strategy, requires a great deal of real-world knowledge and an intuitive understanding of the problem domain.
It is not just a matter of examining the English syntax.»Russel J. Abbott
6
The ObjectSage
The ObjectSage is a try to solve Abbott’s problem automatically.
Input: informal textual description.Output: C++ *.h file with class declarations.
The existing prototype has some restrictions described below.
7
Current restrictions
Text:– no pronouns– just a few syntactical structures– no modal verbs etc.
Object model:– no variables – types only– no return values for function– almost no simple types
8
Demo
Just look at this!It seems to be working!
Launch the demo
9
Troubles
• Different meanings of the same word“Time files like arrows” © M. Geipel
• Insufficient information“It’s raining” – whom do you mean?
• Unknown words (specific for problem domain)“A fricosoid could be either cormable or discormable”
• …
Note: there is no need in understanding certain meaning of each word or sentence, in most cases just relative semantics is enough!
10
Mathematical model
First an input text is translated into Object Relation Graph(ORG).
It is held using Link Grammar Parser (LGP) developed inthe Carnegie Melon University (USA):
http://www.link.cs.cmu.edu/link/
Using LGP’s output which is a separate link-graph for eachsentence we build ORG subgraphs which could be
connected with pronouns (unsupported now).
Parser G ra p hTe xt
11
«There is a shop assistant in the shop.»
Shop assistant
ShopShop assistantTypification
Aggregation orAttribute
Shop object has a shopAssistant attributeof the type ShopAssistant
Type
Belong
Object
Attribute
Legend
Two words are joined intophrase and this phrase gives
out two nodes
12
«Pete, shop assistant, sells food.»
Shop assistant Food
Sell
Generalization Parametrization
Pete object has a sell method which takesa Food parameter of the type Food
Type
Belong
Class
Attribute
Legend
Pete
Food
Param Method
ParameterInherit
Formalparameter type
13
There are shop assistants in the shop.
They sell things to customers.
There are shop assistants in the shop.
Shop assistants sell things to customers.
Pronoun connection between two sentences
14
Principles of operation
ORGraph, produced by the LGP-part is processed by theObjectSage according to the following principle:
ORG vertices with the same names give outan element of the resulting structure.
Objects are joined into classes, attribute-vertices intoattributes and so on.
Finally we get a set of classes called Repository.
G ra p h Ob jectSage
Rep o sito ry
15
Objects are joined into classes
Shop assistantSell
Shop assistantLanguage
Shop assistantTalk
Language
Experience
ShopAssistant
experiencelanguage
sell()talk()
New class
16
Attribute group gives out an attribute
Language
Shop assistant
European language
Language
Shop assistant
Language
Language
Shop assistant
English
ShopAssistant
experiencelanguage
sell()talk()
17
Data manipulationsAll the actions are held using data structures speciallyorganized for that purposes. These are two main ones:
Word dictionary contains all the words used in the original description, each connected with the representing vertex or repository record.
Category structure (thesaurus) organizes all the words into semantically related blocks - categories.
Currently ObjectSage supports only a flat category structure, but it should be organized similar to file system (it should have treelike structure).
18
Background knowledge
• Categories (partially supported)Words are coupled by semantics
• Privileges (unsupported)The problem domain might have more and less related areas, that could be described by semantic privileges
• Primitive types (unsupported)Not all the data is represented by classes, there are also simple integers, character and strings
• Existing classes (unsupported)An existing architecture could give some guidelines while adding new classes
19
Data scheme
Human
Name
Say
Word
HumanNameSay
Word
Dictionary
Object Relation Graph
Word
datacharset
setAt()getAt()
Human
namelanguage
walk()say()
Categories
Class Repository
Humanity
Personal
Actions
Languages
Source text
20
Repository refactoring
Initially we do not get any class hierarchy – just a heap of classes with no connections.
To improve the model quality we use several heuristics, which are mostly aimed to
determining inheritance.
Repository
21
SpecifiersOne of the most difficult things held by The ObjectSage isan inheritance recognition. The most reliable method here
is to use specifiers.
A specifier is an attribute which is expressive enoughfor his presence to sign that an object belongs to anew subclass.When a specifier has been found, we decide to createa new subclass, and it is the most reliable heuristicsused by The ObjectSage.
Specifiers are denoted as attributes that occur verystably with a group of objects.
22
Class specifiers
GoodsgetCost()
PieceGoodsgetCost()
WeightGoodsgetCost()
For goods sold by piece cost is a multiple of an integerpiece-count and price.
For goods sold by weight cost is a multiple of a fractionalweight, unit coefficient and price.
23
Why do we take a whole category as aspecifier?
Weight
Goods
Piece
Goods
Pack
Goods
Categories
Sold by ...
PackGoodsgetCost()
WeightGoodsgetCost()
PieceGoodsgetCost()
Frequentlyused
Rarely used
24
Attribute mergingWhen we create subclasses we are to define
their interfaces according to ORGraph.
Attributes pulled up into superclass may havedifferent types, they are to be merged into one
attribute having a superclass type.
When only values of an attribute (not its name)occur in the description, those values are to be
merged into one attribute according to thecategory structure.
25
Attribute typification
Language
English
Language
Russian
Language
German
Language
Attribute type cases:1. Full type information (Class)2. Category only3. No type information
26
Categories are used as attributes
Green
Car
Red
Car
Blue
Car
Categories
Color
Car
brandcolor
drive()stop()
ColorRED
GREENBLUE
27
Method mergingMethods are similar to classes. Arguments
(parameters) are processed as attributes: theyare to be merged, typified etc.
That’s why the methods may have specifiers too.
A method specifier is a parameterwhich is expressive enough for his presence to
sign a new method existence.
28
Method specifiers
Do
Human
Dance
Do
Human
Deal
Do
Human
Sum
The same classand method name
Human
doDeal()doSum()
doDance()
Different parameters used frequently
29
Dealer
State-of-the-art inheritance hierarchyNote that Food class is not identified as a descendant of
Goods, although semantically it should have been.
GoodsgetCost()
PieceGoodsgetCost()
WeightGoodsgetCost()
Food
ShopAssistant SlotMachine
This problem is solved usingcategories...
30
Machines
Humanity
Category-driven inheritance
GoodsgetCost()
PieceGoodsgetCost()
WeightGoodsgetCost()
Food
Categories
Shop
Goods
Food
Category-drivengeneralization
An edge is moved from the category structure tothe repository.
31
Food
price
sell()
Pull up method or attribute
Goods
price
sell()
CunsumerGoods
price
sell()
Interface elements with thesame names are pulled up into
the superclass.
A new subclassmight have some
members thatoccurred in his
brother-classes.
These members areto be pulled up into
the superclass.
32
C++ Output
LanguageRe p o sito ry
Programclass CClass1 {public: void Method1();private: int Field1;};
The constructed repository is isomorphous to aUML class diagram. So it could be transcribed
into any object-oriented language.
C++ is supported now.
33
Usage
• Pre-processing of the requirements.• Incremental architecture building• Increasing an existing architecture
Classes already created by human-developers can give guidelines
to The ObjectSage.
34
eXtreme Programming
The ObjectsSage seems to be useful in the XP process:
• User histories could be processed incrementally, using existing architecture
• Each user history is semantically homogeneous – no misunderstanding
• Refactoring allows to improve the architecture quickly
35
Thank you for your attention.
Any questions?