+ All Categories
Home > Documents > 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

Date post: 14-Dec-2015
Category:
Upload: coleen-roberts
View: 217 times
Download: 2 times
Share this document with a friend
Popular Tags:
26
1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com
Transcript
Page 1: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

1

DL For Software Engineering

Semantic WebSalman Mirghasemi

salmir19[at]yahoo.com

Page 2: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

2

Content

DLHB (Chapter 11)IntroductionBackgroundLaSSIECODEBASE

CSIS & CBMS Overview

Page 3: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

3

DLHB11- Introduction

One of the first applications of DL was in software engineering.

For program understanding in software maintenance, in enormous systems (Over a million lines of code.)

Q: How DL can play a role?! A: Requires understanding the basic problems of

software engineering “in the large”.

Page 4: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

4

DLHB11- Background

Principal tasks:

1. Pro-active (testing)

2. Reactive (debugging)

3. Enhancement

Require

Understanding The Software

Page 5: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

5

DLHB11- Background-2

Programmers typically solve problems by realizing “plans” in their programs.

Delocalized plans are serious impediment to plan recognition.

Studies in this area started, But most of them were in small domain-independent programs.

But ,How these results applied to huge domain-specific programs?!

A group in AT&T attempt to address this problem by studying maintainers of a large software systems.

Page 6: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

6

DLHB11- Background-3

They found : 60% of the time was spent performing simple

searches across the entire software system. For large software system whose source code

spread out over a large number of files in a deep and complex directory structure, finding definition of a data-type, with tools such as find and grep was both difficult and time-consuming.

Also understanding the domain in which the software operates required for understanding the software.

Page 7: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

7

DLHB11- LaSSIE

They develop new concept : Software Information System.

Software Information System (SIS) : An information system which treats the software system source code itself as data and help maintainers to find required information faster.

LaSSIE (The first SIS) was developed to assist the understanding AT&T’s Definity 75/85 software system.

LaSSIE contained two component : Domain Model and Code Model.

Page 8: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

8

DLHB11- LaSSIE-2

Code Model Was implemented with a

simple ontology of source code elements ,Which was derived empirically from basic kinds of searches maintainers performed.

The knowledge-base (the actual assertions about individual functions, files, data-types, etc.) was populated automatically from source code.

Page 9: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

9

DLHB11- LaSSIE-3

Domain Model The domain model was

reverse engineered from the code and contact with the domain experts

It contained knowledge about telephony domain ,i.e. the things the software system dealt with .These included entities such as telephone, microphones, cable, cable-trunks ,etc.

Page 10: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

10

DLHB11- LaSSIE-4

Model Code Domain

TBOX (size)

20 200

ABOX (size)

thousands few

Example

What is data-type of the variable dial-tone?

What is a dial-tone?

Page 11: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

11

DLHB11- LaSSIE-5

Example 1 : Question :“Where is this variable used?” If maintainer uses grep to find answer ,he/she

must remove various kinds of “semantic noise” such as:

1. variables with longer names that include the desired variable name.

2. names of functions that include the desired variable name.

3. comments that include the variable name.4. other non-variable string matches.

Page 12: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

12

DLHB11- LaSSIE-6

LaSSIE immediately solved these problem by identifying “variable” as a semantic category.

Limiting the search to variables would remove up to 80% of the noise.

For variables, it is simple to automatically determine1. The file it was defined in.2. Each function in which the variable was used.

Page 13: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

13

DLHB11- LaSSIE-7

Example 2 : Question : “Which actions initiated by a user?” If maintainer uses grep to find answer ,clearly the

semantic noise would be quite high in such a case. The code is organized around specific functions, not

around specific domain concepts, and of course multiple “views” of the code is not supported.

LaSSIE contained a facility for defining new domain concepts by maintainer and adding them to the domain model by simply assigning them a name. (e.g. ,USER-ACTION)

Page 14: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

14

DLHB11- LaSSIE-8

To further improve the process, it was observed that: The connection between the domain and code

models needed to be made by hand. This was time-consuming to create, and difficult to maintain since the domain model changed over time as new features were added to the software system.

The code model, though incredibly simple, was used far more frequently than the domain model, and became an important part of every maintainer’s tool set. It did not, however eliminate the searches maintainers made, and therefore did not completely replace find and grep.

Page 15: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

15

DLHB11- CODEBASE

Because the code model proved quite useful and easy to maintain, the demand for it began to increase. This introduced two problems for LaSSIE SIS:

1. It was main memory based. The software contained many thousands of functions, variables, and files. More importantly, the complexity of the function call graph, variable usage graph, and location maps, exceeded one million. It was not possible to store this amount of information in main memory of any computer at that time.

2. Natural language interface, while simple and easy to understand, did not facilitate using the system quickly.

Page 16: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

16

DLHB11- CODEBASE-2

The CODEBASE system offered solutions to both of these problems.

Solution for first problem : Offline storage of individuals. TBOX was always kept in memory. Individuals were kept on a disk, in a technique

similar to virtual memory.Solution for second problem : Provided numerous graphical tools for viewing

and browsing the information in knowledge base.

Page 17: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

17

DLHB11-CSIS & CBMS

Two issues were brought to light by LaSSIE system. 1. The deterioration of the domain model over time was another

manifestation of the classic software documentation problem: the same information being stored in different ways. The code model stayed relevant because it was automatically generated from the only thing that had to be maintained: the software. It did not, therefore, need to be maintained separately to remain accurate. The documentation and the domain model were different representations of the knowledge that was, perhaps implicitly, in the code. These representations always lagged the “real” one, since they had to be maintained independently.

2. The delocalization of information in software, which is the central obstacle to code understanding, required new ways of viewing the code. Looking at code on the screen, analogously to the heuristics for operating system virtual memories, is inherently two-dimensional. It does not allow for relationships between code-level entities to be viewed, or localized.

Page 18: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

18

DLHB11-CSIS & CBMS-2 The first step in determining how to address these

problems was to perform further studies of programmers involved in discovery to gain more detailed insight into specifically what they were doing. One such study, in this case of programmers maintaining a moderate-sized object-oriented software system, found that the most common high level queries were:

1. Where is this variable modified?2. What are the available slots and methods on this

instance?3. What is the data-type of this variable or function?4. What are the super-classes of this class?5. What does this function return?6. Does this function have side-effects?7. Is this data-type used?

Page 19: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

19

DLHB11-CSIS & CBMS-4

The ability to represent everything in the code requires a deeper ontology of code-level software elements than the original Lassie ontology, that includes statements, blocks, conditions, etc. In fact, every syntactic element of the programming language is in the ontology

Page 20: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

20

DLHB11-CSIS & CBMS-3

Comprehensive Software Information System (CSIS)

Code-Based Management System (CBMS) The idea of CBMS was to define the most

precise level of granularity of representation needed to have complete knowledge of the software system in the knowledge-base.

A CBMS is based on a full-scale parse of the code to construct an abstract syntax tree (AST), which is basically the parse tree. The AST has all the information of the source code, such that the source code can be completely generated from the AST

Page 21: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

21

DLHB11-CSIS & CBMS-5

control flow, dataflow, call-graphs.

Page 22: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

22

DLHB11-CSIS & CBMS-6

Using hypertext links.

Anything in {…} is a hypertext link.

Control flaw : “next”

Data flaw : “new-value”

Page 23: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

23

DLHB11-CSIS & CBMS-7

Another advantages of CBMS approach:

Role inverse Changes & Changed-By

Path tracing ChangedInFunction instead of ChangedBy o implementationOF .(only one click)

Subsumption

Page 24: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

24

DLHB11-CSIS & CBMS-8

Answering the sixth most commonly asked question.

Does this function have side-effects?

Side effects: A change to a global variable Any sort of Output Call a method that

has a side-effect.

Page 25: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

25

Overview

We know How DL can help software engineering in

maintenance of huge systems. Definition of a SIS system. How LaSSIE works ?! How CODEBASE solved LaSSIE problems ?! Features of CBMS compare to LaSSIE.

Page 26: 1 DL For Software Engineering Semantic Web Salman Mirghasemi salmir19[at]yahoo.com.

26

Any Question ?

Any Question ?

Thanks for your attention.


Recommended