Post on 14-Apr-2018
transcript
COSC 6431 9/8/06
Software Re-Engineering 1
Software Re-Engineering
COSC 6431
http://www.cse.yorku.ca/course/6431
V. Tzerpos
bil@cse.yorku.ca
Sep-8-06 COSC6431 2
Legacy Systems
• Older software systems that remain vital toan organization
• Software systems that are developedspecially for an organization have a longlifetime.
• Many software systems that are still in usewere developed many years ago usingtechnologies that are now obsolete.
Sep-8-06 COSC6431 3
Legacy Systems
• Legacy systems are still essential for thenormal functioning of the business.
• Many changes have been incorporated inthe system by many different peoplethroughout the years
• It is not unusual that no one has acomplete understanding of the system
Sep-8-06 COSC6431 4
Legacy System Replacement
• There is a business risk in scrapping a legacy
system and replacing it with a modern system:
– Legacy systems rarely have a complete
specification.
– Business processes rely on the legacy system.
– The system may embed business rules that are not
formally documented elsewhere.
– New software development is risky and may not be
successful.
Sep-8-06 COSC6431 5
Lehman’s Second Law
• “The entropy of a software system
increases with time unless specific work is
executed to maintain or reduce it”
• Lehman’s Law of Continuing Growth:
“The functional capability of most
software systems must be continually
increased to maintain user satisfaction over
the system lifetime”
Sep-8-06 COSC6431 6
Legacy System Change
• Systems must change in order to remain useful.
• Changing legacy systems is often expensive:
– Different parts of the system are implemented bydifferent teams.
– The system may use an obsolete programminglanguage.
– The system documentation is often out-of-date.
– The system structure may be corrupted by manyyears of maintenance.
– Techniques to save space or increase speed at theexpense of understandability may have been used.
COSC 6431 9/8/06
Software Re-Engineering 2
Sep-8-06 COSC6431 7
The Legacy Dilemma
• It is expensive and risky to replace the
legacy system.
• It is expensive to maintain the legacy
system.
• Businesses may choose to extend the
system lifetime using techniques such as
reverse engineering.
Sep-8-06 COSC6431 8
Example of a
Legacy Application System
File 1 File 2 File 3 File 4 File 5 File 6
Program 2Program 1 Program 3
Program 4 Program 5 Program 6 Program 7
Sep-8-06 COSC6431 9
After Re-engineering …
Database-centred System
Program1
Program2
Program3
Program4
Databasemanagement
system
Logical andphysical
data models
describes
Sep-8-06 COSC6431 10
Legacy System Design
• Most legacy systems were designed before
object-oriented development was used.
• Rather than being organized as a set of
interacting objects, these systems have
been designed using a function-oriented
design strategy.
Sep-8-06 COSC6431 11
Legacy System Assessment
• Organizations that rely on legacy systemsmust choose a strategy for evolving thesesystems:– Replace the old system with a new one.
– Continue maintaining the system.
– Transform the system by re-engineering to improve itsmaintainability.
• The strategy chosen should depend on thesystem quality and its business value.
Sep-8-06 COSC6431 12
System quality and business value
12
3 45
67
89
10
System quality
Business valueHigh business valueLow quality High business value
High quality
Low business valueLow quality
Low business valueHigh quality
COSC 6431 9/8/06
Software Re-Engineering 3
Sep-8-06 COSC6431 13
Legacy System Categories
• Low quality, low business value– These systems should be scrapped
• Low-quality, high-business value– Should be re-engineered or replaced.
• High-quality, low-business value– Replace, scrap, or maintain.
• High-quality, high business value– Continue in operation using normal system
maintenance.
Sep-8-06 COSC6431 14
Cost/Benefit factors for RE
• Current annual maintenance cost
• Current annual operation cost
• Current annual business value
• Predicted annual maintenance cost after RE
• Predicted annual operation cost after RE
• Predicted annual business value after RE
• Estimated re-engineering costs
• Estimated re-engineering time
• Expected life of the system
Sep-8-06 COSC6431 15
Software Maintenance
• Managing the processes of system
change
Sep-8-06 COSC6431 16
• The system requirements are likely to
change while the system is being
developed because the environment is
changing.
• When a system is installed in an
environment it changes that environment
and therefore changes the system
requirements.
Maintenance is Inevitable
Sep-8-06 COSC6431 17
• Perfective maintenance
– Adding or modifying the system’s functionality to
meet new requirements.
• Adaptive maintenance
– Changing a system to adapt it to new hardware or
operating system.
• Corrective maintenance
– Changing a system to fix coding, design, or
requirements errors.
Types of Maintenance
Sep-8-06 COSC6431 18
Distribution of Maintenance Effort
Perfectivemaintenance
(65%)
Correctivemaintenance
(17%)
Adaptivemaintenance
(18%)
COSC 6431 9/8/06
Software Re-Engineering 4
Sep-8-06 COSC6431 19
Evolving Systems
• It is usually more expensive to add functionality
after a system has been developed rather than
design this into the system:• Maintenance staff are often inexperienced and unfamiliar
with the application domain.
• Programs may be poorly structured and hard to understand.
• Changes may introduce new faults as the complexity of the
system makes impact assessment difficult.
• The structure may be degraded due to continual change.
• There may be no documentation available to describe the
program.
Sep-8-06 COSC6431 20
The Maintenance Process
• Maintenance is triggered by change requests
from customers or marketing requirements.
• Changes are normally batched and implemented
in a new release of the system.
• Programs sometimes need to be repaired without
a complete process iteration but this is dangerous
as it leads to documentation and programs getting
out of step.
Sep-8-06 COSC6431 21
System Documentation
• Requirements document
• System architecture description
• Program design documentation
• Source code listings
• Test plans and validation reports
• System maintenance guide
Sep-8-06 COSC6431 22
• Usually greater than development costs (2* to
100* depending on the application).
• Affected by both technical and non-technical
factors.
• Maintenance corrupts the software structure so
makes further maintenance more difficult.
• Aging software can have high support costs
(e.g., old languages, compilers etc.)
Maintenance Costs
Sep-8-06 COSC6431 23
• Module independence• It should be possible to change one module without
affecting others.
• Programming language• High-level language programs are easier to maintain.
• Programming style• Well-structured programs are easier to maintain.
• Program validation and testing• Well-validated programs tend to require fewer changes due
to corrective maintenance.
Maintenance Cost Factors
Sep-8-06 COSC6431 24
• Documentation• Good documentation makes programs easier to understand.
• Configuration management• Good CM means that links between programs and their
documentation are maintained.
• Application domain• Maintenance is easier in mature and well-understood
application domains.
• Staff stability• Maintenance costs are reduced if the same staff are involved
with them for some time.
Maintenance Cost Factors
COSC 6431 9/8/06
Software Re-Engineering 5
Sep-8-06 COSC6431 25
Maintenance Cost Factors
• Program age– The older the program, the more expensive it is to
maintain (usually) .
• External environment– If a program is dependent on its external environment,
it may have to be changed to reflect environmentalchanges.
• Hardware stability– Programs designed for stable hardware will not
require to change as the hardware changes.
Sep-8-06 COSC6431 26
• Control complexity:
– Can be measured by examining the conditionalstatements in the program.
• Data complexity:
– Complexity of data structures and componentinterfaces.
• Length of identifier names:
– Longer names imply readability.
• Program comments:
– Perhaps more comments mean easier maintenance.
Maintenance Measurements
Sep-8-06 COSC6431 27
• Coupling:
– How much use is made of other components or data
structures
• Degree of user interaction:
– The more user I/O, the more likely the component is
to require change.
• Speed and space requirements:
– Require tricky programming, harder to maintain.
Maintenance Measurements
Sep-8-06 COSC6431 28
Process Measurements
• Number of requests for corrective
maintenance.
• Average time taken to implement a change
request.
• Number of outstanding change requests.
• If any or all of these is increasing, this may
indicate a decline in maintainability.
Sep-8-06 COSC6431 29
Software Re-engineering
• Reorganizing and modifying existing
software systems to make them more
maintainable.
Sep-8-06 COSC6431 30
Forward engineering and re-engineering
Understanding andtransformation
Existingsoftware system
Re-engineeredsystem
Design andimplementation
Systemspecification
Newsystem
Software re-engineering
Forward engineering
COSC 6431 9/8/06
Software Re-Engineering 6
Sep-8-06 COSC6431 31
• When system changes are mostly confined
to part of the system then re-engineer that
part.
• When hardware or software support
becomes obsolete.
• When tools to support re-structuring are
available.
When to Re-engineer
Sep-8-06 COSC6431 32
Re-engineering Advantages
• Reduced risk
– There is a high risk in new software
development.
• Reduced cost
– The cost of re-engineering is often
significantly less than the costs of developing
new software.
Sep-8-06 COSC6431 33
Re-engineering Cost Factors
• The quality of the software to be re-engineered.
• The tool support available for re-engineering.
• The extent of the data conversion which isrequired.
• The availability of expert staff for re-engineering.
Sep-8-06 COSC6431 34
The re-engineering process
Reverseengineering
Programdocumentation
Datareengineering
Original data
Programstructure
improvement
Programmodularisation
Structuredprogram
Reengineereddata
ModularisedprogramOriginal
program
Source codetranslation
Sep-8-06 COSC6431 35
Source Code Translation
• Involves converting the code from one language(or language version) to another e.g., FORTRANto C
• May be necessary because of:
– Hardware platform update
– Staff skill shortages
– Organizational policy changes
• Only realistic if an automatic translator isavailable.
Sep-8-06 COSC6431 36
Reverse Engineering
• Reverse Engineering is the process of
determining how a system works by analyzing its
internal constituents and/or its external behavior.
• In the software world one would say that reverse
engineering is trying to figure out how a system
works by:– Inspecting the source code and documentation (if it exists)
– Exercising the executable programs and observing their behavior.
COSC 6431 9/8/06
Software Re-Engineering 7
Sep-8-06 COSC6431 37
The Reverse Engineering
Process
Data stucturediagrams
Program stucturediagrams
Traceabilitymatrices
Documentgeneration
Systeminformation
store
Automatedanalysis
Manualannotation
System to bere-engineered
Sep-8-06 COSC6431 38
Reverse Engineering
• Reverse engineering often precedes re-engineering but is sometimes worthwhilein its own right
– The design and specification of a system maybe reverse engineered so that they can be aninput to the requirements specification processfor the system’s replacement.
– The design and specification may be reverseengineered to support program maintenance.
Sep-8-06 COSC6431 39
Why is Reverse Engineering
Important/Necessary?
• Most software that is developed is not
“from scratch”.
• Understanding someone else’s source
code, specifications, designs, is difficult.
– Why is this so?
– What makes software more difficult to
understand than a toaster or a car?
Sep-8-06 COSC6431 40
Software Maintenance Problem
• A company hires a bright software developer to
maintain a system.
• The project manager points the developer to a
source code directory and says “become an
expert in the system as soon as possible”.
• The IBM TOBEY back-end compiler project
allowed for a 1 year learning curve (but this is
quite rare).
Sep-8-06 COSC6431 41
Program Structure Improvement
• Maintenance tends to corrupt the structureof a program. It becomes harder tounderstand.
• The program may be automaticallyrestructured to remove unconditionalbranches.
• Conditions may be simplified to makethem more readable.
Sep-8-06 COSC6431 42
Program Modularization
• The process of re-organising a program so
that related program parts are collected
together in a single module.
• Usually a manual process that is carried
out by program inspection and re-
organization.
COSC 6431 9/8/06
Software Re-Engineering 8
Sep-8-06 COSC6431 43
Recovering Data Abstractions
• Many legacy systems use shared tables and
global data to save memory space.
• This causes problems because changes have a
wide impact in the system.
• Shared global data may be converted to objects:– Analyse common data areas to identify logical abstractions.
– Create an object for these abstractions.
– Find all data references and replace them with reference to the
data abstraction.
Sep-8-06 COSC6431 44
Data Re-engineering
• Involves analyzing and reorganizing the data
structures (and sometimes the data values) in a
program.
• May be part of the process of migrating from a
file-based system to a DBMS-based system or
changing from one DBMS to another.
• Objective is to create a managed data
environment.
Sep-8-06 COSC6431 45
Data Problems
• Data naming problems– Names may be hard to understand. The same
data may have different names in differentprograms.
• Field length problems– The same item may be assigned different
lengths in different programs.
• Hard-coded literals
Sep-8-06 COSC6431 46
Reverse Engineering Research
• The focus has been primarily on the
development of tools to help software
developers understand software quicker
and with less effort.
• Not much work has been done on reverse
engineering methods, however.
Sep-8-06 COSC6431 47
Sherlock Holmes Analogy
• “We have developed good detective tools
(e.g., magnifying glasses, fingerprint
matchers, etc) but we have little insight on
how to train someone to be a good
detective (e.g., guidelines, processes, etc)”
S. Mancoridis
Sep-8-06 COSC6431 48
Progress Has Been Made In …
• Source code analysis
• Program tracing and profiling
• Automatic modularization (software
clustering)
But still a research area in its infancy …
COSC 6431 9/8/06
Software Re-Engineering 9
Sep-8-06 COSC6431 49
Lecture schedule
• Sep 8: Introduction, administrivia
• Sep 15: Static and dynamic analysis
• Sep 22: Software clustering
• Sep 29: Evaluation of clustering techniques
• Oct 6: Intro to Design Patterns
• Oct 13: Design Pattern Detection
Sep-8-06 COSC6431 50
Lecture schedule
• Oct 20: Refactoring
• Oct 27: Program transformation
• Nov 3: Re-Engineering Patterns
• Nov 10: Mining Software Repositories
• Nov 17: Research paper presentations
• Nov 24: Research paper presentations
• Dec 1: Research paper presentations
Sep-8-06 COSC6431 51
Grading
• 10% - Participation (paper discussion)
• 20% - Assignment
• 30% - Research paper presentation
• 40% - Project report
Sep-8-06 COSC6431 52
Workload
• September: 2 papers a week
• October: Assignment + 2 papers a week
• November: 1 paper to present + project