+ All Categories
Home > Education > Math ofit simplification-103

Math ofit simplification-103

Date post: 28-Nov-2014
Category:
Upload: yazid-sidi
View: 542 times
Download: 0 times
Share this document with a friend
Description:
Math for IT simpificationl
32
The Mathematics of IT Simplification by Roger Sessions ObjectWatch, Inc. 1 April 2011 Version 1.03 Most Recent Version at www.objectwatch.com/white_papers.htm#Math
Transcript
Page 1: Math ofit simplification-103

The Mathematics of IT Simplification

by

Roger Sessions

ObjectWatch, Inc.

1 April 2011 Version 1.03

Most Recent Version at

www.objectwatch.com/white_papers.htm#Math

Page 2: Math ofit simplification-103

2 | P a g e

Good fences make good neighbors.

- Robert Frost

Contents Author Bio ..................................................................................................................................................... 3

Acknowledgements ....................................................................................................................................... 3

Executive Summary ....................................................................................................................................... 4

Introduction .................................................................................................................................................. 6

Complexity .................................................................................................................................................... 7

Functional Complexity .................................................................................................................................. 8

Coordination Complexity ............................................................................................................................ 11

Total Complexity ......................................................................................................................................... 12

Partitions ..................................................................................................................................................... 12

Set Notation ................................................................................................................................................ 14

Partitions and Complexity ........................................................................................................................... 14

Directed vs. Non-Directed Methodologies ................................................................................................. 16

Equivalence Relations ................................................................................................................................. 17

Driving Partitions with Equivalence Relations ............................................................................................ 18

Properties of Equivalence Driven Partitions ............................................................................................... 19

Uniqueness of Partition .......................................................................................................................... 20

Conservation of Structure ....................................................................................................................... 20

Synergy ........................................................................................................................................................ 22

The SIP Process ........................................................................................................................................... 29

Wrap-up ...................................................................................................................................................... 32

Legal Notices ............................................................................................................................................... 32

Page 3: Math ofit simplification-103

3 | P a g e

Author Bio Roger Sessions is the CTO of ObjectWatch, a company he founded thirteen years ago. He has written

seven books including his most recent, Simple Architectures for Complex Enterprises, and dozens of

articles and white papers. He assists both public and private sector organizations in reducing IT

complexity by blending existing architectural methodologies and SIP (Simple Iterative Partitions). In

addition, Sessions provides architectural reviews and analysis. Sessions holds multiple patents in

software and architectural methodology. He is a Fellow of the International Association of Software

Architects (IASA), past Editor-in-Chief of the IASA Perspectives Journal, and a past Microsoft recognized

MVP in Enterprise Architecture. A frequent keynote speaker, Sessions has presented in countries around

the world on the topics of IT Complexity and Enterprise Architecture. Sessions has a Masters Degree in

Computer Science from the University of Pennsylvania. He lives in Chappell Hill, Texas. His blog is

SimpleArchitectures.blogspot.com and his Twitter ID is @RSessions.

Comments on this white paper can be sent to emailID: roger domain: objectwatch.com

Acknowledgements This white paper has benefited greatly from the careful reading of its first draft by some excellent

reviewers. These reviewers include:

Christer Berglund, Enterprise Architect, aRway AB

Aleks Buterman, Principal, SenseAgility Group

G. Hussain Chinoy, Partner, Enterprise Architect, Bespoke Systems

Dr. H. Nebi Gursoy

Jeffrey J. Kreutzer, MBA, MPM, PMP

Michael J. Krouze, CTO, Charter Solutions, Inc.

Patrick E. Lujan, Managing Member, Bennu Consulting, LLC

John Polgreen, Ph.D., Chief Architect, Architecting the Enterprise

Balaji Prasad, Technology Partner and Chief Architect, Cognizant Technology Solutions

Mark Reiners

David Renton, III

Anonymous

I appreciate all of their help.

The towels photos on the front cover are by EvelynGiggles, Flickr, and Licensed under Creative

Commons Attribution 2.0 Generic.

Page 4: Math ofit simplification-103

4 | P a g e

Executive Summary Large IT systems frequently fail, at a huge cost to the U.S. and world economy. The reason for these

failures has to do with complexity. The more complex a system is, the more likely it is to fail. It is difficult

to figure out the requirements for complex systems. It is hard to design complex systems. And it is hard

to implement complex systems. At every step in the system life cycle, errors accumulate at an ever

increasing rate.

The solution to this problem is to break large complex systems up into small simple systems that can be

worked on independently. But in breaking up the large complex systems, great care must be done to

minimize the overall system complexity. There are many ways to break up a system and the vast

majority of these make the problem worse, not better.

If we are going to compare approaches to reducing the complexity, it is useful to have a way of

measuring complexity. The problem of measuring complexity is orthogonal to the problem of reducing

complexity, in the same way that measuring heat is orthogonal to the problem of cooling, but our ability

to measure an attribute gives us more confidence in our ability to reduce that attribute.

To measure complexity, I consider two aspects of a system. The first is the amount of functionality. The

second is the number of other systems with which this system needs to coordinate. In both cases,

complexity increases exponentially. This indicates that an effective “breaking up” of a large system

needs to take into account both the size of the resulting smaller systems (I call this functional

complexity) and the amount of coordination that must be done between them (I call this coordination

complexity.) Mathematically, I describe the process of “breaking up” a system as partitioning that

system.

The problem with trying to simultaneously reduce both functional and coordination complexity is that

these two pull against each other. As you partition a large system into smaller and smaller systems, you

reduce the functional complexity but increase the coordination complexity. As you consolidate smaller

systems to reduce the coordination complexity, you increase the functional complexity. So the challenge

in reducing overall complexity is to find the perfect balance between these two forms of complexity.

What makes this challenge even more difficult is that one must determine how to split up the project

before one knows what functionality is needed in the system or what coordination will be required.

Existing approaches to this problem use decompositional design, in which a large system is split up into

smaller systems, and those smaller systems are then split into yet smaller systems. But decompositional

design is a relatively random process. The outcome is highly dependent on who is running the analysis

and what information they happen to have. The chance that decompositional design will find the

simplest possible way of partitioning a system is highly unlikely.

This paper suggests another way of partitioning a system called synergistic partitioning. Synergistic

partitioning is based on the mathematics of sets, equivalence relations, and partitions. This approach

has several advantages over traditional decompositional design:

Page 5: Math ofit simplification-103

5 | P a g e

It produces the simplest possible partition for a given system, in other words, it produces the

best possible solution.

It is directed, meaning that it always comes out with the same solution.

It can determine the optimal partitioning of a system with very little information about the

functionality of that system and no information about its coordination dependencies, so it can

be used very early in the system life cycle.

To someone who doesn’t understand the mathematics of synergistic partitioning, the process can seem

like magic. How can you figure out how many baskets you will need to sort your eggs if you don’t know

how many eggs you have, which eggs will end up in which baskets, or how the eggs are related to each

other? But synergistic partitioning is not magic. It is mathematics. This paper describes the mathematics

of synergistic partitioning.

The concept of synergistic partitioning is delivered with a methodology known as Simple Iterative

Partitions (SIP.) It is not necessary to understand the mathematics of synergistic partitioning to use SIP.

But if you are one of those who feel most comfortable with a practice when you understand the theory

behind the practice, this paper is for you.

If you are not interested in the theory, then just focus on the results: reduced complexity, reduced costs,

reduced failure rates, better deliverables, higher return on investment. If this seems like magic,

remember what Arthur C. Clarke said: “Any sufficiently advanced technology is indistinguishable from

magic.” The difference between magic and science is understanding.

Page 6: Math ofit simplification-103

6 | P a g e

Introduction Big IT projects usually fail, and the bigger the project, the more likely the failure. Estimates on the cost of

these failures in the U.S. range from $60B in direct costs1 to over $2 trillion when indirect costs are

included2. By any estimate, the cost is huge.

The typical approach to large IT projects is four phases, as shown in Figure 1. This approach is essentially

a waterfall approach; each phase must be largely completed before the next is begun. While the

waterfall approach is used less and less on smaller IT projects, on large multi-million dollar systems it is

still the norm.

Figure 1. Waterfall Approach to Large IT Projects

Notice that in Figure 1, the Business Case is shown as being much smaller than the remaining boxes. This

reflects the unfortunate reality that too little attention is usually paid to this critical phase of most

projects.

The problem with the waterfall approach when used to deliver large IT systems is that each phase is very

complex. The requirements for a $100 million system can easily be a foot thick. The chances that these

requirements will accurately reflect all of the needs of a large complex system are poor. Then these

most likely flawed requirements will be translated into a very large complex design. The chance that this

design will accurately encompass all of the requirements is also small. Then this flawed design will be

implemented, probably by hundreds of thousands of lines of code. The chance that this complex design

will accurately be implemented is also small. So we have poor business requirements specified by flawed

requirements translated into an even more flawed design delivered by an even more flawed

implementation. On top of this, the process takes so long that by the time the implementation is

delivered, the requirements and technology have probably changed. No wonder the failure rates are so

high!

I advocate another approach known as Simple Iterative Partitions (SIP). The SIP approach to delivering a

large IT system is to first split it up into smaller, much simpler subsystems as shown in Figure 2. These

systems are more amenable to fast, iterative, agile development approaches.

1 CIOInsight, 6/10/2010 http://www.cioinsight.com/c/a/IT-Management/Why-IT-Projects-Fail-762340

2 The IT Complexity Crisis; Danger and Opportunity. White Paper by Roger Sessions, available at

http://objectwatch.com/white_papers.htm#ITComplexity

Page 7: Math ofit simplification-103

7 | P a g e

Figure 2. SIP Approach to Large IT

We are generally good at requirements gathering, design, and implementation of small IT systems. So if

we can find a way to partition the large complex project into small simple systems, we can reduce our

costs and improve our success rates. While SIP is not the only methodology to make use of

partitioning3, it is the only methodology to approach partitioning from a rigorous, mathematical

perspective which yields the many important benefits that I will explore further in this paper.

What makes partitioning difficult is that it must be done very early in the project life cycle, before we

even know the requirements of the project. Why? Because the larger the project, the less likely the

requirements are to be accurate. So if we split the project before requirements gathering, then we can

do a much more accurate job of doing requirements gathering. But how can you split up a project when

you know nothing about the project?

SIP does this through a pre-design of the project as part of the partitioning phase. In this pre-design, we

gather some rudimentary information about the project, analyze that information in a methodical way,

and make highly accurate mathematical predictions about the best way to carve up the project.

This claim seems like magic. How can you effectively partition something before you know what that

something is? SIP claims to do this based on mathematical models. And that is the purpose of this white

paper: to describe the mathematical foundations and models that allow SIP to do its magic.

Complexity SIP promises to find the best possible way to split up a project. In order to evaluate this claim, we need

to have a definition for “best.” For SIP, the best way to split up the project is the way that will give the

maximum return on investment (ROI), that is, the maximum value for the minimum cost. The best way

3 TOGAF® 9, for example, uses the term partitioning to describe different views of the enterprise.

Page 8: Math ofit simplification-103

8 | P a g e

to maximize ROI is to reduce complexity. Complexity is defined as the attribute of a system that makes

that system difficult to use, understand, manage, and/or implement.4

Complexity, cost, and value are all intimately connected. Since the cost of implementing a system is

primarily determined by the complexity of that system5, if we can reduce the complexity of a system

through effective partitioning, then we can reduce its cost. And since our ability to accurately specify

requirements is also impacted by complexity, reducing complexity also increases the likelihood that the

business will derive the expected value from the system. Understanding complexity is therefore critical.

We can’t reduce something unless we understand it.

Of course, reducing the complexity of a system does not guarantee that we can deliver it. There may be

technology, political, or business obstacles that complexity reduction does not address. But it is a

necessary part of the solution.

IT complexity at the architecture levels comes in two flavors: functional complexity and coordination

complexity. I’ll look at each of these in the next two sections. There are other types of complexity such

as implementation complexity, which considers the complexity of the code, and communications

complexity, which considers the difficulty in getting parties to communicate. These other types of

complexity are also important but fall outside of the scope of SIP.

Functional Complexity Functional complexity refers to the complexity that arises as the functionality of a system increases. This

is shown in Figure 3.

4 CUEC (Consortium for Untangling Enterprise Complexity) – Standard Definitions of Common Terms

(www.cuec.info/standards/CUEC-Std-CommonTerminology-Latest.pdf 5 See, for example, The Impact of Size and Volatility on IT Project Performance by Chris Sauer, Andrew Gemino,

and Blaize Horner Reich, Comms of the ACM Nov 07 and 2009 Chaos Report published by the Standish Group

Page 9: Math ofit simplification-103

9 | P a g e

Figure 3. Increasing Complexity with Increasing Functionality

Consider a system that has a set of functions, F. If you have a problem in the system, how many

possibilities are there for where that problem could reside? The more possibilities, the more complexity.

Let’s say you have a system with one function: F1. If a problem arises, there is only one place to look for

that problem: F1.

Let’s say you have a system with two functions: F1 and F2. If a problem arises, there are three places you

need to look. The problem could be in F1, the problem could be in F2, or the problem could be in both.

Let’s say you have a system with three functions: F1, F2, and F3. Now there are seven places the problem

could reside: F1, F2, F3, F1+F2, F1+F3, F2+F3, F1+F2+F3.

You can see that the complexity of solving the problem is increasing faster than the amount of

functionality in the system. Going from 2 to 3 functions is a 50% increase in functionality, but more than

a 100% increase in complexity.

In general, the relationship between functionality and complexity is an exponential one. That means

that as the functionality in the system increases by F, the complexity increases in the system by C, where

C is > F. What is the relationship between C and F?

One supposition is made by Robert Glass in his book Fact’s and Fallacy’s about Software Engineering. He

says that every 25% increase in functionality results in a doubling of complexity. In other words, F=1.25;

C=2.

Glass may or may not be right about these exact numbers, but they seem a reasonable assumption. We

have many examples of how adding more “stuff” into a system exponentially increases the complexity

of that system. In gambling, adding one die to a system of dice increases the complexity of the system

Page 10: Math ofit simplification-103

10 | P a g e

by six-fold. In software systems, adding one if statement to the scope of another if statement increases

the overall cyclomatic complexity of that region of code by a factor of two.

Assuming Glass’s numbers are correct, we can compare the complexity of an arbitrary system relative to

a system with a single function in it. We may not know how much complexity is in a system with only a

single function, but whatever that is, we will call it one Standard Complexity Unit (SCU).

We can calculate the complexity, in SCUs, of a system of N functions.

SCU(N) = 10((log(C)/log(F)) X log(N)

Since log(2) and log(1.25) are both constants, they can be combined as 3.11. This equation simplifies to

SCU(N) = 103.11 X log (N)

This equation can be further simplified to SCU(N) =N3.11

Since the constant 3.11 is derived from Glass, I refer to that number as Glass’s Constant. If you don’t

believe Glass had the right numbers, feel free to come up with a new constant based on your own

observations and name it after yourself. The precise value of the constant (3.11) is less important than

where it appears (the exponent.) For the purposes of this paper, I will use Glass’s Constant.

In this white paper I will not be using large values for N, so rather than go through the arithmetic to

determine the SCU for a given value of N, you can look it up in Table 1 which gives the SCU

transformations for the numbers 1-25.

N SCU(N) N SCU(N) N SCU(N) N SCU(N) N SCU(N)

1 1 6 251 11 1,717 16 5,500 21 12,799

2 9 7 422 12 2.250 17 6,639 22 14,789

3 30 8 639 13 2,886 18 7929 23 16,979

4 74 9 921 14 3,632 19 9,379 24 19,379

5 148 10 1,277 15 4,501 20 10,999 25 21,999

Table 1. SCU Transformations6 for 1-25

Using Table 1, you can calculate the relative complexity of a system of 20 functions compared to a

system of 25 functions. A system of 20 functions has, according to Table 1, 10,999 SCUs. A system of 25

functions has, according to Table 1, 21,999 SCUs. So we calculate relative complexity as follows:

20 functions = 10,999 SCUs 25 functions = 21,999 SCUs Difference = 11.000 SCUs Increase = 11,000/10,999 X 100 = 100 % increase Notice that the above calculation agrees with Glass’s prediction, that a 25% increase in functionality results in a 100% increase in complexity. No surprise here, since it was Glass’s constant I used to calculate this table.

6 The numbers in this table are calculated using a more precise Glass’s constant, 3.1062837. The numbers you get

with 3.11 will be somewhat off from these, especially in larger values of N.

Page 11: Math ofit simplification-103

11 | P a g e

Coordination Complexity The complexity of a system can increase not only because of the amount of functionality it contains, but

how that functionality needs to coordinate with other systems, either internal or external. In a service-

oriented architecture (SOA), this dependency will probably be expressed with messages. But messages

aren’t the only expression of dependencies. Dependencies could be expressed as shared data in a

database, function invocations, or shared transactions, to give a few examples.

Figure 4 shows the increasing complexity of a system with increasing coordination dependencies.

Figure 4. Increasing Complexity with Increasing Coordination Dependencies

If we assume that the amount of complexity added by increasing the dependencies by one is roughly

equal to the amount of complexity added by increasing the number of functions by one, then we can

use similar logic to the previous section to calculate the number of standard complexity units (SCUs) a

system has by virtue of its dependencies:

C(M) = 103.11 X log (M) where M is the number of system dependencies.

This simplifies to C(M) = M3.11

This arithmetic is exactly the same as the SCU transformation shown in Table 1. So you can use the same

table to calculate the complexity (in SCUs) of a system with N dependencies. The three systems in Figure

4 thus have SCU values of 0, 9, and 422, which are the SCU transformations of 0, 2, and 7.

Of course, it is unlikely that the amount of complexity one adds with an additional coordination

dependency is exactly the same as the amount one adds with an additional function. Both of these

constants are just educated guesses. As before, the value of the constant (3.11) is less important than

the location of the constant (the exponent.) I won’t be calculating absolute complexity; I will be using

these numbers to compare the complexity in two or more systems. Since we don’t know how much

Page 12: Math ofit simplification-103

12 | P a g e

complexity is in an SCU, knowing that System A has 1000 SCUs tells us little about its absolute

complexity. On the other hand, knowing that System B has twice the complexity of System A tells us a

great deal about the relative complexity of the two systems.

Total Complexity The complexity of a system has two complexity components: the complexity due to the amount of

functionality and the complexity due to the number of coordination dependencies. Assuming these two

are independent of each other, the total complexity of the system is the addition of the two. So the

complexity of a system with N functions and M coordination dependencies is given by

C(M, N) = M3.11 + N3.11

Of course, feel free to use Table 1 rather than go through the arithmetic.

Many people have suggested to me that the addition of the two types of complexity is too simplistic.

They feel that the two should be multiplied instead. They may well be right. However I have chosen to

assume independence (and thus, addition) because it is a more conservative assumption. If we instead

assume the two should be multiplied, then the measurements later in this paper would be even more

dramatic. Multiplication magnifies, not lessens my conclusions.

One last thing. Systems rarely exist in isolation. In a real IT architecture, we have many systems working

together. So if we want to know the full complexity of the system of systems, we need to sum the

complexity of each of the individual systems. This is straightforward, now that we know how to calculate

the complexity of any one of the systems.

A good example of a “system of systems” is a service-oriented architecture (SOA). In this architecture,

there are a number of services. Each service has complexity due to the number of functions it

implements and complexity due to the number of other services on which it has dependencies. Each of

these can be calculated using the SCU transform (or Table 1) and the complexity of the SOA as a whole is

the sum of the complexity of each of the services.

Partitions Much of the discussion of simplification will center on partitioning. I’ll start by describing the formal

mathematical concept of a partition, since that is how I will be using the term in this white paper.

A set of elements can be divided into subsets. The collection of subsets is a set of subsets. A partition is a

set of subsets such that each of the elements of the original set is now in exactly one of the subsets. It is

incorrect to use the word partition to refer to one of the subsets. The word partition refers to the set of

subsets.

For a non-trivial number of elements, there are a number of possible partitions. For example, Figure 5

shows four ways of partitioning 8 elements. One of the partitions contains 1 subset, two contain 2

subsets, and one contains 8 subsets.

Page 13: Math ofit simplification-103

13 | P a g e

Figure 5. Four Ways of Partitioning Eight Elements

This brings up an interesting question: how many ways are there of partitioning N elements? The answer

to this is known as the Nth Bell Number. I won’t try to explain how to calculate the Nth Bell Number7. I

will only point out that for non-trivial numbers, the Nth Bell Number is very large. Table 2 gives the Nth

Bell Number for integers (N) between 0 and 19. The Bell Number of 10, for example, is more than

100,000 and by the time N reaches 13, the Bell Number exceeds 27 million. This means that if you are

looking at all of the possible ways of partitioning 13 elements, you have more than 27 million options to

consider.

Table 2. N Bell for N from 0-19

You can see some of the practical problems this causes in IT. Let’s say, for example, we want to

implement a relatively meager 19 functions in an SOA. This means that we need to partition the 19

functions into some number of services. Which is the simplest way to do this? If we are going to

exhaustively check each of the possibilities, we have more than 1012 possibilities to go through! Finding

a needle in a haystack is trivial by comparison.

7 For those interested in exploring the Bell Number, see http://en.wikipedia.org/wiki/Bell_number.

N N Bell N N Bell

0 1 10 115,975

1 1 11 678,570

2 2 12 4,213,597

3 5 13 27,644,437

4 15 14 190,899,322

5 52 15 1,382,958,545

6 203 16 10,480,142,147

7 877 17 82,864,869,804

8 4,140 18 682,076,806,159

9 21,147 19 5,832,742,205,057

Page 14: Math ofit simplification-103

14 | P a g e

Set Notation Partitioning Theory is part of Set Theory, so I will often be using set notation in the discussions. Just to

be sure we are clear on set notation, let me review how we write sets.

A = {a, b, c} is to be read A is the set of elements a, b, and c. It is worth noting that order in sets is not

relevant, so if

A = {a, b, c} and B = {b, c, a} then A = B.

The elements of a set can themselves be sets, as in

A = {{a, b}, {c}} which should be read

A is a set containing two sets, the first is the set of elements a and b, and the second is the set with only

c in it.

We do not allow duplicates in sets, so this is invalid: A = {a, b, a}.

The same element can be in two sets of a set of sets, so this is valid: A = {{a, b, c}, {a}}.

If C is the entire collection of elements of interest then a set of sets P of those elements is a partition if

every element is in exactly one of the sets of P. So if

C = a, b, c, d and A = {{ab}, {cd}} and B = {{a}, {b, c, d} and D = {{a, b}, {b, c}} and E = {{a, b}, {d}}

Then A and B are partitions of C. D is not a partition because b is in two sets of C. E is not a partition

because c is not in any of the sets of E.

Partitions and Complexity Earlier I suggested that the main goals of the SIP preplanning process are reducing cost and increasing

value. Both of these are influenced by complexity. Complexity adds cost to a system. And complexity

makes it difficult to deliver, or even understand, the system’s value requirements. Partitioning is an

important tool for reducing complexity. But it is also a tool that if not well understood can do more

harm than good.

Let’s say you are trying to figure out the best way to partition eight business functions, A-H, into

services. And let’s say that you are given the following dependencies:

Page 15: Math ofit simplification-103

15 | P a g e

A/C (I will use “/” to indicate “leaning on”, or dependent, so A is dependent on C) A/F A/G G/F F/B B/E B/D B/H D/H E/D

I stated that we want to partition this the “best way.” Remember we defined best as the one with the

least overall complexity. Given these eight functions and their dependencies, you might want to take a

moment to try to determine the best possible partition. Perhaps the best possible partition is {{ABEF},

{CDGH}}. This partition is shown in Figure 6.

Figure 6. Partition {{ABEF}, {CDGH}}

The partition shown above can be simplified by eliminating any dependencies within a subset (service.)

It isn’t that these connections aren’t important; it’s just that they have already been accounted for in

the implementation calculations. Figure 7 shows this same partition with internal connections removed.

Figure 7. Partition {{ABEF}, {CDGH}} With Internal Connections Removed

Now we can calculate the complexity of the proposed partition. The complexity of the left subset is SCU

(4 functions) + SCU (5 dependencies) which equals 74 + 148. The complexity of the right subset is SCU (4

Page 16: Math ofit simplification-103

16 | P a g e

functions) +SCU (1 coordination dependency) which equals 74 + 1 = 75. The complexity of the entire

system is therefore 148 + 75 = 223 SCUs.

We know that the complexity of {{ABEF}, {CDGH}} is 223 SCUs. But is that the best we can do? Let’s

consider another partition: {{ACGF}, {BDEH}}. Figure 8 shows this partition with internal connections

removed.

Figure 8. Partition {{ACGF}, {BDEH}}

You can already see how much simpler this second partition is than the first. But let’s go through the

calculations.

Left: SCU (4) + SCU (1) = 74 + 1 Right: SCU (4) + SCU (0) + 74 + 0 Total = 75 + 74 = 149 SCUs

The difference in complexity between the two partitions is striking. The second is 149 SCUs. The first is

223 SCUs. The difference is 74 SCUs. Thus we have achieved a 33% reduction in complexity just by

shifting the partitions a bit. This equates to a 33% reduction in cost of the project and a 33% increase in

the likelihood we will deliver all of the value the business wants.

Of course, this example is a trivially small system. As systems grow in size, the difference in complexity

between different partition choices becomes much greater. For very large systems (those over $10M)

the choice of partition becomes critical to the success of the project.

Directed vs. Non-Directed Methodologies You can see why the ability to choose the best (i.e., simplest) partition is important. There are two

typical approaches to the problem. The first is directed, or deterministic. The second is non-directed, or

non-deterministic.

Directed methodologies are those that assume that there are many possible outcomes but only one that

is best. The directed methodology focuses on leading us directly to that best possible outcome. The

childhood game of “hot/cold” is a directed methodology. Participants are trying to lead you directly to

your goal by telling you when you are hot (getting closer) or cold (getting further away.)

Page 17: Math ofit simplification-103

17 | P a g e

Non-directed methodologies are those that assume you aren’t looking for a specific goal, just some

general category of outcome. The game of poker is a non-directed methodology. It doesn’t matter how

you end up winning all the money, as long as you win it.

From the perspective of IT complexity reduction, the directed approach has advantages and

disadvantages. The advantage of the directed approach is that it will lead you to the simplest solution,

guaranteed, no questions asked. The disadvantage is that you need complete information to find that

solution. That is, you must know every function and every coordination dependency. However this

degree of information doesn’t materialize until the very end of the design cycle or even later, long after

the partitioning needs to be done. So directed approaches, at least the common ones in use today, are

not helpful in the particular problem space we are considering, that is, IT complexity reduction.

The non-directed methodologies also have advantages and disadvantages. The non-directed

methodologies used in IT design are all some flavor of decompositional design. Decompositional design

works by starting with the larger system and breaking it down into smaller and smaller pieces. The

advantage of decompositional design is that you can use the methodology early on in the design cycle,

at a point when your knowledge is very incomplete. The disadvantage of decompositional design is that

it is more than non-directed, it is close to random. Any of the many partitions are a possible outcome

from a decompositional design and there are a huge number of partitions (The Bell Number, to be

precise.) Experienced designers may be expected to eliminate the worst of these, but the chances that

decompositional design, even in the hands of highly experienced personnel, will end up with the best

possible partition are, for non-trivial systems, extremely low. If we are using decompositional design to

find the needle in the haystack, the chances are high that all we will find is hay.

So we have a big problem in IT preplanning. We need to do our partitioning early on in the design cycle.

The only design methodologies that can be used early in the design cycle are non-directed

methodologies, particularly decompositional design. But decompositional design methodologies are

good at finding hay, very poor at finding needles. So they offer little hope of solving our problem,

namely, finding the least complex possible partition. What do we do?

The solution to this dilemma is to find another methodology, one that combines the benefits of both

directed and non-directed methodologies. Such a methodology could be used very early in the design

cycle (like non-directed methodologies) but still promise to lead us to the best possible partition (like the

directed methodologies.) But before I can introduce such a methodology, I need to give you some more

background.

Equivalence Relations Equivalence relations are a mathematical concept that will be critical to the needle in the haystack

problem. In this section, I will describe equivalence relations and their relevance to partitions.

An equivalence relation E is any function that has these characteristics:

Page 18: Math ofit simplification-103

18 | P a g e

- It returns TRUE or FALSE (i.e., it is Boolean.)

- It takes two arguments, both of which are elements of some collection (i.e., it is binary.)

- E(a,a) is always TRUE (i.e., it is reflective.)

- If E(a,b) is TRUE, then E(b,a) is TRUE (i.e., it is symmetric.)

- If E(a,b) is TRUE and E(b,c) is TRUE, then E(a,c) is TRUE (i.e., it is transitive.)

For example, consider the collection of people in this room and the function BirthYear that takes as

arguments two people in the room and returns TRUE if they are born in the same year and FALSE

otherwise. This is an equivalence relation, because:

- It is Boolean (it returns TRUE or FALSE.)

- It is binary (it compares two people in this room.)

- It is reflective (everybody is born in the same year as themselves.)

- It is symmetric (if I am born in the same year as you, then you are born in the same year as me.)

- It is transitive (If I am born in the same year as you and you are born in the same year as Sally,

then I am born in the same year as Sally.)

Driving Partitions with Equivalence Relations Equivalence relations are important because they are highly efficient at driving partitions. The general

algorithm for using an equivalence relation E to drive a partition P is as follows:

Initialization:

1. Say C is the collection of elements that are to be partitioned.

2. Say P is the set of sets that will (eventually) be the partition of C.

3. Initialize a set U (unassigned elements) to contain all the elements of C.

4. Initialize P to contain a single set which is the Null set, i.e., P = {{null}}

5. Take a random element out of U and put it in the single set of P, i.e., P = {{e1}}

Iteration:

1. Choose a random element of U, say em.

2. Choose any set in P.

3. Choose any element in P, say en.

4. If E(em, en ) is TRUE, then add em to the set.

5. If (em, en) is FALSE, then check the next set in P (i.e., go to step 2).

6. If there are no more sets in P, then create a new set, add em to that set, and add the new set to

P.

This sounds confusing, but it becomes clear when you see an example. Let’s say we have five people we

want to partition whose names and birth years are as follows:

Page 19: Math ofit simplification-103

19 | P a g e

- Anne (1990)

- John (1989)

- Tom (1990)

- Jerry (1989)

- Lucy (1970)

Let’s assume that we want to partition them by birth year. Let’s follow the algorithm:

Initialization:

1. C = {Anne, John, Tom, Jerry, Lucy}

2. Initialize U = {Anne, John, Tom, Jerry, Lucy}

3. Initialize P = {{null}}

4. Take a random element out of U say John, and assign John to the single set in P. So P now looks

like {{John}} and U now looks like {Anne, Tom, Jerry, Lucy}.

Now we iterate:

1. U is not null (it has four elements) so we are not done.

2. Choose a random element of U, say Lucy.

3. Choose any set in P, there is only one which is {John}.

4. Choose any element in that set, there is only one which is John.

5. If BirthYear(Lucy, John) is TRUE, then add Lucy to the set {John}.

6. BirthYear(Lucy, John) is FALSE, so we check the next set in P. But there are no more sets in P, so

we create a new set, add Lucy to it, and add that set to P. P now looks like {{John}, {Lucy}}

Let’s go through one more iteration:

1. U is not null (it now has three elements) so we are not done.

2. Choose a random element of U, say Jerry.

3. Choose any set in P, say {John}.

4. Choose any element in that set, there is only one which is John.

5. If BirthYear(Jerry, John) is TRUE, then add Jerry to that set. It is, so we add it to that set and start

the next iteration. We now have P = {{John, Jerry}, {Lucy}}

As you continue this algorithm, you will eventually end up with the desired partition.

Properties of Equivalence Driven Partitions When we use an equivalence relation to drive a partition as described in the last section, that partition

has two properties that will be important when we apply these ideas to IT preplanning. These are:

- Uniqueness of Partition

- Conservation of Structure

Page 20: Math ofit simplification-103

20 | P a g e

Let’s look at each of these in turn.

Uniqueness of Partition Uniqueness of partition means that for a given collection of elements and a given equivalence relation

there is only one possible outcome from the partitioning algorithm. This means that there is only one

possible partition that can be generated. Let’s go back to the problem of partitioning the room of people

using the BirthYear equivalence relation. As long as the collection of people we are partitioning doesn’t

change and the nature of the equivalence relation doesn’t change, there is only one possible outcome. It

doesn’t matter how many times we run the algorithm, we will always get the same result. And it doesn’t

matter who runs the algorithms or what order we choose elements from the set.

This property is important in solving our needle in the haystack problem. We are trying to determine

which partition is the optimal, or simplest, partition out of the almost infinite set of possible partitions.

Going through all possible partitions one by one and measuring their complexity is logistically

impossible. But if we can show that the simplest partition is one that would be generated by running a

particular equivalence relation, then we have practical way of finding our optimal partition.

Conservation of Structure Conservation of structure means that if we expand the collection of elements, we will expand but we

will never invalidate our partition. So let’s say we have a collection C of elements that we partition using

equivalence relation E. If we now add more elements to C and continue the partition generating

algorithm, we may get more elements added to existing sets in P and/or we may get more sets added to

P with new elements. But we will never invalidate the work we have done to date. That is, none of the

following can ever occur:

- We will never find that we need to remove a set from P.

- We will never find that we need to split a set in P.

- We will never find that we need to move an element in one of the sets of P to another set in P.

Let’s consider an example. Suppose we have 1000 pieces of mail to sort by zip code. Assume that we

don’t know in advance which zip codes or how many of them we will be using for our sort. We can use

the equivalence relation SameZipCode and our partition generating algorithm to sort the mail.

Let’s say that after sorting the first 100 pieces of mail, we have 10 zip codes identified with an average of

10 pieces of mail in each set. As we continue sorting the remaining mail, only two things can happen: we

can find new zip codes and/or we can add new mail to existing zip code. But we will never invalidate the

work we have done to date. Our work is conserved.

This brings up an interesting point. If you are sorting e Elements into s Sets of a partition, then how

many of e do you need to run through the partitioning algorithm before you have identified most of the

sets that will eventually make up P? In other words, how far to do you need to go in the algorithm

before you have carved out the basic structure of P? The answer to this question is going to depend on

three things:

Page 21: Math ofit simplification-103

21 | P a g e

- The magnitude of s, the number of sets we will eventually end up with.

- The distribution of e into s.

- How much of P we need to find before we can say we have found its basic structure.

To give an example calculation, I set up a test where I assumed that we would be partitioning 1000

business functions randomly among 20 sets. I then checked to see how many business functions I would

need to run through the partitioning algorithm to have found 50%, 75%, 90%, and 100% of the sets. The

results are shown in Table 2.

Table 2.

Run# 50% 75% 90% 100% 1 10 23 45 83 2 14 40 48 68 3 13 22 65 93 4 12 18 28 31 5 16 32 41 111 6 12 22 29 47 7 11 23 35 83 8 14 22 41 51 9 15 27 46 63 10 18 29 45 81 Table 2. Runs to Identify Different Percentages of Partition As Table 2 shows, in the worst case I had identified 90% of the final sets after analyzing at most 65 of the

1000 business functions. The average number of iterations I had to run to find 90% of the sets was 42.3

with a standard deviation of 10.6. This tells us that 95% of the time we will find 90% of the 20 sets with

no more that 42.3 + (2 X 10.6) = 64 iterations. Putting this all together, by examining 6.4% of the 1000

business functions, we can predict with 95% confidence that we have found at least 90% of the structure

of the partition. From then on, we are just filling in details.

This is well and good, you may say, if you know in advance that we have 20 sets. What if we have no

idea how many sets we have?

It turns out that you can predict with remarkable accuracy when you have found 90% of the sets even

without any idea how many sets you have. You can do this by plotting how many sets have been

identified against the number of iterations. With one random sample, I got the plot shown in Figure 9.

Page 22: Math ofit simplification-103

22 | P a g e

Figure 9. Number of Sets Found vs. Iterations

As you can see in Figure 9, we rapidly find new sets early in the iteration loop. In the first five iterations

we find five sets. This is to be expected, since if the elements are randomly distributed, we would expect

almost every element of the early iterations to land in a new subset. But as more subsets are discovered

there are fewer left to discover, so for later iterations elements are more likely to land in existing

subsets. By the time we have found 15 of the subsets, there are only 5 left to be discovered. The next

iteration after the 15th subset is discovered has only a 5/20th chance of discovering yet another

undiscovered subset. And the next iteration after the 19th subset has only a 1/20th chance of finding the

next subset. It will take on average 20 iterations to go from the 19th subset to the 20th subset.

So by observing the shape of the curve, we can predict when we are plateauing out on the set discovery

opportunities. The particular curve shown in Figure 9 seems to flatten out at about 17 subsets, which

have taken us about 35 iterations to reach. So by the time we have completed 35 iterations, we can

predict that we have found someplace around 90% of the existing sets.

Synergy Let’s take a temporary break from partitions and equivalence relations to examine another concept that

is important in complexity reduction, that is, synergy. Synergy is a relationship that may or may not exist

between two business functions. When two business functions have synergy, we call them synergistic.

Synergy is of course a word that is used in many different contexts. I define synergy as follows:

Synergy: Two business functions have synergy with each other if and only if from the business perspective neither is useful without the other.

Page 23: Math ofit simplification-103

23 | P a g e

So if we want to test two business functions A and B for synergy, we take them to the business and tell

them we can give them A or B but not both. If they are willing to take either without the other, then the

two functions are not synergistic. If they won’t take either without the other, then they are synergistic.

For example, consider these business functions:

- Deposit

- Withdraw

- Check Overdraft

- Check Balance

- Validate Credit Card

- Validate Debit Card

- Process Credit Charge

- Process Debit Charge

If you take these functions two by two to the business and ask which have mutual requirements, you are

likely to get these answers:

- Deposit and Withdraw are needed as a pair. The business has no interest in a system that can

deposit but not withdraw or can withdraw but not deposit. So these two are synergistic.

- Withdraw and Check Balance are both needed. The business won’t accept a system that can

withdraw but not show you the new balance and visa versa, so these two are synergistic.

- Process Credit Charge and Withdraw are not needed as a pair. Process Credit Charge is used to

process a credit card charge, and the business can see use for this regardless of whether that

same system can show them what their account balance is.

Now let’s say that we have three business functions, A, B, and C. Here are some observations we can

make about synergy.

1. It is Boolean, meaning that the statement “A and B are synergistic” is either TRUE or FALSE.

2. It is binary, meaning that the statement takes two arguments.

3. It is reflective, meaning that any business function is synergistic with itself. You can’t

meaningfully ask the question, “Can I give you Deposit but not Deposit?”

4. It is symmetric, meaning that if A is synergic with B, then B must be synergistic with A. So if you

ask the question “Can I give you only one of Deposit and Withdraw?” you will get the same

answer as “Can I give you only one of Withdraw and Deposit?”

5. It is transitive, meaning that if A and B are synergistic and B and C are synergistic, then A and C

must be synergistic. So once we know that Deposit and Withdraw and synergistic and we also

know that Withdraw and Check Balance are synergistic, then we can predict that Deposit and

Check Balance are also synergistic.

So we have a function that is Boolean, binary, reflective, transitive, and symmetric. What is such a

function called? As I discussed earlier, such a function is called an equivalence relation. So synergy is not

Page 24: Math ofit simplification-103

24 | P a g e

just a function, it is an equivalence relation. And now that we know that synergy is an equivalence

relation, we know a great deal about it. In particular, we know these important facts:

- It can be used to drive a partition.

- That partition will be unique.

- That partition will have conservation of structure.

Now of course the business side doesn’t think of business functions as elements in a partition. It thinks

of business functions as steps in a particular business process. It thinks of Deposit and Withdrawal, for

example, as part of the process of Managing An Account. Managing An Account is an example of a

cluster of functions that is often referred to as a capability. From the business perspective, a capability is

a unified business responsibility. From the mathematical perspective, a capability is one of the sets in

the synergy driven partition.

An element in the partition then corresponds to a discrete step in the business process. The synergistic

analysis ensures that elements of a given set are all working on a unified business responsibility, that is,

the responsibility of their respective capability. I will start using the term capability set to refer to one of

the synergy driven sets in the partition.

When we ask the business to analyze these eight functions, they are not going to take long to figure out

which are synergistically related (or, as they might put it, living in the same capability.) They are likely to

conclude that the related groups are synergistic:

Manage Cards

Validate Credit Card

Validate Debit Card

Process Credit Card

Process Debit Card

Manage Account

Deposit

Check Overdraft

Withdraw

Check Balance

Now let’s watch what happens when this project is turned over to IT for implementation. Four different

architects (John, Anne, Jim, Sally) examine the requirements. They each come back with their

recommendation.

John doesn’t like SOAs. He thinks the messaging is too complex. He recommends that all eight of the

functions be packaged together. If anything goes wrong, debugging is much easier, he says.

Page 25: Math ofit simplification-103

25 | P a g e

Anne really likes SOAs. She thinks everything should be a service. She recommends that each function

be a separate service. This will give the system the maximum flexibility, she says.

Jim believes in reuse. He thinks functions that are going to have similar implementations should be

packaged together. This will give the maximum possible opportunity to reuse code, he says.

Sally believes that the implementation packages should follow the business capabilities as defined by

the synergistic relationships. This will give the maximum possible alignment between the business and

IT, she says.

Who is right? Notice that all of these arguments are entirely reasonable. Let’s approach the problem

from a complexity perspective.

Since each function implements one step in a unified business process, it stands to reason that there will

be many mutual dependencies between implementations of members of a capability set. Consider the

functions Deposit, Withdraw, Check Balance and Check Overdraft. If we change the format of money

that Deposit stores in a database, we will need to change Withdraw to use the same storage format. If

we change Deposit’s understanding of an account ID, that understanding will need to be reflected in

Check Balance.

We will be facing similar issues with Validate Credit Card, Process Credit Charge, Validate Debit Card,

and Process Debit Charge. These are all functions related to processing credit and debit cards. Changes

to Process Credit Charge are likely to force changes in Validate Debit Card and vise versa.

On the other hand, changes in Validate Credit Card are not likely to force changes in Check Balance and

changes in Check Balance are not likely to force changes in Charge Credit Card. Graphically, we can map

these dependencies as shown in Figure 10.

Figure 10. Dependencies in Capability Sets

Now let’s look at each of the four architectural proposals. Figures 11-14 show the four different

packaging possibilities with dependencies internal to the package not shown.

John’s consolidated packaging is shown in Figure 11.

Page 26: Math ofit simplification-103

26 | P a g e

Figure 11. Monolithic Package

Anne’s extreme SOA packaging is shown in Figure 12.

Figure 12. Extreme SOA Package

Jim’s reuse intensive packaging is shown in Figure 13.

Page 27: Math ofit simplification-103

27 | P a g e

Figure 13. Reuse Driven Package

Sally’s synergy driven package is shown in Figure 14.

Figure 14. Synergy Driven Package

Now let’s do the complexity calculations on the four packages. Remember that the complexity of a

system of systems is equal to the sum of the individual systems, and the complexity of an individual

system is equal to its implementation complexity and its coordination complexity. Using Table 1, we get

these values in standard complexity units (SCUs) for the four packages:

Monolithic Package System Functional Complexity Coordination Complexity A 8 Functions, 639 SCUs 0 Dependencies, 0 SCUs Total Complexity: 639 SCUs

Page 28: Math ofit simplification-103

28 | P a g e

Extreme SOA Package System Functional Complexity Coordination Complexity A 1 Function, 1 SCU 3 Dependencies, 30 SCUs B 1 Function, 1 SCU 3 Dependencies, 30 SCUs C 1 Function, 1 SCU 3 Dependencies, 30 SCUs D 1 Function, 1 SCU 3 Dependencies, 30 SCUs E 1 Function, 1 SCU 3 Dependencies, 30 SCUs F 1 Function, 1 SCU 3 Dependencies, 30 SCUs G 1 Function, 1 SCU 3 Dependencies, 30 SCUs H 1 Function, 1 SCU 3 Dependencies, 30 SCUs Total Complexity: 248 SCUs Reuse Package System Functional Complexity Coordination Complexity A 3 Functions, 30 SCUs 7 Dependencies, 422 SCUs B 1 Functions, 1 SCU 3 Dependencies, 30 SCUs C 3 Functions, 30 SCUs 7 Dependencies, 422 SCUs D 1 Functions, 1 SCU 3 Dependencies, 30 SCUs Total Complexity: 966 Synergy Package System Functional Complexity Coordination Complexity

A 4 Functions, 74 SCUs 0 Dependencies, 0 SCUs B 4 Functions, 74 SCU 0 Dependencies, 0 SCUs Total Complexity: 148 SCUs The implementation package that follows the synergy relationships is the least complex. In fact, it is

substantially less complex. It has only 60% of the complexity of the next simplest solution (the extreme

SOA) and only 15% of the complexity of the most complex solution (the reuse.)

This result cannot be over emphasized. The least complex implementation package is the one that

precisely reflects the synergy driven capability sets. Not only is it the least complex. It is by far the least

complex. That means it will be by far the least costly to implement and by far the most likely to succeed.

At this point, some readers will question the need for all of this math. Why, they will ask, do we need to

go through this complex synergy algorithm? Why don’t we just let the business tells us how they view

the capability sets and drive the technical architecture from those?

In fact, in most cases, this is exactly what we do. For most large systems, the majority of functions fall

into obvious sets. It is only when we run into questionable or politically motivated placements that we

need to revert to the formalism of synergy analysis. But we mustn’t underestimate the importance of

this. Even a few misplacements can introduce huge amounts of complexity.

Page 29: Math ofit simplification-103

29 | P a g e

Some other readers will question how mathematical this process really is. After all, they will point out,

the synergy decision is made by human beings and human beings are notoriously fickle. So what have

we gained?

The way we get around the fickleness of human beings is to sharpen the focus of the question as much

as possible. So instead of asking a fuzzy and potentially politically charged question, such as “Which

group should own credit card processing,” we keep our questions tightly focused and politically neutral:

“Is Deposit synergistic with Process Credit Card?” No politics, no vagueness. If the business groups can’t

agree on the answer, we have a very simple question to escalate to the next highest level of expertise.

We must still deal with human beings. But their role is limited to answering pointed questions not much

more complicated than “are these two people born in the same year?”

The SIP Process Now that we have some mathematical foundations, I can explain how SIP accomplishes its magic. Keep

in mind that it is not my goal in this white paper to explain the SIP process. I have done that elsewhere8.

My goal here is to describe the mathematics of SIP.

Recall that most of the SIP activity occurs in the partitioning phase, or what I often call the preplanning

phase. The main goal of this phase is to identify the basic structure of the project partition. To do this we

generate a representative selection of business functions, say 10% of the expected total. We then use

synergy analysis on that selection to drive the partition. This partition will define the structure and

packaging of the technical solution.

Since synergy is an equivalence relation, the partition that is created is unique and can be validated.

Because the methodology is deterministic, there is only one way it can turn out regardless of who is

driving the process and what random selection of business functions happen to be picked for the

analysis.

The partition generated will be the simplest possible, because synergy limits the number of business

functions that will live together. Only those with synergy are allowed to cohabitate. And it also limits the

coordination complexity by grouping together those with the most likely dependencies. The fact that we

can limit the coordination dependencies without knowing what those dependencies are is part of what

appears to be magic.

Remember all partitions generated with equivalence relations have the property of conservation of

structure. This guarantees that the partition we identified with the small group of business functions will

not change as we discover more business functions. This is important because upwards of 90% of the

business functions and all of their dependencies are still unknown during the partitioning phase. They

won’t be discovered until the design phase or even later.

8 See, for example, my book Simple Architectures for Complex Enterprises, Microsoft Press.

Page 30: Math ofit simplification-103

30 | P a g e

Even after implementation, the conservation of structure gives the system resiliency against future

changes. New functions may get added to the system in the future or existing functions may get re-

implemented, but the overall structure is highly stable.

Once SIP has identified how the project should be partitioned into smaller, simpler subprojects, each of

these subprojects can proceed relatively independently by smaller, more focused teams that have

specialized expertise. This means that the requirements of the subprojects are more likely to accurately

reflect the needs of the business, the design of the subprojects is more likely to deliver on the

requirements, and the implementation of the functions is more likely to meet the specifications of the

design. These smaller projects can more successfully use agile development methodologies, if desired.

The main role for SIP in the design phase is to ensure that newly discovered business functions are

assigned the proper subsystem based on synergy. This is critical, because if independent groups start

making their own decisions about where to place newly discovered business functions, the original

partition will quickly degrade. When that happens, all bets are off. Complexity will flood the system.

It is important that dependencies between subsystems are tracked as they are identified. These will

become part of the specifications to the implementers. These dependencies will also be used to define

an integration harness that will eventually unify the various subprojects.

As the subprojects move from design to implementation, SIP takes on another role. SIP needs to ensure

that the partition carved out at the business level is projected precisely onto the technical architecture.

Mathematically, the relationship between the business architecture and the technical architecture

should be an isomorphic projection. An isomorphism is a mapping between two sets of elements such

that every element in one set maps to one and only one element in the other set and vice versa. There

are two isomorphisms of interest to SIP. First, every boundary at the business level should map to a

boundary at the technical level. Second, every coordination dependency at the business level should

map to a coordination dependency at the technical level. Since the technical level will be driven by the

business level, I call this relationship an isomorphic projection (the business architecture projects onto

the technical architecture.)

This isomorphic projection is critical if SIP is to be used to simplify IT systems. Since we don’t analyze IT

systems directly for synergy (we can’t, because IT folks are not qualified to make synergy decisions) we

get the IT simplification indirectly through the business synergy analysis and the isomorphic projection.

This isomorphic projection must be guided by those whose vision encompasses both the business and

technical architecture, a group commonly referred to as enterprise architects. This isomorphic projection

must not only be initiated, but diligently maintained throughout the system lifecycle, otherwise the

simplicity gained in the design phase will be eroded in the maintenance phase.

Page 31: Math ofit simplification-103

31 | P a g e

The net result of this isomorphic projection is an architectural style that is quite unique to SIP. The

contrast between this style and a traditional IT style (e.g. Federal Enterprise Architecture Framework9)

is shown in Figure 15.

Figure 15. Comparing SIP IT Architecture to Traditional IT Architecture

I often refer to the SIP architecture as a snowman architecture, because it appears to be a collection of

snowmen holding hands. The snowman formation is a direct reflection of the isomorphic projection of

the business boundaries/dependencies onto the technical boundaries/dependencies. For more on the

snowman architecture, see my recent white paper with Nikos Salingaros10

The snowman architecture has a number of advantages over a traditional IT architecture such as

- It is less complex.

- It is more amenable to agile development methodologies.

- It is easier to adapt as the business processes change.

- It is more efficient for cloud deployment.

The snowman architecture does have one feature that some will consider a disadvantage: it emphasizes

simplicity at the expense of reusability. This is in keeping with the SIP philosophy that simplicity is the

most important design philosophy. It isn’t that we eliminate the possibility of reuse. We still seek

opportunities for code reuse through shared libraries and, to a lesser extent, through shared services. It

is just that all reuse is accomplished within the tight constraints of isomorphic projection. When we are

forced to choose between simplicity and reuse, simplicity wins.

9 http://en.wikipedia.org/wiki/Technical_Reference_Model#Technical_Reference_Model

10

Urban and Enterprise Architectures: A Cross-Disciplinary Look at Complexity by Roger Sessions and Nikos A. Salingaros available at http://objectwatch.com/white_papers.htm#SessionsSalingaros.

Page 32: Math ofit simplification-103

32 | P a g e

Wrap-up We are not doing a good job building large complex IT systems. Many people feel that the design of

large complex IT systems is an art, not a science. If it is an art, it is a failing art. We are spending too

much to deliver too little too late. This will not change until we change our approaches to delivering

large complex IT systems.

We have tried many approaches to managing complexity. None have solved the problem. I don’t believe

the problem is solvable. We can’t manage complexity. We must get rid of it. And our best hope to

getting rid of complexity is partitioning. We must partition large complex systems into small simple

systems. And we must do it at the earliest possible phase of the system life cycle, before complexity has

had a chance to take over.

But partitioning alone is not the answer. There are many ways of partitioning systems, many of which

make complexity worse and very few of which are optimal. If we are looking for the optimal partitioning,

we need to understand something about the nature of complexity and the nature of partitioning. And

then we need reproducible, verifiable methodologies for leveraging the mathematics of partitions to

address the problem of complexity.

That is the focus of the methodology known as SIP (Simple Iterative Partitions.) In this white paper, I

have discussed the mathematics behind SIP and IT simplification. The mathematical foundations are not

needed to apply the principles of simplification, but they are critical to understanding how simplification

works. And that is the first step in moving IT design from an art to a science. The artistic side of design

may help us appreciate complexity but only the scientific side of design can help us eliminate it.

Legal Notices This white paper describes the mathematics of the methodology known as Simple Iterative Partitions

(SIP). SIP is protected by U.S. patent 7,756,735. Organizations interested in using SIP should contact the

author.

Simple Iterative Partitions is a trademark of ObjectWatch, Inc. ObjectWatch is a registered trademark of

ObjectWatch, Inc.

This white paper is copyright (c) 2011 by ObjectWatch, Inc.. It may be freely redistributed as long as it is

redistributed in its entirety and not changed in any way.


Recommended