Linked Lists This presentation is intended to be viewed in slideshow mode. If you are reading this...

Linked ListsThis presentation is intended to be viewed in slideshow mode. If you are reading this text, you are not in slide show mode. Hit the F5 function key to enter slideshow mode.

MSJ-2

Linked Lists

Introduction and Motivation Linked Lists versus Linear Lists Abstract Properties and Operations Animation of a Dynamic Linear Structure

Implemented as a Linked List

Building (Insertion Into) a Linked List: Simplest Case

Traversal and Traversal-Based Operations

Variations, Embellishments, and Elaborations Bi-directional Lists Circular Lists Headed Lists Summary

MotivationBoth Real World Engineering and Pedagogical

Linked lists are the simplest of the fundamental computer science objects known as dynamically linked structures They are dynamic in that their size can vary during execution as

individual items are inserted into or removed from the list, like the list of students in CS315, which can and usually does change after the semester starts as some students add the course and others drop it

Linked lists are extremely useful in and of themselves – e.g., the CS315 roster – and they provide the basis for even more interesting (both theoretically and practically) objects such as stacks, queues, and binary trees, the most beautiful data structures in the universe

Implementation of even the simplest of linked list operations requires careful thought, selection among multiple design alternatives, and excruciating care with the coding details – all skills that apprentice software engineers need to cultivate

MSJ-3

Linked Lists and Linear Lists

Linked lists are one of two common implementations of a slightly more abstract concept known as a linear list (a one-dimensional array is the other implementation)

Unlike, for example, binary trees, the abstract properties of linear lists by themselves are not to me that fascinating; what makes them worth our study here is their tremendous real world utility and the importance of mastering their linked implementation alternatives that we’ll use throughout CS315 (and that you’ll often use professionally later)

But I promised in my introduction to this course that we would begin our discussion of every data structure this semester with a discussion of its abstract properties so I’ll include them here for logical consistency – but I’ll be brief (for me ;-)

MSJ-4

MSJ-5

Abstract Properties of Linear Lists

Each item in the list except the first has a unique predecessor and each except the last has a unique successor

The uniqueness of the predecessor/successor relationship means that the list can be laid out as illustrated above and is the reason that simple lists are often known as linear lists

A list may or may not be empty, but, in the abstract, there is no maximum size Note that an array does have a maximum size, the one you

declare, and, as far as C is concerned, its size is constant This is a theme we’ll see again and again – an implementation of

an abstract data structure may induce properties or limitations that the abstract structure itself does not have

George Washington John Adams Thomas Jefferson … George Bush Barack Obama

Abstract Operations on Linear Lists

Traversal – visiting (e.g., printing out) each item in the list; comes in two flavors: first-to-last and last-to-first

Search or find – determining whether a given item is present Deletion – of a given item

Note that insertion can’t be defined without more information There are many possible places where insertion could occur

In front of the front Behind the last Somewhere in the middle (where?)

So a definition of “insertion” will require more knowledge about where and why; for linear lists, its behavior is not uniquely defined just from the linearity property

MSJ-6

Abstract Operations on Linear Lists (cont’d)

Any abstract data structure has certain operations and characteristics defined even in the abstract, regardless of implementation

Whether or not a given application needs a given operation on some data structure is an engineering issue, not a theory of data structures one

As an engineer, you need to know what the properties and operations are so that you can pick an appropriate structure for your problem ─ it may have more than what you need, but it better not have less

MSJ-7

Abstract Operations on Linear Lists (cont’d)

For linear lists, the abstract properties and operations may seem pretty obvious and hence this discussion mostly unnecessary, but later this semester we’ll be dealing with data structures whose properties and operations are much less obvious and I want to start out doing things properly, i.e., defining our structures in the abstract before moving to questions of implementation

MSJ-8

MSJ-9

Linked Lists

Introduction and Motivation Linked Lists Versus Linear Lists Abstract Properties and Operations Animation of a Dynamic Linear Structure

Implemented as a Linked List

Building (Insertion Into) a Linked List : Simplest Case


Variations, Embellishments, and Elaborations

MSJ-10

Example of a Dynamically Linked Data Structure

Suppose we want to write a program that works with a set of numbers whose cardinality varies widely – perhaps a variable number of temporary data entry clerks input the numbers one at a time until they get tired and tell our program “no more”; we couldn’t know in advance how many clerks we would have available on any given day nor how many items any given clerk would actually enter in any given day If our code declared too big of an array size, most of the

time we'd be wasting most of it But if we declared too small an array and filled it up, we

couldn't keep going without major problems

A dynamic data structure whose size can increase one cell at a time whenever we want it to is what we need

A linked list implemented with pointers is an important technique for implementing such a dynamic data structure

MSJ-11

Dynamically Growing a Linked List

start

18 -2 461 23

The next time we get a new number, we again get some new space for it and link it in to the end of the growing list

Note that the list elements are not simple integer cells but structures of some sort, since each must have space for a pointer in it as well as the integer

314

And so on ...

Notice that the structures in the list have no names; they can only be accessed via some pointer – names are part of the communication from the programmer to the compiler; these objects, above, are not compiled objects; they are created dynamically during program execution, the compiler is long gone

When we get a number, we use malloc to dynamically obtain some storage space for it and link it into the list

MSJ-12

Linked Lists

Introduction and Motivation

Building (Insertion Into) a Linked List:Simplest Case



start

Example of Building a Linked List

• ••

Although the code for this version of insertion is simple enough, it’s not actually the very simplest version to code.

Insertion at the front of the list, new items being inserted before (to the left of) previous items is actually slightly simpler, so to keep our sample code as simple as possible, that’s what we’ll look at

Later, we’ll also look at why and how we might insert in the middle of the list rather than at one of the ends

The previous animation was designed for simplicity of imagery, to get us started visualizing the concept of a linked list

start pointed to the original (oldest) item in the list, new items being inserted to the right, at the far end of the list, after previous items; the list growing in the direction of its links

older newer

Animation for the Code Example We’re About to Do

MSJ-14

start

integerpointer to our structure

0

start will always point to the most recent item in the list, new items being inserted in front (to the left) of older items

And we’ll color code the various components of our structure so help us tell them apart

This is what we’ll code:

• ••

while (???){

}

while (???){

}

temp =temp =

typedef struct listItem{ int someInteger; struct listItem *next;} LIST_ITEM;

typedef struct listItem{ int someInteger; struct listItem *next;} LIST_ITEM;

LIST_ITEM *start = NULLLIST_ITEM *start = NULL

MSJ-15

, *temp;, *temp;;;

??? =printf("Enter your integer: ");scanf("%d", &(temp->someInteger) );printf("Enter your integer: ");scanf("%d", &(temp->someInteger) );

The Creation of a Linked List:The First Insertion

start = temp;start = temp;

start37

temp

We'll also need a temporary pointer to hold the address returned by malloc for a new cell before it's linked in to the list

temp->next = start;temp->next = start;

mallocmalloc Now let's look at the logic to add a

new LIST_ITEM into the list

We'll do this inside of a while loop since we don't know how many items the user will want to add

First, we need to get some new memory for our new cell with our old friend the malloc function

(sizeof(LIST_ITEM));(sizeof(LIST_ITEM)); The malloc will return the address for the

new memory it just allocated to us

We need to store that address in a pointer variable somewhere or we'll never be able use this new chunk of memory in the future

Now that our typedef is set up, we need to declare a pointer to the start of the list, like the one we saw in the simpler animation earlier

Since we want to insert our new LIST_ITEM in front of the current start of our list, we need to make our new one point to the item currently at the front, i.e., pointed to by start

At this point in our example here there aren't any items in the list so we just wind up setting temp->next to NULL (the current value of start) which is fine, it indicates that this item is the last item in the list, which it will be – when there's only one item in the list, as there is for now in this current example, it is both the start and the end of the list, no? Anyway, now we'll ask the user to provide

an integer value for our new list item

Here's where we want to store the user input; and this integer cell can be referenced as temp->someInteger, correct?

Also note that at this point we actually have a completely valid linked list; it's just empty at the moment

Note the use of sizeof(LIST_ITEM) to make sure that malloc obtains the right amount space for us

Note that we still need to put the * in front of the variable name to indicate that what we want to declare here is a pointer to a LIST_ITEM, not a LIST_ITEM itself, which, I'm sure you remember, is what we'd get if we left out the * in front of the name temp

Without the typedef, this line would be struct listItem *start = NULL; which is completely equivalent, except possibly for readability

And this statement, by setting start equal to temp makes start point to the same place as temp does, thus completing the insertion of the new LIST_ITEM into our list at the very front

Note that this statement and the one before have to be done in exactly the right order or the list will wind up looking rather strange!

Note that the & is used normally to provide an address to scanf for the location of the storage for the user's input

As it happens, we could have left out the parentheses and just written &temp->someInteger, but unless one remembers the operator precedence tables, better safe than sorry; and I think the version here is easier to read in any event, no? Note that the name LIST_ITEM here needn't

have any relationship to the name listItem, above, as far as the compiler is concerned

Stylistically, however, it seems a good idea to make them related somehow, and C is case sensitive, so all caps for typedef's is a common convention (but not a requirement)

This definition defines a recursive structure, also known as a self-referential structure, since each item of this type contains a pointer to another item of the same type

Here it's embedded within a typedef just to make the rest of the program a little easier to read

Look at the next line (when it gets here ;-) and think about how you'd write it if there were no typedef

That's one of the major uses for a typedef, brevity and improved comprehensibility

Although this very first time through this insertion loop we could store the address returned by malloc directly into start instead of temp, that won't work for any subsequent insertions

On the next slide, we'll see that this logic we’re writing here is general enough so that it actually works for all our insertions

At this point we have completed the insertion of the first element into a previously empty list

The temp pointer actually still points into our list, too; but so what? We're done with it for now and won't refer to it again until we create a new item

Our list is the structure (set of items) pointed to by start; the existence and properties of that list are not affected by the existence of other “leftover” pointers pointing to any of those items

start, or whatever other name you choose in your code, always points to the complete list; temp is just that, a “working” variable used to help construct the list, but not part of the list itself – a more descriptive (i.e., better) name for it would have been newItemForInsertion, but I didn’t have room for anything that long on these charts

MSJ-16

The Creation of a Linked List:More Insertions

start37

temp

523-2

while (???){

}

while (???){

}

temp =temp =

, *temp;, *temp;;;

mallocmalloc (sizeof(LIST_ITEM));(sizeof(LIST_ITEM));

start = temp;start = temp;

typedef struct listItem { int someInteger; struct listItem *next;} LIST_ITEM;

typedef struct listItem { int someInteger; struct listItem *next;} LIST_ITEM;

LIST_ITEM *start = NULLLIST_ITEM *start = NULL

temp->next = start;temp->next = start;

printf("Enter your integer: ");scanf("%d", &(temp->someInteger));printf("Enter your integer: ");scanf("%d", &(temp->someInteger));

This little while loop of five statements – and two of them are printf and scanf, which are really not about linked lists at all – is all it takes to build a linked list as long as we want, until our operator tells us to quit, for example, since it's 5PM

Let's watch it at work a few times

And so on …

This little while loop of five statements that we just developed – and two of them are printf and scanf, which are really not about linked lists at all – is all it takes to build a linked list as long as we want, until our operator tells us to quit, for example, since it's 5PM

Let's watch it at work a few more times in our loop

Reset the start pointer to point to the new item that is now at the front of the list

Make the new item point to the item currently at the front of the list, thus positioning the new item in front of the old start of the list

MSJ-17

Linked Lists



Traversal and Traversal-Based Operations Traversal Search/Find Deletion Insertion into the middle (building an ordered list)


MSJ-18

TraversalA Fundamental Operation on Linked Lists

(and Linked Structures in General, For That Matter)

trvPtr

start37523-2

LIST_ITEM *trvPtr = start;LIST_ITEM *trvPtr = start;

Here’s the traversal code

It’s pretty simple; let’s step

through it

We’re still not done with the traversal yet …

37523-2

}}trvPtr = trvPtr->next;trvPtr = trvPtr->next;

{{while (trvPtr != NULL)while (trvPtr != NULL)

printf(“%d”, trvPtr->someInteger);printf(“%d”, trvPtr->someInteger);

Since this is an assignment statement, we’re going to change the bit pattern that’s stored in trvPtr

Since trvPtr is a pointer, that means we’re changing what it points to

But first, to know what to store in trvPtr, we have to evaluate the expression trvPtr->next

Since the traversal pointer is still pointing to a valid item, we’re not done with the traversal yet …

… so let’s visit the current item …This statement’s syntax and

semantics are very typical of those we use in navigating linked structures in general, so let’s make sure we understand the details of how it works

Currently, trvPtr contains the address of (designates, or points to) this structure

So the current value of the expression trvPtr->next is the bit pattern stored here, in the component named next of the structure currently pointed to by trvPtr

Since next is a pointer, what’s stored here is an address, which we represent in these diagrams as an arrow, so the value of the expression trvPtr->next is this arrow

So it is this address (arrow) that will be stored in trvPtr by the assignment statement we’re currently executing

We’ll need a traversal pointer to keep track of where we currently are in the list

We’ll initialize it to the start of the list, since, as the name suggests, that’s the usual starting point for doing much of anything with a linked list ;-)

… and then move on to the next one

So the result of this statement is to advance trvPtr to the next item in our list

And since trvPtr is now NULL, we’ve reached the end of the list and are therefore done with the traversal

So we need a separate traversal pointer that we can change

We’ve lost access to -2 and can never get it back

To traverse a data structure is to go through it one item at a time, “visiting” each item in turn

The purpose of the visit is application dependent; maybe, for example, we just want to add up all the values in the list, or maybe we’re searching to find some specific value

For our example here, where we’re concerned with the mechanics of traversal and not what we do on a visit, let’s just print out the integer in each item as we visit it

Note that we do not want to use our start pointer as our traversal pointer

If start is the only pointer we have to the start of our list and we change it, as we’re certainly going to do with our trvPtr, we’ll lose forever all ability to access the actual start of our list

Reminder On Interpreting and Using the Graphics of Linked Structures

MSJ-19

start 37523-2

The arrows portray the logical connectivity, or topology, of our structures, which is all that we really care about

The picture above emphasizes that this structure is a linear list (one item after another), other topologies are possible and used for other purposes; we’ll look at some later this semester

The actual layout in memory need not look anything at all like our logical view, so long as the pointers still provide the correct topology; here, below, is a partial memory map showing how our linear list might actually be laid out in memory:

37 -2 5 23

star

t

••• •••

Traversal Is a Fundamental Operation

It is important all by itself: E.g., print out the list of all students enrolled in CS315

It is used at the beginning of several other key operations: Search/find Delete Insert-in-order

MSJ-20

Search/FindE.g., Look Up a Phone Number Given a Last Name

Traverse the list, checking each item to see if it is the one being searched for, stopping either when we reach the searched for item or when we complete the traversal and have no more items to check

If we find what we were searching for, the search is said to be successful

If the traversal completes without finding the desired data, the search is unsuccessful; the target item is not present in the list

MSJ-21

MSJ-22

start 37523-2

delPtr

deleteItem(5);Delete a Specified Item

High level pseudocode:

Search for the item to be deleted using the standard search/traversal logic

If the search is unsuccessful, do nothing – or maybe report that the item to be deleted was not found

If the item to be deleted is found, adjust list pointers as necessary to remove it from the list

But how do we designate this cell that we want to set to delPtr->next ?

= delPtr->next;???

We know what to set it to:

??? = delPtr->next;

Item 5 has been deleted from the list

Note that the item itself still exists; it’s just not part of the linked list anymore: item 23 now points to item 37

As far as the list itself is concerned, all we need to do to delete the item is to adjust one pointer

There are some interesting issues here; let’s look at an example

Here’s the situation after we have traversed the list and successfully found the item to be deleted

Let’s say, for example, that it’s the item containing 5 that we want to delete

As a programming matter, we should save the address of this item in some pointer variable (it’s currently in delPtr)

Maybe we’d want to insert it into a different structure after we deleted it from this one – e.g., after a student flunks CS315, remove the student record from the list of current Computer Science majors and then insert it into a list of Sociology majors

At the very least, if we were really all done with this item, we’d want to return its storage to the operating system

Non trivial question: Just exactly how do we designate the pointer we need to adjust?

free is the system service call that returns memory to the OS, e.g., free(delPtr);

free is thus the opposite of malloc

Note that we are freeing up the memory that delPtr points to, not actually delPtr itself

As a matter of good programming practice, everything obtained via malloc should eventually be explicitly returned to the OS via a free call – as part of a program’s cleanup-before-shutdown processing, if not before

But since what we do with an item after deleting it from a structure is application dependent, we won’t show this step in our code here; we’re only concerned with the theory and practice of deletion

MSJ-23

Delete a Specified Item (cont’d)Adjusting the Pointers

start 37523-2

delPtr

deleteItem(5);

trail

Designating the pointer to the item to be deleted (bypassed) will require some additional work

There are two more or less obvious approaches

lead

Note that there will be a special case to handle the deletion of the very first item in the list:

/* Traverse/search to find the item to be deleted */

if (lead == start) /* It is the first item that is to be deleted */ start = start->next;else /* “Normal” deletion logic */ trail->next = lead->next

deleteItem(2);

So here’s the actual deletion code:

trail->next = lead->next;

One standard method involves the use of two traversal pointers: a leader and a trailer, where trailer is always kept one item behind the leader so that when the leader finds the item to be deleted, the trailer’s next pointer is the one that must be adjusted to actually do the deletion of the item pointed to by the leader

Reminder In terms of code development, do the general case first; in this case,

that’s when item to be deleted is somewhere in the middle of the list

Then figure out if you need special cases – i.e., places where the logic for the general case won’t work

Typical special cases involve working at the ends of the list

In this example, the general purpose logic works correctly to delete the last item, but fails to delete the first item properly, so here we only needed one special case; sometimes you’ll need more, sometimes fewer; but when trying to figure out your algorithm, start with the general case

And start by drawing/annotating pictures showing which pointers get adjusted when and to point to where

Only then, when you’re sure you’ve got a workable algorithm, worry about translating it into code

That’s known as separation of concerns, a more formal name for doing one thing at a time

Another Way to Delete a Specified Item

start 37523-2

delPtr

deleteItem(5);

MSJ-24

An alternative to the two pointer (leader/trailer) method is to only use a single, trailing pointer at the cost of slightly more complicated expressions in both the “find-the-item-to-be-deleted” logic and the actual deletion logic itself

delPtr->next = (delPtr->next)->next;

while ((delPtr->next)->someInteger != 5) delPtr = delPtr->next; while ( delPtr->someInteger != 5 ) delPtr = delPtr->next;

Since this item, designated by delPtr->next, is itself a structure that has components, (delPtr->next)->someInteger designates this component, named someInteger, so the value of the expression (delPtr->next)->someInteger is 5

Similarly, (delPtr->next)->next designates the next field here …

… leaving us no way to designate the pointer that needed to be adjusted to actually do the deletion

… by copying this value … … into this cell

… so here’s the pointer adjustment logic that does the actual deletion …

Here’s the loop condition we now want to use so that the loop stops with delPtr pointing to the item before the item to be deleted …

… but whose next field is the one that will have to be adjusted to actually do the desired deletion

And we have, as before, deleted item 5 from the list, although, as before, the item itself continues to exist

Note that if the illustration here were the whole story, we never saved the address of this cell anywhere (it’s what used to be in delPtr->next), so now we can’t access this cell anymore

We can’t even give it back to the OS, since the free call requires the address of the cell to be freed up, and our code has no record of it anymore

Oopsy, we probably should have saved the old value of delPtr->next somewhere

It’s important to understand the meaning of expressions like this that have multiple -> operators

C programming note: C evaluates multiple -> operators from left to right so the parentheses can be omitted from (delPtr->next)->someInteger

E.g., delPtr->next->next evaluates to what we want and to me, is slightly easier to read; this is a rare case where knowing and taking advantage of the C operator precedence rules and leaving out unnecessary parentheses makes code more readable, not less

Here’s our old traversal loop, that stops when delPtr points to the item to be deleted …

delPtr->next designates the next component of the item pointed to by delPtr

Since the next component is itself a pointer type, the value of the expression delPtr->next is an address/arrow that points to this, the next item of our list

if (start->someInteger == valueToBeDeleted) start = start->next; /* Delete the very first item in the list */else /* Traverse to find the valueToBeDeleted */ { delPtr = delPtr->next; delPtr->next = (delPtr->next)->next; /* Do the deletion */}

Still Need to Check for Special Cases

start 37523-2

delPtrMSJ-25

Putting the traversal ‘while’ loop as one case of an enclosing ‘if’ statement solves the problem

while ((delPtr->next)->someInteger != delPtr = delPtr->next;

And the same general case logic we used before correctly handles everything else

Note that this time the general case traversal is inside an ‘if’ statement whereas last time (separate leading and trailing pointers) the traversal logic was the same in all cases and only the actual deletion logic itself had the special case which resulted in the ‘if’ statement being inside the while loop

There’s no general pattern to special cases; the only firm rule to follow is that one should develop the general case logic first, then figure out what the special cases are (they vary from problem to problem) , then add logic for them wherever and however necessary

valueToBeDeleted)

Here’s the special case our general logic can’t handle, deleting the first item in the list

delPtr = start;

As before, when we check for problems we’ll see that the general case logic doesn’t handle all possible cases

To make this general case look truly general, let’s replace 5, the specific value used in the last example, with a more general valueToBeDeleted

One problem with this “single trailing pointer” approach is that since delPtr is initialized to start, the first item the traversal loop, above, actually looks in for the valueToBeDeleted is actually the second item in the list – i.e., it starts looking at 23, not at 2

And there’s no way to initialize delPtr to anything earlier than the start so that this code could check the first item in the list

The result is that this code, above, can’t find the item containing -2 and so can’t delete it

5)

if (start == NULL) /*List is empty */ printf(“Can’t delete from an empty list, dummy”);else if (start->someInteger == valueToBeDeleted) start = start->next; /* Delete first item from list */else /* Traverse the list looking for the valueToBeDeleted */ { delPtr = start; /* Initialize the traversal pointer */ while ( (delPtr->next != NULL) && (delPtr->next->someInteger !=

valueToBeDeleted) ) delPtr = delPtr->next; /* Move to next item */ if (delPtr->next == NULL) /* Fell off the end of the list */ printf(“Can’t find %d in the list”, valueToBeDeleted) else /* The traversal loop found the item to be deleted */ delPtr->next = (delPtr->next)->next; /* Do the deletion */ }

More Special Cases, Still Using Only One (Trailing) Traversal Pointer

MSJ-26

The code above covers all the basesstart

delPtr

37523-2

The special case for deleting the first item in the list

The empty list case

We should also deal with the case that the valueToBeDeleted isn’t in our list at all, which comes in two flavors:

The list is empty

The list isn’t empty but doesn’t contain the valueToBeDeleted

We didn’t check for these cases in the two pointer (leading and trailing) code that we developed earlier; we should have

Maybe the same code can cover both cases here, maybe it can’t; we’ll have to check and see

In any event, let’s look here at the complete code for deletion using only a single (trailing) traversal pointer

The search loop terminated when it found the valueToBeDeleted (that’s 5, in this single trailing pointer example illustrated here, not 23), this line of code deletes it

The search loop looks for the valueToBeDeleted

The search loop completed without ever finding the valueToBeDeleted

Note that we need to add a check to our traversal/search loop to keep us from advancing beyond the end of the list and then trying to de-reference a NULL pointer, which would cause our program to blowup

Also note that we are relying on the fact that the && operator is a short-circuit operator to keep our code from trying to evaluate delPtr->next->someInteger when delPtr->next is NULL

MSJ-27

Linked Lists



Traversal and Traversal-Based Operations Traversal Search/Find Deletion Insertion into the middle (Building an Ordered List)


Ordered Lists

MSJ-28

start 3730105

In an ordered list, items are ordered on the basis of some data item, known as the key, as illustrated in the ascending list below

Traversal, search, and deletion are the same as we saw before; but insertion is different

The earlier (simpler) insertion animations only inserted at one end of the list

Now, to keep the list in order, we’ll usually need to insert into the middle, no?

Suppose we wish to insert a node with a key of 24 into the list below, how would you go about it? Here’s some pseudocode:

newItemForInsertion 24We need to insert the new 24 node before the 30 one

Traverse the list to find the place to insert the new item so as to preserve the ordering

Adjust the appropriate pointers to actually do the insertion

MSJ-29

Linked Lists

Introduction and Motivation Building (Insertion Into) a Linear Linked List Traversal and Traversal-Based Operations

Variations, Embellishments, and Elaborations Bi-directional lists, including fascinating

(or at least important ;-) sidebars on: Topology again l-values, r-values, and how compilers really process an

assignment operator; all of which are necessary to understand expressions like a->b->c->d->e = a->b->c->d->e

Circular Lists Headed Lists Summary

Bi-Directional Lists

MSJ-30

start

Just as a structure can have more than one integer or floating point component, if we find it useful for whatever problem we’re trying to solve, it can certainly have more than one pointer struct biDirectionalListItem{ … struct biDirectionalListItem *previous, *next;};

Since all we’re interested in is the list mechanics, I’m not going to bother making up other data items to be past of our list items

So here’s what a bidirectional list might look like:

Later we’ll look at other topologies – multi-linked structures that are not linear (a bi-directional list is still linear) such as orthogonal lists (for sparse matrices) or binary trees, the most beautiful data structures in the universe

Note that here we used multiple (two, to be precise) pointers of the same type; since that’s all we need for a bidirectional list

More complex problems (not simply bidirectional lists) might require multiple pointers of multiple types

Benefits for Bi-Directional Lists

It makes deletion a lot cleaner (neither of our two previous deletion algorithms was exactly elegant, were they?) After we find the item to be deleted, we don’t need leading or trailing

pointers to help with the deletion

MSJ-31

EDCBAstart

delPtr Insertion into an ordered list is simplified, too, much as deletion is

Traversal in reverse order becomes possible; without a great deal of difficulty, it isn’t possible with a one directional list

i. Is that important? It depends; some applications need to be able to traverse in both directions, some don’t

ii. Your job, as an engineer, is to know the capabilities, plusses, and minuses of each of the tools/techniques in the standard armamentarium

delete(D)

Sidebar: Programmers and Topologies

MSJ-32

It’s up to you, the programmer, to insure that you set up the pointers so that your lists have the desired topologic properties – e.g., bidirectional linearity

If you wanted to, you could use the struct biDirectionalListItem that we defined on the last slide to make a linked thingy that looked like so:

I can’t imagine why anyone would want to create a monstrosity like that (although some of you will probably manage something like it the first few times you try to create a bidirectional linear list ;-) but each item in the monster thingy would still be a struct biDirectionalListItem; only the connection topology would be different

C provides features to support pointers and structures, lists are up to you; there are languages that directly support lists and basic list operations (LISP, for example), but neither C nor any of its descendants are among them: the topologic properties of linked structures are up to the programmer

EDCBA

start

Another Sidebar (Fairly Important): The Assignment Operator (or ‘=’ Sign)

The point of this digression is make sure you know what really happens when you write things like

delPtr->next->previous = delPtr->previous, and

delPtr->previous->next = delPtr->next,

the two lines used to do the deletion from the bidirectional list

There’s nothing particularly new or tricky here, but its crucial to what we’re doing so I want to go over the concepts pretty precisely

MSJ-33

EDCBAstart

delPtr

delPtr->next->previous = delPtr->previous

delPtr->previous->next = delPtr->next

Delete(D)

Another Sidebar: The Assignment Operator In More Detail


The compiler must do three things for an assignment (=)1. For the left side:

a) Figure out the type of the value of the expression to the left of the = sign (e.g., delPtr->next->previous); it must be an address value (pointer type) or the expression won’t compile

b) Generate the code to evaluate that expression in real time when the assignment is executed

2. For the right side:a) Figure out the type of the value for the expression to the right of the = sign; it

must be a type that’s legal for the address from the left-side expression to contain (e.g., delPtr->previous = 3.1416*diameter won’t compile)

b) Generate the code to evaluate that expression in real time when the assignment is executed

3. Generate the code to store (assign) the computed value from the right hand expression in the memory location whose address will be computed by the code generated for the expression on the left side

MSJ-34

The Problem with an Overly Simplistic View of the Assignment Operator (cont’d)

MSJ-35

But given, for example, the declarations int x,y; if the assignment operation looks like x = y+2, the compiler would seem to have a problem: As far as we know so far, the type of a simple variable, like x, in an expression is the type the variable was declared as, int, in this case

But when we look more formally at the details, as we just did, the assignment operator requires the expression on the left to be an address type, not an integer

The Solution

MSJ-36

So here’s the compiler’s real logic:

Ordinarily, the value of a simple variable name like x is what we expect, i.e., the value stored there and the type of that value is the type the variable was declared as

But, if the next operator to the right is an assignment operator*, the value of the variable name is the address of the variable and the type of that value/expression is the correct pointer type – e.g., an integer pointer if x is an integer

* Not just =, but +=, -=, *=, /=, %=, &=, &&=, ^^=, etc; several of these are used rarely, if ever, but the compiler allows them and they are assignments

The Exception in Evaluating the Assignment Operator (cont’d)

MSJ-37

To continue being technically precise, a simple variable name is thus overloaded in C, as in most modern imperative languages: It can be evaluated by the compiler to one of two completely different values (bit patterns) depending on where it is in a larger expression

To the immediate left of an assignment operator, its value is its address; elsewhere, it’s the bit pattern stored at that address (unless we explicitly use the & operator to ask for its address)

The two different possible values for the same variable name are referred to as its l-value and its r-value

The l- (or left) value being the address The r- (or right) value being the bit pattern stored there

Which value the compiler chooses to use depends on where the variable name is, if it’s in an assignment statement

Sidebar: Assignment Evaluation (cont’d)

The evaluation of an expression like

delPtr->next->previous = delPtr->previous has the same issue and the same resolution

MSJ-38

15105

delPtr

Because it’s to the right of the = sign, the value of this expression will be the bit pattern stored in the cell designated by the expression delPtr->previous

Because both expressions are the same type, struct biDirectionalListItem *, or pointer to a struct biDirectionalListItem, the compiler is happy to generate the code to do the evaluations and make the assignment

Because it’s to the left of the assignment operator, the value of this expression will be the address of the cell designated delPtr->next->previous, namely the address of the cell for the component named previous of the structure pointed to by delPtr->next

Last Sidebar: C is Being Nice to Us for a Change


MSJ-39

15105

delPtr

Each -> operator here …

… corresponds to one arrow that must be followed to evaluate (understand) the expression

Note the lovely (and hardly coincidental) correspondence between the syntax of C and the pictures we need to understand the semantics

… but if the highlighted expression above were not to the left of an assignment operator, it would be evaluated to its r-value, which, since the type of a previous cell is a pointer type, is the address of some other cell,

The highlighted expression, above, is designating the component cell named previous in the structure pointed to by the highlighted arrows in this picture

If the expression is on the left of an = sign, as it is here, its value is the l-value (address) of this cell …

delPtr->next->previous->previous->next->someInteger

I can’t imagine ever needing to write such an expression, usually we’re working “closer” to some named pointer (e.g., delPtr), but if we need to, C allows it and you now know how to decipher it (congratulations ;-)

Got it? OK, then how about this as a possible exam question: Given the picture below, consider the expression delPtr->next->previous->previous->next->someInteger

a) Is it legal (will it compile)?

b) If so, what is its r-value?

MSJ-40

Linked Lists

Introduction and Motivation Building (Insertion Into) a Linear Linked List Traversal and Traversal-Based Operations Variations, Embellishments, and Elaborations

Bi-directional Lists Circular Lists Headed Lists Summary

An empty circular list

Circular Lists

MSJ-41

aPtrToTheList

What’s this?

Not as ubiquitous as linear lists, but still quite useful Often used in operating systems – which we’ll explore in a

bit more detail in CS420, e.g.: Round robin scheduling Contiguous memory management with a first fit allocation policy

This is the general case illustrated here

There are two special cases, one pretty normal for linked lists in general, one not

And what line of code could you write to achieve it?

Note that many authors (but not all) consider a circular list to be merely an implementation technique for a linear list, since each item still has a unique predecessor and a unique successor

The successor of an item is the item it points to

The predecessor of an item is the one that points to it

If, as shown here, the list is implemented uni-directionally, it won’t have pointers to predecessors, but that’s an implementation issue, not a topological one; i.e. – the property that each node has a unique predecessor and a unique successor is the abstract, or topological, property, known as linearity, regardless of whether or not the underlying implementation is uni- or bi-directional (or circular)

MSJ-42

Linked Lists


Ordered lists Bi-directional lists Circular lists Headed lists

Basic Concept Example: Sparse Matrices

Summary

Headed ListsA.k.a. Lists With Sentinels

A list header (a.k.a sentinel node) is an item in a linked list that Is the first item in the list (the head of the list) May not be deleted Has some special (application specific) semantics or meaning

associated with some value in some field that identifies it a sentinel (header) node ─ the examples will make this clearer, I hope; bear with me

Headed lists are very useful and very common; some problems cannot easily be solved any other way

We’ll see a particularly lovely and important example when we look at an implementation of sparse matrices via orthogonal lists, but let’s start with a simpler example

MSJ-43

-4 78 52217start start

Note the difference: Here, start is a declared variable, but it is a simple pointer not a structure …

… and the sentinel must be created dynamically and any deletion algorithm must be sure not to later delete it by accident

Sometimes, the header is a declared node that starts the list, all other list nodes being created normally (i.e., dynamically, during execution)Whereas here, start is a declared structure …

Anyway, the motivation for all this stuff will, I hope, become much clearer when we look at orthogonal lists to represent sparse matrices, coming up next

It (sparse matrices) really is a pretty solution to an important problem

Remind me to bring in and show you a fairly good textbook (Gollmann, Computer Security), where the author, whom I otherwise mostly like, states that a certain implementation of a key data structure in computer security can’t be used in most cases because there’s no good, general solution to a problem with one aspect of the implementation

Bullshit – you’ll know how to solve that problem after a few more charts here

For our simple example, here’s a list is to contain only positive integers, but for this example, assume we also need to keep track of the length of the list for some reason

Rather than declaring a separate variable for length, store the length in a sentinel node, but make it negative so as to ensure that the header (sentinel) is not mistaken for an ordinary (positive) node

MSJ-44

Linked Lists


Ordered Lists Bi-directional Lists Circular Lists Headed Lists

Basic Concept Example: Sparse Matrices

Summary

Introduction and Motivationfor Sparse Matrices

Although sparse matrices themselves are interesting and important objects, they don’t really belong here since they’re not linear lists

But they are built from linear lists and what interests us here is that the lists must be headed or we can’t get this sparse matrix structure to do what we need to

So looking at sparse matrices will give us a chance to see what drives the need for headed lists and how we work with them

MSJ-45

A Multi-Linked Structures Example:Orthogonal Lists for Sparse Matrices

A sparse matrix is one where the majority of the entries in the matrix are 0

Economists, for example, might want to keep track of the extent to which changes in the price of one commodity, product, or service (CPS) are correlated with changes in others They prepare a complete list of CPS’s, possibly millions of

entries long for a large national economy, then make a square matrix M, where each entry 0 ≤ mi,j ≤ 1 is the correlation coefficient between CPSi and CPSj

But the vast majority of the mi,j are all 0; I mean, how much do you think the price of steel correlates with the price of, for example, bubble gum?

MSJ-46

A Multi-Linked Structures Example:Orthogonal Lists for Sparse Matrices (cont’d)

The natural representation of a matrix in a programming language like C is obviously a 2-dimensional array

But if there were a million commodities, the array would have a trillion entries; if most of them were 0, that would be filling a lot of memory with zeroes; that seems wasteful

Very few modern systems will let you use a 106 x 106 array in any event

Even if such a declaration compiled, it would probably blow up in execution (that’s what happens on prclab)

MSJ-47


What we want is some other data structure (not an array) that just stores the non-zero elements of M

To find the value of some mi,j, we search the structure; if the search is successful, we know the (non-zero) value of mi,j, if the search is unsuccessful, we know its value is 0

One implementation technique for sparse matrices involves what are called orthogonal lists

MSJ-48


Here’s a diagram of what each member of this sparse matrix structure* would look like:

Each item is a member of two separate lists All the non-zero items in row i form a single ordered list, ordered

by j (their column number), linked by their nextInRowPtr All the non-zero items in column j form a single ordered list,

ordered by i (their row number), linked by their nextInColPtr

MSJ-49

i and j are commodity numbers

mi,j is the correlation coefficient between

commodity i and commodity jnextInColPtr nextInRowPtr

i jmi,j

* The term structure is overloaded here. The sparse matrix is an example of the sort of theoretic object called a data structure that we study in computer science, particularly in CS315. As an implementation matter, each element of the sparse matrix will be a structure in the C programming sense, (Many other languages don’t use the word “structure” this way; Ada, for example, calls such things “records”.) Anyway, the figure, above, is a pictorial representation of the C structure that would comprise one element of a sparse matrix; the next slide illustrates how such elements fit together to make a sparse matrix

152 -1-

168 -1-

882 -1-

617 -1-

927 -10.46

MSJ-50

An Example of a Multi-Linked Structure:Orthogonal Lists for Sparse Matrices (cont’d)

882 8010.17

168 6670.62

168 8010.38

152 3260.31

882 6670.53

617 7130.22

927 7130.46

-1 -1-

-1 326-

-1 667-

-1 713-

-1 801-

startOfTheMatrix

The answer is to add a row of column headers to our structure

Then, after we insert the new node into the correct row list by searching the column of row headers, we can then search the row of column headers for the correct column (713 in this case), creating a new column if necessary …

One answer could be to search the entire matrix, which we can do now that we added the column of row headers, looking for entries that have the column number we want

But that seems awfully inefficient; there’s got to be a better way

OK, things are going pretty well so far; we can search this structure to find out if mi,j is in it

How about insertion? Let’s try inserting m882,713

0.05882 713

Once the correct row header is found or created, insert m882,713 into that row

Since the row lists are ordered by column number, the new node, m882,713, in this example, must be inserted between the nodes for m882,667 and m882,801

… and then insert the new node into its column list, column lists being ordered by row number, of course

Now the search logic is easy:

Traverse the column of row headers by following the nextInColPtr

Each time you move down to a new row, traverse it by following the nextInRowPtr before moving down to the next row

So here’s an illustration of a (very!) sparse 1000 x 1000 matrix having only 7 non-zero elements in it: m152,326, m168,667, m168,801, m617,713, m882,667, m882,801, and m927,713

Not very realistic, but it makes the artwork here a lot easier and it will still show the key issues we’ll have to deal with

The solution here is to add a column of row headers

The column is an ordered linked list, ordered by row #, linked by the nextInColPtr

A row of our sparse matrix will have a row header in this column if and only if the matrix has at least one non-zero value in the row

Assume, for example, that you wish to know the value of m56,492

Since this data structure is not an array, you can’t simply ask C to retrieve it for you by writing m[56][492]

As discussed earlier, your code has to search for it

Where do you start? And what happens, for example, when you get to m152,326 or m927,713, where do you go next and how do you get there?

What we’re really asking here, of course, is how do we traverse this beast? It’s not obvious

We’re trying to insert m882, 713

Inserting the new node into the correct row list is easy enough:

Search the column of row headers to find the correct row number, 882

In the example here, row 882 already exists; but if that row header didn’t exist, it would mean that there were no non-zero elements in that row yet; so we’d create the new row by inserting a new node with row number 882 and column number of -1 into the ordered column of row headers

The problem now is finding the column list for column 713, if it even exists, which in this example it does, but of course we won’t always be inserting into existing columns

If column 713 does exist, we need to find it so that we can find this node, so that we can insert our new node after it by adjusting the nextInColPtr pointer here and in our new node

So how do we find this node, or figure out that the entire column doesn’t even exist yet?

?

Here is the Complete Algorithm (Pseudocode) for the Insertion of mi,j

MSJ-51

start

Create the node for mi,j Row insertion:

Search the column of row headers to find the header for row i If row i does not yet exist, create a header node for row i and insert it

into the column of row headers Insert the mi,j node into row i

Column insertion: Search the row of column headers to find the header for column j If column j does not yet exist, create a header node for column j and

insert it into the row of column headers Insert the mi,j node into column j

Note that in this particular problem (insertion into a sparse matrix), the order of insertion into row and column lists is irrelevant; you could do the column insertion first, followed by the row insertion, or the row insertion first followed by the column insertion; there’s no ultimate difference, since the two operations are independent of one another

152 -1-

168 -1-

882 -1-

617 -1-

927 -1-

MSJ-52

The “real” sparse matrix elements

Sentinel Marks

882 8010.17

168 6670.62

168 8010.38

152 3260.31

882 6670.53

617 7130.22

927 7130.46

-1 -1-

-1 326-

-1 667-

-1 713-

-1 801-

… and a negative row # for a column header

0.05882 713

Note that these header nodes are not elements of the sparse matrix from the mathematical standpoint

But to C, a node is a node is a node

The sentinel mark is some attribute of some value for some data item in the node that’s “special” and so can be used to identify a node as a sentinel or header rather than a node containing real data

Here, I used a negative column # for a row header …

startOfTheMatrix

In this example so far, our algorithms haven’t actually needed to identify sentinel nodes, but let’s complicate our life a little bit ;-)

Alternatively, for this example matrix, I could have set the mi,j to some negative value for both row and column headers, since any valid correlation coefficient must be ≥ 0

For that matter, I could actually have used mi,j = 0 to mark a header, since no real element in this structure would have a 0 for mi,j; by definition, the only elements that are supposed to be here are those with non-zero mi,j values

152 -1-

168 -1-

882 -1-

617 -1-

927 -10.46

MSJ-53

Sentinel Marks (cont’d)

882 8010.17

168 6670.62

168 8010.38

152 3260.31

882 6670.53

617 7130.22

927 7130.46

-1 -1-

-1 326-

-1 667-

-1 713-

-1 801-

0.05882 713

Let’s look in slightly more detail at how the traversal algorithm for this structure would work

startOfTheMatrix

currentRowPtrtravPtr

We’ll need a pointer to the current row being traversed, initialized to startOfTheMatrix->nextInColPtr

And a row traversal pointer initialized to currentRowPtr

And here’s a complete traversal algorithm:

while (currrentRowPtr != NULL){ travPtr = currentRowPtr->nextInRowPtr; while (travPtr != NULL) { visit(travPtr); travPtr = travPtr->nextInRowPtr; } currentRowPtr = currentRowPtr->nextInColPtr;}

152 -1-

168 -1-

882 -1-

617 -1-

927 -10.46

MSJ-54

Sentinel Marks (cont’d)

882 8010.17

168 6670.62

168 8010.38

152 3260.31

882 6670.53

617 7130.22

927 7130.46

-1 -1-

-1 326-

-1 667-

-1 713-

-1 801-

0.05882 713

startOfTheMatrix

travPtr

The problem comes at the end of a row …

… we could certainly move travPtr down to a new row by setting travPtr = travPtr->nextInColPtr …

… taking travPtr right back to where we need it to be so that we can move down to the next row

Starting from the startOfTheMatrix…

… and then traverse the new row as before by repeatedly setting travPtr = travPtr->nextInRowPtr until we reached the end (NULL)

… we need to get travPtr back here so we can move down to the next row by following the nextInColPtr; but we have no way to get back here

Now at the end of a row, the normal row traversal movement follows this nextInRowPtr pointer…Now let’s do it with only a single

traversal pointer ‒ no separate currentRowPtr to keep track of what row is being traversed

Of course we recognize by the sentinel mark that we’ve just arrived at a row header rather than an ordinary sparse matrix element and so we must have just completed traversing a row and so it’s time to follow the nextInColPtr to get down to a new row rather than following the nextInRowPtr again and going into an infinite loop ─ circular lists being easily prone to that ;-)

No sweat; let’s just circularize the row lists!

We already saw the row major traversal: We traversed the column of row headers and traversed each new row we came to

To do a column major traversal, we’d traverse the row of column headers and traverse each new column as we came to it

Most implementations would probably circularize the column lists as well as the row lists, so that the matrix could be traversed in either row major order or column major order

That’s it; the implementation of a sparse matrix, a multi-linked structure, built out of orthogonal circular lists with headers with sentinel marks

Pretty slick, no?

Summary of the Sparse Matrix

The sparse matrix implementation we just saw built a multi-linked structure out of orthogonal circular headed lists

The sparse matrix is our first, it surely will not be our last, example of a multi-linked structure, one where each element has more than one pointer component

This is a theme we will see over and over again in CS315: complex data structures being built up out of simpler data structures which in turn are built out of simpler data structures until we eventually get to something the language itself supports (pointers, in this example)

MSJ-55

Summary of the Sparse Matrix (cont’d)

There are multi-linked structure like (yes!) binary trees where the linked elements are not organized into linear lists, but let’s leave that for another day (month, actually ;-)

The reason I put this sparse matrix problem here rather than waiting until later in this course has less to do with multi-linked structures, although it is a great example, than that the sparse matrix is a pretty example (well, I think it’s pretty ;-) of an important real world problem that uses linked lists and that can’t be done without making those lists headed lists

The circularity was added mostly just to torture you – but it did eliminate the need for a separate named pointer to the current row, hardly worth the effort in this case, but under other circumstances, circularity can be more important

MSJ-56

MSJ-57

Linked Lists


Ordered Lists Bi-directional Lists Circular Lists Headed Lists

Summary

Linear List Variants and Embellishments: Summing It All Up

You can mix and match all the linked list variants discussed here as your application requires; they are completely independent and every possible combination – e.g., unordered circular uni-directional with a header, bidirectional with a header but not circular, ordered circular no header, etc, etc – has been used for some problem or other

Your job, as always, is to know what techniques are available (that’s where Gollmann slipped up) and which are called for by what types of problems

Sometimes it’s obvious – if you need to print out ordered lists both forward and backward, bi-directional lists are clearly the way to go

Sometimes it’s not – which is why there’s more to good software engineering than merely being a good code-slinger

MSJ-58

Date post:	30-Mar-2015
Category:	Documents
Upload:	payton-arrowsmith
View:	212 times
Download:	0 times

Linked Lists This presentation is intended to be viewed in slideshow mode. If you are reading this...

Documents