Date post: | 19-Jan-2018 |
Category: |
Documents |
Upload: | brook-andrea-joseph |
View: | 213 times |
Download: | 0 times |
Computer Science 121
Scientific ComputingWinter 2016
Chapter 4Collections and Indexing
Collections and Indexing We've seen two kinds of collection
– Array (sequence of numbers)– Text/string (sequence of characters)
Two main issues– How to access individual elements of a
collection– How to group related elements together
(even when their types differ)
4.1 Indexing Consider census data for a single street:
>>> elmstreet = array([3, 5, 2, 0, 4, 5, 1]) Matlab can give us various stats about this data
>>> sum(elmstreet) # total residents20>>> mean(elmstreet) # mean household size2.8571>>> max(elmstreet) # largest household size5>>> min(elmstreet) # smallest household size0
4.1 Indexing
Some data may be bogus>>> min(elmstreet) # smallest size0
Need to know bogus values, and where they “live” In general, need to know
– Value of an element– Position (index) of the element
4.1 Indexing: where Boolean operators on arrays give us arrays of Booleans:
>>> elmstreet == 0array([False, False, False, True, False, False, False], dtype=bool) The where operator tells us the indices of the True
elements>>> where(elmstreet == 0)(array([3]),)>>> where(elmstreet > 2)(array([0, 1, 2, 4, 5, 6]),)>>> where(elmstreet < 0)(array([], dtype=int64),)
4.1 Indexing: First and last Elements
First element has index 0>>> elmstreet
array([3, 5, 2, 0, 4, 5, 1])>>> elmstreet[0]3
Last element can be referenced by special -1 index>>> elmstreet[-1]1
4.1 Indexing: Subsequences
Can use a list of indices instead of a single index>>> elmstreet[[0,2,4]]3 2 4>>> elmstreet[[0,2,4]] = -1>>> elmstreet
array([-1, 5, -1, 0, -1, 5, 1])
4.1 Indexing: Extending an Array Use append to add an element at end of array:
>>> elmstreet = append(elmstreet,8)>>> elmstreetelmstreet = 3 5 2 0 4 5 1 8
Can append more than one element:>>> elmstreet = append(elmstreet,[9,10,11])
Fibonacci Redux With arrays, we only need a single variable and line (versus three) to do Fibonacci:
>>> fib = arange(2)>>> fib
array([0, 1]) >>> fib = append(fib, fib[-1] + fib[-2]) >>> fib = append(fib, fib[-1] + fib[-2]) >>> fib = append(fib, fib[-1] + fib[-2]) >>> fib = append(fib, fib[-1] + fib[-2]) >>> fib array([0, 1, 1, 2, 3, 5])
4.2: 2D Arrays, a.k.a. Matrices Lots of data are best represented as tables:
4.2 Matrices We can store such data in a matrix:>>> elmstreet = array([[3,2,1,35000],\
[5,2,3,41000],\ [2,1,1,25000],\
[2,2,0,56000],\ [4,2,2,62000],\ [5,3,2,83000],\ [1,1,0,52000]])
Backslash says "continued on next row..." Household index is implicit (as row number)
4.2 Matrices Like len operator for 1D arrays (a.k.a. vectors), shape operator reports size of matrix:
>>> shape(elmstreet)(7, 4)
With matrices, we use two indices (instead of one) for referencing values:
>>> elmstreet[2,3]25000>>> elmstreet[3,2]0
4.2 Matrices As with 1D, we can access part of matrix by using
an array of indices
>>> elmstreet[[3,4,6], 3]array([56000, 62000, 52000])
Grab a whole row using colon notation
>>> elmstreet[0,:] # whole first rowarray([3, 2, 1, 35000])
4.2 Matrices Also works for columns:
>>> elmstreet[:, 0] # whole first colarray([3, 5, 2, 2, 4, 5, 1])
As with a vector, we can do operations on a scalar and a matrix:
>>>elmstreet*2array([[6,4,2,70000], [10,4,6,82000], [4,2,2,50000], [4,4,0,112000], [8,4,4,124000], [10,6,4,166000], [2,2,0,104000]])
... and element-by-element on two matrices:>>> a = array([[1,2,3],[4,5,6],[7,8,9]])>>> b = array([[2,4,6],[0,1,0],[0,3,1]])
>>> a + barray([[ 3, 6, 9], [ 4, 6, 6], [ 7, 11, 10]])
>>> a * barray([[ 2, 8, 18], [ 0, 5, 0], [ 0, 24, 9]])
Of course, matrices must be same size:>>> a + array([[1,2],[0,5]])
ValueError: operands could not be broadcast together with shapes (3,3) (2,2) And your socks don’t match either.
We can get a lot of mileage by combining colon and other operations
>>> children = elmstreet[:, 2]>>> childrenarray([1, 3, 1, 0, 2, 2, 0])>>> nokidshouses = where(children == 0)>>> nokidshouses(array([3, 6]),)>>> incomenokids = elmstreet[nokidshouses, 3]>>> incomenokidsarray([[56000, 52000]])>>> mean(incomenokids)54000.0
Can get rows and cols at same time with where:
>>> r,c = where(elmstreet >3)>>> rarray([0, 1, 1, 2, 3, 4, 4, 5, 5, 6])>>> carray([3, 0, 3, 3, 3, 0, 3, 0, 3, 3])
For Boolean ops, use logical_>>> r,c = where(logical_and(elmstreet >3, elmstreet <= 5))>>> rarray([1,4,5])>>> carray([0,0,0])
4.3 Dictionaries: Mixed Data Types, with Names as Indices
Dictionaries (a.k.a. Data Structures) allow us to put different types of data into the same collection:
>>> pt = {}>>> pt["x"] = 3>>> pt["name"] = "Henry">>> pt{'x': 3, 'name': 'Henry'}
4.3 Arrays of arbitrary types: Lists
>>> friends = ["Sally", "Bob", "Jane"] >>> friends['Sally', 'Bob', 'Jane']>>> friends[2]'Jane'
Arrays are great for numerical computing For other types of elements in a sequence, use
"bare" lists:
Unlike arrays, lists can mix types (not recommended):
>>> stuff = [3.14159, "Einstein", arange(5)]