+ All Categories
Home > Documents > CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability...

CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability...

Date post: 28-Mar-2015
Category:
Upload: emily-gomez
View: 219 times
Download: 0 times
Share this document with a friend
Popular Tags:
27
CS1512 CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory © J R W Hunter, 2006; C J van Deemter 2007
Transcript
Page 1: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

1

CS1512Foundations of

Computing Science 2(Theoretical part)

Kees van Deemter

Probability and statisticsPropositional Logic

Elementary set theory

© J R W Hunter, 2006; C J van Deemter 2007

Page 2: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

2

What this is going to be about

1. Suppose you know that the statement p is true and that the statement q is true. What can you say about the statement that p and q ?

Page 3: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

3

What this is going to be about

1. Suppose you know that the statement p is true and that the statement q is true. What can you say about the statement that p and q ?In this case, you know that p and q is also true.

2. Suppose you know that the statement p has a probability of .5 and the statement q has a probability of .5. What can you say about the statement that p and q ?

Page 4: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

4

What this is going to be about

1. Suppose you know that the statement p is true and the statement q is true. What can you say about the statement that p and q ?In this case, p and q is also true.

2. Suppose you know that the statement p has a probability of .5 and the statement q has a probability of .5. What can you say about the statement that p and q ?

It depends! If p and q are independent of each other then you know that p and q has a probability of .25 But suppose (1) p = It will snow (some time) tomorrow and q = It will be below zero (some time) tomorrowThen p and q has a probability >.25

Page 5: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

5

What this is going to be about

1. Suppose you know that the statement p is true and the statement q is true. What can you say about the statement that p and q ?In this case, p and q is also true.

2. Suppose you know that the statement p has a probability of .5 and the statement q has a probability of .5. What can you say about the statement that p and q ?

It depends! If p and q are independent of each other then you know that p and q has a probability of .25 But suppose (2) p = It will snow (sometime) tomorrow and q = It will not snow (any time) tomorrowThen p and q has a probability 0

Page 6: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

6

Before we get there ...

Some basic concepts in statistics• different kinds of data• ways of representing data• ways of summarising data

This will be useful later in your CS career. For example to

• assess whether a computer simulation is accurate• assess whether one user interface is more user friendly than another• estimate the expected run time of a computer program (on typical

data)

Page 7: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

7

Lecture slides on statistics and probability are based on originals by Professor Jim Hunter.

Page 8: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

8

CS1512Foundations of

Computing Science 2

Lecture 1

Probability and statistics (1)

© J R W Hunter, 2006; C J van Deemter 2007

Page 9: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

9

Sources

Text book (parts of chapters 1-6):Essential Statistics (Fourth Edition)D.G.ReesChapman and Hall2001(Blackwells, ~£28)

Courses:ST1505Mathematical Scienceshttp://maths.abdn.ac.uk/~ap/st1505/

CS1012Sets

Page 10: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

10

Some definitions

Sample space (population)• Set of entities of interest, also called elements

• this set may be infinite

• entities can be physical objects, events, etc. ...

Sample

• subset of the sample space

Page 11: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

11

More definitions

Variable• an attribute of an element which has a value (e.g., its

height, weight, etc.)

Observation• the value of a variable as recorded for a particular

element• an element will have variables with values but they are

not observations until we record it

Sample data• set of observations derived from a sample

Page 12: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

12

Descriptive and Inferential Statistics

Descriptive statistics:

• Summarising the sample data (as a number, graphic ...)

Inferential statistics:

• Using data from a sample to infer properties of the sample space• Chose a ‘representative sample’

(properties of sample match those of sample space – difficult)• In practice, use a ‘random sample’

(each element has the same likelihood of being chosen)

Page 13: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

13

Variable types

Qualitative:• Nominal/Categorical (no ordering in values)

e.g. sex, occupation• Ordinal (ranked)

e.g. class of degree (1, 2.1, 2.2,...)

Quantitative:• Discrete (countable) – [integer]

e.g. number of people in a room• Continuous – [double]

e.g. height

Page 14: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

14

Examples

1. A person’s marital status

2. The length of a CD

3. The size of a litter of piglets

4. The temperature in degrees centigrade

Page 15: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

15

Examples

1. A person’s marital status Nominal/categorical

2. The length of a CD Quantitative; continuous or discrete? This depends on how you model length (minutes or bits)

3. The size of a litter of piglets Quantitative, discrete (if we mean the number of pigs)

4. The temperature in degrees centigrade Quantitative, continuous (Even though it does not make sense to say that 200

is twice as warm as 100)Footnote: We us the term `Continuous` a bit loosely: For us a variable is

continuous/dense (as opposed to discrete) if between any values x and y, there lies a third value z.

Page 16: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

16

Summarising data

Categorical (one variable):

• X is a categorical variable with values: a1, a2, a3, ... ak, ... aK

(k = 1, 2, 3, ... K)

• fk = number of times that ak appears in the sample

fk is the frequency of ak

• if we have n observations then:

relative frequency = frequency / n

• percentage relative frequency = relative frequency 100

Page 17: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

17

Frequency

Blood Type Frequency Relative Frequency Percentage RF

A 210 0.37 37%

AB 35 0.06 6%

B 93 0.16 16%

O 234 0.41 41%

Totals 572 1.00 100%

sample of 572 patients (n = 572)

sum of frequencies = n

Page 18: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

18

Bar Chart

Page 19: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

19

Summarising data

Categorical (two variables):• contingency table• number of patients with blood type A who are female is 108

Blood Type Sex Totals male female

A 102 108 210 AB 12 23 35 B 46 47 93 O 120 114 234 Totals 280 292 572

Page 20: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

20

Summarising data

Categorical (two variables):• contingency table• number of patients with blood type A who are female is 108

Blood Type Sex Totals % Blood Type by sex male female male female

A 102 108 210 49% 51% AB 12 23 35 34% 66% B 46 47 93 50% 50% O 120 114 234 51% 49% Totals 280 292 572

Page 21: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

21

Bar Chart

Page 22: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

22

Ordinal data

• X is an ordinal variable with values: a1, a2, a3, ... ak, ... aK

• ‘ordinal’ means that:

a1 ≤ a2 ≤ a3 ≤ ... ≤ ak ≤ ... ≤ aK

• cumulative frequency at level k:

ck = sum of frequencies of values less than or equal to ak

ck = f1 + f2 + f3 + ... + fk = (f1 + f2 + f3 + ... + fk-1 ) + fk = ck-1 + fk

• Can be applied to quantitative data as well ...

Page 23: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

23

Cumulative frequencies

Number of piglets

in a litter: (discrete data)

c1=f1=1, c2=f1+f2=1,c3=f1+f2+f3=3,c4=f1+f2+f3+f4=6, etc.

Litter size Frequency=f Cum. Freq =c

5 1 1 6 0 1 7 2 3 8 3 6 9 3 9 10 9 18 11 8 26 12 5 31 13 3 34 14 2 36

Total 36cK = n

Page 24: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

24

Plotting

frequency cumulative frequency

Page 25: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

25

Continuous data

• A way to obtain discrete numbers from continuous data: Divide range of observations into non-overlapping intervals (bins)

• Count number of observations in each bin

• Enzyme concentration data in 30 observations:

121 25 83 110 60 101 95 81 123 67 113 78 85 145 100 70 93 118119 57 64 151 48 92 62 104 139 201 68 95

Range: 25 to 201 For example, you can use 10 bins of width 20:

Page 26: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

26

Enzyme concentrations

Concentration Freq. Rel.Freq. % Cum. Rel. Freq.

19.5 ≤ c < 39.5 1 0.033 3.3%

39.5 ≤ c < 59.5 2 0.067 10.0%

59.5 ≤ c < 79.5 7 0.233 33.3%

79.5 ≤ c < 99.5 7 0.233 56.6%

99.5 ≤ c < 119.5 7 0.233 79.9%

119.5 ≤ c < 139.5 3 0.100 89.9%

139.5 ≤ c < 159.5 2 0.067 96.6%

159.5 ≤ c < 179.5 0 0.000 96.6%

179.5 ≤ c < 199.5 0 0.000 96.6%

199.5 ≤ c < 219.5 1 0.033 100.0%

Totals 30 1.000

Page 27: CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

CS1512

CS1512

27

Cumulative histogram


Recommended