+ All Categories
Home > Documents > IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS

Date post: 31-Dec-2015
Category:
Upload: veda-briggs
View: 26 times
Download: 0 times
Share this document with a friend
Description:
IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS. INTRODUCTION. WHAT IS STATISTICS?. Statistics is a science of collecting data, organizing and describing it and drawing conclusions from it. That is, statistics is a way to get information from data. It is the science of uncertainty. - PowerPoint PPT Presentation
Popular Tags:
36
1 IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS INTRODUCTION
Transcript
  • *IAM 530ELEMENTS OF PROBABILITY AND STATISTICSINTRODUCTION

  • *WHAT IS STATISTICS?Statistics is a science of collecting data, organizing and describing it and drawing conclusions from it. That is, statistics is a way to get information from data. It is the science of uncertainty.

  • WHAT IS STATISTICS?A pharmaceutical CEO wants to know if a new drug is superior to already existing drugs, or possible side effects.How fuel efficient a certain car model is?Is there any relationship between your GPA and employment opportunities?Actuaries want to determine risky customers for insurance companies.

    *

  • *STEPS OF STATISTICAL PRACTICEPreparation: Set clearly defined goals, questions of interests for the investigationData collection: Make a plan of which data to collect and how to collect itData analysis: Apply appropriate statistical methods to extract information from the dataData interpretation: Interpret the information and draw conclusions

  • *STATISTICAL METHODSDescriptive statistics include the collection, presentation and description of numerical data.Inferential statistics include making inference, decisions by the appropriate statistical methods by using the collected data.Model building includes developing prediction equations to understand a complex system.

  • *BASIC DEFINITIONSPOPULATION: The collection of all items of interest in a particular study.

    VARIABLE: A characteristic of interest about each element of a population or sample.STATISTIC: A descriptive measure of a sampleSAMPLE: A set of data drawn from the population; a subset of the population available for observationPARAMETER: A descriptive measure of the population, e.g., mean

  • *EXAMPLEPopulation Unit Sample VariableAll students currently Student Any department GPAenrolled in school Hours of works per week

    All books in library Book Statistics Books Replacement cost Frequency of check out Repair needs

    All campus fast food Restaurant Burger King Number of employeesrestaurants Seating capacity Hiring/Not hiringNote that some samples are not representative of population and shouldnt be used to draw conclusions about population.

  • How not to run a presidential pollFor the 1936 election, the Literary Digest picked names at random out of telephone books in some cities and sent these people some ballots, attempting to predict the election results, Roosevelt versus Landon, by the returns. Now, even if 100% returned the ballots, even if all told how they really felt, even if all would vote, even if none would change their minds by election day, still this method could be (and was) in trouble: They estimated a conditional probability in that part of the American population which had phones and showed that that part was not typical of the total population. [Dudewicz & Mishra, 1988]

  • STATISTICStatistic (or estimator) is any function of a r.v. of r.s. which do not contain any unknown quantity. E.g. are statistics.

    are NOT.

    Any observed or particular value of an estimator is an estimate.

    *

  • RANDOM VARIABLESVariables whose observed value is determined by chanceA r.v. is a function defined on the sample space S that associates a real number with each outcome in S.Rvs are denoted by uppercase letters, and their observed values by lowercase letters.Example: Consider the random variable X, the number of brown-eyed children born to a couple heterozygous for eye color (each with genes for both brown and blue eyes). If the couple is assumed to have 2 children, X can assume any of the values 0,1, or 2. The variable is random in that brown eyes depend on the chance inheritance of a dominant gene at conception. If for a particular couple there are two brown-eyed children, we have x=2. *

  • *COLLECTING DATATarget Population: The population about which we want to draw inferences.

    Sampled Population: The actual population from which the sample has been taken.

  • *SAMPLING PLANSimple Random Sample (SRS): All possible samples with the same number of observations are equally likely to be selected. Stratified Sampling: Population is separated into mutually exclusive sets (strata) and then sample is drawn using simple random samples from each strata.Convenience Sample: It is obtained by selecting individuals or objects without systematic randomization.

  • *

  • *EXAMPLEA manufacturer of computer chips claims that less than 10% of his products are defective. When 1000 chips were drawn from a large production run, 7.5% were found to be defective.What is the population of interest?

    What is the sample?What is parameter?What is statistic?Does the value 10% refer to a parameter or a statistics?

    Explain briefly how the statistic can be used to make inferences about the parameter to test the claim.

    The complete production run for the computer chips 1000 chips Proportion of the all chips that are defective Proportion of sample chips that are defective ParameterBecause the sample proportion is less than 10%, we can conclude that the claim may be true.

  • *DESCRIPTIVE STATISTICSDescriptive statistics involves the arrangement, summary, and presentation of data, to enable meaningful interpretation, and to support decision making.Descriptive statistics methods make use ofgraphical techniquesnumerical descriptive measures. The methods presented apply both to the entire populationthe sample

  • *Types of data and informationA variable - a characteristic of population or sample that is of interest for us.Cereal choice ExpenditureThe waiting time for medical servicesData - the observed values of variables Interval and ratio data are numerical observations (in ratio data, the ratio of two observations is meaningful and the value of 0 has a clear no interpretation. E.g. of ratio data: weight; e.g. of interval data: temp.)Nominal data are categorical observationsOrdinal data are ordered categorical observations

  • Types of data examples*

    Examples of types of dataQuantitativeContinuousDiscreteBlood pressure, height, weight, ageNumber of childrenNumber of attacks of asthma per weekCategorical (Qualitative)Ordinal (Ordered categories)Nominal (Unordered categories)Grade of breast cancerBetter, same, worseDisagree, neutral, agreeSex (Male/female)Alive or deadBlood group O, A, B, AB

  • *Types of data analysisKnowing the type of data is necessary to properly select the technique to be used when analyzing data.

    Types of descriptive analysis allowed for each type of dataNumerical data arithmetic calculationsNominal data counting the number of observation in each categoryOrdinal data - computations based on an ordering process

  • *Types of data - examplesNumerical dataAge - income55750004268000....Weight gain+10+5..NominalPerson Marital status1married2single3single....Computer Brand1IBM2Dell3IBM....

  • *Types of data - examplesNumerical dataAge - income55750004268000....Nominal dataA descriptive statistic for nominal data is the proportion of data that falls into each category.

    IBM Dell Compaq OtherTotal 25 11 8 6 50 50% 22% 16% 12% Weight gain+10+5..

  • *Cross-Sectional/Time-Series/Panel DataCross sectional data is collected at a certain point in time Test score in a statistics courseStarting salaries of an MBA program graduatesTime series data is collected over successive points in time Weekly closing price of goldAmount of crude oil imported monthlyPanel data is collected over successive points in time as well

  • Differences*

    Cross-sectionalTime seriesPanelChange in timeCannot measureCan measureCan measure

    Properties of the seriesNo seriesLong; usually just one or a few seriesShort; hundreds of seriesMeasurement timeMeasurement only at one time point; even if more than one time point, samples are independent from each otherUsually at regular time points (all series are taken at the same time points and time points are equally spaced)VariesMeasurements Response(s); time-independent covariatesResponse(s); time; usually no covariateResponse(s); time; time-dependent and independent covariates

  • GAMES OF CHANCE *

  • *COUNTING TECHNIQUESMethods to determine how many subsets can be obtained from a set of objects are called counting techniques.FUNDAMENTAL THEOREM OF COUNTINGIf a job consists of k separate tasks, the i-th of which can be done in ni ways, i=1,2,,k, then the entire job can be done in n1xn2xxnk ways.

  • *THE FACTORIALnumber of ways in which objects can be permuted. n! = n(n-1)(n-2)2.10! = 1, 1! = 1

    Example: Possible permutations of {1,2,3} are {1,2,3}, {1,3,2}, {3,1,2}, {2,1,3}, {2,3,1}, {3,2,1}. So, there are 3!=6 different permutations.

  • *COUNTINGPartition Rule: There exists a single set of N distinctly different elements which is partitioned into k sets; the first set containing n1 elements, , the k-th set containing nk elements. The number of different partitions is

  • COUNTINGExample: Lets partition {1,2,3} into two sets; first with 1 element, second with 2 elements.Solution:Partition 1: {1} {2,3}Partition 2: {2} {1,3}Partition 3: {3} {1,2}3!/(1! 2!)=3 different partitions

    *

  • ExampleHow many different arrangements can be made of the letters ISI? 1st letter 2nd letter 3rd letter

    *IIISSSIIN=3, n1=2, n2=1; 3!/(2!1!)=3

  • ExampleHow many different arrangements can be made of the letters statistics?N=10, n1=3 s, n2=3 t, n3=1 a, n4=2 i, n5=1 c

    *

  • *COUNTINGOrdered, without replacement

    Ordered, with replacement

    3. Unordered, without replacement 4. Unordered, with replacement (e.g. picking the first 3 winners of a competition)(e.g. tossing a coin and observing a Head in the k th toss)(e.g. 6/49 lottery)(e.g. picking up red balls from an urn that has both red and green balls & putting them back)

  • *PERMUTATIONSAny ordered sequence of r objects taken from a set of n distinct objects is called a permutation of size r of the objects.

  • *COMBINATIONGiven a set of n distinct objects, any unordered subset of size r of the objects is called a combination.Properties

  • *COUNTING

    Number of possible arrangements of size r from n objectsWithout ReplacementWith ReplacementOrderedUnordered

  • EXAMPLEHow many different ways can we arrange 3 books (A, B and C) in a shelf?Order is important; without replacementn=3, r=3; n!/(n-r)!=3!/0!=6, or

    *

    Possible number of books for 1st place in the shelfPossible number of books for 2nd place in the shelfPossible number of books for 3rd place in the shelf3 x2x1

  • EXAMPLE, cont.How many different ways can we arrange 3 books (A, B and C) in a shelf?1st book 2nd book3rd book

    *ABCACBBCCACAABB

  • *EXAMPLELotto games: Suppose that you pick 6 numbers out of 49What is the number of possible choices If the order does not matter and no repetition is allowed?

    If the order matters and no repetition is allowed?

    *********************


Recommended