Date post: | 29-Jan-2016 |
Category: |
Documents |
Upload: | chrystal-reed |
View: | 216 times |
Download: | 1 times |
Basics of Biostatistics for Health ResearchSession 1 – February 7th, 2013
Dr. Scott Patten, Professor of EpidemiologyDepartment of Community Health Sciences
& Department of Psychiatry
Objective 1:
Upon completion, students will be (more) able to ….
• Read,
• Understand,
• Critically interpret,
The statistical portions of articles in the medical literature.
Objective 2:
Given a dataset, students will be able to ….
• Select appropriate statistical procedures for basic analyses
• Implement these analyses using typical statistical software (we will use Stata)
Objective 3:
Upon completion, students will be (more) able to ….
• Define and interpret specialized parameters found in the clinical epidemiology literature, for example…– Sensitivity– Specificity– Predictive values
The statistical portions of articles in the medical literature.
Topics for Session 1:
• Why do we need statistics?
• Calculating a 95% confidence interval for a proportion
Why Do We Need Statistics?
• We don’t always need statistics.
• However, statistics are the most powerful tools for answering questions in medicine, for example….– Determining whether treatments work– Comparing different treatments– Identifying the causes of diseases
Why Do We Need Statistics?
• Statistics are the most powerful tools for answering questions in medicine, for example….– Determining whether treatments work– Comparing different treatments– Identifying the causes of diseases
The Power of Statistics
• Where does it come from?– Fundamentally, from the laws of probability
• A familiar example:– Flipping one coin versus flipping many coins
Coin Flipping
• First, I’ll flip a coin and you can try to guess what I got.
• Then, I’ll ask you to flip a coin and I’ll guess how many you get
The Power of Statistics
• A set of observations can allow us to make statements of a sort that we generally cannot make based on a single observation– E.g. how well does a treatment work?
• Larger and larger sets of observations allow us to make stronger and stronger statements
Formal Terminology
• Source of the observations are a sample
• The sample is a subset of a population
• The observations are data
• The collection of observations are a dataset
Inference
Generally,“A conclusion reached on the basis of evidence and reasoning.”
Statistical,“Making a statement about a population based on observations from a sample (a dataset)”
Stata’s Graphical Interface
Lets do a Study!
• We’ll select a sample of half the class
• Tabulate the frequency of male/female
• Estimate the proportion of women
Select a Sample• We’ll consider the class, of ‘N’ students as our
population
• The first step in obtaining a sample is to have a sampling frame – a list of the population
• Lets make one in Stata
• For notation, I’ll type Stata commands in red.
• These go into the command window• To execute a command, press Enter: <Enter>
Command Menus(an alternative to the command window)
1
2
I’ll use screen capturesand add red numbersif things need to be done in more than onestep.
Use this drop-down variable to select the new variable
1
2
Click OK
• Let’s create a sampling frame in Stata.
• In the command window, tell Stata that we want to create a list with N rows:
set obs 30
(instead of 40, we’ll use the # in the class)
generate id = _n
• Let’s create a sampling frame in Stata?
• We’ll start by typing into the command window..
• In the command window, tell Stata that we want to create a list with N* rows:
set obs 30
<enter>
generate id = _n
* instead of 30, we’ll use the # in the class
The data viewer
• Now, lets sample half of these
sample 50
• Click on the data viewer to see our sample
Data Collection
• From each member of our sample, we’ll record the person’s sexMale = 0
Female = 1
• Let’s create a variable called “sex” in which to enter our datagenerate sex = .
The data viewer
Look at the Dataset!
The data editor
Enter the Data
Highlight a cell (click on it) and start entering data!
Closing the Data Editor
Click Exit
Making a Table
• At this point, we could make a table to show the frequency of men and women in our sample,
12
3
4
Use this drop-down variable to select the new variable
1
2
Click OK
A few things to note….
• Our table doesn’t look so great
• The command that our menus created is executed by Stata (see the “. tab var2” in the output window)
• We can do the same thing by typing:
tab var2 in the command line
Command Line
Our Table is Still Very Ugly
(not exactly, but something like this)
Renaming a Variable
1
2
The Variables Manager
1
Select “var2” (click)
2
Type “sex” here, under Name
Using the Command Window
• Another way to do it is just to type into the command window
rename var2 sex
Our Table is Still Very Ugly
(not exactly, but something like this)
Creating a Label
1
2 3
4
Creating a Label
Click Here
Creating a Label
In Stata, you need to give your label a name,
Our values are 0 and 1
Our labels are men and women
Click Here
1
2
3
4
Creating a Label
After adding women, make add a second value-label for men.
Our labels are men and women
1
2
3
4
Attaching the Label
1
2 3
4
Assigning the Label
1
2
3
A Good Looking Table
Saving a Dataset
Click Here
To Save
Let’s do Statistics!
12 3
• We need to enter the Statistics menu
4
Entering the Command
1
23
4
Our Output
• What is the 95% confidence interval?
• What does it mean?
• What kind of statement can be made about the population (our class)?
• Is the statement true?
Introducing the “do file” editor
1
2 3
Executing a “do” file
Executing a “do” file
Something more Realistic
• Go to “www.ucalgary.ca/~patten” www.ucalgary.ca/~patten
• Scroll to the bottom.
• Right click to download the two files described as being “for PGME Students”
• Save them on your desktop
Open the Datafile
Explore the Datafile
• Click on the data browser in Stata
• Type describe into the command bar
• Open the data documentation file
• Note that sex is not labeled properly and that it is coded differently than in our example
Recode the Sex Variable as 0/1
• Let’s use the command window:generate female = sex
recode female 1=0 2=1
• Double check you’ve done it right:tab female sex
Your Task…• Create a good label for this new variable
• Make a good table of the new variable
• Create a 95% exact binomial confidence interval for the proportion of females in Framingham
• Interpret what this 95% confidence interval means
• Create a do file that will do all of these steps automatically
Creating a Log File
1
2 3
Additional Tasks
• Create a log file for your calculation of the proportion of women in Framingham, and an associated 95% confidence interval.
Additional Tasks
• Calculate an estimate of the proportion of people in Framingham with greater than high school education (and 95% confidence interval) – generate and save a log file that shows this calculation.