While We Are Waiting…While We Are Waiting…• If you want to work along with the presentation, all the materials are available on the PRISM websitematerials are available on the PRISM website– Go to: http://polisci.osu.edu/prism/luncheons.htm– Download the following zip file onto your desktop
• StataIntro_08.zip• Extract all of the contents of the zip folder to your desktop– Double click to open the presentation file: IntroStata08_Vfinal.pdfD bl li k St t t th– Double click on Stata to open the program
– Note: Included in the zip folder• Presentation: IntroToStata08_Vfinal.pdf• Datasets: NES04_VstataIntro08.dta
ICPSR_08865• Do file IntroStata V08 do• Do file: IntroStata_V08.do
1/25/2008 Christenson & Powell: Intro to Stata 1
PRISM Brownbag:
An Introduction to RDino Christenson & Scott Powell
Ohio State UniversityOhio State University
January 25th, 2008
Intro to StataIntro to Stata
I. GUIII. Log fileIII. Basic statsIV Data manipulationIV. Data manipulationV. Descriptions of variablesVI. Help files!pVII. GraphingVIII. Do filesIX E ti t bl h d d tIX. Exporting tables, graphs and dataX. Importing foreign dataXI. Closingg
1/25/2008 3Christenson & Powell: Intro to Stata
GUIGUI
• First, let’s identify what we’re looking at.
• Stata has several differentdifferent viewing windows, each with a differentwith a different function.
1/25/2008 4Christenson & Powell: Intro to Stata
GUIGUI
• Review: ListsReview: Lists commands that have recently been entered
• Results: Show recently yobtained results
1/25/2008 5Christenson & Powell: Intro to Stata
GUIGUI
• Variables: a ab es:All the existing
blvariables in your data setset
• Command: WhereWhere commands are entered
1/25/2008 Christenson & Powell: Intro to Stata 6
GUIGUI• File: More
than open• Data: Multiple
Avenues for • Statistics: Statistical
Modeling Optionsthan open and save.
• Edit: What you expect
Data Manipulation
• Graphics: More to come later
• Help: More to come later
• Bottom Line: These menus offer graphical alternatives to directly typing commands into Stata
1/25/2008 Christenson & Powell: Intro to Stata 7
directly typing commands into Stata
GUIGUI• Begin a
new log• Bring up the
Results• Edit/View Data
new log file
• Bring up the Help Viewer
Results Window
• Begin a new do file
• STOP! (The Number Crunching)Crunching)
1/25/2008 Christenson & Powell: Intro to Stata 8
The Log FileThe Log File
• Log or Perish! (or at the very least you might do some crying)
• Log files keep track of everything you do in Stata, both i t d t tinput and output
• However, it does not record when dditi l i dadditional windows open up (i.e. graphs, help window etc )window, etc.)
1/25/2008 Christenson & Powell: Intro to Stata 9
The Log FileThe Log File
• To start a log file, th “Fil ”access the “File”
menu and select “Begin”
• Log files will automatically close when you end your session. However, you can also close it manually as wellmanually, as well as suspend it during a session.
1/25/2008 Christenson & Powell: Intro to Stata 10
Stata as a CalculatorStata as a Calculator
• Stata can be used to compute both basic and advanced
th ti lmathematical operations
• Use the displayd dicommand, or di,
followed by the mathematical expressionexpression
• di 20*1.5-17
1/25/2008 Christenson & Powell: Intro to Stata 11
Some Basic StatisticsSome Basic Statistics
• Stata can also perform several probability p yfunctions
• Example: What’s the probability of tossing a coin ten times andcoin ten times and getting five heads?
• diBi i l(10 5 5)Binomial(10,5,.5)
1/25/2008 Christenson & Powell: Intro to Stata 12
Some Basic StatisticsSome Basic Statistics
• Example: CDF for h lthe normal distribution, z = 1.96
• dinormal(1 96)normal(1.96)
1/25/2008 Christenson & Powell: Intro to Stata 13
Some Basic StatisticsSome Basic Statistics
• Stata has many more distribution functions that can be implemented
• For a summary of these, use the following
dcommand:
• help density ifunctions
1/25/2008 Christenson & Powell: Intro to Stata 14
In The Beginning…(Opening a Data Set)
• Several optionsSeveral options exist for opening data setssets
• Using the GUI allows you to browse orbrowse or access recent data sets
• It is alsoIt is also possible to type in the usecommand
1/25/2008 Christenson & Powell: Intro to Stata 15
The Data EditorThe Data Editor
• Sort data by l d
• Move variable to first or last
• Hide selected i blselected
variableto first or last position
variable
• Preserve changes that you’ve made
• Delete selected variable or observation
you ve made
• Restore data to the state of the • And, of course,the state of the last “Preserve”
And, of course, you can edit each cell
1/25/2008 Christenson & Powell: Intro to Stata 16
Manipulating the DataManipulating the Data
• Stata can generate i bl dnew variables and
edit existing ones• Let’s create a new
variable called “left”variable called left using generateand replace
• gen left = 1 if ideology <0
• replace left = 0 if id l0 if ideology >=0
1/25/2008 Christenson & Powell: Intro to Stata 17
Manipulating the DataManipulating the Data• Notice that we now
have a new variablehave a new variable in our list
• Let’s create a new variable by recoding an existing one
• recode marr(0=1) (1=0)(0 1) (1 0), gen(single)
• Other Expressions to k & |know: >=, <=, &, |,~, ^, ‐, /, *, +, ~=
1/25/2008 Christenson & Powell: Intro to Stata 18
Manipulating the DataManipulating the Data
• Stata also has the bilit t tability to generate vectors and matrices
• matrix input mat1 = (1\2\3)
• matrix input mat2 = (1,2,3)
i 3• matrix mat3 = mat1*mat2
• matrix list mat3mat3
1/25/2008 Christenson & Powell: Intro to Stata 19
Describing the DataDescribing the Data
• Now let’s have a look at what we created
• tab ideologyideology
• tab left• tab marr• tab single
1/25/2008 Christenson & Powell: Intro to Stata 20
Describing the DataDescribing the Data
• To produce a list d f lland summary of all
variables, use the sum command
• sum
• You can also use• You can also use this command to summarize individual variables
• sum ideology
1/25/2008 Christenson & Powell: Intro to Stata 21
Describing the DataDescribing the Data• The tab command
can also be used tocan also be used to create cross‐tabs when implemented with two variables
• tab left marr
• Summary statistics can be separatedcan be separated using the bycommand, but you have to sort first
• sort left• by left: sum
educ
1/25/2008 Christenson & Powell: Intro to Stata 22
Describing the DataDescribing the Data
• In Stata, data exists in several formats
• For a summary of data types in
d tyour data set, use the describecommandcommand
• describe
1/25/2008 Christenson & Powell: Intro to Stata 23
Describing the DataDescribing the Data• Strings are non‐
numeric variablesnumeric variables• Floats are numeric
data types that store up to 7 digits fof accuracy,
rounding thereafter• byte, int, long, and
double are other numeric types
• Useful commands for changing datafor changing data types: format, destring, encode
1/25/2008 Christenson & Powell: Intro to Stata 24
Help ViewerHelp Viewer
• The capabilities of Stata are vastThe capabilities of Stata are vast
• What you can do with Stata depends on your knowledge of the commandsknowledge of the commands
• Fortunately Stata comes with user friendly help
St t ’ t t lli i t• Stata’s greatest selling point• All commands are easily referenced
All d ith h l f l d i ti d• All commands come with helpful descriptions and examples
• All commands have been peer reviewedAll commands have been peer reviewed
1/25/2008 Christenson & Powell: Intro to Stata 25
Help ViewerHelp Viewer • To open the Help Viewer click on Help Contents
• The Help Viewer opens and allows you to browse the entire Stata database and online resourcesdatabase and online resources
• It acts like an internet browser…1/25/2008 Christenson & Powell: Intro to Stata 26
Help ViewerHelp Viewer
• Take the now familiar tab command
• In command prompt or in help viewer prompt:help viewer prompt:help tabulate
• Provides information on:– Command title – Command syntax
• Note: blue font is linked;Note: blue font is linked; click on it to get more info on the given word
1/25/2008 Christenson & Powell: Intro to Stata 27
Help ViewerHelp Viewer
• Also providesAlso provides information on:– Command Description– Command Description
– Command Options
Command Examples– Command Examples
– Related commands
1/25/2008 Christenson & Powell: Intro to Stata 28
Help ViewerHelp Viewer
• Add‐on packages also p geasy to find with the help viewer
For e g “Clarify” by G– For e.g., “Clarify” by G. King
– Search clarify: Help Search…
– Type: clarify– Help finds the add‐onHelp finds the add on package site and provides links for its description and downloadand download
1/25/2008 Christenson & Powell: Intro to Stata 29
GraphingGraphing
• Stata has numerous graphing capabilities– ANOVA and post‐estimation OLS – Time Series: ARCH, ARIMA, VAR…– Duration Analysis: exponential, weibull, cox…
E t C t ti bi i l i H dl– Event Count: negative binomial, poisson, Hurdle…– Limited Dependent Variables: logit, probit, multinomial logit and
probit, ordered logit and probit…– Selection Models: heckman, censored probit, tobit,…Selection Models: heckman, censored probit, tobit,…– And, if it is not canned, we can program it – but that is for another
brownbag• Furthermore Stata 10 is supposed to be a drastic improvement in
h fl ibili f hi f ithe flexibility of graphing functions– Competition with R?
• Let’s quickly look at some of the basic graphs you can create
1/25/2008 Christenson & Powell: Intro to Stata 30
ScatterplotsScatterplots
• Perhaps we want to check if our data hints that people become more favorable to conservative values as they age
• We can graph the variables with respect to one another– scatter repthermage
• Graph viewer appears b th lt iabove the results viewer
• Toggle to and fro with graph viewer buttons on toolbar
1/25/2008 Christenson & Powell: Intro to Stata 31
ScatterplotsScatterplots
• We can also look at the same relationship by a particular sample of our data
• Perhaps there is a difference between those that voted for Bush (1)that voted for Bush (1) and Kerry (0)
• Let’s sort by voteT• Try scatter reptherm age, by(vote)
1/25/2008 Christenson & Powell: Intro to Stata 32
Bar Charts & HistogramsBar Charts & Histograms
• Say we are interested in Say e a e te estedthe distribution of a categorical variable
• Try creating a bar chart for our measure of
liti l id lpolitical ideology• Typehi t id l• hist ideology, discrete width(1)width(1)
1/25/2008 Christenson & Powell: Intro to Stata 33
UFPCUFPC
• Say you need to paint a really b i i t f t id
Strength of Party IdentificationUFPC
basic picture of party id strength for your coworkers
• Try a pie chartgraph pie over(pid)
16.99%
14.98%12.89%
16.15%
– graph pie, over(pid) – Use options for presentation: – title(UFPC) subtitle(Strength of
17.57%9.874%
11.55%
-3 -21 0
gParty Identification) caption(-3 = Strong Rep to 3 = Strong
-1 01 23
-3 = Strong Rep to 3 = Strong Dem
p gDem) plabel(_all percent) cw
• Then quit your job; you’re working with imbecilesworking with imbeciles
1/25/2008 Christenson & Powell: Intro to Stata 34
Graphing with GUIGraphing with GUI• Of course, we did not need
the exact commands to create the graphs above
• We could have used the GUI toolbar to create any ofGUI toolbar to create any of those graphs
• Just go to Graphics and select the appropriateselect the appropriate graph
• A new viewer will appear• Select from the drop down• Select from the drop‐down
menu to fill in the necessary variables and optionsoptions
1/25/2008 Christenson & Powell: Intro to Stata 35
Exporting Graphs & TablesExporting Graphs & Tables
• So why did the last chart, the UFPC, look so nice So y d d t e ast c a t, t e U C, oo so ceand the others… not so much? – 1. Used titles– 2. Used a key
• The graph was understandable on its own
3 Exported the graph as a picture– 3. Exported the graph as a picture
• Stata allows you to export its output – both tables and graphs – in various formatstables and graphs in various formats– Depending on your typesetting system you will want to save the output in different manners
1/25/2008 Christenson & Powell: Intro to Stata 36
Exporting GraphsExporting Graphs• To save and export a graph, right click on the g p , ggraph (control click to my Mac friends)– Click Save GraphClick Save Graph– Save in the appropriate format
• Word: .wmf or .pngWord: .wmf or .png• Latex: .eps
• Alternatively, go to the main toolbar and clickmain toolbar and click File Save Graph– Follow same procedure
1/25/2008 Christenson & Powell: Intro to Stata 37
Exporting GraphsExporting Graphs• Shortcut to word usersusers
• To merely copy a graph, right click on h h ( lthe graph (control click to my Mac friends))– Click Copy– Paste it in your word processorprocessor
– Note: you do not have a separate saved
h i thigraph in this case
1/25/2008 Christenson & Powell: Intro to Stata 38
Exporting TablesExporting Tables
• The Stata table output is not appropriate for a conference paper or article submission
• Why not?– 1. Too much information– 2. Vertical lines– 3. Variable names– 4. No title or explanation
• Therefore, when you write a paper you will need to transform p p ythe output
• You’ve all seen article worthy tables (e.g. Balla & Wright 2001)tables (e.g. Balla & Wright 2001)
1/25/2008 Christenson & Powell: Intro to Stata 39
Exporting TablesExporting Tables
• Let’s run a simple OLS regression of some key political and demographic variables g pon the republican thermometer measure– Explanatory variables: p yeduc black south pray pid ideology
– Dependent variable: preptherm
• Stata output
1/25/2008 Christenson & Powell: Intro to Stata 40
Exporting TablesExporting Tables
• To export, highlight the p , g gtable with the mouse
• Right click on the hi hli ht d t blhighlighted table– For Word: Copy Text– For Excel: Copy TableFor Excel: Copy Table
• Edit in your chosen program in accord with journal specificationsjournal specifications
1/25/2008 Christenson & Powell: Intro to Stata 41
Do FilesDo Files• We’ve accomplished quite a bit and we have a log file of our work to prove itlog file of our work to prove it
• But is there an easy way to rerun all our work?Wh if d k ll• What if we wanted to make some small changes to our analyses and largely repeat this work?this work?
• Use a Do file!Cli k h• Click here to open a new or saved .do file
1/25/2008 Christenson & Powell: Intro to Stata 42
Do Files New .do fileSave your .do fileDo Files
• A Stata do file saves
yPrint your .do file
text in a text editor format
It is often easier to– It is often easier to create your commands in an editor than at the command promptcommand prompt
– Also easier to record your commands for future use and Fi d t t i d fil Copy do filefuture use and manipulation
Find text in .do fileRun .do file and show output
Copy .do fileUndo last edit
1/25/2008 Christenson & Powell: Intro to Stata 43
Do FilesDo Files• Typical text editing functions can
be used in here: replace, copy…etc.
• Asterisk * tells Stata not to run that line
Th f t t d fil– Therefore annotate your .do file with titles and explanations beginning with an *
• Let’s look at all the commands used in today’s presentation– Open a do file – Select Open… in do file toolbar– Select IntroStata_V08.do – Click Open
1/25/2008 Christenson & Powell: Intro to Stata 44
Do FilesDo Files
• The .do file presents all the commands from today in a simple editor
• From here we can edit the commands
• We can run the entire series of commands at one fell swoop – Bring cursor to the first line and
click on the Run button• We can also select portions to p
run by highlighting the appropriate text and clicking the same button
1/25/2008 Christenson & Powell: Intro to Stata 45
Do FilesDo Files
• Note: if you forget to ote: you o get towork in the do file, you can capture all
d fyour commands from the review editor:
Right click in the– Right click in the review editor Copy Review Contents to Clipboard
– Paste into your do file and editand edit
1/25/2008 Christenson & Powell: Intro to Stata 46
Importing Foreign DataImporting Foreign Data
• Often times we aren’t lucky enough to have data in Stata’sdatabase format
• Stata’s data files are stored as .dta files– They are just EZ‐form data files y j
• Used in various programs– Not to be confused with .dat files
• Which are usually ASCII comma delimited and often viewed in text dieditors
• Not to worry!• Beyond working with .dta files, Stata allows you to import
d fdata in various formats:– ASCII (.txt, .raw, .csv)– FDA (SAS export)– XML (.xml)
1/25/2008 Christenson & Powell: Intro to Stata 47
Importing Foreign DataImporting Foreign Data
• For example, say we wanted to use data stored at ICPSR
• www.icpsr.umich.eduwww.icpsr.umich.edu• ICPSR has tons of data on various topicsH “D t ” d• Hover on “Data” and select “Browse” to view their many ddatasets
• You can also search for a particular pdataset 1/25/2008 Christenson & Powell: Intro to Stata 48
Importing Foreign DataImporting Foreign Data• Today I’m interested in American
state politicsstate politics• I find that ICPSR has 14 relevant
datasets• I simply select to download the p y
dataset I’m interested in: 8655 Survey of City Council Members…
• If you are a returning user, it will request your login and passwordrequest your login and password
• If you are a new user, you will have to register first– It’s free and easy to register– No self‐respecting methods student
will make it through their first year without registering & downloading a dataset here
1/25/2008 Christenson & Powell: Intro to Stata 49
Importing Foreign DataImporting Foreign Data
• Download will usually allow you to import the data with various set‐up files– These files make importing
t f dto your preferred program easier
• In this case we just want the “Stata Setup” files withthe Stata Setup files with the data file
• Add these to the “Data Cart” in Step 3Cart in Step 3
• Then select “Download” in Step 5 (you can review your cart in Step 4)cart in Step 4)
1/25/2008 Christenson & Powell: Intro to Stata 50
Importing Foreign DataImporting Foreign Data
• After agreeing toAfter agreeing to their terms and conditions– The data files are compressed in a zip d idrive
– You are prompted to open or save the filesopen or save the files
• Save the drive in your preferred folderpreferred folder
1/25/2008 Christenson & Powell: Intro to Stata 51
Importing Foreign DataImporting Foreign Data
• Now we have the dataNow we have the data and setup files in a zip drive on our computerp– Extract the contents from your zip drive
– View the contents• Codebook as .pdf
• Data as .txt
• Setup dictionary as .dct
• Setup do file as .doSetup do file as .do
1/25/2008 Christenson & Powell: Intro to Stata 52
Importing Foreign DataImporting Foreign Data
• Let’s return to your Stata GUI
• Type clear to completely reset your datacompletely reset your data– Doing so deletes any variables you have stored or you have createdor you have created
• Click here to open a “do file”
In the do file select open– In the do file, select open – Browse for the setup do file:08655 0001 S d– 08655‐0001‐Setup.do
1/25/2008 Christenson & Powell: Intro to Stata 53
Importing Foreign DataImporting Foreign Data
• The setup do fileThe setup do file
• This file will define and label your dataand label your data for the Stata editor bycallingcalling– The dataset
A di– A corresponding dictionary file
1/25/2008 Christenson & Powell: Intro to Stata 54
Importing Foreign DataImporting Foreign Data• Edit the do file to pull from the appropriate folder
– You must tell it where to find the raw data (.txt) and the dictionary file (.dct) we stored it on the desktop
– And you must specify the name of the output file (.dta)y p y p ( )
1/25/2008 Christenson & Powell: Intro to Stata 55
Importing Foreign DataImporting Foreign Data• Once we’ve told the do file
editor where to find theeditor where to find the dictionary and data…
• We run the do file• In result viewer, Stata
returns in green– file C:\council.dta
dsaved– Or, if you got it wrong, it
returns an error code in redIf wrong make sure you– If wrong, make sure you specified the right directory
1/25/2008 Christenson & Powell: Intro to Stata 56
Imported Foreign DataImported Foreign Data
• Now you have a Stata formatted dataset ( dta)Now you have a Stata formatted dataset (.dta) from an ASCII file (.txt)
• Properly saved data file• Properly saved data file
• Variables listed and labeled
1/25/2008 Christenson & Powell: Intro to Stata 57
Other Importing OptionsOther Importing Options
• SPSS data (.sav) can be easily exported to Stata ( ) y pformat (.dta) from SPSS– In SPSS, just click Save As and select the appropriate Stata version (an export wizard is now available inStata version (an export wizard is now available in SPSS as well)
– FYI: You can also export from SPSS to just about anything else (SAS Excel ASCII dBase & SAS)anything else (SAS, Excel, ASCII, dBase & SAS)
• The PRL lab has Stat/Transfer– An easy way to move data between packages and into An easy way to move data between packages and intodifferent databases
– Especially good with large and labeled databases
1/25/2008 Christenson & Powell: Intro to Stata 58
CongratulationsCongratulations
• By now you can move comfortably around Stata• You can
– Keep a log of your work– Use Stata as a statistics calculatorUse Stata as a stat st cs ca cu ato– Create variables– Load a Stata dataset– Examine your dataExamine your data– Run some descriptive functions– Make basic graphs– Search for help on commands and packagesSearch for help on commands and packages– Export Stata output into your preferred document– Create, edit, run and save commands from a do file– And even import foreign datasets– And even import foreign datasets
1/25/2008 Christenson & Powell: Intro to Stata 59
RememberRemember
1. Begin by opening a log– Always keep a log
2. To increase memory for large datasets, type set mem 100m 3. Begin all analyses with simple descriptives
– Know your data 4. Utilize gen to generate variables
– The egen command is a helpful extension to genf l f h d5. Usefulness of the Review window
– Don’t need to retype the command (just click from the review)– Also helpful are the page up/down keys within the command prompt
6 i St t i d f b ti b6. _n is Stata programming code for observation number 7. Use .do files
– Annotate your do files utilizing the *
1/25/2008 Christenson & Powell: Intro to Stata 60
See You Next TimeSee You Next Time
• PRISM’s next brownbaggContemporary Methods of Ideal Point EstimationPresenter: Josh Clinton of Princeton UniversityJanuary 30, 2008January 30, 2008 12:00‐1:00pm
• PRISM’s Spring brownbagB i I f ith Wi BUGSBayesian Inference with WinBUGSPresenters: Dino Christenson & Scott Powell Date & Time TBA (Spring 2008)
d h // l d / /l h hUpdates at http://polisci.osu.edu/prism/luncheons.htm
• PRISM’s next methods lunchFebruary 5th, 12 noonFebruary 5 , 12 noon
1/25/2008 Christenson & Powell: Intro to Stata 61