Getting the most out of
your data: a pivot table
workshop
Julia Hill, Jon Ma, William Elliott
Student Data Management and Analysis
Tuesday 27 September 2016
Welcome
• Pivot tables:
• Data summarization tool
• Used with “flat” datasets with defined columns of
data
• Can automatically sort, count, total and give
averages – among other functions
• Exists in many different data processing and
visualization software, most commonly used in
Microsoft Excel
Outline of session
• Cleaning and shaping data
• Data Analysis I - getting started with Pivots
• Joining data
• Data Analysis II - more Pivot functions
• Data workshop
CLEANING AND SHAPING
YOUR DATA
Extracting data
• Sign in to eVision using your SSO
• Navigate to your chosen report
• Filter using parameters as required
• Watch out for required fields
• Always export the data file
Reviewing your data
• Familiarise yourself with
the data
• Watch out for data
issues:
• Issues with column
headers
• Mismatched data types
• Numbers stored as text
• Error values
• Obvious outliers
• Active formulas
• “Blank” cells
• Duplicate values
(depends on the context)
Cleaning and shaping data
• Use filters + +
• Use formulae:
• =COUNT for numerical values
• =COUNTIF for duplicates
• Error checking
• Find and replace +
• Pivot tables themselves can be useful to identify
data integrity issues
DATA ANALYSIS I
Data Analysis – Basic Methods
Autosum +
• Useful for summarising whole data ranges
• Can change aggregation type
Selection + + or
• Useful for summarising on the fly
• Can choose summary types
Filter + +
• Allows review of specific data
PivotTables I – Why use them?
• A powerful feature for summarising larger datasets
• Much easier and more flexible than summarising
separate subsets
• Can help answer the following questions
• How many students are in each college?
• Which college has the most undergraduate
students?
PivotTables I – Things you can do
• Choose the relevant aggregation
eg Sum / Count / Average
• Filter to show the exact data you want to analyse
• Create crosstabs to analyse multiple dimensions
• Drill down to see the data
JOINING DATASETS
Data Joins
• How can I join data from different spreadsheets?
• Common field between both data sources:
• Student Number
• UCAS Number … ?
• Is it truly unique?
• May need to create a unique ID if there are multiple
instances (e.g. student applied more than once)
• VLOOKUP and INDEX/MATCH
Absolute Cell References
Not absolute A2 Row only absolute A$2
Will change rows and columnsWill always reference row 2 but
columns will change
Column only absolute $A2 All absolute $A$2
Will always reference column A but
rows will changeWill always reference cell A2
If you copy or drag a formula, what happens?
Consider this carefully when using lookup formulas!
F4
VLOOKUP
1. The unique lookup value
2. The range (array) of cells your additional data is found inCan be specified as:• Columns – tab!B:D• Specified cells – tab!A2:E300• Named Range - NewDataV
3. The column reference where your new data is found.The ID column is always 1 -count along to the right
4. Approximate match: TRUE or 1orExact match: FALSE or 099% always Exact
INDEX/MATCH
1. The column where your new data is
2. The unique lookup ID value in your original dataset
3. The matching ID column in your new dataset
5. Column index number (1)
4. Match type0 = exact
Demo
Matching Data
• VLOOKUP
• Simpler
• All new data must be to the RIGHT of the unique ID
column in the new dataset
• INDEX/MATCH
• Slightly more complex
• New data can be anywhere in the new dataset
DATA ANALYSIS II
PivotTables II
• Refresh a Dataset
• Use other summaries
• Average / Maximum / Minimum
• Add a calculation eg to show % of Total
• Add multiple summaries eg count and percentage
of students
• Group Data
DATA WORKSHOP
Data Workshop
• Use your own data to explore the techniques we
have covered
• Or – use the model data provided!
• Play with pivots and data joins
• Think about how these techniques can help you in
your job
• Ask for help – don’t be shy!
Conclusion
• A Pivot Table is a powerful tool for providing
summaries of large datasets
• Underlying data must be cleaned and shaped
before pivoting
• Underlying data can be enriched with data from
other spreadsheets using VLOOKUP or
INDEX/MATCH
QUESTIONS?