An Introduction to Qualitative
Data Analysis for NVivo
Nyree Mason,
Quantitative & Qualitative Research Consultant,
Statistician & Trainer
Purpose of this Presentation
NVivo is a complex program which can be
confusing for people with little or no
experience in conducting qualitative
research.
This presentation aims to provide you with
an introduction to the process of qualitative
data analysis with specific reference to
working in NVivo.
You will learn about:
Creating Coding Schemes
Applying Coding Schemes
Ensuring Reliability and Validity
NVivo-Specific Terminology
The Basic Steps Involved in Setting
Up an NVivo Project
The Purpose of Qualitative Data
Analysis
To use qualitative data to answer
research questions or collect
evidence by:
• Categorising,
• Synthesising, and
• Summarising data
Qualitative Research Methods
To do this we use a standardised method of
analysis called a CODING SCHEME.
The Codes are research themes and act like the
Dependent Variables in your project.
Coding is a process of identifying sections of
data where research themes of interest occur
and applying descriptive labels to them.
We can use Codes to quantify qualitative data
to some extent.
Defining a Coding Scheme
Coding schemes can be constructed
(at least initially) based on:
research questions/hypotheses
previous research findings
info from a random sample or all
of your data (Grounded Theory)
an NVivo Word Frequency Query
A Posteriori
(Exploratory)
A Priori
(Confirmatory)
Defining a Coding Scheme
It is usually an iterative process using both top-down & bottom-up methods. Some general considerations are:
Is it relevant vs interesting?
The KISS principle (Keep It Simple Stupid)
Positive vs Negative contexts and attitudes
“Outliers”
Free vs Tree coding
Tree vs Free Coding
Codes can be:
Hierarchical (“Tree”):
– Parent Nodes (e.g., Assessment)
– Child Nodes (e.g., MCQ, SAQ, Essay)
[allows for “aggregation” of children]
Stand-Alone (“Free”)
NB: Avoid “grandchildren”. NO duplicates. Have
succinct labels.
The problem of subjectivity
You’ll need to devise a coding scheme which is
as objective as possible and satisfies the
requirements of:
Reliability – other researchers would find the
same results & you would find the same
results if you re-analysed the data;
Validity – you are measuring/assessing what
you believe/say you are.
Defining Codes
Ideally, specify exactly & unambiguously
what the code means and under what
conditions it will be applied to your data.
If someone else was coding your data, they
should be able to understand exactly how to
correctly apply the codes and they would apply
them the same way you would.
Ensuring Reliability & Validity
Assess the Reliability and Validity of your
coding scheme early at the beginning of
the coding process using:
Coding Comparison measures
Discussions with colleagues/subject experts
Ensuring Objectivity
Try to eliminate bias as much as possible:
Read with an open mind
Read to both confirm AND falsify any preconceptions/theories/hypotheses you have (and all humans have them so it’s best to make them explicit if you can).
How to Code
Applying Codes
Reliability and Validity Checks
Coding Context
Applying a Coding Scheme
In simple terms, the process of coding is
like highlighting sections of text in an
article and then applying one or more
descriptive labels to it.
Consider Coding Context
How much information you include in your coding
reference (e.g., partial/whole sentences,
paragraphs) is very important for revision and
analysis.
Include enough information to justify how
appropriate your coding was during revision.
Exclude superfluous information if you want to
avoid spurious correlations in analysis.
Example Code: “TV is not educational”
“… television....”
vs.
“I find television very educating.”
vs.
“I find television very educating. Every
time somebody turns on the set, I go into
the other room and read a book.”
Example Coding Query
Relevant
TextIrrelev
ant
Relevant
Text
CONDUCTING QUALITATIVE
DATA ANALYSIS
The process in NVivo
Terminology in NVivo: Nodes
Coding is stored in NVivo as project
items called Nodes (folders containing
all the data you code as belonging to a
theme). You “code your data at Nodes”.
E.g., The “Habitat” Node stores all the
references you have coded as being
related to the concept of “Habitat”. Open
the Node, and you see all the references.
Handy Hint: Contextual Nodes
It’s also a good idea to have a Node for each
question in a survey or interview. Then you can
analyse responses to individual questions more
easily.
Attitudes and non-verbal cues are other
examples of Nodes you can use for analysing
themes in context (positive/negative, sad/angry
etc.)
Terminology in NVivo: Cases
Cases are very important NVivo project
items and store all data references relating
to individual “Cases” (units of measurement
such as people, articles, policies etc.).
Case Nodes allow you to count how many
people/articles mentioned themes X, Y and Z
E.g., The Case item called “Bob” is where all of
Bob’s survey responses are stored.
[]
Terminology in NVivo: Classifications
Case Classifications: describe the Cases in
your analysis (e.g., demographics like age and
gender). You MUST include all the information
you need for analysis (independent/breakdown
variables).
Source Classifications: describe your
reference material and data sources (e.g.,
interviews, surveys, bibliographic information).
These mainly help with limiting searches and
keeping track of important information.
Terminology in NVivo:
Attributes & Values
Attributes: define all the variables that
each Classification needs to have.
Values: the possible values each Attribute
has.
[Setting values allows you to create a drop-down
menu making data entry easier and reducing
errors]
Classifications, Attributes & Values
Person
Age
18-21
22-25
26-29
Gender
Male
Female
Other
CaseClassification
Attributes
Values
Classification Sheets
ID Gender Age State/Territory
1 Female 26-29 NSW
2 Male 26-29 ACT
3 Male 22-25 Vic
4 Other 18-21 SA
“Person” Classification Sheet
Attributes
Values
Handy Hint
Think of your Classification Attributes as the
Independent Variables in your research design.
E.g., if you think Political Affiliation might affect
how people respond to questions “Affiliation”
is an IV and needs to be an Attribute of a
Person Classification.
Handy Hint
Attributes MUST be mutually exclusive.
E.g., if one Case can be engaged in both
recreational and commercial fishing, you cannot
have one Fishing Attribute, because you can’t tick
more than one box. You need one Attribute for
Recreational AND one for Commercial, with
yes/no/not applicable options.
Handy Hint
For interviews conducted by multiple people,
consider creating an Attribute for Individual
Interviewers within your “Interview” Source
Classification”. Sometimes responses vary
greatly depending on who asks the questions,
and it’s worth assessing.
Analysing in NVivo: STEP 1
Determine your units of measurement, and
these will become your Cases. For example, if you want to know the number of:
people saying X, Y & Z
need a Case item for each person
journal articles including concept Y
need a Case item for each article
[NB Case project items are more easily created during or
after importing data]
Analysing in NVivo: STEP 2
Create all your Case & Data File
Classifications, associated Attributes and
Values.
It is much quicker and easier if you set these
up BEFORE importing data, as you can
Classify your Data Files and Cases during
data import.
Analysing in NVivo: STEP 3
Import and organise your qualitative data.
Put all the data you need to code & analyse in the
Files Folder. Other peripheral documents can be
included as linked files in the Externals Folder.
This will save computing power and make NVivo
run better.
Analysing in NVivo: STEP 4
Create your Cases (if you haven’t done so
during import), and give them relevant
unique ID numbers/Names etc.
Classify your Cases so they appear in the
Classification Sheet for data entry.
Analysing in NVivo: STEP 5
Open the Classification Sheet and apply
the relevant Attribute Values to your
Cases.
Analysing in NVivo: STEP 6
RECOMMENDED: Define your coding
scheme and create a Node for each code
you have in the Nodes Folder of the
Code area.
This can be done/altered at any time, but it
does help to keep the coding within the scope
of your research project. Plus it makes any
implicit assumptions explicit; thus allowing for
active reading to confirm & falsify (à la Popper).
Analysing in NVivo: STEP 7
CODE! ☺Start with a small (random) sample of data first
and check reliability & validity.
Checking Reliability
Ideally 3 people code the same small
random sample of your data. You can
check the inter-coder reliability in NVivo
using a Coding Comparison Query.
Handy Hints: You could also just Recode the
same data weeks later under a different user
ID. Focus only on coding difficult/ambiguous
concepts to save time. Delete duplicate coding
once reliability established.
Analysing in NVivo: STEP 8
ANALYSE! ☺
When you have completed coding, you can
conduct Queries in NVivo to help analyse and
summarise your data.
NOW YOU’RE READY TO
USE NVIVO!
EXTRA EXAMPLES
Developing a Coding Scheme
Example: Climate Change
Example Hypotheses
RESEARCH QUESTION: What factors do
respondents believe are responsible for the
decline in the fishing economy?
• How many believe Climate Change is a factor?
• Of these people, do they see Climate Change
as being caused by human influences, natural
influences or both?
• How many do not believe Climate Change is a
factor because it does not exist?
Coding Scheme Examples
PARENT CODE - Climate change
RULES - Any mention of the following:
terms such as “climate change” or “global warming”,
substantial change in average weather conditions,
increasing/decreasing average temperatures,
change in the time variation of weather,
increase/decrease in the length of seasons or in
number of distinct seasons,
change in amount of extreme weather events.
Coding Scheme Examples cont.
CHILD CODES & RULES:
Caused by Human Influences: Mentions
directly/indirectly caused by pollution, agriculture,
increased population,
Caused by Natural Influences: Mentions caused by
factors beyond human influence such as ocean
variation, orbital variations, solar output, volcanic
activity, seismic activity, etc.
Does not exist: Explicitly rejects the notion that
Climate Change is responsible because it isn’t real.