Date post: | 21-Dec-2015 |
Category: |
Documents |
View: | 228 times |
Download: | 4 times |
Data Collection in the Field, Response
Error, and Questionnaire
Screening
Nonsampling Error in Marketing Research
• Nonsampling (administrative) error includes• All types of nonresponse error• Data gathering errors• Data handling errors• Data analysis errors• Interpretation errors
Possible Errors in Field Data Collection
• Field worker error: errors committed by the persons who administer the questionnaires
• Respondent error: errors committed on the part of the respondent
Nonsampling Errors Associated With Fieldwork
Possible Errors in Field Data Collection
Field-Worker ErrorsIntentional
• Intentional field worker error: errors committed when a fieldworker willfully violates the data collection requirements set forth by the researcher• Interviewer cheating: occurs when the
interviewer intentionally misrepresents respondents. May be caused by unrealistic workload and/or poor questionnaire
• Leading respondents: occurs when interviewer influences respondent’s answers through wording, voice inflection, or body language
Possible Errors in Field Data Collection
Field-Worker ErrorsUnintentional
• Unintentional field worker error: errors committed when an interviewer believes he or she is performing correctly• Interviewer personal characteristics: occurs
because of the interviewer’s personal characteristics such as accent, sex, and demeanor
• Interviewer misunderstanding: occurs when the interviewer believes he or she knows how to administer a survey but instead does it incorrectly
• Fatigue-related mistakes: occur when interviewer becomes tired
Possible Errors in Field Data Collection
Respondent ErrorsIntentional
• Intentional respondent error: errors committed when there are respondents that willfully misrepresent themselves in surveys• Falsehoods: occur when respondents fail to
tell the truth in surveys• Nonresponse: occurs when the prospective
respondent fails 1) to take part in a survey or 2) to answer specific survey questions
• Refusals (respondent does not answer any questions) vs. Termination (respondent answers at least one question then stops)
Possible Errors in Field Data Collection
Respondent ErrorsIntentional
• Refusals typically result from the topic of the study or potential respondent lack of time, energy or desire to participate
• Terminations result from a poorly designed questionnaire, questionnaire length, lack of time or energy, and/or external interruption
Possible Errors in Field Data Collection
Respondent ErrorsUnintentional
• Unintentional respondent error: errors committed when a respondent gives a response that is not valid but that he or she believes is the truth
Possible Errors in Field Data Collection
Respondent ErrorsUnintentional…cont.
• Respondent misunderstanding: occurs when a respondent gives an answer without comprehending the question and/or the accompanying instructions
• Guessing: occurs when a respondent gives an answer when he or she is uncertain of its accuracy
• Attention loss: occurs when a respondent’s interest in the survey wanes
• Distractions: (such as interruptions) may occur while questionnaire administration takes place
• Fatigue: occurs when a respondent becomes tired of participating in a survey
How to Control Data Collection Errors
Types of Errors Control Mechanisms
Intentional Field Worker Errors Cheating Good questionnaire,
Reasonable work expectation, Supervision, Random checks
Leading respondent Validation
Unintentional Field Worker ErrorsInterviewer Characteristics Selection and training of interviewersMisunderstandings Orientation sessions
and role playingFatigue Require breaks and alternate
surveys
} {
How to Control Data Collection Errors…cont.
Types of Errors Control Mechanisms
Intentional Respondent Errors Assuring anonymity and
confidentialityFalsehoods Incentives
Validation checksThird person technique
Assuring anonymity and confidentiality
Nonresponse IncentivesThird person technique
{
{
How to Control Data Collection Errors…cont.
Types of Errors Control Mechanisms
Unintentional Respondent Errors Well-drafted questionnaire
Misunderstandings Direct Questions: Do you understand?
Well-drafted questionnaireGuessing Response options
(e.g., “unsure”)
Attention loss Reversal of scale endpointsDistractionsFatigue Prompters
{
{
{
}
Data Collection Errors with Online Surveys
• Multiple submissions by the same respondent (not able to identify such situations)
• Bogus respondents and/or responses (“fictitious person,” disguises or misrepresents self)
• Misrepresentation of the population (over-representing or under-representing segments with/without online access and use)
Nonresponse Error
• Nonresponse: failure on the part of a prospective respondent to take part in a survey or to answer specific questions on the survey• Refusals to participate in survey• Break-offs (terminations) during the interview• Refusals to answer certain questions (item
omissions)• Completed interview must be defined (acceptable
levels of non-answered questions and types).
Nonresponse Error…cont.
• Response rate: enumerates the percentage of the total sample with which the interviews were completed• Refusals to participate in survey• Break-offs (terminations) during the interview• Refusals to answer certain questions (item
omissions)
Nonresponse Error…cont.
CASRO response rate formula (not mathematically correct):
Reducing Nonresponse Error
• Mail surveys:• Advance notification• Monetary incentives• Follow-up mailings
• Telephone surveys:• Callback attempts
Preliminary Questionnaire Screening
• Unsystematic (flip through questionnaire stack and look at some) and systematic (random or systematic sampling procedure to select) checks of completed questionnaires
• What to look for in questionnaire inspectionIncomplete questionnaires?Nonresponses to specific questions?Yea- or nay-saying patterns (use scale
extremes only)?Middle-of-the-road patterns (neutrals on all)
?
Unreliable Responses
• Unreliable responses are found when conducting questionnaire screening, and an inconsistent or unreliable respondent may need to be eliminated from the sample.
Determining the Sample Plan
The Sample Plan is the process followed to select units from the population to be used
in the sample
Basic Concepts in Samples and Sampling
• Population: the entire group under study as defined by research objectives. Sometimes called the “universe.”
Researchers define populations in specific terms such as heads of households, individual person types, families, types of retail outlets, etc.
Population geographic location and time of study are also considered.
Basic Concepts in Samples and Sampling
• Sample: a subset of the population that should represent the entire group
• Sample unit: the basic level of investigation…consumers, store managers, shelf-facings, teens, etc. The research objective should define the sample unit
• Census: an accounting of the complete population
Basic Concepts in Samples and Sampling…cont.
• Sampling error: any error that occurs in a survey because a sample is used (random error)
• Sample frame: a master list of the population (total or partial) from which the sample will be drawn
• Sample frame error (SFE): the degree to which the sample frame fails to account for all of the defined units in the population (e.g a telephone book listing does not contain unlisted numbers) leading to sampling frame error.
Basic Concepts in Samples and Sampling…cont.
• Calculating sample frame error (SFE): Subtract the number of items on the sampling list from the total number of items in the population.
Take this number and divide it by the total population. Multiply this decimal by 100 to convert to percent (SFE must be expressed in %)
If the SFE was 40% this would mean that 40% of the population was not in the sampling frame
Reasons for Taking a Sample
• Practical considerations such as cost and population size
• Inability of researcher to analyze large quantities of data potentially generated by a census
• Samples can produce sound results if proper rules are followed for the draw
Basic Sampling Classifications
• Probability samples: ones in which members of the population have a known chance (probability) of being selected
• Non-probability samples: instances in which the chances (probability) of selecting members from the population are unknown
Probability Sampling MethodsSimple Random Sampling
• Simple random sampling: the probability of being selected is “known and equal” for all members of the population• Blind Draw Method (e.g. names “placed in a hat”
and then drawn randomly)• Random Numbers Method (all items in the
sampling frame given numbers, numbers then drawn using table or computer program)
• Advantages: • Known and equal chance of selection• Easy method when there is an electronic database
Probability Sampling MethodsSimple Random Sampling
• Disadvantages: (Overcome with electronic database)• Complete accounting of population needed
• Cumbersome to provide unique designations to every population member
• Very inefficient when applied to skewed population distribution (over- and under-sampling problems) – this is not overcome with the use of an electronic database)
Probability Sampling MethodsSystematic Sampling
• Systematic sampling: way to select a probability-based sample from a directory or list. This method is at times more efficient than simple random sampling.
• Sampling interval (SI) = population list size (N) divided by a pre-determined sample size (n)
• How to draw: 1) calculate SI, 2) select a number between 1 and SI randomly, 3) go to this number as the starting point and the item on
the list here is the first in the sample, 4) add SI to the position number of this item and the new
position will be the second sampled item, 5) 5) continue this process until desired sample size is
reached.
Probability Sampling MethodsSystematic Sampling
• Advantages: • Known and equal chance of any of the SI
“clusters” being selected• Efficiency..do not need to designate (assign a
number to) every population member, just those early on on the list (unless there is a very large sampling frame).
• Less expensive…faster than SRS
• Disadvantages:• Small loss in sampling precision• Potential “periodicity” problems
Probability Sampling MethodsCluster Sampling
• Cluster sampling: method by which the population is divided into groups (clusters), any of which can be considered a representative sample.
• These clusters are mini-populations and therefore are heterogeneous.
• Once clusters are established a random draw is done to select one (or more) clusters to represent the population.
• Area and systematic sampling (discussed earlier) are two common methods. • Area sampling
Probability Sampling MethodsCluster Sampling
• Advantages• Economic efficiency … faster and less
expensive than SRS• Does not require a list of all members of the
universe
• Disadvantage:• Cluster specification error…the more
homogeneous the cluster chosen, the more imprecise the sample results
Probability Sampling MethodsCluster Sampling – Area Method
• Drawing the area sample:
• Divide the geo area into sectors (sub-areas) and give them names/numbers, determine how many sectors are to be sampled (typically a judgment call), randomly select these sub-areas. Do either a census or a systematic draw within each area.
• To determine the total geo area estimate add the counts in the sub-areas together and multiply this number by the ratio of the total number of sub-areas divided by number of sub-areas.
A two-step area cluster sample (sampling several clusters) is preferable to a one-step (selecting only
one cluster) sample unless the clusters are homogeneous
• This method is used when the population distribution of items is skewed.
• It allows us to draw a more representative sample.
• Hence if there are more of certain type of item in the population the sample has more of this type and
• if there are fewer of another type, there are fewer in the sample.
Probability Sampling MethodsStratified Sampling
Probability Sampling MethodsStratified Sampling
• Stratified sampling: the population is separated into homogeneous groups/segments/strata and a sample is taken from each. The results are then combined to get the picture of the total population.
• Sample stratum size determination• Proportional method (stratum share of total
sample is stratum share of total population)• Disproportionate method (variances among
strata affect sample size for each stratum)
Probability Sampling MethodsStratified Sampling
• Advantage: • More accurate overall sample of skewed
population…see next slide for WHY• Disadvantage:
• More complex sampling plan requiring different sample sizes for each stratum
Why is Stratified Sampling more accurate when there are skewed populations?
The less the variance in a group, the smaller the sample size it takes to produce a precise answer.
Why? If 99% of the population (low variance) agreed on the choice of brand A, it would be easy to make a precise estimate that the population preferred brand A even with a small sample size.
But, if 33% chose brand A, and 23% chose B, and so on (high variance) it would be difficult to make a precise estimate of the population’s preferred brand…it would take a larger sample size….
Why is Stratified Sampling more accurate when there are skewed populations?
Continued..Stratified sampling allows the researcher to allocate
a larger sample size to strata with more variance and smaller sample size to strata with less variance. Thus, for the same sample size, more precision is achieved.
This is normally accomplished by disproportionate sampling.
Non-probability Sampling MethodsConvenience Sampling Method
• Convenience samples: samples drawn at the convenience of the interviewer. People tend to make the selection at familiar locations and to choose respondents who are like themselves.• Error occurs 1) in the form of members of the population who
are infrequent or non-users of that location and
2) who are not typical in the population
Nonprobability Sampling MethodsJudgment Sampling Method
• Judgment samples: samples that require a judgment or an “educated guess” on the part of the interviewer as to who should represent the population. Also, “judges” (informed individuals) may be asked to suggest who should be in the sample.• Subjectivity enters in here, and certain
members of the population will have a smaller or no chance of selection compared to others
Nonprobabilty Sampling MethodsReferral and Quota Sampling Methods
• Referral samples (snowball samples): samples which require respondents to provide the names of additional respondents• Members of the population who are less known,
disliked, or whose opinions conflict with the respondent have a low probability of being selected.
• Quota samples: samples that set a specific number of certain types of individuals to be interviewed• Often used to ensure that convenience samples
will have desired proportion of different respondent classes
Online Sampling Techniques
• Random online intercept sampling: relies on a random selection of Web site visitors
• Invitation online sampling: is when potential respondents are alerted that they may fill out a questionnaire that is hosted at a specific Web site
• Online panel sampling: refers to consumer or other respondent panels that are set up by marketing research companies for the explicit purpose of conducting online surveys with representative samples
Developing a Sample Plan
• Sample plan: definite sequence of steps that the researcher goes through in order to draw and ultimately arrive at the final sample
Developing a Sample PlanSix steps
• Step 1: Define the relevant population.• Specify the descriptors, geographic
locations, and time for the sampling units.
• Step 2: Obtain a population list, if possible; may only be some type of
sample frame• List brokers, government units,
customer lists, competitors’ lists, association lists, directories, etc.
Developing a Sample PlanSix steps
• Step 2 (concluded):• Incidence rate (occurrence of certain
types in the population, the lower the incidence the larger the required list needed to draw sample from)
Developing a Sample Plan Six steps …continued
• Step 3: Design the sample method (size and method).
• Determine specific sampling method to be used. All necessary steps must be specified (sample frame, n, … recontacts, and replacements)
• Step 4: Draw the sample.• Select the sample unit and gain the
information
Developing a Sample PlanSix steps…concluded
• Step 4 (Continued):• Drop-down substitution• Oversampling• Resampling
• Step 5: Assess the sample.• Sample validation – compare sample
profile with population profile; check non-responders
• Step 6: Resample if necessary.
Determining the Size of
a Sample
Sample Accuracy
• Sample accuracy: refers to how close a random sample’s statistic (e.g. mean, variance, proportion) is to the population’s value it represents (mean, variance, proportion)
• Important points:• Sample size is NOT related to
representativeness … you could sample 20,000 persons walking by a street corner and the results would still not represent the city; however, an n of 100 could be “right on.”
Sample Accuracy
• Important points:• Sample size, however, IS related to accuracy.
How close the sample statistic is to the actual population parameter (e.g. sample mean vs. population mean) is a function of sample size.
Sample Size AXIOMS
To properly understand how to determine sample size, it helps to understand the following AXIOMS…
Sample Size Axioms
• The only perfectly accurate sample is a census.• A probability sample will always have some
inaccuracy (sample error).• The larger a probability sample is, the more
accurate it is (less sample error).• Probability sample accuracy (error) can be
calculated with a simple formula, and expressed as a + % value.
Sample Size Axioms…cont.
• You can take any finding in the survey, replicate the survey with the same probability sample plan & size, and you will be “very likely” to find the same result within the + range of the original findings.
• In almost all cases, the accuracy (sample error) of a probability sample is independent of the size of the population.
Sample Size Axioms…cont.
• A probability sample can be a very tiny percentage of the population size and still be very accurate (have little sample error).
• The size of the probability sample depends on the client’s desired accuracy (acceptable sample error) balanced against the cost of data collection for that sample size.
There is only one method of determining sample size that allows the researcher to PREDETERMINE the
accuracy of the sample results…
The Confidence Interval Method of Determining
Sample Size
The Confidence Interval Method of Determining Sample SizeNotion of Confidence Interval
Confidence interval: range whose endpoints define a certain percentage of the responses to a question
• Central limit theorem: a theory that holds that values taken from repeated samples of a survey within a population would look like a normal curve. The mean of all sample means is the mean of the population.
The Confidence Interval Method of Determining Sample Size
• Confidence interval approach: applies the concepts of accuracy, variability, and confidence interval to create a “correct” sample size
• Two types of error:• Nonsampling error: pertains to all sources of error
other than sample selection method and sample size • Sampling error: involves sample selection and
sample size…this is the error that we are controlling through formulas
• Sample error formula:
The Confidence Interval Method of Determining Sample Size
• The relationship between sample size and sample error:
The Confidence Interval Method of Determining Sample Size -
ProportionsVariability
• Variability: refers to how similar or dissimilar responses are to a given question
• P (%): share that “have” or “are” or “will do” etc.• Q (%): 100%-P%, share of “have nots” or “are
nots” or “won’t dos” etc.
N.B.: The more variability in the population being studied, the larger the sample size needed to achieve stated accuracy level.
With Nominal data (i.e. Yes, No), we can conceptualize answer variability with bar charts…
the highest variability is 50/50
The Central Limit Theorem allows us to use the logic of the Normal Curve Distribution
• Since 95% of samples drawn from a population will fall within + 1.96 x Sample error
• (this logic is based upon our understanding of the normal curve)
• we can make the following statement: ….
If we conducted our study over and over, e.g.1,000 times, we would expect our result to fall within a known range (+ 1.96 s.d.’s of the mean). Based upon this, there are 95 chances in 100 that the true value of the universe statistic (proportion, share, mean)
falls within this range!
The Confidence Interval Method of Determining Sample Size
Normal Distribution
1.96 X s.d. defines the endpoints for 95% of the distribution
We also know that, given the amount of variability in the population, the sample size affects the size of the confidence interval; as n goes down the interval
widens (more “sloppy”)
So, what have we learned thus far?
There is a relationship among:
• the level of confidence we desire that our results be repeated within some known range if we were to conduct the study again, and…
• the variability (in responses) in the population and…
• the amount of acceptable sample error (desired accuracy) we wish to have and…
• the size of the sample.
Sample Size Formula
• The formula requires that we (a.)specify the amount of confidence we wish to
have, (b.) estimate the variance in the population, and (c.) specify the level of desired accuracy we want.
• When we specify the above, the formula tells us what sample size we need to use….n
Sample Size Formula - Proportion
• The sample size formula for estimating a proportion (also called a percentage or share):
Practical Considerations in Sample Size Determination
• How to estimate variability (p and q shares) in the population
• Expect the worst case (p=50%; q=50%)
• Estimate variability: results of previous studies or conduct a pilot study
Practical Considerations in Sample Size Determination
• How to determine the amount of desired sample error
• Researchers should work with managers to make this decision. How much error is the manager willing to tolerate (less error = more accuracy)?
• Convention is + 5% • The more important the decision, the less should
be the acceptable level of the sample error
Practical Considerations in Sample Size Determination
• How to decide on the level of confidence desired
• Researchers should work with managers to make this decision. The higher the desired confidence level, the larger the sample size needed
• Convention is 95% confidence level (z=1.96 which is + 1.96 s.d.’s )
• The more important the decision, the more likely the manager will want more confidence. For example, a 99% confidence level has a z=2.58.
Example: Estimating a Percentage (proportion or share) in the Population
What is the Required Sample Size?
• Five years ago a survey showed that 42% of consumers were aware of the company’s brand (Consumers were either “aware” or “not aware”)
• After an intense ad campaign, management will conduct another survey. They want to be 95% confident (95 chances in 100) that the survey estimate will be within + 5% of the true share of “aware” consumers in the population.
• What is n?
Estimating a Percentage: What is n?
Z=1.96 (95% confidence)
p=42% (p, q and e must be in the same units)
q=100% - p%=58%
e= + 5%
What is n?
N=374 What does this mean?
It means that if we use a sample size of 374, after the survey, we can say the following of the results: (Assume results show that 55% are aware)
“Our most likely estimate of the percentage of consumers that are “aware” of our brand name is 55%. In addition, we are 95% confident that the true share of “aware” customers in the population falls between 52.25% and 57.75%.”
Note that: ( + .05 x 55% = + 2.75%) !!!!
Estimating a MeanThis requires a different formula
Z is determined the same way (1.96 or 2.58)
e is expressed in terms of the units we are estimating, i.e. if we are measuring attitudes on a 1-7 scale, we may want our error to beno more than + .5 scale units. If we are estimating dollars being paid for a product, we may want our error to be no more than + $3.00.S is a little more difficult to estimate, but must be in same units as e.
Estimating “s” in the Formula to Determine the Sample Size Required to
Estimate a MeanSince we are estimating a mean, we can assume that our data
are either interval or ratio. When we have interval or ratio data, the standard deviation of the sample, s, may be used as a measure of variance.
How to estimate s?• Use standard deviation of the sample from a previous study
on the target population• Conduct a pilot study of a few members of the target
population and calculate s
Example: Estimating the Mean of a Population
What is the required sample size, n?
Management wants to know customers’ level of satisfaction with their service. They propose conducting a survey and asking for satisfaction on a scale from 1 to 10 (since there are 10 possible answers, the range = 10).
Management wants to be 99% confident in the results (99 chances in 100 that true value is captured) and they do not want the allowed error to be more than + .5 scale points.
What is n?
What is n?
S = 1.7 (from a pilot study), Z = 2.58 (99% confidence), and e = .5 scale pointsWhat is n? It is 77. Assume the survey average score was
7.3, what does this “tell us?” A 10 is very satisfied and a 1 is not satisfied at all.
Answer: “Our most likely estimate of the level of consumer satisfaction is 7.3 on a 10-point scale. In addition, we are 99% confident that the true level of satisfaction in our consumer population falls between 6.8 and 7.8 on the scale.”
Other Methods of Sample Size Determination
• Arbitrary “percentage rule of thumb” sample size:• Arbitrary sample size approaches rely on
erroneous rules of thumb (e.g. “n must be at least 5% of the population”).
• Arbitrary sample sizes are simple and easy to apply, but they are neither efficient nor economical. (e.g. Using the “5 percent rule,” if the universe is 12 million, n = 600,000 – a very large and costly result)
Other Methods of Sample Size Determination…cont.
• Conventional sample size specification• Conventional approach follows some
“convention” or number believed somehow to be the right sample size (e.g. 1,000 – 1,200 used for national opinion polls w/+ 3% error)
• Using conventional sample size can result in a sample that may be too large or too small.
• Conventional sample sizes ignore the special circumstances of the survey at hand.
Other Methods of Sample Size Determination…cont.
• Statistical analysis requirements of sample size specification• Sometimes the researcher’s desire to use
particular statistical technique influences sample size. As cross comparisons go up cell sizes go up and n goes up.
• Cost basis of sample size specification• Using the “all you can afford” method, instead of
the value of the information to be gained from the survey being the primary consideration in sample size determination, the sample size is based on budget factors.
Special Sample Size Determination Situations
Sample Size Using Nonprobability Sampling
• When using nonprobability sampling, sample size is unrelated to accuracy, so cost-benefit considerations must be used
Theoretical Framework and Hypothesis Development
Theoretical framework is a conceptual model which shows the relationships between factors affecting a phenomenon.
It is based on previous research that are tested.
When developing theoretical frameworks…
1. Determine the relevant variables and define them
2. State the relationships between 2 or more variables and their directions
3. Determine the direction of relationships among variables
4. Explain why this direction of the relationship is expected
Types of Variables
Dependent variable:
• The main variable that is the main interest of the research • The aim is to explain the change in this variable
• Brand preference, brand loyalty, customer satisfaction, evaluation of advertising campaign
• Export performance, Perceived image of Brand X
Independent variable:
• The variable which affects the dependent variable, in other words
• Which causes the change in the dependent variable
• Store preference-------- planned shopping behaviour• Adoption of internet banking------- age• Factors affecting supermarket preference -------- the importance
given to price
Types of Variables
Mediating variable:
• The variable which is creates the necessary condition to have the relationship between the dependent and the independent variable
• Emotional attachment ------ consumer-company identification---corporate image
• Age------- shopping in supermarkets ------- frozen food
Intervening variable:
• The variable which emerge during the period in which the affect of independent variable’s impact on the dependent variable is assessed
HHypothesisypothesis
The testable statements which assert the relationships that are pre-determined on the basis of theoretical framework
1. If-then statements
2. Directional or non-directional statements NULL and ALTERNATIVE HYPOTHESIS
H0 : It states the relationship that we do not want to find. • We expect to reject this hypothesis
• Therefore, we should formulate the statement in the NULL hypothesis as something we do not prefer to happen.
ExamplesExamples
The firms will launch the product to a certain market if the market share is more than 10 %
H0 : 0.10Ha : 0.10
The new formula of the X product should bring a better market share than the existing version of the product X
H0 : 0.10 Ha : 0.10
H0 : Ha :
Alternative Hypotheses: ExamplesAlternative Hypotheses: Examples
Ha : There is a relationship between internet banking and prior experience about technological products.
Ha : There is a relationship between usage of marketing
research in international markets and firm size. Ha : Status of foreign partnership in capital affect technology
usage in logistics activities. Ha :When brand preference is assessed, there is a difference
between less loyal and more loyal consumers groups on brand reputation.
Ha : Age and gender affect purchase intention.
Hypothesis test types
Two major aims:
1. Understanding differences 2. Understanding relationships Univariate tests:
• There is only one measurement for an item in a sample
• Variables are tested individually Multivariate tests:
• There are 2 or more measurement for observationVariables are tested simultaneously