Identifying Sources of Error: the 2007 Classification Error Survey for the US Census of
Agriculture
Jaki McCarthy and Denise AbreuUSDA’s National Agricultural Statistics Service
Presented at the International Total Survey Error Workshop
Tallberg, Sweden
June 2009
Errors in one survey can be measured with matching information from other sources Target:
Census of Agriculture
Alternate Source of Information:
June Agricultural Survey
Error of Interest: Scoping Errors, i.e. Census Misclassification Census farms incorrectly classified
as non-farms
Census non-farms incorrectly classified as farms
Errors in one survey can be measured with matching information from other sourcesCensus of Agriculture Census of Agriculture
conducted every 5 years
Count of all US Ag operations ($1000 or more in sales)
Primarily mail data collection
Data collected December - March
June Agricultural Survey (JAS)
Annual area frame based sample survey in June
JAS is primarily face to face interviews
Data collected in first 2 weeks of June
JAS has been used to measure undercoverage and misclassification on census
JAS – Area Frame BasedJAS – Area Frame Based
7
NASS Area Frame - SEGMENTNASS Area Frame - SEGMENT
Theoretically complete sampling frame
No overlap or gaps
Segments of land sampled
8
NASS Area Frame – Segment NASS Area Frame – Segment EnumerationEnumeration
Sampled segments divided into tracts representing unique land operating arrangements
In-person interviewers screen for whether a tract is part of an agricultural operation and, if so, collect crop and livestock information
9
10
Background:Previous Classification Error Studies
Measured census classification error – records incorrectly classified as farms or non-farms and duplication
Census records matched to JAS
JAS was assumed as truth; differences between the two sources were designated as census misclassification
Overall census misclassification error was estimated
11
Background:Previous Classification Error Studies
Net classification error was small and was not used to adjust census numbers
For these reasons, shift in study’s primary objective
12
Current Classification Error Survey
To identify REASONS for discrepancies between the JAS and the Census
Qualitative examination of why errors occur Classification errors Reporting errors also examined
To provide information to improve quality of the data, reduce analyst review and editing
13
2007 CES Objective
Determine whether acreage/scoping differences are legitimate changes or errors
Determine why people report incorrectly
Determine if the forms were correctly processed
14
Methods Census records matched to JAS records
Respondents records with scoping or acreage discrepancies were identified
Respondents re-interviewed and asked to resolve discrepancies
Census farm Census non-farm
JAS farm Match Misclassification - undercount
JAS non-farm Misclassification - overcount
Match
Identifying Groups with Discrepancies
Group Description Action Total
MATCH:Classification in
agreement, acres comparable
Census farm/ JAS farm OR
Census non-farm / JAS Non-farm
No Action 1,629
MATCH:Classification in
agreement, acres not w/in 25%
Census farm / JAS estimated farm No Action; JAS Incorrect
240
Census farm / JAS farm Re-interview 1,122
Potential Scoping Errors:
Classification Conflict
Census non-farm by NASS / JAS farm FO Review Only 158
Potential Undercount:Census non-farm / JAS farm
Re-interview 185
Census non-farm / JAS estimated farm
No Action; JAS Incorrect
53
Potential Overcount:Census farm/JAS Non-farm
Re-interview 279
Total 3,666
16
Discrepancies between Census and JAS
Scoping differences: 18.4% of matched records had discrepancies in classification (~3% net classification error)
Acreage differences: 37.2% of matched records had acreage differing by more than 25%
17
Methods 67 respondents were re-interviewed by
enumerators in July 2008
Respondents reviewed questionnaires from both the 2007 Census and the 2007 JAS
Then asked to identify which was correct and why they were different
Census 59.7%
Neither 11.9%
Both 13.4%
JAS 15.0%
Scoping Differences Which Source is
Correct?
Scoping Differences Which Source is
Correct?TRUE Census
Misclassification
20
Scoping Differences – Census is Correct
Number of Responses(n=39)
21
Scoping Differences – JAS is Correct
Number of Responses(n=10)
22
Scoping Differences – Both Sources Correct
Number of Responses
4
2
1
1
Different Operation
NASS O/S
Land Sold
Land Purchased
Re
aso
ns
for
Dis
crep
anc
ies
(n=9)
True Change – reported correctly
True Change – reported incorrectly
23
Scoping Differences – Neither Source Correct
Number of Responses(n=9)
24
Scoping Differences – Overall Summary by Category
Number of Responses
(n=67)
True Change - Incorrect
True Change – Correct
25
Summary – Scoping Differences Very few of these cases are real changes between
JAS and the Census
Census was correct more often than June
Most discrepancies are actual errors June tracts screened out incorrectly Proxy respondents reporting incorrectly in JAS Specific types of land excluded (government program land,
woods, rented)
26
Source Used to Report Acres
Source Used to Report Acres* Percent
(n=67)
No Records, I know my acreage 50.8%
Tax records 10.5%
FSA records 6.0%
Operation books 14.9%
Other records (ie., deed, GPS #s) 1.5%
* Multiple answers allowed
27
What did we learn about Census misclassification? Classification error remains minimal
and is probably smaller than previous estimates
JAS cannot be used as “truth” Re-interview with resolution shows both
the Census and the JAS have errors JAS is not the GOLD STANDARD --
personal interviews not always best way to get accurate responses
Some errors due to respondents and won’t be eliminated
28
Our external source had more errors than our target:
Recommendations to improve the JAS: Avoid proxy respondents in JAS Review of screening in JAS
Intensive re-screening of all non-ag tracts is in progress
Estimation of farms missing from JAS Capture/Re-capture
estimates in progress
To examine errors, you need a good measure