The Use of First Names to Evaluate Reports of Gender and Its Effect on the Distribution of Married and
Unmarried Couple Households
Martin O’Connell and Gretchen Gooding
Fertility and Family Statistics Branch
US Census Bureau
Poster presented at the Annual Meeting of the Population Associationof America, Los Angeles, CA, March 30-April 1, 2006. This report is releasedto inform interested parties of ongoing research and to encourage discussion. The views expressed on statistical, methodological, or technical issues arethose of the authors and not necessarily those of the U.S. Census Bureau.
Classifying Unmarried Couple Households Household classification depends on reports of relationship and gender.
Sex is usually the best reported item on surveys1, but reports of names may sometimes seem inconsistent with reports on gender.
Distribution of coupled households in Census 2000:
1 Less than 1 percent had inconsistent responses on the Census 2000 Content Reinterview Survey.
60 million coupled households
Three Types of Coupled Households
91%
1%8%
Married Couple
Opposite-SexUnmarried Couple
Same-SexUnmarried Couple
Why Study Reponses to Names and Gender?
Minor errors in reporting gender can have a substantial impact on the overall estimates of unmarried couple households. The 1996 Federal Defense of Marriage Act instructs federal agencies
only to recognize opposite-sex marriages.
"In determining the meaning of any Act of Congress, or of any ruling, regulation, or interpretation of the various administrative bureaus and agencies of the United States, the word 'marriage' means only a legal union between one man and one woman as husband and wife, and the word 'spouse' refers only to a person of the opposite sex who is a husband or a wife."
Current Census Bureau editing programs assign reported same-sex “married” couples to same-sex “unmarried” couples.
With several states issuing marriage licenses to same-sex couples, the issues of collecting, processing, editing, and presenting estimates of same-sex couples will become more important, especially when examining estimates for specific states and cities.
What This Project Will Examine—
Can a person’s first name be used to verify reports of gender?
How accurate is the reporting of names?
What would be the effect on different household estimates when using a person’s name to alter reports of gender?
How sensitive would these estimates be to changes in responses of gender?
2004 American Community Survey
To address these issues, two data sources are used in this presentation :
First, the 2004 American Community Survey (ACS).
This survey will be used to illustrate differences in the reporting of gender for specific names by: Major Census divisions Age cohorts
The 2004 ACS selected a nationwide sample of 838,000 households.
Starting in 2005, it consisted of 3 million households in the yearly
sample.
2004 Test Census of New York
Second, the 2004 Test Census of New York. We will examine— The likelihood that a person’s name is male or female Age and race differences in reporting masculine or feminine names The gains or losses to different types of households if first names are
used to verify/change reports of gender
The 2004 Test Census of New York was conducted in the county of Queens. Overall, there were 130,756 households.
The test census consisted of 60,244 “coupled households.”
91%
7% 2%
Married Couple
Opposite-Sex UnmarriedCouple
Same-Sex UnmarriedCouple
2004 Test Census Items on Name, Relationship, and Gender
Common Errors in Collecting Names Inconsistencies in the collection of names on forms may result
from the following types of errors: Scanning errors of forms Keying errors of names Respondent/enumerator misspellings Illegible handwriting Transposing first and last names—e.g., Mary Thomas written as
Thomas Mary Concatenating names—e.g., Jack’s son as Jackson Names with non-alphabetic characters (e.g. *, @, $, 4)
(Spaces and hyphens are accepted)
Are All Names Created Equal? The same name may be correctly reported as being a different sex
for several reasons: Geographical Ethnic/cultural Different age cohorts
Source: U.S. Census Bureau, American Community Survey 2004.
Gender of Some Names May Change Over Time While Others May Remain the Same
0
10
20
30
40
50
60
70
80
90
100
<10 10-19 20-29 30-39 40+
Age Cohort in 2004
Perc
en
t R
ep
ort
ing
Male
John
Morgan
Leslie
Elizabeth
Methodological Issues
Considerations when using a first name to override reported sex responses: Develop an objective/statistical indicator with a variable range Define acceptable levels for altering a sex response—e.g. should name be
“male” 50%, 90%, 99% of the time to override a response of “female” Evaluate impact of procedure on estimates for all couple types—anyone
can make a mistake Indicator should be sensitive to geographical variations Indicator should be usable in large scale processing applications
2004 Test Census of New York First Name Index
On each person’s record, there is a first name index: Based on millions of observations in the 2000 Census of New York
State name dictionary Index = (people with that name who were male)/(all people) Index ranges from 0 to 1000 A high value, e.g. 990, means 990 out of every 1000 people with that
name in 2000 reported themselves as male—a very masculine name A low value, e.g. 50, means that only 50 out of every 1000 people were
reported as male—or that 950 were female—a very feminine name
For consistency purposes for this presentation: Index scale reverses for females A value of 990 for females now indicates that 990 out of 1000 people
with that name in 2000 reported themselves as female Index scale for males unchanged
First Name Index Characteristics
The majority of men (50%) and women (59%) had first names very strongly associated with their sex (Index = 990–1000).
Only 1% had first names that were inconsistent with their reported sex (Index <= 10).
However, there are some shortfalls in the applicability of this index for the total population: 11% of people did not report their names 7-8% had first names that could not be found in the dictionary Using first names to invalidate sex responses may not cover large
segments of the population
Coverage and Properties
For men in the 2004 Test Census: Name not reported or not in dictionary higher for
• Men 15 to 44• Chinese, Korean, Asian Indian men
Highest percent of men with first name indices 990-1000 • Older men• Whites and Blacks
Percent of men with first name indices 100 or less• About 1% for all ages• About 2% for Korean, Asian Indian men
Similar patterns were found for women.
Percent of Males With First Name Index 100 or Less
1 Includes specified race in combination with other races.
Note: An index less than 100 indicates that 900 out of every 1000 people with this name reported that they were female in Census 2000.
Source: U.S. Census Bureau, Test Census of New York, 2004.
0.9
0.8
0.8
0.8
0.6
0.7
0.7
0.9
1.4
1.3
0.3
0.3
0.3
0.3
0.3
0.3
0.1
0.2
0.4
0.2
0.2
0.2
0.2
0.2
0.1
0.3
0.4
0.7
0.3
0.40-14
15-29
30-44
45-64
65+
White
Black
Chinese
Korean
Asian Indian
Ag
eR
ace1
0-10 11-50 51-100
1.5
1.3
1.3
1.3
1.1
1.1
1.3
1.4
2.3
2.0
Index values
First Name Index: Characteristics of Coupled Households
For couples in the 2004 Test Census: Name not reported or not in dictionary
• Lowest for opposite-sex unmarried couples• Highest for same-sex couples
Couples with first name indices 990-1000—indicating high agreement between reported sex in 2004 and gender orientation of name
• Highest proportion for opposite-sex unmarried couples• Lowest proportion for same-sex couples, especially for male
partners and female householders Couples with first name indices 100 or less—indicating low agreement
between reported sex in 2004 and gender orientation of name• About 1% for married couples and opposite-sex unmarried
couples • Data suggests that errors in marking sex item may affect 8-16% of
male same-sex couples and 10-27% of female same-sex couples.
Percent of People Not Reporting First Name or First Name Not Found in Names
Dictionary
Source: U.S. Census Bureau, Test Census of New York, 2004
9.3
9.9
6.7
6.8
12.2
16.4
11.3
17.9
7.9
8.5
4.4
4.8
5.1
7.1
9.3
7.7
Husband
Wife
Male
Female
Householder
Partner
Householder
Partner
Marr
ied
Co
up
les
Op
po
sit
e-S
ex
Co
up
les
Male
-M
ale
Co
up
les
Fem
ale
-F
em
ale
Co
up
les
No report Not in dictionary
17.2
18.4
11.1
11.6
17.3
23.5
20.6
25.6
Percent of People With First Name Index Over 500
Note: An index greater than 500 indicates that more than half of the males/females with this name reported they were male/female in Census 2000.
Source: U.S. Census Bureau, Test Census of New York, 2004
7.5
5.6
5.1
4.5
4.8
4.7
4.3
2.5
2.2
2.1
2.3
1.8
2.4
2.0
0.5
18.2
12.5
18.6
12.6
13.9
12.1
5.4
8.8
51.2
57.4
60.6
66.6
53
38.3
37.3
47.3
3.6
Husband
Wife
Male
Female
Householder
Partner
Householder
Partner
Mar
ried
Co
up
les
Op
po
site
-Sex
Co
up
les
Mal
e-M
ale
Co
up
les
Fem
ale-
Fem
ale
Co
up
les
501-899 900-949 950-989 990-1000
79.4
77.7
86.4
86.0
73.5
57.5
48.3
60.9
Index values
Percent of People With First Name Index 100 or Less
*Total percent for index values 0-100.
Note: An index less than 100 indicates that 900 out of every 1000 males/females with this name reported that they were of a different sex in Census 2000.
Source: U.S. Census Bureau, Test Census of New York, 2004.
5.1
13.4
18.3
5.9
1.8
2.4
7.7
3.6
0.8
0.5
0.7
0.9
Husband
Wife
Male
Female
Householder
Partner
Householder
Partner
Mar
ried
Co
up
les
Op
po
site
-Sex
Co
up
les
Mal
e-M
ale
Co
up
les
Fem
ale-
Fem
ale
Co
up
les
0-10 11-50 51-100Index values
0.9*
1.1*
0.8*
1.0*
7.7
16.3
26.7
10.4
Using First Names To Edit Sex Responses:
How willing are you to accept a first name over a sex response? The lower the index level of a respondent’s first name, the more
frequently that name was associated with the opposite sex• Index 0-10 = 99% of people with that name were of the opposite
sex in Census 2000• Index 0-50 = 95% were of the opposite sex• Index 0-100 = 90% were of the opposite sex
By using different index ranges• Respondents can be reassigned their sex on basis of first names• Different levels of name “acceptance” produce changes in
estimates of household types
Who’s sex can change? Anyone—reassignment rules apply to all people in all household types Regardless of sex or living arrangement, anyone can mistakenly mark
a form
Estimates of Married and Unmarried Couple Households After Reassigning Sex of
Respondent at Different First Name Index Levels
Household type Original distribution 0-10 0-50 0-100Total 60,244 60,244 60,244 60,244Married couples 55,026 54,692 54,537 54,349Opposite-sex couples 4,112 4,103 4,092 4,076Same-sex couples 1,106 1,449 1,615 1,819 Male partners 664 831 935 1,043 Female partners 442 618 680 776
Distribution after sex reassignment at different
index levels
Note: Sex of respondent in Census 2004 Test was reassigned to opposite sex if their first name index was in this range.
Source: U.S. Census Bureau, Test Census of New York, 2004.
Estimates of Same-Sex Couples After Reassigning Sex, by First Name Index Level
and Transfer Source
Source: U.S. Census Bureau, Test Census of New York, 2004.
442 337 293 288
664541 515 507
28 49 68
543 758956
Originaldistribution
0-10 level 0-50 level 0-100 level
Female Same-Sex Couples Male Same-Sex Couples
Opposite-Sex Couples Married Couples
1,106
1,4491,615
1,819
Source of same-sex couples:
Results of Model Simulation
Overall results for same-sex couple estimates: An increase from 1,106 to 1,449 using most conservative index level
(0-10) Increases continue to 1,819 at 0-100 level
Less than 1% decline in opposite-sex couples. Married-couples experience greatest loss:
From 55,026 to 54,349 using 0-100 index level Although low index levels are reported by a small percentage of
married couples• Magnitude of this population produces relatively large additions to
the same-sex population
Net increase in same-sex couples to 1,819 at 0-100 level: Loss of 311 from original sample Offset by transfer of 956 married couples and 68 opposite-sex couples
Summary
First names offer the potential to edit/verify reports of sex on questionnaires.
Problems to face if considering this option: Not all population groups report names Geographical/cultural differences in gender of names Choosing the degree of uncertainty in deciding if a name is “Male” or
“Female”
Using 2004 Test Census of New York data and Census 2000 names dictionary: Objective first name index was developed Reassignment of sex was made at different levels of acceptance Model simulation showed losses to same-sex couples greatly offset by
gains to this population from married couples
Conclusion: using first names to invalidate reported sex response will yield more same-sex couples than originally reported.
Contact Information
Fertility and Family Statistics Branch
Phone: 301-763-2416
• Martin T. O’Connell
E-mail: [email protected]
• Gretchen E. Gooding
E-mail: [email protected]