FILE INFORMATION
Geographic Coverage
State codes are shown except for nine
states which are collapsed into three groups.
The three state groupings are as follows: Maine and Vermont; Iowa, North
Dakota and South Dakota; and Alaska, Idaho, Montana, and Wyoming. The sample was not designed to produce state
level estimates. State codes are
primarily useful in relating a respondent's recipiency of benefits to welfare
reform thresholds and policies which may vary from state to state.
Identification Number System/Match Key
Variables
The SPD identification scheme uses match
key variables designed to uniquely identify individuals, provide a means of
linking data for the same individuals across files, and grouping individuals
into households and families across files over time. The various components of the identification scheme are listed
below:
SIPP Panel Number SIPP_PNL Sample Unit Identification Number PP_ID Entry ID PP_ENTRY Person Number PP_PNUM Address ID ADDIDE(2,3,4,7,8,9,0) SIPP _PNL, PP_ID, ADDID IHHKEY(2,3,4,7,8,9,0)
File Match Keys
This file includes match keys to merge
with SPD and 1992/1993 SIPP data. These
match keys link data for each person between the 1992 and 1993 SIPP Panels, the
1997 SPD Bridge, the 1998 SPD file and the first longitudinal SPD file.
Use the following variables to match back
at the person level:
SIPP Panel Number SIPP_PNL Sample Unit Identification Number PP_ID Person Number PP_PNUM Entry ID PP_ENTRY
Use the following variables to match back
at the household level:
SIPP Panel Number SIPP_PNL Sample Unit Identification Number PP_ID Address ID ADDIDE(2, 3, 4, 7, 8,9,0)
The SIPP panel number identifies
the panel in which the respondent participated. The sample person should either have an entry of 1992 or 1993 for
their panel number.
The sample unit identification number
was created by scrambling together the Primary Sampling Unit, segment, and
serial numbers used for Census Bureau administrative purposes. These identifiers are constructed in the
same manner as the 1992 and 1993 SIPP panel files, to enable matching to these
files. To uniquely identify a
household, you must use the sample unit identification (ID) number, the address
ID, and the SIPP panel number. The
sample unit identification number, the address ID, and the SIPP panel number
can be used to link all households back to the original household.
The entry ID represents the
address of the person at the time he/she was first interviewed and does not
change even if the person moves. It is
used in conjunction with the person number to uniquely identify people within
the sample unit. This variable is the
number 011 for all original sample people.
For additional sample people, this variable can be 011 or greater than
011 depending on the current address ID of the unit which the new sample person
joined. For example, a person who moves
into a household with an ADDIDE0 of 011 will receive an PP_ENTRY of 011.
Whereas, a person who enters a household spawned in 2000 (ADDIDE0=141) will
have a PP_ENTRY of 141.
The person numbers represent the
wave the person entered the sample.
Person numbers such as 0101 and 0102 are assigned in Wave 1 of the
SIPP. Person numbers such as 0201 and
0202 are assigned to people added to the roster in
Wave 2 of the SIPP. People added to the
roster in the 2000 SPD have person numbers 1401 and 1402 and following
sequentially as needed. People added to
the roster in the 1999 SPD have person numbers 1301 and 1302 and following
sequentially as needed. People added to the roster in the 1998 SPD have person
numbers 1201 and 1202 and following sequentially as needed. People added to the
roster in the 1997 SPD Bridge have person numbers 1101 and 1102 and following
sequentially as needed.
The address ID is a three digit
code that identifies the various household addresses associated with the same
sample unit identification number. The
first two digits of the address ID code indicate the wave or year in which that
address was first interviewed. The
third digit sequentially numbers that split into multiple households and have
the same address ID. The address ID
code is 011 for all sample addresses that are the same as in Wave 1 of the 1992
and 1993 SIPP panels. As the SIPP
sample people move to new addresses, new address ID codes are assigned. For example, any new address to which sample
unit members moved during the 2000 SPD is numbered from 141 to 149.
Households are defined at each cross
sectional time point in this file:
1992, 1993, 1994, 1997, 1998, 1999, and 2000. If you would like to look at the household configuration in any
given year, use the IHHKEY variable appropriate for that year. For example, to look at 1999 household
structure, use IHHKEY99. The IHHKEY
variables are a concatenation of SIPP_PNL; PP_ID and ADDID for the specific year. For example, IHHKEY99 is SIPP_PNL; PP_ID and
ADDIDE9.
File Structure
The file is a rectangular person-level
file. Household and Family level
variables are included on the record for every person in the household.
Edits
The second longitudinal file is a fully
edited and imputed data set.
Topcoding of Variables
To protect against the possibility that a
user may recognize the identity of a SPD respondent with a very high income,
income from every source is topcoded so that no individual amounts above
$100,000 are revealed. This topcode
amount is consistent with the topcoding for the 1992 and 1993 SIPP panels.
Other economic variables are topcoded at the 97 percentile level, meaning the
top 3 percent of values are not disclosed.
Variables that have been topcoded will have a "T" in the
second to the right position. NOTE:
Aggregate amounts (PTOTVLR, PERNVLR, etc.) use topcoded amounts as input.
We topcode age by bottom coding year of
birth. For the Second Longitudinal SPD
file no age will be older than 88.
Weighting
The SPD panel universe is initially
represented by all of the original sample people from the SIPP Panels 1992 and
1993. Each and everyone of these
original sample people has an initial weight.
Due to nonresponse in both the SIPP and the SPD, and a sample cut in the
SPD in 1998, the initial weights of the respondents are adjusted to
significantly compensate for the bias.
Through the weighting process, the SPD panel universe is represented by
the original sample people who have positive weights. The SPD second longitudinal file is the same as the SPD first
longitudinal file except that it also includes the data collected from the SPD
1999 and 2000. Thus, like the SPD first
longitudinal file, only part of the original sample people from the SIPP Panels
1992 and 1993 are included on the second longitudinal file because the SPD
Bridge sample did not include all of the original sample people. Namely, the SPD Bridge (1997) sample which
is the starting sample for the SPD consists only of the sample households that
were interviewed in the last interview waves of the SIPP Panels 1992 and
1993. Thus, for those original sample
people who were not included on the SPD second longitudinal file, their
characteristics needed for the weighting process are obtained from the cross‑sectional
and longitudinal files of the SIPP Panels 1992 and 1993.
There are three weights provided for
sample people on the SPD second longitudinal file. These three weights are
referred to as the traditional longitudinal panel weight, the 1999 quasi‑longitudinal
panel weight, and the 2000 quasi‑longitudinal panel weight.
The sample people who meet the following
definition have a positive traditional longitudinal panel final weight:
1)
Lived in a
1992/1993 SIPP panel household during Wave 1 interview.
2)
Were interviewed (self,
proxy, or imputed) in each and every reference month in SIPP.
3)
Were interviewed (self,
proxy, or imputed) in 1997 SPD Bridge and 1998, 1999, and 2000 SPD.
All the other sample people included on the file have
zero traditional longitudinal panel final weights.
The sample people who meet the following definition
have a positive 1999 quasi‑longitudinal panel final weight:
1) Lived
in a 1992/1993 SIPP panel household during Wave 1 interview. 2) Were
interviewed (self, proxy, or imputed) in each and every reference month in
SIPP. 3) Were
interviewed (self, proxy, or imputed) in 1997 SPD Bridge and 1999 SPD, but may
have been or may not have been interviewed in either or both the 1998 and 2000
SPD.
All the other sample people included on
the file have zero 1999 quasi‑longitudinal panel final weights.
The sample people who meet the following
definition have a positive 2000 quasi‑longitudinal panel final
weight:
1) Lived
in a 1992/1993 SIPP panel household during Wave 1 interview. 2) Were
interviewed (self, proxy, or imputed) in each and every reference month in
SIPP. 3) Were
interviewed (self, proxy, or imputed) in 1997 SPD Bridge and 2000 SPD, but may
have been or may not have been interviewed in either or both the 1998 and 1999
SPD.
All other sample people included on the
file have zero 2000 quasi‑longitudinal panel final weights.
To provide a means for estimating the
annual or calendar characteristics of young children based on the data on the
SPD second longitudinal file, the characteristics of the non‑original
sample children born after Wave 1 of the SIPP Panels 1992 and 1993 must
be also taken into account. To enable an accounting for the characteristics of
these non‑original sample children in any annual estimates, positive
weights are assigned to a cohort of these non‑original sample children.
For each non‑original sample child aged eight and below in 2000 if
originated from the SIPP Panel 1992, or aged seven and below in 2000 if
originated from the SIPP Panel 1993, having a designated parent that is an
original sample person with positive traditional longitudinal (1999 quasi‑longitudinal,
2000 quasi‑longitudinal) weight, then a positive weight is assigned to
the child based on the procedure described in more detail below. The weights of
these non‑original children together with the traditional longitudinal
(1999 quasi‑longitudinal, 2000 quasi‑longitudinal) weights of the
original sample adults and children enable the users to make annual estimates
of young children=s characteristics. Thus, we refer to
these three SPD longitudinal annual weights as the traditional longitudinal
annual weight, the 1999
quasi‑longitudinal
annual weight, and the 2000 quasi‑longitudinal annual weight which are
derived from the tradition longitudinal panel final weight, the 1999 quasi‑longitudinal
panel final weight, and the 2000 quasi‑longitudinal panel final weight,
respectively.
The procedure used for calculating the
three longitudinal panel final weights and the three corresponding longitudinal
annual weights of the sample people on the second longitudinal file is briefly
described in the eight steps provided below.
Refer to section 4 (Glossary) of the Technical Documentation for
definitions for the key terms used in the eight steps of weighting described
below.
STEP 1 B March 1993 was chosen as the reference
point in time (control date) for the SPD panel universe used for measuring the
effect of the Welfare Reform. Namely,
the SPD panel universe consists of persons (including children) residing in the
United States households and persons living in group quarters in March 1993.
(Note that persons living in military barracks and institutions, such as
prisons and nursing homes, are excluded.) The SPD panel universe is represented
by combining the SIPP Panels 1992 and 1993 into one sample. The effect of the
Welfare Reform can be assessed based on the various longitudinal (time
dependent) characteristics of the people in the SPD panel universe.
STEP 2 B An initial weight was assigned to each
original sample person (including children). An original sample
person is a person who at the time of the SIPP Wave 1 interview resided in an
interviewed sample household or group quarters. The inverse of this initial
weight represents the probability of an original sample person residing in an
interviewed Wave 1 sample household in either the SIPP Panel 1992 or 1993,
depending on which SIPP panel he/she originally belonged.
Every sample person who was not an
original sample person was assigned an SPD traditional longitudinal, 1999 quasi‑longitudinal,
and 2000 quasi‑longitudinal panel final weights of zero.
For sample children aged eight or less if
spawned from the SIPP Panel 1992 and aged seven or less if spawned from the
SIPP Panel 1993 (as of March 2000) who was not an original sample person, they
are assigned annual weights according to the procedure provided in Step 8. Note
that this group of non‑original sample persons nominally represents the
children born after the inception of the SPD panel universe.
STEP 3 B Since each of the SIPP Panels 1992 and 1993 (samples) was a
nationally representative sample by itself and their sample sizes are
approximately the same, combining them into one sample reduces the weight of
each panel sample person by half.
STEP 4 B
Original sample persons were then divided into two cohorts as follows: the
first cohort consists of the original sample persons who are qualified as
longitudinally interviewed between Wave 1 of the SIPP Panels 1992 and 1993 up
to the SPD Bridge. The second cohort consists of the original sample persons
who are not qualified as longitudinally interviewed between Wave 1 of the SIPP
Panels 1992 and 1993 up to the SPD Bridge. The first group is referred to as
the SPD Bridge longitudinally interviewed.
The second group is referred to as the SPD Bridge longitudinally
non-interviewed.
The SPD Bridge non-interview adjustment
served as a means of appropriately transferring the weights from the SPD Bridge
longitudinally non-interviewed to the SPD Bridge longitudinally interviewed with similar demographic and
economic characteristics
[1]
.
Sample persons were classified based on the following variables: age, race,
ethnicity, education, labor force status, employment status, income types,
assets, and income level. Every SPD Bridge longitudinally
interviewed person was assigned a non‑interview adjustment factor to
inflate their initial weights.
Every SPD Bridge longitudinally
non-interviewed person was assigned a traditional longitudinal, 1999 quasi‑longitudinal,
and 2000 quasi‑longitudinal panel final weight of zero, and excluded from
further processing.
STEP 5 B Due to budget constraints, some sample
household units were cut
[2]
in the SPD 1998. SPD Bridge longitudinally interviewed persons were divided
into two cohorts. The first consists of
those belonging to the sample households selected into the SPD 1998 sample, and
we call this group Athe selected SPD Bridge longitudinally
interviewed.@ Those belonging to the sample households
not selected into SPD 1998 sample are called Athe
not-selected SPD Bridge longitudinally interviewed.@
A sample cut factor was assigned to the
selected group in accordance with the household demographic characteristics
provided in the document referred to in Footnote 2. The adjustment served as a
means to appropriately transfer the weights from the not-selected SPD Bridge
longitudinally interviewed to the selected SPD Bridge longitudinally
interviewed. Every not-selected SPD Bridge
longitudinally interviewed person was assigned an SPD traditional longitudinal,
1999 quasi‑longitudinal, and 2000 quasi‑longitudinal panel final
weight of zero and excluded from further processing.
STEP 6 B For each of the three panel weights, the
selected SPD Bridge longitudinally interviewed persons were divided into two
cohorts.
For each panel weight, the first cohort
consists of the selected SPD Bridge longitudinally interviewed persons who do
qualify as interviewed. The second cohort consists of the selected SPD Bridge
longitudinally interviewed persons who do not qualify as interviewed. A list of requirements that must be met to
qualify as interviewed for each of the panel weights was given earlier in the
weighting section.
For each panel weight (traditional, 1999
quasi‑longitudinal, and 2000 quasi‑longitudinal), a second non‑interview
adjustment procedure was independently performed in the same manner as the
first non‑interview adjustment in Step 4. Namely, this second adjustment served as a means to appropriately transfer the weights
from the non‑interviewed to the interviewed with similar demographic and economic characteristics
as classified in Step 4.
For each panel weight, those people who
qualify as interviewed were assigned a second non‑interview adjustment
factor to inflate their weights.
For each panel weight, those people who
qualify as non‑interviewed were assigned a weight of zero and excluded
from further processing.
STEP 7 B A ratio adjustment procedure was
independently performed to finish each panel weight (traditional, 1999 quasi‑longitudinal,
and 2000 quasi‑longitudinal). The
ratio adjustment procedure involved raking to match a set of SPD population
estimates with a corresponding set of control (benchmark) population estimates
at a representative control date. The control population estimates were
available for the classifications based on the following demographic variables:
age, sex, race, ethnicity, householder living with or not living with relative,
non-householder related to or not related to householder. As described in Step
1, the representative control date for the SPD panel universe was March 1993.
The control population estimates are from the CPS population estimates produced
for March 1993. The ratio adjustment procedure served as a means to improve the
population coverage of the SPD sample. The classification of the sample persons into groups for the
ratio adjustment also served as a post-sampling stratification. Therefore, as a
byproduct of the ratio adjustment, the variance estimates of the estimates
based on the SPD sample are generally improved as well.
The three sets of positive weights
obtained after the ratio adjustment were the final weights for the SPD
traditional longitudinal panel interviewed persons, the 1999 quasi‑longitudinal
panel interviewed persons, and the 2000 quasi‑longitudinal panel
interviewed persons.
STEP 8 B Creation of the traditional
longitudinal, 1999 quasi‑longitudinal, and 2000 quasi‑longitudinal
annual weights was done in two parts.
For the first part, all sample people except the non‑original
young sample children, the traditional longitudinal, 1999 quasi‑longitudinal,
and 2000 quasi‑longitudinal annual weights were set equal to their
traditional longitudinal, 1999 quasi‑longitudinal, and 2000 quasi‑longitudinal
panel final weights, respectively.
For the second part, each non‑original
young sample child (eight or less) whose designated parent is an original
sample person, three intermediate weights were created by giving them their
designated parent=s traditional longitudinal, 1999 quasi‑longitudinal,
and 2000 quasi‑longitudinal annual weights, respectively.
Each of the three intermediate weights of
these weighted non‑original children was independently adjusted by ratio
adjusting their weighted totals to various controls (classified by sex, race,
and ethnicity) for June 2000. Once the
ratio adjustment factors were applied to the intermediate weights, these non‑original
children were assigned their final traditional longitudinal, 1999 quasi‑longitudinal,
and 2000 quasi‑longitudinal annual weights. Their application is
described in the next section (see below).
Longitudinal Weight Associated with
Non-original Sample Children
The traditional longitudinal, 1999 quasi‑longitudinal,
and 2000 quasi‑longitudinal annual weights are for calculating
annual or calendar year estimates.
The traditional longitudinal annual
weight is positive only for original sample people who provided a self or
proxy, or whose data were imputed, in each and every reference month for SIPP, SPD 1997 Bridge, SPD 1998, SPD
1999, and SPD 2000, and for non‑original sample children (aged eight or
less) born after the SIPP 1992/1993 Wave 1 interview whose designated parents
are original sample people and have positive traditional longitudinal annual
weight.
The 1999 quasi‑longitudinal annual
weight is positive only for original sample people who provided a self or
proxy, or whose data were imputed, in each and every reference month for
SIPP, SPD 1997 Bridge, and SPD 1999, and for non‑original sample children
(aged eight or less) born after the SIPP 1992/1993 Wave 1 interview whose
designated parents are original sample people and have a positive 1999 quasi‑longitudinal
annual weight.
The 2000 quasi‑longitudinal annual
weight is positive only for original sample people who provided a self or
proxy, or whose data were imputed, in each and every reference month for
SIPP, SPD 1997 Bridge, and SPD 2000, and for non‑original sample children
(aged eight or less)
born after the SIPP
1992/1993 Wave 1 interview whose designated parents are original sample people
and have a positive 2000 quasi‑longitudinal annual weight.
The annual weights of the weighted
non-original sample children together with the annual weights of the original
sample adults and children associated with the traditional longitudinal (1999
quasi-longitudinal, 2000 quasi-longitudinal) panel weight enable the users to
make annual estimates of young children=s characteristics. When calculating an annual estimate of a
characteristic of young children, the portion of the estimate
contributed by the weighted non-original sample children must be calculated
based only on the weighted non-original sample children born in or prior
to the calendar year under consideration.
The non-original sample children weights, for those aged eight or less,
are intended only to approximate growth in the universe. Some caution should be given when
interpreting results for children aged 8 or less. The annual estimates of the
characteristics of these children for a calendar year closer to 2000 are likely
to be more accurate than those for a calendar year further away from 2000. This is because all three types of annual
weights for these children were adjusted to improve the population coverages
for June 2000.
Estimation of Person Characteristics
Some basic types of longitudinal
estimates that can be constructed using the longitudinal weight are described
below in terms of estimated numbers. Of
course, more complex estimates, such as percents, averages, and ratios can be
constructed from the estimated numbers.
The SPD longitudinal weights can be used to construct the following
types of longitudinal estimates:
1. The
number of people who have ever experienced a characteristic or situation during
a given time. (For example, the number
of people who experience unemployment during
1999.)
To construct such an estimate, sum the weights over all people who
possessed the characteristic of interest at some point during the time period
of interest.
2. The
amount of a characteristic accumulated by people during a given time. (For example, the amount of unemployment
compensation received by unemployed people during 1999.) To construct such an estimate, compute the
product of the weight times the amount of the characteristic and sum this
product over all appropriate people.
Longitudinal Research Using This File
The SPD is designed exclusively to
support longitudinal analysis of the impact of welfare reform. The Second Longitudinal SPD data can be
linked to the 1992 and 1993 SIPP Panel and Cross-sectional files, the 1997 SPD
Bridge, the 1998 SPD file and the First Longitudinal SPD file using the
following variables:
SIPP Panel Number SIPP_PNL Sample Unit Identification Number PP_ID Entry ID PP_ENTRY Person Number PP_PNUM Address
ID ADDIDE(2,3,4,7,8,9,0)
A Longitudinal weight is assigned to
100-level persons with full panel weights in the 1992/1993 SIPP file who were
successfully interviewed in 2000. Note
the full panel weights on the SIPP files were assigned to 100-level people who
were interviewed for the entire time they remained in the SIPP universe or who
had at most 1 missing interview bounded by successful interviews.
The SPD data represent the behavior and
characteristics of people in two fixed cohorts. One cohort represents the population as it existed in March 1992
from the 1992 panel of the SIPP and the other population as of March 1993 from
the 1993 panel. This is not a
traditional longitudinal survey in that it does not repeat the same measure
throughout the period. Each round of
the SPD interviewing, beginning with the Bridge in 1997, does not represent
cross-sectional snapshots of the U.S. population. It does offer insight into what the current condition is of the
people in the U. S. population in the early 1990s just prior to welfare reform.
The core information common throughout
the data collection (although reference periods and question phrasing vary)
consists of basic demographics, labor force activity, income, and program
participation. The longitudinal file
consists of data collected using three different instruments, each with
variations in wording and context.
Obtaining Access to SAQ Data
The SAQ data (adult and adolescent) will
only be available through the Census Bureau=s
Research Data Centers. Contact the
Research Data Center staff for the requirements for reviewing the SAQ data.
[1]
A full detailed
specification for calculating adjustment factors can be obtained from the
memorandum dated January 4, 2002 on AWeighting Procedure for the SPD Second
Longitudinal File@ for C. E. Bowie from A.
R. Tupek.
[2]
The sample cut procedure
and the sampling rates (sample cut factor) for each of the six household
strata of the SPD Bridge sample can be obtained from the memorandum
dated January 22, 1999 on A1998 SPD: Sampling Specifications@ for C. E. Bowie from L.
S. Cahoon.
|
|
Contact: (dsd.survey.program.dynamics@census.gov)
| Introduction to SPD |
Survey Design & Content |
Data Editing |
Finding
SPD Info | Sampling &
Weighting |
|
|
Census
2000 | Subjects
A to Z | Search
| Product
Catalog | Data
Access Tools | FOIA
| Privacy
· Policies | Contact
Us | Home
|