Brief Overall Description of the Dataset:
“The Consumer Price Index (CPI) is a measure of the average change over time in the prices paid by urban consumers for a market basket of consumer goods and services.”
Link: http://www.bls.gov/cpi/cpiovrvw.htm#item1
Date Inventory Completed: 6/18/2015
Screening
- Is the data collected opinion-based?
- Is the data collection recurring (must be collected at least annually)?
- Is there data available for 2013?
- Is the data collected at the property or housing unit level?
- Can we access the data by August 15th?
Purpose
What is the purpose of the organization collecting the data?
The BLS is the national organization that measures such things as the unemployment rate, the labor force participation rate, etc. Essentially, its purpose is to provide the federal government and the public with a number of official measurements on important ‘labor’ areas.
Why is it collected and how does the organization use it?
The BLS collects data on prices and then creates the CPI because its job is to determine these types of statistics; therefore, it uses CPI statistics to create the reports it has been mandated to create.
More specifically, these are the purposes the BLS cites:
“As an economic indicator. As the most widely used measure of inflation, the CPI is an indicator of the effectiveness of government policy. In addition, business executives, labor leaders and other private citizens use the index as a guide in making economic decisions.
As a deflator of other economic series. The CPI and its components are used to adjust other economic series for price change and to translate these series into inflation-free dollars.
As a means for adjusting income payments. Over 2 million workers are covered by collective bargaining agreements which tie wages to the CPI. The index affects the income of almost 80 million people as a result of statutory action: 47.8 million Social Security beneficiaries, about 4.1 million military and Federal Civil Service retirees and survivors, and about 22.4 million food stamp recipients. Changes in the CPI also affect the cost of lunches for the 26.7 million children who eat lunch at school. Some private firms and individuals use the CPI to keep rents, royalties, alimony payments and child support payments in line with changing prices. Since 1985, the CPI has been used to adjust the Federal income tax structure to prevent inflation-induced increases in taxes.”
Who else uses the data?
Business, policy-makers, researchers, students
Who do they sell the data to?
They do not sell the data.
Method
What is the data collection method?
To create ‘the basket’: “The CPI market basket is developed from detailed expenditure information provided by families and individuals on what they actually bought. For the current CPI, this information was collected from the Consumer Expenditure Surveys for 2011 and 2012. In each of those years, about 7,000 families from around the country provided information each quarter on their spending habits in the interview survey. To collect information on frequently purchased items, such as food and personal care products, another 7,000 families in each of these years kept diaries listing everything they bought during a 2-week period. Over the 2 year period, then, expenditure information came from approximately 28,000 weekly diaries and 60,000 quarterly interviews used to determine the importance, or weight, of the more than 200 item categories in the CPI index structure.”
To create the index once the basket is established: “Each month, BLS data collectors called economic assistants visit or call thousands of retail stores, service establishments, rental units, and doctors' offices, all over the United States, to obtain information on the prices of the thousands of items used to track and measure price changes in the CPI.”
What is the type of data collected?
Survey data/Designed collection
If designed, who created the questions?
BLS itself (government)
What is the raw source of the collected data (prior to any aggregation)?
The forms the individuals who were surveyed during ‘basket creation’ process filled out
The forms the economic assistants fill out every month when they gather information about prices
Description
What is the general topic of the data (1-2 words)?
Price Index
What are the earliest and latest dates for which data is available?
For monthly reports: October 2000 to Present in 2015 (so, when this form was filled out, April 2015)
Is data collected and available periodically?
Yes
How soon after a reference period ends can a data source be prepared and provided?
About two to three weeks according to the BLS
Selectivity
What is the universe (e.g., population) that the data represents?
“The CPI reflects spending patterns for each of two population groups: all urban consumers and urban wage earners and clerical workers. The all urban consumer group represents about 87 percent of the total U.S. population. It is based on the expenditures of almost all residents of urban or metropolitan areas, including professionals, the self-employed, the poor, the unemployed, and retired people, as well as urban wage earners and clerical workers. Not included in the CPI are the spending patterns of people living in rural nonmetropolitan areas, farm families, people in the Armed Forces, and those in institutions, such as prisons and mental hospitals. Consumer inflation for all urban consumers is measured by two indexes, namely, the Consumer Price Index for All Urban Consumers (CPI-U) and the Chained Consumer Price Index for All Urban Consumers (C-CPI-U). ( See the answer to Question 4 for an explanation of the differences between the CPI-U and C-CPI-U.)
The Consumer Price Index for Urban Wage Earners and Clerical Workers (CPI-W) is based on the expenditures of households included in the CPI-U definition that also meet two requirements: more than one-half of the household's income must come from clerical or wage occupations, and at least one of the household's earners must have been employed for at least 37 weeks during the previous 12 months. The CPI-W population represents about 32 percent of the total U.S. population and is a subset, or part, of the CPI-U population.”
“It is important to understand that BLS bases the market baskets and pricing procedures for the CPI-U and CPI-W populations on the experience of the relevant average household, not of any specific family or individual. It is unlikely that your experience will correspond precisely with either the national indexes or the indexes for specific cities or regions.” (so CPI measures prices faced by “average householders” on a continuous basis - as in the CPI is released very regularly)
Accessibility
- How is the data accessed?
PDF and Excel (there might be others)
- Is it open data?
Yes (all the data on the website is open data - see below for more information about “restricted access.”
Any legal, regulatory, or administrative restrictions on accessing the data source?
No (unless you get access to the “restricted data”)
- Cost? - One time or annual or project based payment?
None
Does this dataset appear to meet our needs for the Census study? Yes
Full Inventory
Description
- Features
- What is the temporal nature of the data: longitudinal, time-series, or one time point?
Time series
- Geospatial? If Yes, at what level?
The CPI is computed for some smaller regions, but it is not collected at every state or county level. (“Four broad geographic regions, size of city distinctions, a region and city size-class cross classification, and 27 Metropolitan Statistical Areas”)
Metadata
- Is there information available to assess the transparency and soundness of the methods to gather the data for our purposes?
Yes, the BLS is very good about giving information about how the CPI is computed and, if we needed more detail, we could contact the BLS directly.
- Is there a description of each variable in the source along with their valid values?
The variables are identified (at times they could use more description). The BLS is a very reputable source, so we should be able to trust its data.
- Are there unique IDs for unique elements that can be used for linking data?
Not applicable
- Is there a data dictionary or codebook?
I could not find one
Selectivity
What unit is represented at the record level of the data source?
Does this universe match the stated intentions for the data collection? If not, what has been included or excluded and why?
To a certain extent. The CPI is supposed to measure prices for the average consumer, but it is of course not perfect (the basket, in a perfect world, should be updated more often and, in a perfect world, we would like to have more prices)
What is the sampling technique used (if applicable)?
See the “what is the data collection method?” question above. You would need to look into the Consumer Expenditure survey (I looked into it a bit, and I could not find the sampling technique. My guess would be some kind of random sampling.)
What was the coverage?
Response rate information can be found here: http://stats.bls.gov/cpi/publications.htm
Stability/Coherence
- Were there any changes to the universe of data being captured (including geographical areas covered) and if so what were they?
None explicitly stated
- Were there any changes in the data capture method and if so what were they?
None explicitly stated
- Were there any changes in the sources of data and if so what were they?
None explicitly stated
Accuracy
- Any known sources of error?
Yes
- Describe any quality control checks performed by the data’s owner.
Unknown
Accessibility
- Any records or fields collected, but not included in data source, such as for confidentiality reasons)?
Unknown
- Is there a subset of variables and/or data that is must be obtained through a separate process? If yes, is there a separate legal, regulatory, or administrative restrictions on accessing the data source? Cost? - One time or annual or project based payment?
You can obtain more detailed information collected by the BLS for the CPI (but the BLS does not say about exactly what dimensions of the CPI on its website). The information to do this is here: http://www.bls.gov/bls/blsresda.htm#availabledata Most, if not all, of the access is free (but you have to be “on site” to do research).
Privacy and security
- Was consent given by participant? If so, how was consent given?
Yes, consent was given when the BLS surveyed people with the Consumer Expenditure Survey.
- Are there legal limitations or restrictions on the use of the data?
There are not limitations on the use of the public data. However, there are definitely limitations on the private data that you have to access “on site.”
- What confidentiality policies does the source have?
See information regarding the “restricted data”
Research
- What research has been done with this dataset? (e.g., impact of policies, predictors of student success)
See: http://stats.bls.gov/cpi/publications.htm
- Include any links to research if provided:
- List any other data use notes provided by the supplier.
Gaps/Concerns
- Feasibility - can all jurisdiction levels provide the data (if applicable)?
- Data ownership - a lack of clarity in legal guidance stemming from a lack of clarity with who owns digital data?
- Data collection authority - what data is reasonably private and what constitutes unwarranted intrusion?
- Describe any other notes you have or any gaps/concerns you see with this dataset