Brief Overall Description of the Dataset: The National Association of REALTORS does research on many topics, including market data, commercial, international, home buying and selling, NAR member information and technology. The majority of the data reports median housing prices across the country. This data is reported by local associations in four census regions of the United States, Northeast, South, Midwest and West, as houses are purchased and sold. Data is collected monthly, reporting within a month after collection. Datasets are available for purchase dating from 1989 to 2015.
Link:
http://www.realtor.org/research-and-statistics/research-reports
http://www.realtor.org/topics/existing-home-sales/expansion-and-survey
Date Inventory Completed: 5/22/2015
Screening
- Is the data collected opinion-based?
- Is the data collection recurring (must be collected at least annually)?
- Is there data available for 2013?
- For Housing: Is the data collected at the property or housing unit level?
- Can we access the data by August 15th?
Purpose
What is the purpose of the organization collecting the data?
To better monitor and analyze the housing market in the United States.
Why is it collected and how does the organization use it?
This data is collected to “preserve the free enterprise system and the right to own real property”.
Who else uses the data?
Realtors, researchers
Who do they sell the data to?
Businesses, government
Method
What is the data collection method?
After receiving the raw data in online questionnaire format from local associations, the National Association of Realtors divides the nation into four census regions, Northeast, South, Midwest and West. This raw data is cleaned and “problematic data” is removed. The aggregated raw volume figures are weighted as accurate representations of the sales activity in each of the four census regions. These weights are then benchmarked every ten year to measure shifts in regional demand. And the non-seasonally adjusted volume is converted into seasonally-adjusted annualized rates. http://www.realtor.org/topics/existing-home-sales/methodology
What is the type of data collected?
Designed collection through survey
If designed, who created the questions?
Researchers
What is the raw source of the collected data (prior to any aggregation)?
Local Realtor Associations fill out information on past and existing listings http://www.realtor.org/topics/existing-home-sales/expansion-and-survey
Description
- What is the general topic of the data (1-2 words)?
Median housing prices
What are the earliest and latest dates for which data is available?
1989-2015
Is data collected and available periodically?
Yes, monthly
How soon after a reference period ends can a data source be prepared and provided?
Within a month
Selectivity
What is the universe (e.g., population) that the data represents?
Single-family homes, condos, and co-ops in the United States
Accessibility
How is the data accessed?
Digital download (file type not specified)
Is it open data?
No, it does have to be purchased through their online store
Any legal, regulatory, or administrative restrictions on accessing the data source?
No
Cost? - One time or annual or project based payment?
One time cost
Existing Home Sales Historical Data File (1989-2015)- 15 spreadsheets - $795
Metro Area Median Price Historical Data File (1989-2015)- 4 spreadsheets- $795
Pending Home Sales Historical Data File (2001-2015)- 2 spreadsheets- $395
Housing Affordability Index Historical Data File (1989-2015) - 7 spreadsheets- $395
Does this dataset appear to meet our needs for the Census study? YES
Full Inventory
Description
- What is the general contents of the data source?
Bed/Baths, sale price, location
- Features
- What is the temporal nature of the data: longitudinal, time-series, or one time point?
Time series
- Geospatial? If Yes, at what level?
Yes, though each data point is assigned to one of four census regions, Northeast, South (which includes Virginia), West, Midwest.
Metadata
- Is there information available to assess the transparency and soundness of the methods to gather the data for our purposes?
None specified
- Does this universe match the stated intentions for the data collection? If not, what has been included or excluded and why?
Unknown
- Is there a description of each variable in the source along with their valid values?
None specified
- Are there unique IDs for unique elements that can be used for linking data?
None specified
- Is there a data dictionary or codebook?
None specified
Selectivity
- What unit is represented at the record level of the data source?
Property
What is the sampling technique used (if applicable)?
None specified
What was the coverage?
None specified
Stability/Coherence
- Were there any changes to the universe of data being captured (including geographical areas covered) and if so what were they?
None specified
- Were there any changes in the data capture method and if so what were they? (e.g., revised questions, data collection mode, classification categories, algorithms for social media data)
None specified
- Were there any changes in the sources of data and if so what were they?
They continue to have more associations add to their data collection monthly
Accuracy
- Any known sources of error?
None specified
- Describe any quality control checks performed by the data’s owner.
Check for problematic data, caused by “changes in association/board/MLS physical jurisdiction, changes in MLS vendors and /or staff, lack of response by associations/boards and erroneous data.”
Accessibility
- Any records or fields collected, but not included in data source, such as for confidentiality reasons)?
None are listed
- Is there a subset of variables and/or data that is must be obtained through a separate process? (e.g. state level data openly available, but one must apply to get census tract)? If yes, is there a separate legal, regulatory, or administrative restrictions on accessing the data source? Cost? - One time or annual or project based payment?
None are listed
Privacy and security
- Was consent given by participant? If so, how was consent given?
Local associations entered housing voluntarily, no PII were given for owners of the home.
- Are there legal limitations or restrictions on the use of the data?
No
- What confidentiality policies does the source have?
Does not include PII.
Research
- What research has been done with this dataset?
Housing outlook, Economics outlook, rent studies
- Include any links to research if provided:
http://www.realtor.org/research-and-statistics
- List any other data use notes provided by the supplier.
Use data from SentriLock to record foot traffic into homes for sale http://www.realtor.org/infographics/foot-traffic
Gaps/Concerns
- Feasibility - can all jurisdiction levels provide the data (if applicable)?
Yes, it is available for the whole country
- Data ownership - a lack of clarity in legal guidance stemming from a lack of clarity with who owns digital data?
- Data collection authority - what data is reasonably private and what constitutes unwarranted intrusion?
- Describe any other notes you have or any gaps/concerns you see with this dataset:
It is currently only given by the four census regions, so the regions are pretty broad. It might not be as helpful as a more specific data source.