VT Census Case Studies : National Association of REALTORS

Brief Overall Description of the Dataset: The National Association of REALTORS does research on many topics, including market data, commercial, international, home buying and selling, NAR member information and technology. The majority of the data reports median housing prices across the country. This data is reported by local associations in four census regions of the United States, Northeast, South, Midwest and West, as houses are purchased and sold. Data is collected monthly, reporting within a month after collection. Datasets are available for purchase dating from 1989 to 2015.

Screening

  • Is the data collected opinion-based?
  • Is the data collection recurring (must be collected at least annually)?
  • Is there data available for 2013?
  • For Housing: Is the data collected at the property or housing unit level? 
  • Can we access the data by August 15th?

Purpose

  • What is the purpose of the organization collecting the data?

To better monitor and analyze the housing market in the United States. 

  • Why is it collected and how does the organization use it?

This data is collected to “preserve the free enterprise system and the right to own real property”.

  • Who else uses the data?

Realtors, researchers

  • Who do they sell the data to?

Businesses, government

 

Method

  • What is the data collection method?  

After receiving the raw data in online questionnaire format from local associations, the National Association of Realtors divides the nation into four census regions, Northeast, South, Midwest and West. This raw data is cleaned and “problematic data” is removed.  The aggregated raw volume figures are weighted as accurate representations of the sales activity in each of the four census regions.  These weights are then benchmarked every ten year to measure shifts in regional demand. And the non-seasonally adjusted volume is converted into seasonally-adjusted annualized rates.  http://www.realtor.org/topics/existing-home-sales/methodology

  • What is the type of data collected? 

Designed collection through survey

  • If designed, who created the questions?

Researchers

  • What is the raw source of the collected data (prior to any aggregation)? 

Local Realtor Associations fill out information on past and existing listings http://www.realtor.org/topics/existing-home-sales/expansion-and-survey


Description

  • What is the general topic of the data (1-2 words)?

 Median housing prices

  • What are the earliest and latest dates for which data is available?

1989-2015

  • Is data collected and available periodically?

Yes, monthly

  • How soon after a reference period ends can a data source be prepared and provided? 

Within a month


Selectivity

  • What is the universe (e.g., population) that the data represents?

Single-family homes, condos, and co-ops in the United States

 

Accessibility

  • How is the data accessed? 

Digital download (file type not specified)

  • Is it open data?

No, it does have to be purchased through their online store

  • Any legal, regulatory, or administrative restrictions on accessing the data source?

No

  • Cost? - One time or annual or project based payment?

One time cost

 Existing Home Sales Historical Data File (1989-2015)- 15 spreadsheets - $795

 Metro Area Median Price Historical Data File (1989-2015)- 4 spreadsheets- $795

 Pending Home Sales Historical Data File (2001-2015)- 2 spreadsheets- $395

 Housing Affordability Index Historical Data File (1989-2015) - 7 spreadsheets- $395  

Does this dataset appear to meet our needs for the Census study? YES

Full Inventory

Description

  • What is the general contents of the data source?

Bed/Baths, sale price, location

  • Features
    • What is the temporal nature of the data: longitudinal, time-series, or one time point?

Time series

    • Geospatial? If Yes, at what level?

Yes, though each data point is assigned to one of four census regions, Northeast, South (which includes Virginia), West, Midwest.


Metadata

  • Is there information available to assess the transparency and soundness of the methods to gather the data for our purposes?

None specified

 

  • Does this universe match the stated intentions for the data collection? If not, what has been included or excluded and why?

Unknown

  • Is there a description of each variable in the source along with their valid values?

None specified

  • Are there unique IDs for unique elements that can be used for linking data?

None specified

  • Is there a data dictionary or codebook?

None specified

 

Selectivity

  • What unit is represented at the record level of the data source? 

 Property

  • What is the sampling technique used (if applicable)?

None specified

  • What was the coverage? 

None specified


Stability/Coherence

  • Were there any changes to the universe of data being captured (including geographical areas covered) and if so what were they?

None specified

  • Were there any changes in the data capture method and if so what were they? (e.g., revised questions, data collection mode, classification categories, algorithms for social media data)

None specified

  • Were there any changes in the sources of data and if so what were they? 

They continue to have more associations add to their data collection monthly

 

Accuracy

  • Any known sources of error?

None specified

  • Describe any quality control checks performed by the data’s owner.

Check for problematic data, caused by “changes in association/board/MLS physical jurisdiction, changes in MLS vendors and /or staff, lack of response by associations/boards and erroneous data.”

 

Accessibility

  • Any records or fields collected, but not included in data source, such as for confidentiality reasons)? 

None are listed

  • Is there a subset of variables and/or data that is must be obtained through a separate process? (e.g. state level data openly available, but one must apply to get census tract)? If yes, is there a separate legal, regulatory, or administrative restrictions on accessing the data source? Cost? - One time or annual or project based payment?

None are listed


Privacy and security

  • Was consent given by participant? If so, how was consent given?

Local associations entered housing voluntarily, no PII were given for owners of the home.

  • Are there legal limitations or restrictions on the use of the data? 

No

  • What confidentiality policies does the source have? 

Does not include PII.

 

Research

  • What research has been done with this dataset?

Housing outlook, Economics outlook, rent studies

  • Include any links to research if provided:

http://www.realtor.org/research-and-statistics

  • List any other data use notes provided by the supplier.

Use data from SentriLock to record foot traffic into homes for sale http://www.realtor.org/infographics/foot-traffic

 

Gaps/Concerns

  • Feasibility - can all jurisdiction levels provide the data (if applicable)?

Yes, it is available for the whole country

  • Data ownership - a lack of clarity in legal guidance stemming from a lack of clarity with who owns digital data?
  • Data collection authority - what data is reasonably private and what constitutes unwarranted intrusion?
  • Describe any other notes you have or any gaps/concerns you see with this dataset:

It is currently only given by the four census regions, so the regions are pretty broad. It might not be as helpful as a more specific data source.