VT Census Case Studies : Williamsburg Local MLS Data

Brief Overall Description of the Dataset: We have obtained data from the Williamsburg Local MLS Data Base concerning properties in James City County (and I believe some properties outside of James City County) from 2005 to 2015 (what they have of 2015).  This data was obtained by contacting the Williamsburg Area Association of Realtors.  From looking at the data, it appears data was collected each year on homes that sold during that year.  For each of the sold homes, there is listed information about the taxes on the home, number of bedrooms, how much the home sold for, and other variables.

Link: http://www.waarealtor.com/members.html 

Date Inventory Completed: 6/2/2015

Screening

  • Is the data collected opinion-based?
  • Is the data collection recurring (must be collected at least annually)?
  • Is there data available for 2013?
  • Is the data collected at the property or housing unit level? 
  • Can we access the data by August 15th?

Purpose

  • What is the purpose of the organization collecting the data?

This is the listed purpose of the organization: “The Williamsburg Area Association of REALTOR®S is a professional trade association of over 400 licensed real estate agents.  Our members abide by a strict code of ethics and have access to a wide variety of business services that are not available to non-REALTORS. This gives them a competitive edge in the marketplace, enabling them to provide superior services to buyers and sellers of real property.   One of the most valuable tools only available to REALTORS® is a state-of-the-art MLS (Multiple Listing Service) which provides accurate, current information about the local market and more. Through the Williamsburg Multiple Listing Service (WMLS) REALTORS® have access to many “Value Added Services” not found elsewhere.  And, ever serving our members, the WMLS has become part of a new venture – the Commonwealth MLS Co-op – which upon launch late fall 2015, will greatly expand the footprint of area covered.”

  • Why is it collected and how does the organization use it? 

The organization collects the data to make it available to members of the Association.

  • Who else uses the data?

Realtors are normally the only people that would be accessing this.  It seems from our experience that policy-makers and researchers can access the data as well.

  • Who do they sell the data to?

They do not sell the data to anyone, but I believe you have to pay to be a member (and therefore the data is available to paying members).

 

Method

  • What is the data collection method?

The data is administrative data, but I’m not sure how it is collected.

  • What is the type of data collected?

The data is administrative data.

  • If designed, who created the questions?

Not designed.

  • What is the raw source of the collected data (prior to any aggregation)?

The raw source would be the administrative forms.


Description

  • What is the general topic of the data (1-2 words)? 

MLS James City Data (Williamsburg Area Association Property Data)

  • What are the earliest and latest dates for which data is available?

2005 to 2015 (What has happened of 2015)

  • Timeliness

    • Is data collected and available periodically?

Yes

    • How soon after a reference period ends can a data source be prepared and provided?

I believe the turn-around time is a few days at most.


Selectivity

  • What is the universe that the data represents?

Properties in James City County (and other places in Virginia (York County, New Kent, and City of Williamsburg) from 2005 to 2015.


Accessibility

  • How is the data accessed? 

It can be obtained in Excel format.

    • Is it open data? 

No

    • Any legal, regulatory, or administrative restrictions on accessing the data source?

Unclear

    • Cost? - One time or annual or project based payment?

There does not seem to be any cost.


Does this dataset appear to meet our needs for the Census study? YES


Full Inventory 

Description

  • Features

    • What is the temporal nature of the data: longitudinal, time-series, or one time point?

Time-series

    • Geospatial? If Yes, at what level?

Yes, there is geospatial data at the latitude and longitude level.

  • What is the scope of the records?

Properties in James City County (and possibly York County, New Kent, and City of Williamsburg)


Metadata

  • Is there information available to assess the transparency and soundness of the methods to gather the data for our purposes (i.e., supplementing the census)?

The data is from a Realtor Association, which keeps this data up-to-date for its members.  Therefore, since this Association wants to provide good “products” to its members, we can say the data collection method should be sound.  With regard to transparency, one would have to contact the Association directly to see if the organization would tell you about the methods it used to collect the data.

  • Is there a description of each variable in the source along with their valid values?

Each column has a header; many of these headers are pretty self-explanatory.  However, there is not a separate page with descriptions of all of the variables (which would be useful).    

  • Are there unique IDs for unique elements that can be used for linking data?

There is a parcel ID number that goes with each sold property.  One would have to do a bit more digging to see if one could use these ID numbers to track properties.  The addresses are also included for each sold property, so you could link by address as well.

  • Is there a data dictionary or codebook? If so, put the link here and add to folder.

No, there is not a data dictionary or codebook.


Selectivity

  • What unit is represented at the record level of the data source?

Selling of a property

  • Does this universe match the stated intentions for the data collection? If not, what has been included or excluded and why?

MLS data does not include self-sold homes

  • What is the sampling technique used (if applicable)? 

Not applicable

  • What was the coverage?

Uknown


Stability/Coherence

  • Were there any changes to the universe of data being captured (including geographical areas covered) and if so what were they? 

I am not aware of any changes.

  • Were there any changes in the data capture method and if so what were they?

Unknown

  • Were there any changes in the sources of data and if so what were they? 

Unknown: from a quick look at the data over time, it seems the variables have stayed pretty consistent (which would mean the Association has been recording the same data overtime).  But, one would need to do a more thorough analysis to know for sure if the variables have changed or not.


Accuracy

  • Any known sources of error?

There are NA values/blanks throughout the data sets, but I believe this is because, for certain properties, some of the variables would not apply.

  • Describe any quality control checks performed by the data’s owner.

Unknown 


Accessibility

  • Any records or fields collected, but not included in data source, such as for confidentiality reasons)? 

The names of the people who bought the properties are not included in the data sets.

  • Is there a subset of variables and/or data that is must be obtained through a separate process? If yes, is there a separate legal, regulatory, or administrative restrictions on accessing the data source?  Cost? - One time or annual or project based payment? 

No, there is not a subset of variables that must be obtained via a separate process.

 

Privacy and security

  • Was consent given by participant? If so, how was consent given?

Unknown

  • Are there legal limitations or restrictions on the use of the data?

Since we were given the data by the Association, there shouldn’t be any limitations or restrictions.

  • What confidentiality policies does the source have?

We are not aware of any at the present time because we did not have to go through a process to get the data; it was just given to us.  However, before we start using the data, we should probably double check about if there are any confidentiality policies.


Research

  • What research has been done with this dataset?

  • Include any links to research if provided:  

  • List any other data use notes provided by the supplier.

  • Describe any other notes you have or any gaps/concerns you see with this dataset:

It was slightly concerning that the person I contacted at the Association sent the data to me via email.  

From the description of the Association itself, we know there will be a new MLS resource at the Association in the Fall.  Therefore, I believe we should contact the Association again at that time to see if we can obtain the new data made available by this source.