Brief Overall Description of the Dataset:
Black Knight Financial is “the mortgage and finance industries’ leading provider of integrated technology, services and data solutions that facilitate and automate many of the business processes across the entire loan lifecycle.” The dataset originates from U.S. property records, directly collected and verified by BKFS. Dataset includes information regarding housing amenities, demography and building quality to better estimate any given houses value and a homeowner’s expected mortgage rate for a given area. On January 3, 2014, Fidelity National Financial acquired Lender Processing Services "LPS", renaming it Black Knight.
Link: http://www.bkfs.com/Products/RealEstate/Pages/Real-Estate-Data-Analytics.aspx
Date Inventory Completed: 05/22/2015
Screening
- Is the data collected opinion-based?
- Is the data collection recurring?
- Is there data available for 2013?
- For Housing: Is the data collected at the property or housing unit level?
- Can we access the data by August 15th?
Purpose
What is the purpose of the organization collecting the data?
The purpose of the organization collecting this data is to allow for proper financial analyses, regarding housing and for proper mortgage performance data to be available for investors
Why is it collected and how does the organization use it?
The data is collected in order to give investors, lenders and homebuyers a better estimate of their mortgage and loans when purchasing real estate.
Who else uses the data?
Investors, lenders, government agencies
Who do they sell the data to?
Investors, lenders, government agencies
Method
What is the data collection method?
Directly collect from multiple US property records
What is the type of data collected?
Administrative data
If designed, who created the questions?
What is the raw source of the collected data (prior to any aggregation)?
Researchers collect and scrape data from public records
Description
What is the general topic of the data (1-2 words)?
Assessment data
What are the earliest and latest dates for which data is available?
Unknown-most recent month 2015
Is data collected and available periodically?
Yes, monthly
How soon after a reference period ends can a data source be prepared and provided?
Approximately a month
Selectivity
What is the universe (e.g., population) that the data represents?
Properties in the U.S.
Accessibility
How is the data accessed?
The data purchased was made to available from their API.
Is it open data?
No
- Any legal, regulatory, or administrative restrictions on accessing the data source?
No
- Cost? - One time or annual or project based payment?
$3250 for Virginia, June 1, 2013 - December 31, 2015; $1,000 for each additional year
Does this dataset appear to meet our needs for the Census study? YES
Full Inventory
Description
- What is the general contents of the data source?
Housing amenities, bed/baths, demographics, building quality
- Features
- What is the temporal nature of the data: longitudinal, time-series, or one time point?
Time-series
- Geospatial? If Yes, at what level?
Yes, Addresses
Metadata
- Is there information available to assess the transparency and soundness of the methods to gather the data for our purposes?
No
- Is there a description of each variable in the source along with their valid values?
Yes
- Are there unique IDs for unique elements that can be used for linking data?
Addresses
- Is there a data dictionary or codebook?
Yes (emailed)
Selectivity
- What unit is represented at the record level of the data source?
Household
Does this universe match the stated intentions for the data collection? If not, what has been included or excluded and why?
Unknown
What is the sampling technique used (if applicable)?
None listed
- What was the coverage?
Cover 70% of the mortgage industry
99.9% coverage of the U.S. population and households
Stability/Coherence
- Were there any changes to the universe of data being captured (including geographical areas covered) and if so what were they?
None listed
- Were there any changes in the data capture method and if so what were they?
None listed
- Were there any changes in the sources of data and if so what were they?
None listed
Accuracy
- Any known sources of error?
None listed
- Describe any quality control checks performed by the data’s owner.
None Listed
Accessibility
- Any records or fields collected, but not included in data source, such as for confidentiality reasons?
None listed
- Is there a subset of variables and/or data that is must be obtained through a separate process? If yes, is there a separate legal, regulatory, or administrative restrictions on accessing the data source? Cost? - One time or annual or project based payment?
None listed
Privacy and security
- Was consent given by participant? If so, how was consent given?
Not stated (possibly in mortgage information)
Assessment data this is based on is public data
- Are there legal limitations or restrictions on the use of the data?
None listed
- What confidentiality policies does the source have?
Signed non-disclosure agreement
Research
- What research has been done with this dataset?
None specifically listed
- Include any links to research if provided:
None were provided
- List any other data use notes provided by the supplier.
Gaps/Concerns
- Feasibility - can all jurisdiction levels provide the data (if applicable)?
- Data ownership - a lack of clarity in legal guidance stemming from a lack of clarity with who owns digital data?
- Data collection authority - what data is reasonably private and what constitutes unwarranted intrusion?
- Describe any other notes you have or any gaps/concerns you see with this dataset: