VT Census Case Studies : Black Knight Financial Services (BKFS)

Brief Overall Description of the Dataset:

Black Knight Financial is “the mortgage and finance industries’ leading provider of integrated technology, services and data solutions that facilitate and automate many of the business processes across the entire loan lifecycle.” The dataset originates from U.S. property records, directly collected and verified by BKFS. Dataset includes information regarding housing amenities, demography and building quality to better estimate any given houses value and a homeowner’s expected mortgage rate for a given area. On January 3, 2014, Fidelity National Financial acquired Lender Processing Services "LPS", renaming it Black Knight.

Screening

  • Is the data collected opinion-based?
  • Is the data collection recurring?
  • Is there data available for 2013?
  • For Housing: Is the data collected at the property or housing unit level? 
  • Can we access the data by August 15th?

Purpose

  • What is the purpose of the organization collecting the data?

The purpose of the organization collecting this data is to allow for proper financial analyses, regarding housing and for proper mortgage performance data to be available for investors

  • Why is it collected and how does the organization use it?

The data is collected in order to give investors, lenders and homebuyers a better estimate of their mortgage and loans when purchasing real estate.

  • Who else uses the data?

Investors, lenders, government agencies

  • Who do they sell the data to?

Investors, lenders, government agencies

 

Method

  • What is the data collection method? 

Directly collect from multiple US property records

  • What is the type of data collected? 

Administrative data

  • If designed, who created the questions?

  • What is the raw source of the collected data (prior to any aggregation)? 

Researchers collect and scrape data from public records


Description

  • What is the general topic of the data (1-2 words)?

Assessment data

  • What are the earliest and latest dates for which data is available?

Unknown-most recent month 2015

  • Is data collected and available periodically?

Yes, monthly

  • How soon after a reference period ends can a data source be prepared and provided? 

Approximately a month


Selectivity

  • What is the universe (e.g., population) that the data represents?

Properties in the U.S.


Accessibility

  • How is the data accessed? 

The data purchased was made to available from their API.

  • Is it open data?

No

  • Any legal, regulatory, or administrative restrictions on accessing the data source?

No

  • Cost? - One time or annual or project based payment?

$3250 for Virginia, June 1, 2013 - December 31, 2015; $1,000 for each additional year


Does this dataset appear to meet our needs for the Census study? YES

Full Inventory

Description

  • What is the general contents of the data source?

Housing amenities, bed/baths, demographics, building quality

  • Features
    • What is the temporal nature of the data: longitudinal, time-series, or one time point?

Time-series

    • Geospatial? If Yes, at what level?

Yes, Addresses

 

Metadata

  • Is there information available to assess the transparency and soundness of the methods to gather the data for our purposes?

No

  • Is there a description of each variable in the source along with their valid values?

Yes

  • Are there unique IDs for unique elements that can be used for linking data?

Addresses

  • Is there a data dictionary or codebook?

Yes (emailed)

 

Selectivity

  • What unit is represented at the record level of the data source? 

Household

  • Does this universe match the stated intentions for the data collection? If not, what has been included or excluded and why?

Unknown

  • What is the sampling technique used (if applicable)? 

None listed

  •  What was the coverage? 

Cover 70% of the mortgage industry

99.9% coverage of the U.S. population and households 


Stability/Coherence

  • Were there any changes to the universe of data being captured (including geographical areas covered) and if so what were they?

None listed

  • Were there any changes in the data capture method and if so what were they?

None listed

  • Were there any changes in the sources of data and if so what were they? 

None listed

 

Accuracy

  • Any known sources of error?

None listed

  • Describe any quality control checks performed by the data’s owner.

None Listed

 

Accessibility

  • Any records or fields collected, but not included in data source, such as for confidentiality reasons? 

None listed

  • Is there a subset of variables and/or data that is must be obtained through a separate process? If yes, is there a separate legal, regulatory, or administrative restrictions on accessing the data source? Cost? - One time or annual or project based payment?

None listed

 

Privacy and security

  • Was consent given by participant? If so, how was consent given?

Not stated (possibly in mortgage information)

Assessment data this is based on is public data

  • Are there legal limitations or restrictions on the use of the data? 

None listed

  • What confidentiality policies does the source have? 

Signed non-disclosure agreement


Research

  • What research has been done with this dataset? 

None specifically listed

  • Include any links to research if provided:

None were provided

  • List any other data use notes provided by the supplier.

 

Gaps/Concerns

  • Feasibility - can all jurisdiction levels provide the data (if applicable)?
  • Data ownership - a lack of clarity in legal guidance stemming from a lack of clarity with who owns digital data?
  • Data collection authority - what data is reasonably private and what constitutes unwarranted intrusion?
  • Describe any other notes you have or any gaps/concerns you see with this dataset: