VT Census Case Studies : TransUnion

Brief Overall Description of the Dataset:

TransUnion would be able to provide us with Credit Score information - NOT housing information.  The company states that its “database contains more than 200 million files, which profile nearly every credit-active consumer in the United States. This database contains information provided by more than 85,000 credit-granting institutions and is updated, audited and monitored on a regular basis.”  Some of the “products” they offer include “recovery scores” and “new account risk scores.”

After some inquiry, contact was able to get census tract information for the data, which makes the data possible to use.

Information on Trade Lines, which are individual accounts (eg mortgage trade line is a mortgage account with x bank).

On their end, they will go through yearly data and extract users based on zip codes. With each take, new people may be added.

Type of monthly mortgage payment data: date open, closed delinquency, aggregate current balance and aggregate monthly payments (for multiple mortgage)

Payment is what is paid to the bank (including all taxes)... how the banks then report to the credit bureaus. 

Link: Products TransUnion provides: https://www.transunion.com/corporate/business/solutionsbyneed/credit-reporting.page

Date Inventory Completed: 6/3/2015

Screening

  • Is the data collected opinion-based ?
  • Is the data collection recurring (must be collected at least annually)?
  • Is there data available for 2013?
  • For Housing: Is the data collected at the property or housing unit level? 
  • Can we access the data by August 15th?

Purpose

  • What is the purpose of the organization collecting the data?

This is a description of the company: “We've always been more than just a credit reporting agency. We're a sophisticated information provider with an eye toward making a better world.

Our diverse sets of data and analytic solutions deliver meaningful insights to help businesses and consumers spot opportunities and manage risk. We see information not for what it is, but for what it can help people achieve. And we believe, with the right information, people can achieve great things.”

  • Why is it collected and how does the organization use it?

The organization collects credit scores to “provide products” to businesses.

  • Who else uses the data? 

Businesses and citizens (unknown what classes of data different people have access to)

  • Who do they sell the data to?

Businesses and insurers


Method

  • What is the data collection method?

This is administrative data, so the collection would be any administrative forms.

  • What is the type of data collected?

Administrative data

  • If designed, who created the questions?

Not applicable

  • What is the raw source of the collected data (prior to any aggregation)?

Forms that go into the creation of administrative records


Description

  • What is the general topic of the data (1-2 words)?

Credit information

  • What are the earliest and latest dates for which data is available? 

Unknown

  • Timeliness

    • Is data collected and available periodically?

Unknown

  • How soon after a reference period ends can a data source be prepared and provided?

Unknown


Selectivity

  • What is the universe (e.g., population) that the data represents?

Individuals in 33 counties that have credit histories


Accessibility

  • How is the data accessed?

Data exchange gateway

Must create a membership to TransUnion first and get project approved. The membership requires an onsite inspection of where data will be stored and analyzed. After membership, a data service request- DSR (like a SOW) is made.

  • Is it open data?

No

  • Any legal, regulatory, or administrative restrictions on accessing the data source?

Had to get board approval for the project in order to access the data.

  • Cost? - One time or annual or project based payment? 

$27,500 for 5 years of data ($7,000 for first year and $5,000 for rest)

Does this dataset appear to meet our needs for the Census study? Yes

Full Inventory

Description

  • Features
    • What is the temporal nature of the data: longitudinal, time-series, or one time point?

 Time Series

    • Geospatial? If Yes, at what level

Census tract (Had to specifically ask for the data)

Only available for 2000 census tract maps.


Metadata

  • Is there information available to assess the transparency and soundness of the methods to gather the data for our purposes?

 No

  • Is there a description of each variable in the source along with their valid values?

No 

  • Are there unique IDs for unique elements that can be used for linking data?

No

  • Is there a data dictionary or codebook?

Had to ask for a list of variables available (received)


Selectivity

  • What unit is represented at the record level of the data source?

Individuals

  • Does this universe match the stated intentions for the data collection? If not, what has been included or excluded and why?

 "All major minor lender report. small mom-pop community banks may not report (not really in mortgage world)"

  • What is the sampling technique used (if applicable)? 

  • What was the coverage?

 Unknown


Stability/Coherence

  • Were there any changes to the universe of data being captured (including geographical areas covered) and if so what were they?

 Unknown

  • Were there any changes in the data capture method and if so what were they? 

 Unknown

  • Were there any changes in the sources of data and if so what were they? 

 Unknown


Accuracy

  • Any known sources of error?

 Unknown

  • Describe any quality control checks performed by the data’s owner.

 Unknown

 

Accessibility

  • Any records or fields collected, but not included in data source, such as for confidentiality reasons)? 

PII data

  • Is there a subset of variables and/or data that is must be obtained through a separate process? If yes, is there a separate legal, regulatory, or administrative restrictions on accessing the data source? Cost? - One time or annual or project based payment?

Unknown


Privacy and security

  • Was consent given by participant? If so, how was consent given?

Not explicitly stated, but might be in the paperwork that TransUnion gets data from (i.e. mortgages and credit cards)

  • Are there legal limitations or restrictions on the use of the data? 

Data are project specific and must reapply for different projects

  • UnknownWhat confidentiality policies does the source have? 

None Listed

 

Research

  • What research has been done with this dataset? (e.g., impact of policies, predictors of student success)

Unknown

  • Include any links to research if provided:
  • List any other data use notes provided by the supplier.

 

Gaps/Concerns

  • Feasibility - can all jurisdiction levels provide the data (if applicable)?
  • Data ownership - a lack of clarity in legal guidance stemming from a lack of clarity with who owns digital data?
  • Data collection authority - what data is reasonably private and what constitutes unwarranted intrusion?
  • Describe any other notes you have or any gaps/concerns you see with this dataset: 

Brief Overall Description of the Dataset:

TransUnion would be able to provide us with Credit Score information - NOT housing information.  The company states that its “database contains more than 200 million files, which profile nearly every credit-active consumer in the United States. This database contains information provided by more than 85,000 credit-granting institutions and is updated, audited and monitored on a regular basis.”  Some of the “products” they offer include “recovery scores” and “new account risk scores.”

After some inquiry, contact was able to get census tract information for the data, which makes the data possible to use.

Information on Trade Lines, which are individual accounts (eg mortgage trade line is a mortgage account with x bank).

On their end, they will go through yearly data and extract users based on zip codes. With each take, new people may be added.

Monthly mortgage payment: open, closed delinquency, aggregate current balance and aggregate monthly payments (for multiple mortgage)

Payment is what is paid to the bank (including all taxes)... how the banks then report to the credit bureaus. 

Link: Products TransUnion provides: https://www.transunion.com/corporate/business/solutionsbyneed/credit-reporting.page

Date Inventory Completed: 6/3/2015

Screening

  • Is the data collected opinion-based ?
  • Is the data collection recurring (must be collected at least annually)?
  • Is there data available for 2013?
  • For Housing: Is the data collected at the property or housing unit level? 
  • Can we access the data by August 15th?

Purpose

  • What is the purpose of the organization collecting the data?

This is a description of the company: “We've always been more than just a credit reporting agency. We're a sophisticated information provider with an eye toward making a better world.

Our diverse sets of data and analytic solutions deliver meaningful insights to help businesses and consumers spot opportunities and manage risk. We see information not for what it is, but for what it can help people achieve. And we believe, with the right information, people can achieve great things.”

  • Why is it collected and how does the organization use it?

The organization collects credit scores to “provide products” to businesses.

  • Who else uses the data? 

Businesses and citizens (unknown what classes of data different people have access to)

  • Who do they sell the data to?

Businesses and insurers


Method

  • What is the data collection method?

This is administrative data, so the collection would be any administrative forms.

  • What is the type of data collected?

Administrative data

  • If designed, who created the questions?

Not applicable

  • What is the raw source of the collected data (prior to any aggregation)?

Forms that go into the creation of administrative records


Description

  • What is the general topic of the data (1-2 words)?

Credit information

  • What are the earliest and latest dates for which data is available? 

Unknown

  • Timeliness

    • Is data collected and available periodically?

Unknown

  • How soon after a reference period ends can a data source be prepared and provided?

Unknown


Selectivity

  • What is the universe (e.g., population) that the data represents?

Adults in 33 counties that have credit histories


Accessibility

  • How is the data accessed?

Data exchange gateway

Must create a membership to TransUnion first and get project approved. The membership requires an onsite inspection of where data will be stored and analyzed. After membership, a data service request- DSR (like a SOW) is made.

  • Is it open data?

No

  • Any legal, regulatory, or administrative restrictions on accessing the data source?

Had to get board approval for the project in order to access the data.

  • Cost? - One time or annual or project based payment? 

$27,500 for 5 years of data ($7,000 for first year and $5,000 for rest)

Does this dataset appear to meet our needs for the Census study? Yes

Full Inventory

Description

  • Features
    • What is the temporal nature of the data: longitudinal, time-series, or one time point?

 Time Series

    • Geospatial? If Yes, at what level

 Census tract (Had to specifically ask for the data)

Only available for 2000 census tract maps.


Metadata

  • Is there information available to assess the transparency and soundness of the methods to gather the data for our purposes?

 No

  • Is there a description of each variable in the source along with their valid values?

No 

  • Are there unique IDs for unique elements that can be used for linking data?

No

  • Is there a data dictionary or codebook?

Had to ask for a list of variables available (received)


Selectivity

  • What unit is represented at the record level of the data source?

Individuals

  • Does this universe match the stated intentions for the data collection? If not, what has been included or excluded and why?

 "SAll major minor lender report. small mom-pop community banks may not report (not really in mortgage world)"

  • What is the sampling technique used (if applicable)? 

  • What was the coverage?

 Unknown


Stability/Coherence

  • Were there any changes to the universe of data being captured (including geographical areas covered) and if so what were they?

 Unknown

  • Were there any changes in the data capture method and if so what were they? 

 Unknown

  • Were there any changes in the sources of data and if so what were they? 

 Unknown


Accuracy

  • Any known sources of error?

 Unknown

  • Describe any quality control checks performed by the data’s owner.

 Unknown

 

Accessibility

  • Any records or fields collected, but not included in data source, such as for confidentiality reasons)? 

PII data

  • Is there a subset of variables and/or data that is must be obtained through a separate process? If yes, is there a separate legal, regulatory, or administrative restrictions on accessing the data source? Cost? - One time or annual or project based payment?

Unknown


Privacy and security

  • Was consent given by participant? If so, how was consent given?

Not explicitly stated, but might be in the paperwork that TransUnion gets data from (i.e. mortgages and credit cards)

  • Are there legal limitations or restrictions on the use of the data? 

Data are project specific and must reapply for different projects

  • UnknownWhat confidentiality policies does the source have? 

None Listed

 

Research

  • What research has been done with this dataset? (e.g., impact of policies, predictors of student success)

Unknown

  • Include any links to research if provided:
  • List any other data use notes provided by the supplier.

 

Gaps/Concerns

  • Feasibility - can all jurisdiction levels provide the data (if applicable)?
  • Data ownership - a lack of clarity in legal guidance stemming from a lack of clarity with who owns digital data?
  • Data collection authority - what data is reasonably private and what constitutes unwarranted intrusion?
  • Describe any other notes you have or any gaps/concerns you see with this dataset: