Brief Overall Description of the Dataset:
Freddie Mac “provide information, data, analysis, and insight across a wide range of housing and economic indicators.” The agency’s data sets are as follows: “Mortgage Rates Survey, Market Outlook, Multi-Indicator Market Index, House Price Index, Refinance Report, Single-Family Loan Level Data set, and ‘additional datasets.’”
Link: http://www.freddiemac.com/finance/
Date Inventory Completed: 6/17/2015
Screening
- Is the data collected opinion-based?
- Is the data collection recurring (must be collected at least annually)?
- Is there data available for 2013?
- Is the data collected at the property or housing unit level? (loan or lender level)
- Can we access the data by August 15th?
Purpose
What is the purpose of the organization collecting the data?
“Freddie Mac was chartered by Congress in 1970 with a public mission to stabilize the nation's residential mortgage markets and expand opportunities for homeownership and affordable rental housing. Our statutory mission is to provide liquidity, stability and affordability to the U.S. housing market.
We participate in the secondary mortgage market by purchasing mortgage loans and mortgage-related securities for investment and by issuing guaranteed mortgage-related securities, principally those we call PCs. The secondary mortgage market consists of institutions engaged in buying and selling mortgages in the form of whole loans (i.e., mortgages that have not been securitized) and mortgage-related securities. We do not lend money directly to homeowners.
Freddie Mac is operating under a conservatorship that began on September 6, 2008, conducting our business under the direction of the Federal Housing Finance Agency (FHFA).”
Why is it collected and how does the organization use it?
Freddie Mac collects the data to aid in its own operations and also to make the data available to the public.
Who else uses the data?
Potentially businesses, policy-makers, and researchers
Who do they sell the data to?
The data under their “Economic and Housing Research” section is freely available to the public - except for the “Single Family Loan-Level Data set” which you need to “request a login” in order to look at the data. However this data set is still free - so long as you are not.
Method
What is the data collection method?
It depends upon the survey.
For the Primary Mortgage Market Survey, “Freddie Mac surveys lenders (all different types) each week on the rates, fees and points for the most popular mortgage products.”
For the Housing Price Index, Freddie Mac states that it is “based on an ever expanding database of loans purchased by either Freddie Mac or Fannie Mae.”
For US Economic and Housing Market Outlook, Freddie Mac “compiles data on major economic, housing and mortgage indicators” and uses the Outlook to “offer forecasts.” (From a brief look at the data set, I would say Freddie Mac definitely uses some other federal agencies’ data for this Outlook)
For the Multi-Indicator Market Index (MiMi): “MiMi measures the stability of local housing activity by combining current local market data with Freddie Mac data for all 50 states plus the District of Columbia, the top 100 metros, and the nation.”
For the Refinance Report: “Freddie Mac compiles statistics and produces its quarterly Refinance Report based on loans refinanced in our retained portfolio.”
For the “single family loan-level data set”: “As part of a larger effort to increase transparency, Freddie Mac is making available loan-level credit performance data on a portion of fully amortizing 30-year fixed-rate mortgages that the company purchased or guaranteed from 1999 to 2013.”
The “additional data sets” use Freddie Mac surveys and data about “Treasury bills” (Annual ARMs survey, Federal Cost of Funds Index (the Treasury Bills), and Refinance and ARM Share survey (which is “part of Primary Mortgage Market Survey once a month”))
What is the type of data collected?
See above: It is a mix of designed collection and administrative data (some from Freddie Mac and some from other places).
If designed, who created the questions?
Freddie Mac or other government agencies (when surveys were used)
What is the raw source of the collected data (prior to any aggregation)?
The survey forms (which represent the respondents to Freddie Mac and other agency surveys), administrative forms located in Freddie Mac, and administrative forms from other agencies
Description
What is the general topic of the data (1-2 words)?
Loans, home prices, and interest rates
What are the earliest and latest dates for which data is available?
Depends on the data set. For the ones we are particularly interested in:
Primary Mortgage Market Survey: It has been carried out since April 1971, but the readily available data on the website is from the very beginning of 2010 up until now.
Housing Price Index: “Began in January 1975” and data is available from January 1975 until now
Is data collected and available periodically?
Yes
How soon after a reference period ends can a data source be prepared and provided?
Would vary from dataset to dataset
PMMS: Within a week after the data is collected (at most)
HPI: About 2-3 months after each three month batch is recorded (Values are calculated monthly but are released at the end of the following quarter. For example, the FMHPI for October, November, and December are published in late February of the following year. Series are available at three levels of geographical aggregation: Metropolitan Statistical Area (MSA), state, and national. )
Selectivity
What is the universe (e.g., population) that the data represents?
All of the data sets attempt to measure their respective areas across the US for differing spans of time; each data set also has a different “level of granularity.”
PMMS: US and “five Freddie Mac regions”
HPI: US, State, and “Metro Area”
Accessibility
- How is the data accessed?
Excel and Freddie Mac also allows one to “display” some of the surveys “on your website” (so in other words, Freddie Mac allows you to “syndicate”)
- Is it open data?
Yes (for the Single-Family Loan Level data set, you would just need to “register”)
Any legal, regulatory, or administrative restrictions on accessing the data source?
The Single-Family Loan Level data set has a ‘terms of use.’ I do not believe the two primary data sets we are looking at have restrictions. For the others, we would need to look more closely
Cost? - One time or annual or project based payment?
None for this data (unless we wanted to do a license for the Single-Family Loan Level data - which I do not believe we would have to do since we are an academic organization)
Does this dataset appear to meet our needs for the Census study? Maybe (if applicable)
Full Inventory
Description
- Features
- What is the temporal nature of the data: longitudinal, time-series, or one time point?
Time series
- Geospatial? If Yes, at what level?
The House Price Index dataset gives data down to the “Metro” level (and even has a map for it). This would be the closest to ‘geospatial data’ that the data sets would come. (Some of the other data sets might have similar levels of geography.)
Metadata
- Is there information available to assess the transparency and soundness of the methods to gather the data for our purposes?
Yes because each of the data sets were relatively clear about where the information was coming from, and any nebulous “dimensions” of the data collection process could be answered by contacting them (they seem pretty willing to answer questions)
- Is there a description of each variable in the source along with their valid values?
Most if not all of the variables are well explained/notated. It would be difficult to check the validity of Freddie Mac’s own data but any data they use from other sources would be fairly easy to verify.
- Are there unique IDs for unique elements that can be used for linking data?
No, these are mainly aggregates, so there would not be the need for the type of linkages found at a more ‘granular’ level of data collection.
- Is there a data dictionary or codebook?
No, but everything is well annotated.
Selectivity
What unit is represented at the record level of the data source?
At the record level, the unit would be for some data the lender, for some data an individual’s loan/mortgage, etc.
Does this universe match the stated intentions for the data collection? If not, what has been included or excluded and why?
Nothing stands out why it wouldn’t
What is the sampling technique used (if applicable)?
For the data sets involving sampling, it does not seem the type of sampling is clear.
What was the coverage?
Unknown
Stability/Coherence
- Were there any changes to the universe of data being captured (including geographical areas covered) and if so what were they?
If there were changes, I believe they would be documented somewhere either in the dataset or in the information pertaining to the dataset.
- Were there any changes in the data capture method and if so what were they?
Unknown
- Were there any changes in the sources of data and if so what were they?
Unknown
Accuracy
- Any known sources of error?
No known sources of error, but any problems should be noted on the site itself.
- Describe any quality control checks performed by the data’s owner.
Unknown
Accessibility
- Any records or fields collected, but not included in data source, such as for confidentiality reasons)?
Some of the information regarding the lenders and the people who have taken out the loans would have been omitted in aggregation.
Is there a subset of variables and/or data that is must be obtained through a separate process? If yes, is there a separate legal, regulatory, or administrative restrictions on accessing the data source? Cost? - One time or annual or project based payment?
Yes, we would need to be “registered” (at http://www.freddiemac.com/news/finance/sf_loanlevel_dataset.html) to have access to the “Single-Family Loan-Level Dataset.” These the terms of agreement: https://freddiemac.embs.com/FLoan/HistoricalDataTerms.html
“Use of the dataset continues to be free for non-commercial, academic/research and for limited use” (NOTE Freddie Mac has had some relationship with CoreLogic; I believe CoreLogic may have gotten a “licensing agreement for commercial redistribution of the data” - see link mentioned in the previous question.)
Privacy and security
- Was consent given by participant? If so, how was consent given?
The surveys would have gotten consent (unknown how consent was given) but, for say datasets such as the “Single Family Loan-Level Dataset” and the “House Price Index” (where information is drawn from Freddie Mac’s data sources), it is unclear how consent was given.
- Are there legal limitations or restrictions on the use of the data?
Only the Terms and Condition on the Single Family Loan-Level Dataset
- What confidentiality policies does the source have?
Freddie Mac will not give out personally identifiable information (its data sets do not include PII). Specifically, there are some confidentiality policies in the Terms and Conditions on the Single Family Loan-Level Dataset (I do not believe there are any other specific confidentiality policies.)
Research
- What research has been done with this dataset? (e.g., impact of policies, predictors of student success)
Unclear
- Include any links to research if provided:
None
- List any other data use notes provided by the supplier.
Gaps/Concerns
- Feasibility - can all jurisdiction levels provide the data (if applicable)?
- Data ownership - a lack of clarity in legal guidance stemming from a lack of clarity with who owns digital data?
- Data collection authority - what data is reasonably private and what constitutes unwarranted intrusion?
- Describe any other notes you have or any gaps/concerns you see with this dataset: