Brief Overall Description of the Dataset:
“This system allows selective access to data from HUD's Low-Income Housing Tax Credit Database. Data output is in either easy-to-read HTML tables, or a comma-delimited text file suitable for further analysis with spreadsheet, database, or statistical software. Data are now available for projects placed in service through 2012.” You can obtain the locations of various buildings that have ‘low income units.’
More information: “The Low-Income Housing Tax Credit (LIHTC) is the most important resource for creating affordable housing in the United States today. The LIHTC database, created by HUD and available to the public since 1997, contains information on 39,094 projects and almost 2,458,000 housing units placed in service between 1987 and 2012.
Created by the Tax Reform Act of 1986, the LIHTC program gives State and local LIHTC-allocating agencies the equivalent of nearly $8 billion in annual budget authority to issue tax credits for the acquisition, rehabilitation, or new construction of rental housing targeted to lower-income households. Although some data about the program have been made available by various sources, HUD's database is the only complete national source of information on the size, unit mix, and location of individual projects. With the continued support of the national LIHTC database, HUD hopes to enable researchers to learn more about the effects of the tax credit program.
The database includes project address, number of units and low-income units, number of bedrooms, year the credit was allocated, year the project was placed in service, whether the project was new construction or rehab, type of credit provided, and other sources of project financing. The database has been geocoded, enabling researchers to look at the geographical distribution and neighborhood characteristics of tax credit projects. It may also help show how incentives to locate projects in low-income areas and other underserved markets are working.”
Link: http://lihtc.huduser.org/
Date Inventory Completed: 6/22/2015
Screening
- Is the data collected opinion-based?
- Is the data collection recurring (must be collected at least annually)?
- Is there data available for 2013?
- Is the data collected at the property or housing unit level?
- Can we access the data by August 15th?
Purpose
What is the purpose of the organization collecting the data?
“HUD’s mission is to create strong, sustainable, inclusive communities and quality affordable homes for all. HUD is working to strengthen the housing market to bolster the economy and protect consumers; meet the need for quality affordable rental homes; utilize housing as a platform for improving quality of life; build inclusive and sustainable communities free from discrimination, and transform the way HUD does business.”
Why is it collected and how does the organization use it?
The organization uses the data to inform their decisions about low income housing initiatives. The organization collects the data in order to keep track of low income housing.
Who else uses the data?
Policy-makers, researchers
Who do they sell the data to?
No one
Method
What is the data collection method?
Unknown for sure but, since the “LIHTC gives State and local LIHTC-allocating agencies” money, these organizations probably have to report back to the federal government. Therefore, HUD probably obtained the data via this reporting chain.
What is the type of data collected?
Administrative Data
If designed, who created the questions?
What is the raw source of the collected data (prior to any aggregation)?
Unknown for sure, but probably “State and local LIHTC-allocating agencies’ ” records
Description
What is the general topic of the data (1-2 words)?
Affordable housing
What are the earliest and latest dates for which data is available?
For general data: 1987-2012 (but certain variables are only available for a subset of those years)
Is data collected and available periodically?
Somewhat - the system does not have 2013 or 2014 data uploaded right now. But the data is by year.
How soon after a reference period ends can a data source be prepared and provided?
About 1.5-2 years (see http://www.huduser.org/portal/datasets/lihtc.html#data for more information)
Selectivity
What is the universe (e.g., population) that the data represents?
“Affordable housing funded by the LIHTC from 1987 to 2012”
Accessibility
- How is the data accessed?
HTML tables and “comma-delimited text files”
- Is it open data?
Yes
Any legal, regulatory, or administrative restrictions on accessing the data source?
None that I know of.
Cost? - One time or annual or project based payment?
None
Does this dataset appear to meet our needs for the Census study? Maybe, if applicable
Full Inventory
Description
- Features
- What is the temporal nature of the data: longitudinal, time-series, or one time point?
Time Series
- Geospatial? If Yes, at what level?
Yes, Census tract (and “Metropolitan area code”) and some of the data apparently has latitude and longitude data.
Metadata
- Is there information available to assess the transparency and soundness of the methods to gather the data for our purposes?
All of that information can be found here: http://www.huduser.org/portal/datasets/lihtc.html
- Is there a description of each variable in the source along with their valid values?
Yes
- Are there unique IDs for unique elements that can be used for linking data?
There are HUD ID numbers, but I’m not sure if this would be counted as ‘unique IDs for linking.’
- Is there a data dictionary or codebook?
Unknown
Selectivity
What unit is represented at the record level of the data source?
Housing unit
Does this universe match the stated intentions for the data collection? If not, what has been included or excluded and why?
Yes (because the stated intentions were to collect data on the LIHTC homes, and the data base does this)
What is the sampling technique used (if applicable)?
What was the coverage?
Stability/Coherence
- Were there any changes to the universe of data being captured (including geographical areas covered) and if so what were they?
Unknown (see: http://www.huduser.org/portal/datasets/lihtc.html)
- Were there any changes in the data capture method and if so what were they?
Unknown
- Were there any changes in the sources of data and if so what were they?
Unknown
Accuracy
- Any known sources of error?
Yes: statistics on “missing records and values” can be found here: http://lihtc.huduser.org/missing.htm
- Describe any quality control checks performed by the data’s owner.
Unknown
Accessibility
- Any records or fields collected, but not included in data source, such as for confidentiality reasons)?
Unknown
- Is there a subset of variables and/or data that is must be obtained through a separate process? If yes, is there a separate legal, regulatory, or administrative restrictions on accessing the data source? Cost? - One time or annual or project based payment?
No
Privacy and security
- Was consent given by participant? If so, how was consent given?
Unclear: since the data is administrative and probably comes from the states and the localities, direct ‘consent’ by the people who live in those apartments was probably not given.
- Are there legal limitations or restrictions on the use of the data?
None
- What confidentiality policies does the source have?
None stated
Research
- What research has been done with this dataset? (e.g., impact of policies, predictors of student success)
Summary reports have been made and I imagine some of the research listed on the HUD website has used the data.
- Include any links to research if provided:
Summary reports: http://www.huduser.org/portal/datasets/lihtc.html#data
HUD website research: http://www.huduser.org/portal/research/home.html
- List any other data use notes provided by the supplier.
Gaps/Concerns
- Feasibility - can all jurisdiction levels provide the data (if applicable)?
- Data ownership - a lack of clarity in legal guidance stemming from a lack of clarity with who owns digital data?
- Data collection authority - what data is reasonably private and what constitutes unwarranted intrusion?
- Describe any other notes you have or any gaps/concerns you see with this dataset: