This page makes comparisons External Data to the County Data.

Data Sampling Frame and Data Collection

For its main sample processing, the Census selects addresses form its a Master Address Files (MAF). "The MAF is updated twice each year with the Delivery Sequence Files (DSF) provided by the U.S. Postal Service. The DSF covers only the U.S. These files identify mail drop points and provide the best available source of changes and updates to the housing unit inventory. The MAF is also updated with the results from various Census Bureau field operations, including the ACS." The file is received September/October of the previous year and accounts for 99% of that year's sample. –  American Community Survey Design and Methodology (January 2014)

The MAF is updated a second time in January/February of the sample year and a second/supplementary sample is created. "All addresses that were in a first-phase sample within the past four years are excluded from eligibility. This ensures that no address is in sample more than once in any five-year period. The second step is to select a 20 percent systematic sample of 'new' units, i.e. those units that have never appeared on a previous MAF extract. Each new address is systematically assigned to either the current year or to one of four backsamples. This procedure maintains five equal partitions (samples) of the universe." – American Community Survey Design and Methodology (January 2014)

The figure below compares the years covered by ACS Sampling and Data Collection for 2013 to the years covered by Arlington County's Data Collection for 2013.

Rate of Change

There are several ways to see how fast housing characteristics and total counts within the county. One way is to see how many housing units were listed in the Arlington County data as being built that year (see table below). However, this is not perfect as often a housing unit is listed with an NA the year is is built. Thus, we looked at the number of units listed to be built the year after (e.g. 2009 in 2010 data).

 20092010201120122013
Housing Units Built each Year62819210
Housing Units Built each Year (using the next year's data)112893081,017--

Using MRIS-MLS data, we looked at how many houses where sold each year with a year built listed as the same year — new homes being sold on the open market.

 20092010201120122013
New Homes Sold via MRIS8870627787

Lastly, another way to to look at the rate of change is to look at the permitting data so see how many permits were taken out for new bedrooms, new construction, etc. This is left for future research.

Data Transforming Process

The unit of observation for CoreLogic, BlackKnight, and AC data is the parcel. The unit of observation for ACS is a housing unit. Thus, we had to take steps to transform the data to residential parcels that have a building on them. The table below describes the steps and the results for County Data and BlackKnight.

2009

  ACBKFS
 Original N64,82764,827
JustificationStep taken  
Residential OnlyRecode Land Use codes and select Residential Parcels*62,21362,213
No Buildings/Vacant LandRemove the parcels with zero Assessed Improvement Value --0
 Remove those with a "VAC" (Vacant) Code under extension60,485--
 Remove common areas60,485--
Parking LotsRemove the parcels with zero Assessed Land Value60,283 
Remaining Vacant Land**Select the parcels with Assessed Land Value greater than 15,000-- 
Non-Arlington PropertiesRemove multi-jurisdictions Properties60,261--
 Final N60,2610

2010

  ACBKFS
 Original N65,20165,201
JustificationStep taken  
Residential OnlyRecode Land Use codes and select Residential Parcels*62,60862,608
No Buildings/Vacant LandRemove the parcels with zero Assessed Improvement Value --0
 Remove those with a "VAC" (Vacant) Code under extension60,933--
 Remove common areas60,933--
Parking LotsRemove the parcels with zero Assessed Land Value60,2250
Remaining Vacant Land**Select the parcels with Assessed Land Value greater than 15,000--0
Non-Arlington PropertiesRemove multi-jurisdictions Properties60,203--
 Final N60,2030

2011

  ACBKFS
 Original N65,24265,264
JustificationStep taken  
Residential OnlyRecode Land Use codes and select Residential Parcels*62,64462,663
No Buildings/Vacant LandRemove the parcels with zero Assessed Improvement Value --0
 Remove those with a "VAC" (Vacant) Code under extension61,130--
 Remove common areas61,130--
Parking LotsRemove the parcels with zero Assessed Land Value60,4870
Remaining Vacant Land**Select the parcels with Assessed Land Value greater than 15,000--0
Non-Arlington PropertiesRemove multi-jurisdictions Properties60,465--
 Final N60,4650

2012

  ACBKFS
 Original N65,36465,364
JustificationStep taken  
Residential OnlyRecode Land Use codes and select Residential Parcels*62,77862,778
No Buildings/Vacant LandRemove the parcels with zero Assessed Improvement Value --0
 Remove those with a "VAC" (Vacant) Code under extension61,284--
 Remove common areas61,284--
Parking LotsRemove the parcels with zero Assessed Land Value60,7100
Remaining Vacant Land**Select the parcels with Assessed Land Value greater than 15,000--0
Non-Arlington PropertiesRemove multi-jurisdictions Properties60,688--
 Final N60,6880

2013

  ACBKFS
 Original N65,43365,443
JustificationStep taken  
Residential OnlyRecode Land Use codes and select Residential Parcels*62,84762,847
No Buildings/Vacant LandRemove the parcels with zero Assessed Improvement Value --61,174
 Remove those with a "VAC" (Vacant) Code under extension62,049--
 Remove common areas62,049--
Parking LotsRemove the parcels with zero Assessed Land Value60,98860,592
Remaining Vacant Land**Select the parcels with Assessed Land Value greater than 15,000--60,413
Non-Arlington PropertiesRemove multi-jurisdictions Properties60,966--
 Final N60,96660,413

 * "Condo", "Single Family - Detached", "Single Family - Attached", "Multifamily","Unknown Affordable Dwelling Unit", "MixedUsed"

 ** This was done as there was not vacant land or common area codes. Some vacant land/common areas do have minimal improvement value.

Fitness for Use

With knowledge of the data, it became possible to identify which of the various ACS tables we would have confidence in (or not in) comparing with the external data.

Arlington County Data

  • The best:
    • Owner occupied tables (e.g. value): The county data does not include whether a single family residence is being rented out, the county data over estimates these tables. The size of this overestimation depends on the rental market for single families.
    • Property Information (e.g. year built; number of units) for all housing units: The county data can be weighted by number of units to better match the ACS estimates.
  • The worst:
    • Heating Fuel: Where the ACS looks at heating fuel, the AC data looked at heating type (e.g. forced hot air). Not all heating types listed could be placed within an ACS bin.
    • Bedroom Count: Where the ACS includes all housing units, the county data only has bedroom information for multifamily. 
    • Plumbing Facilities: Where the ACS defines this having at least a full bathroom and a kitchen, it was only possible to define this as having at least a full bathroom.

BKFS Data

  • The best:
    • Tenure and by Tenure (depending on other variable): BKFS does impute a variable on whether a property is rented or not.
    • Property Information (e.g. year built; number of units) for all housing units: The county data can be weighted by number of units to better match the ACS estimates.
  • The worst:
    • Bedroom Count: Where the ACS includes all housing units, the BKFS data only has bedroom information for multifamily. 
    • Plumbing Facilities: Where the ACS defines this having at least a full bathroom and a kitchen, it was only possible to define this as having at least a full bathroom.

Weighting/Number of Units

Due to the external data's unit of observation (the parcel), we need to weight multifamily units by the number of units in the structure. For example, if an apartment building as 50 apartments, the ACS views each apartment as its own separate housing unit, each has a potential to be sampled, and the estimates are created as if each are a separate observations. In the external data, the same multifamily would only be in the data once. Because the significant role these weights play, it is necessary to examine if difference in weights are creating the observed differences in benchmarking.

 

 YearN (Multifamily)N (Multifamily with a Unit Count > 1) Min Number of UnitsMax Number of UnitsMeanSD
AC2013770359 11327120160
BKFS2013570345 1308591207
ATRACK2015412410 11318121175


Matrix of matching unit counts

Of the shared parcels, cells indicate percentage of  counts of row (data source) that are the same.

 ACBKFSATRACK
AC100%  
BKFS99%100% 
ATRACK66%66%100%

Of the shared parcels, cells indicate percentage of  counts of row (data source) that are less than 10 units different.

 ACBKFSATRACK
AC100%  
BKFS99%100% 
ATRACK83%83%%100%

Note: Figure includes all parcels listed as multifamily with a unit count greater than 0. If a parcel is not list in a dataset, then its unit count is changed to 0.

Two parcels are accounting for this difference. (Parcels are not in ATRACT)

APNBKFS Unit CountAC Unit Countdiff
380110043,0858422,243
220017242,4111222,289
Total  4,532

 

When excluding these two outliers and then looking at the rows that are in both populations, the unit counts become clear. The difference in ATRACK comes from two difference sources. The ATRACK unit count being significant higher than both AC and BKFS (this could be due to growth in the apartment units between 2010 to 2015.... no way of knowing though.)  Where ATRACK has a unit count but both AC and BKFS as the unit count as 1.

 

 

 

Selling Price (MRIS to Assessments)

Difference between listing price and assessment

  • Median: $44,400

Difference between selling price and assessment

  • Median: $29,900

 

 

Variable Comparisons

Housing Type

 Year
 Data
N
Missing
 

Condo

MultiFamily

Single Family-Attached

Single Family Detached

Unknown Affordable Dwelling Unit

2013BKFS60,4130 26,0025706,00527,47789
 County60,9910 26,3217716,29227,51889
 Year
 Data
Weighted N
Missing
 

Condo

MultiFamily

Single Family-Attached

Single Family Detached

Unknown Affordable Dwelling Unit

2013BKFS110,8830 26,00250,9556,02327,54489
 County103,9870 26,32143,4966,29227,51889

Number of Units

 Year
Data 
N
Missing
 

1-Attached

1-Detached

2

3 or 4 units

5 to 9 units

10 to 19 units

20 to 49 units

50 or more Units
Unknown
2013BKFS60,4130 32,00727,477119127100102196384
 County60,468523 6,02427,5182719124578

2,229

23,775463
 Year
Data 
Weighted N
Missing
 

1-Attached

1-Detached

2

3 or 4 units

5 to 9 units

10 to 19 units

20 to 49 units

50 or more Units
Unknown
2013BKFS110,8830 32,02527,5442759621,3203,21545,356384
 County103,9870 6,02427,518542123341,220

4,635

63,239463

Year Built

 Year
Data
N
 Missing
 

1939 or earlier

1940 - 1949

1950 - 1959

1960 - 1969

1970 - 1979

1980 - 1989

1990 - 1999

2000 - 2009

2010 or later

2013BKFS60,41326,142 7,6648,3468,1802,2931,3092,2771,6501,974578
 County60,511480 8,83113,74111,3525,0332,8726,7113,2438,416459
 Year
Data
Weighted N
 Missing
 

1939 or earlier

1940 - 1949

1950 - 1959

1960 - 1969

1970 - 1979

1980 - 1989

1990 - 1999

2000 - 2009

2010 or later

2013BKFS80,87730,006  9,26116,85215,07110,5922,560 7,1454,54510,6084,243
 County97,3656,622  9,49518,26716,40313,5604,11211,7266,01515,9521,835