Variables included in the Dataset:
K-12
- Demographics
- School
- District
- Attendance/Dropout
- Grades/GPA
- Courses
- Test scores
- ACT/SAT
- Disciplinary action
- Teacher info (salaries, etc.)
Higher Education
- Demographics
- Type of college
- Enrollment
- Courses
- Major
- Grades
- Tuition/Scholarships
Workforce
- Demographics
- Salary
- Industry
Link: http://www.doe.virginia.gov/statistics_reports/research_data/index.shtml
Date Inventory Completed: 5/25/15
Screening
- Is there data available for 2013?
- Can we access the data by August 15th?
Does this dataset appear to meet our needs for the Census study? YES
Explanation: Currently in contact with Todd Masa to get the postsecondary and workforce data.
Full Inventory
Purpose
- What is the purpose of the organization collecting the data?
"The Virginia Longitudinal Data System supports critical reporting on the quality of public education – such as accurate graduation and dropout rates for high schools and school divisions – while providing information that can help policy-makers improve programs that prepare and connect Virginians with employment opportunities."
- Who else uses the data? (make a note if they sell the data to companies)
App developers http://www.apps4va.org/, the community, researchers, parents
Description
- What is the general topic of the data?
- K-12 Student Information
- Higher Education Student Information
- Workforce Information
- Longitudinal Education Information (includes K-12, Higher Ed, and/or Workforce)
- What are the earliest and latest dates for which data is available?
2006-2014
- How soon after a reference period ends can a data source be prepared and provided?
Approximately one year for some datasets.
Method
- What is the data collection method (portal, other)?
Depends on the institution
- What is the raw source of the collected data (teacher, superintendent)?
In high school, schools collect some information from parents (e.g., race) and send the data to VDOE based on federal reporting requirements. In college, students report some of their information to colleges directly (e.g., race), which then send the information to SCHEV. Program administrators report enrollment and participation.
Selectivity (conversely, the representativeness)
- What is the universe (e.g., population) that the data represents?
Public schools, Virginia colleges, wage records for those employed in Virginia by an entity that reports Unemployment Tax to the VEC.
Stability/Coherence
- Note any changes to the universe of data being captured (e.g., including private schools).
Prior to 2012, it was optional for private institutions of higher education (IHE) to submit course grades to SCHEV. This field became a requirement for all IHE in 2012.
- Note any changes to the data capture method or sources of data.
VDOE added course enrollment and completion data for each student in 2010/11. Course data from prior years are not available, although for some research, state end-of-course tests can provide a reasonable proxy variable for course participation. DOE made changes to race/ethnicity codes and LEP proficiency codes. LEP Proficiency Type was removed from VDOE’s data in 2009. VDOE changed the methods by which limited English proficient students’ proficiency levels were measured and documented. Scores on Virginia’s SOL tests identify students as being proficient or advanced proficient in content areas. The achievement needed to meet minimum or advanced proficiency changes with each revision of the Standards of Learning. For further information: http://vlds.virginia.gov/media/287/a006_1_final-ccr-researchersguidevlds.pdf
Metadata
- Is there a description of each variable in the source along with their valid values?
Yes
- Are there unique IDs for unique elements that can be used for linking data?
Yes, however, datasets that are generated at different times cannot be concatenated. Therefore new datasets cannot be linked to previously created datasets based on unique identifiers according to Virginia State privacy laws (see below).
- Can K-12 be linked to higher ed or higher ed to workforce?
Yes, however, there are limitations, such as when two agencies cannot directly link records. Currently, this applies to VDOE linking to VEC—there is currently no reliable way to directly link data between these two agencies. However, using SCHEV records, users can connect VDOE’s data to wage records for a part of the population—those high school students who at some point were enrolled in a Virginia’s IHE.
- Links to codebooks: http://www.doe.virginia.gov/statistics_reports/research_data/data_files/data_dictionary.pdf
Accuracy
- Any known sources of error?
“VDOE’s state assessment data are based on official records collected through the state testing program. However, VDOE and SCHEV course participation and grades are not official student transcripts. These records may be missing important information that is included on transcripts, such as previously earned credits (e.g., through Advanced Placement or out-of-state dual credit courses), and information about courses that were, by local or institutional policy, excluded from transcripts.”
- Describe any quality control checks performed by the state (or data manager).
Developed common data definitions for Virginia’s workforce data - however, not every source of VLDS data follow these definitions. “VLDS relies on both deterministic and probabilistic matching methods to connect records between agencies. Researchers working with VLDS data will need to determine the quality of the matched data. In some situations, researchers may be working with data that have never been matched before, and in other situations, data may have been matched in one or more prior research projects. The effort it takes to determine data and matching quality for first-time matching projects can be significant.”
Accessibility
- How is the school-level data accessed (note if it needs to be screen scraped)?
Freely available online in csv format.
- How is the student-level data accessed?
By contacting one of the institutions who contributes data to the VLDS and having them be an advocate for your project. They will need to get permission from any other institutions involved.
- Note if IRB is needed or any other restrictions on accessing data.
No
- Any records or fields collected, but not included in data source?
Suppression rules consistent with DOE policy have been applied. Within each dataset, rows were withheld if deemed that the number of students in the group could lead to the identification of a single student. In most cases, student groups of 9 or less are suppressed.
- Cost? - One time or annual or project based payment?
Unclear
Privacy and security
- Note any confidentiality policies or legal limitations other than FERPA:
Virginia’s privacy law: http://leg1.state.va.us/cgi-bin/legp504.exe?000+cod+2.2-3800
- What do they consider personally identifiable information?
Suppression rules are provided for each dataset.
Research
- What research has been done with this dataset?
College outcomes, teacher pipeline, vulnerable population and return on investment, impact of high school diploma types, return on investment of workforce programs.
- Research links: http://vlds.virginia.gov/InsightInsights
Describe any other notes you have or any gaps/concerns you see with this dataset: See http://vlds.virginia.gov/media/287/a006_1_final-ccr-researchersguidevlds.pdf
“At the time of this writing, VLDS included over 775 data elements. The data elements are organized by partner agency and usually further organized by the source or type of data. Data are available in accordance with each agency’s internal data structure, which may differ. For example, VDOE makes data available by school year using a four digit code representing the fall of each school year (e.g., school year 2008 represents the 2008-2009 school year); SCHEV represents school year using a four digit code representing the fall and spring (e.g., 0809 represents the 2008-2009 school year. Similarly, data are stored and therefore delivered to researchers using each agency’s internal coding which typically differs. For example, VDOE and SCHEV’s codes for students’ gender are available with different codes—SCHEV provides data using numeric codes (1, 2, and 4) and VDOE provides data using characters (M, F, and null). The agencies’ data codes are available from the Data Dictionary and Selection Tool. Not all data are available for all years. In general, data from SCHEV is available from 2006 forward and VEC from 2005 forward. VDOE’s data system has undergone significant change over the past decade. As a result, the starting year by which authorized users can access VDOE data via VLDS varies by data set and element. For example, state assessment and demographic records are available beginning with the 2005/06 school year; student schedule data in 2011/12.”