VT Census Case Studies : Pennsylvania Longitudinal Education Data

Variables included in the Dataset:

K-12

  • Demographics
  • School
  • District
  • Attendance/Dropout
  • Grades/GPA
  • Courses
  • Test scores
  • ACT/SAT
  • Disciplinary action
  • Teacher info (salaries, etc.)

Higher Education

  • Demographics
  • Type of college
  • Enrollment
  • Courses
  • Major
  • Grades
  • Tuition/Scholarships

Workforce

  • Demographics
  • Salary
  • Industry

Screening

  • Is there data available for 2013?
  • Can we access the data by August 15th? Likely not for student-level data.

Does this dataset appear to meet our needs for the Census study? UNDECIDED

ExplanationPennsylvania considers test scores, GPA, course information, etc. “confidential information”. The state requires all researchers to be in compliance with FERPA to access any of this data. It involves filling out an extensive proposal, and probably wouldn’t get us access until after August 15. However, one can also submit requests for non-confidential information, so more variables and greater detail of school level data could likely be acquired.

Full Inventory 

Purpose

  • What is the purpose of the organization collecting the data?

Purpose is to coordinate education for the state and make policy recommendations.

  • Who else uses the data? (make a note if they sell the data to companies)

Anyone from the general public can use the data to assess school accountability. Primarily educators, policy makers, and Pennsylvania citizens.


Description

  • What is the general topic of the data?
  • K-12 Student Information
  • Higher Education Student Information
  • Workforce Information
  • Longitudinal Education Information (includes K-12, Higher Ed, and/or Workforce)

 

  • What are the earliest and latest dates for which data is available?

Depends on data set, some as early as 1997, some as late as the current school year 2014-2015.

  • How soon after a reference period ends can a data source be prepared and provided?

Depends on the data that is being used


Method

  • What is the data collection method (portal, other)?

Online data entry

  • What is the raw source of the collected data (teacher, superintendent)?

Depends on what data is being reported, PIMS manual has documentation: https://www.portal.state.pa.us/portal/server.pt/directory/pims_manuals/71511


Selectivity (conversely, the representativeness)

  • What is the universe (e.g., population) that the data represents?

All students attending school in Pennsylvania, including public, charter, private and homeschool, as well as community college and public universities if the students do not opt-out.


Stability/Coherence

  • Note any changes to the universe of data being captured (e.g., including private schools).

School district boundaries frequently change, but they are listed at http://www.portal.state.pa.us/portal/server.pt/community/data_and_statistics/7202/map_of_school_districts_and_intermediate_units/509785

  • Note any changes to the data capture method or sources of data.

Unknown

 

Metadata

  • Is there a description of each variable in the source along with their valid values? 

Yes

  • Are there unique IDs for unique elements that can be used for linking data? 

Yes

  • Can K-12 be linked to higher ed or higher ed to workforce? 

Yes

 

Accuracy

  • Any known sources of error?

Unknown

  • Describe any quality control checks performed by the state (or data manager).

Unknown

 

Accessibility

  • How is the school-level data accessed (note if it needs to be screen scraped)?

Pennsylvania considers test scores, GPA, course information, etc. “confidential information”. The state requires all researchers to be in compliance with FERPA to access any of this data. It involves filling out an extensive proposal, and probably wouldn’t get us access until after August 15. However, one can also submit requests for non-confidential information, so more variables and greater detail of school level data could likely be acquired.

  • How is the student-level data accessed? 

See above

  • Note if IRB is needed or any other restrictions on accessing data.

N/A

  • Any records or fields collected, but not included in data source? 

Unknown

  • Cost? - One time or annual or project based payment? 

There is a cost, but it is dependent on what information is requested.

Privacy and security

  • Note any confidentiality policies or legal limitations other than FERPA: 

N/A

  • What do they consider personally identifiable information? 

Pennsylvania interprets anything like test scores, GPA, courses, etc. to be personally identifiable information that requires the organization to fall under FERPA in order to access.


Research

  • What research has been done with this dataset? 

Unknown

  • Research links:

N/A

Describe any other notes you have or any gaps/concerns you see with this dataset: Data for postsecondary education is housed in the PIMS system, longitudinally linked with the K-12 data according to the Department of Education website. However, there is both the State System of Higher Education and the Commonwealth System of Higher Education governing the public universities in Pennsylvania. The State System collects data from their 14 public universities, then submits it to PIMS. Their reporting is also inconsistent in the PIMS. On the State System website, there are reports that measure the value of the education, however they do this using alumni surveys, and there is currently no direct linkage to workforce data. Pennsylvania’s State System of Higher Education gives out no student level data, but they could give out more detailed school level data than is listed on their website if we e-mail cosmolenski@passhe.edu. Furthermore, PASSHE is trying to make a system to link their data with workforce data, but it isn’t complete yet.