The goal is to predict a person's income based on their age, gender, class of worker, and education level. We validated a model using numerical representation, in particular R squared and RMSE. PINCP, AGEP, SEX, SCHL data are selected within the PUMS dataset. The beta coefficient for age is 0.012 which means that for every addition increase in age we estimate that the salary increases by a factor of 10^0.012 = 1.028. The beta coefficient for bachelor’s degree is 0.39 which means that people with bachelor’s degree make 10^0.39 = 2.45 times as much as those who did not completed high school.
PUMS stands for Public Use Microdata Sample by American Community Survey (ACS), an ongoing survey that provides information on a yearly basis about the United States and its citizens. It contains detailed population and housing information such as Class of Worker, Education Level, Gender, and Income.
This workflow describes the process to create a linear regression analysis of the PUMS data.
I created and tested a multiple linear regression model with the following dataset from the US Census Bureau. The goal is to predict a person's income based on their age, gender, class of worker, and education level.