My Regression Tests

My Regression Tests

For anyone looking to understand a simple multiple regression to predict average income based on a host of supporting variables.

My task here was to predict the median income number where education and household variables exists using linear regression in Rstudio – totally self taught. My results are below.

MedianIncomePredictor1<-lm(medincome ~ studypercap+percentmarried+pctbachdeg25_over+
pctbachdeg18_24+pctnohs18_24+birthrate, data=Categorised)
summary(MedianIncomePredictor1)

MedianIncomePredictor2<-lm(medincome ~ studypercap+medianage+percentmarried+pctbachdeg25_over+
+pctnohs18_24+birthrate, data=Categorised)
summary(MedianIncomePredictor2)

MedianIncomePredictor3<-lm(medincome ~ studypercap+medianage+percentmarried+pctbachdeg25_over+
pctbachdeg18_24+pctnohs18_24+birthrate, data=Categorised)
summary(MedianIncomePredictor3)

summary(MedianIncomePredictor1)
summary(MedianIncomePredictor2)
summary(MedianIncomePredictor3)

 

Results –

summary(MedianIncomePredictor1)

Call:
lm(formula = medincome ~ studypercap + percentmarried + pctbachdeg25_over +
pctbachdeg18_24 + pctnohs18_24 + birthrate, data = Categorised)

Residuals:
Min 1Q Median 3Q Max
-32865 -4664 -648 4104 38731

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1171.0513 1213.3028 0.965 0.335
studypercap -0.4382 0.2676 -1.637 0.102
percentmarried 493.2993 20.7482 23.775 < 2e-16 ***
pctbachdeg25_over 1358.9278 33.6663 40.365 < 2e-16 ***
pctbachdeg18_24 303.8023 39.6299 7.666 2.37e-14 ***
pctnohs18_24 -7.2859 19.3769 -0.376 0.707
birthrate 112.2244 72.4050 1.550 0.121

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 7752 on 3040 degrees of freedom
Multiple R-squared: 0.5863, Adjusted R-squared: 0.5855
F-statistic: 718 on 6 and 3040 DF, p-value: < 2.2e-16

> summary(MedianIncomePredictor2)

Call:
lm(formula = medincome ~ studypercap + medianage + percentmarried +
pctbachdeg25_over + +pctnohs18_24 + birthrate, data = Categorised)

Residuals:
Min 1Q Median 3Q Max
-34609 -4768 -669 4154 44874

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2063.8226 1225.0537 1.685 0.0922 .
studypercap -0.4649 0.2702 -1.720 0.0854 .
medianage -3.4855 3.1351 -1.112 0.2663
percentmarried 495.2675 20.9692 23.619 <2e-16 ***
pctbachdeg25_over 1493.2989 28.9606 51.563 <2e-16 ***
pctnohs18_24 -35.3394 19.2091 -1.840 0.0659 .
birthrate 70.5096 72.9034 0.967 0.3335

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 7825 on 3040 degrees of freedom
Multiple R-squared: 0.5784, Adjusted R-squared: 0.5776
F-statistic: 695.2 on 6 and 3040 DF, p-value: < 2.2e-16

> summary(MedianIncomePredictor3)

Call:
lm(formula = medincome ~ studypercap + medianage + percentmarried +
pctbachdeg25_over + pctbachdeg18_24 + pctnohs18_24 + birthrate,
data = Categorised)

Residuals:
Min 1Q Median 3Q Max
-32872 -4656 -634 4104 38718

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1283.4783 1217.8728 1.054 0.2920
studypercap -0.4442 0.2677 -1.660 0.0971 .
medianage -3.3033 3.1059 -1.064 0.2876
percentmarried 494.3989 20.7735 23.799 < 2e-16 ***
pctbachdeg25_over 1358.3475 33.6700 40.343 < 2e-16 ***
pctbachdeg18_24 303.4794 39.6302 7.658 2.53e-14 ***
pctnohs18_24 -7.3771 19.3767 -0.381 0.7034
birthrate 110.8941 72.4142 1.531 0.1258

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 7752 on 3039 degrees of freedom
Multiple R-squared: 0.5864, Adjusted R-squared: 0.5855
F-statistic: 615.6 on 7 and 3039 DF, p-value: < 2.2e-16

 

Some charts that have been exported from R Studio using:
plot(MedianIncomePredictor1)
plot(MedianIncomePredictor2)
plot(MedianIncomePredictor3)

MedianIncome1

Model = MedianIncome1 MedianIncome2

Model = MedianIncome2

 

MedianIncome3