*Program to Determine Women's Labor Force Participation Rate *Sara Gieseke *November 2004 *Program name: workforce_part1; /*variables used in the model data set for 1980 and 1990 for 50 states wlfp = participation rate (%) of all women over 16 yf = median earnings (in thousands of dollars) by females ym = median earnings (in thousands of dollars) by males educ = percent of female high school graduates over 24 years of age ue = unemployment rate (%) mr = marriage rate (%) of women at least 16 years of age dr = divorce rate (%) urb = percentage of urban population in state wh = percentage of females over 16 years who are d90 = dummy variable 0=1980 1=1990 */ /*This is a continuation of the first women in labor participation rate, but using two decades rather than one. Therefore dummy variables must used to see if there is differences between the 1980s and 1990s.*/ /*TEST FOR CORRELATION*/ proc corr data=women2; var wlfp yf ym educ ue mr dr urb wh run; /*REGRESSION*/ /*ym left out of model because it is highly correlated with yf will run regression until all of the variables are significant the model produces the highest r square */ proc reg data=women2; model wlfp = yf d90yf educ d90educ ue d90ue mr d90mr dr d90dr urb d90urb wh d90wh; run; /*r-square for model 1 =.8613 drop least significant variable d90educ (.8813) */ proc reg data=women2; model wlfp = yf d90yf educ ue d90ue mr d90mr dr d90dr urb d90urb wh d90wh; run; /*r-square for model 2 =.8612 drop least significant variable d90wh (.5573) */ proc reg data=women2; model wlfp = yf d90yf educ ue d90ue mr d90mr dr d90dr urb d90urb wh; run; /* r-square for model 3 = .8607 drop least significant variable d90urb (.3391) */ proc reg data=women2; model wlfp = yf d90yf educ ue d90ue mr d90mr dr d90dr urb wh; run; /* r-square for model 4 = .8592 drop least significant variable d90dr (.4350) */ proc reg data=women2; model wlfp = yf d90yf educ ue d90ue mr d90mr dr urb wh; run; /*r-square for model 5 = .8582 all variables are significant at the .10 level so this will be our final model, just like on p317 */ /*final model re-run*/ proc reg data=women2; /*print out predicted values and residuals*/ model wlfp = yf d90yf educ ue d90ue mr d90mr dr urb wh / p r; /*plot residuals to check model assumptions*/ plot residual.*predicted. ='o'; run; quit; /*best model (all variables significant at a p-value of .1 or lower)*/ proc reg data=women2; title 'Women in the Labor Force 1980 and 1990'; model wlfp = yf d90yf educ ue d90ue mr d90mr dr urb wh; run;