[lbo-talk] I hope you all vote(d) for Obama

Julio Huato juliohuato at gmail.com
Fri Feb 8 07:53:56 PST 2008


Miles wrote:


> The value of t does not tell you "how significant" a predictor is in a
> regression model. I think the question you want to answer is, "Is age
> a good predictor of personal income, after controlling for the other
> sociodemographic variables in the model?" Try removing age from the
> model and compare r-squared with and without age. That will give you
> a meaningful estimate of the unique predictive power of age.

Again, I do not claim that age explains much of the total (huge!) variation in U.S. personal income. As I said, even bunching along variables such as education and sex, less than a third of the total variation can be explained. The case I make is simpler: that the influence of age on personal income is positive and significantly different from zero. And that the t value does tell me. (Please see the output of the regressions that Miles suggest below, back to back.)

You may ask why my insisting on age and race/ethnicity as predictors of personal income. In the regressions, don't gender and education (especially higher ed) come out even more strongly? The reason is this:

When people choose their race/ethnicity, i.e. when they choose their parents and the time when their parents are going to have them, they have little chance to look at their income. So the direction of causality is easy to establish. With education, you have a problem -- "endogeneity," the chicken-and-egg issue.

In the U.S., you can make an easy case that education depends on income. Even at the pre-school, elementary, middle, and high school levels. Now, if you find me *one* variable in the 2000 census that is highly correlated with education but uncorrelated with personal income (or, more precisely, uncorrelated with the residuals of the regression model), then I can isolate for you more exactly the influence of education on income without contaminating the analysis with the feedback from income to education. (I've experimented with lags and other things, but the results are not as clear and that'd take me longer to explain here.)

But how about sex or -- if you prefer -- gender? After all, when people choose their gender, they can't look at their income either. Well, I've looked into that variable too. Ignore those below 16. With adults, the variation in the gender income gap has to do with the interdependence between sex, age, race/ethnicity, and household income! In any comparison that is as apples-to-apples as possible, the differences in income between women and men are like -- if you don't mind my putting it this way -- a luxury good that predominantly rich (middle to rich class), white couples can afford. Among poor Hispanic couples, you get a bit of a gap. Perhaps culture. But nothing like what you get with rich, educated, urban and suburban, white couples. (And that, by the way, indicates to me a reason why the gender gap is such a big issue in the media. It hurts those who have a voice.) Among poor Black couples, the gender income gap is virtually nonexistent.

(Hope this addresses also some of the comments that shag just made.)

In fact, on this very issue, a colleague and I will be presenting a paper in the annual meeting of the Urban Affairs Association in Baltimore at some point in April. We took the PUMS of 1980, 1990, and 2000 -- and we've also been looking at the American Community Survey more recent data.

* * *

Here are the regressions with and without age. Is 4.75% of the total variation in income enough explaining? You be the judge:

. xi: regress lninct lnage i.wbho female i.ed2 i.wbho _Iwbho_1-4 (naturally coded; _Iwbho_1 omitted) i.ed2 _Ied2_1-6 (naturally coded; _Ied2_1 omitted)

Source | SS df MS Number of obs = 1908799 -------------+------------------------------ F( 10,1908788) =66456.15

Model | 694711.961 10 69471.1961 Prob > F = 0.0000

Residual | 1995387.771908788 1.04536898 R-squared = 0.2582 -------------+------------------------------ Adj R-squared = 0.2582

Total | 2690099.731908798 1.40931609 Root MSE = 1.0224

------------------------------------------------------------------------------

lninctot | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+----------------------------------------------------------------

lnage | .6350648 .0018162 349.67 0.000 .6315052 .6386245

_Iwbho_2 | -.1121831 .002499 -44.89 0.000 -.1170811 -.1072852

_Iwbho_3 | .040356 .0026666 15.13 0.000 .0351296 .0455824

_Iwbho_4 | -.115744 .0034079 -33.96 0.000 -.1224234 -.1090646

female | -.5889234 .0014854 -396.47 0.000 -.5918348 -.586012

_Ied2_2 | .056953 .0035532 16.03 0.000 .0499888 .0639172

_Ied2_3 | .6201275 .0032624 190.08 0.000 .6137333 .6265218

_Ied2_4 | .854046 .0033024 258.61 0.000 .8475733 .8605186

_Ied2_5 | 1.310026 .0035962 364.28 0.000 1.302978 1.317074

_Ied2_6 | 1.554936 .0039808 390.61 0.000 1.547134 1.562739

_cons | 6.982544 .0079991 872.92 0.000 6.966866 6.998222 ------------------------------------------------------------------------------

. xi: regress lninct i.wbho female i.ed2 i.wbho _Iwbho_1-4 (naturally coded; _Iwbho_1 omitted) i.ed2 _Ied2_1-6 (naturally coded; _Ied2_1 omitted)

Source | SS df MS Number of obs = 1908799 -------------+------------------------------ F( 9,1908789) =56627.21

Model | 566893.935 9 62988.215 Prob > F = 0.0000

Residual | 2123205.791908789 1.11233132 R-squared = 0.2107 -------------+------------------------------ Adj R-squared = 0.2107

Total | 2690099.731908798 1.40931609 Root MSE = 1.0547

------------------------------------------------------------------------------

lninctot | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+----------------------------------------------------------------

_Iwbho_2 | -.1743928 .0025712 -67.82 0.000 -.1794324 -.1693533

_Iwbho_3 | -.119376 .00271 -44.05 0.000 -.1246875 -.1140645

_Iwbho_4 | -.1969446 .0035072 -56.15 0.000 -.2038186 -.1900706

female | -.5590877 .0015297 -365.48 0.000 -.5620859 -.5560894

_Ied2_2 | -.1965181 .0035882 -54.77 0.000 -.2035508 -.1894854

_Ied2_3 | .4422414 .0033241 133.04 0.000 .4357262 .4487566

_Ied2_4 | .6138793 .0033321 184.23 0.000 .6073486 .6204101

_Ied2_5 | 1.099146 .003657 300.56 0.000 1.091978 1.106314

_Ied2_6 | 1.422095 .0040876 347.90 0.000 1.414083 1.430107

_cons | 9.563309 .0031816 3005.84 0.000 9.557073 9.569545 ------------------------------------------------------------------------------



More information about the lbo-talk mailing list