Incidence of Reported Disability among Men: Accounting for Self Selection

A large volume of research has investigated consequences of physical and health impairments with respect to labor force attachments and earnings of persons with disabilities. Many of these studies are based on data sources in which individual disability status is self-reported. Considerably less attention has been devoted to factors that determine self-reported disability. The contribution of this paper is in its distinction between two aspects of endogenous selection in the transition from self-declared nondisability to disability status. On one hand, measured earnings and income might exert a direct effect on the individual’s propensity to report disability. On the other hand, some individuals possess unmeasured traits that might simultaneously affect their earnings and their propensity to report a disability. Based on samples of individuals from the U.S. Current Population Survey at two points in time, and using individuals who initially do not report a disability, this study looks for an explicit role of earnings per se in the reporting decision. Second, it examines whether transitions to disability between the two periods occur in the presence of correlation between unmeasured factors present in both earnings during the first period and the subsequent decision to report a disability.

earnings or income as an explanatory variable. Rivera (2001) estimates an ordered probit model of self-reported health based on a cross section of individuals from the 1993 wave of the Spanish National Health Survey. The dependent variable encompasses five categories of health, and the model purports to estimate the impact of public health expenditures on those health outcomes. The model includes as explanatory variables several background and demographic characteristics, as well as variables measuring public health expenditures in the individual's region of residence, yet it excludes measures of individuals' earnings or income. Beegle and Stock (2003) use a linear probability model for individual self-reports of disability based on U.S. Census data from 1970, 1980, and 1990. The specification includes individual characteristics such as age, education, and household size. In addition, it includes indicators of diagnosed medical conditions thought to be associated with disability. The model does not, however, allow individuals' earnings to directly affect their reported disability status. Chirikos and Nestel (1984) use cross section samples of mature men and women from the 1976 and 1977 waves of the National Longitudinal Surveys of Youth to estimate logit models of self-reported disability status. They implicitly recognize the potential endogeneity of individuals' hourly earnings, which they address by specifying the model to include estimated wages from a first step auxiliary regression. Their estimates indicate that individuals with higher expected wages are significantly less likely to report disabilities. A similar approach is used by Chirikos (1986), based on cross sectional data from the 1978 Social Security Administration Survey. His results likewise indicate a reduced propensity to report disabilities as expected wages increase.
Additional papers that include models of reported work limitation status are Kreider (1999) and Webber and Bjelland (2015). Both studies focus on simultaneous estimation of reported limitation and labor force participation. Again, unlike our approach, to be described in Section III, their disability-reporting models are reduced forms and do not incorporate individuals' wage earnings as endogenous determinants of the reporting decision.
Another feature that distinguishes the literature is the tendency to utilize data on individuals who report themselves as disabled at the time the respective surveys are conducted. Less attention is devoted to individuals who are initially nondisabled and subsequently transition to disabled status in a succeeding time period. A recent exception is Lechner and Vasquez-Alvarez (2011). Using the 1984-2002 waves of the German Socioeconomic Panel, they construct successive four year panels for each individual, in which by construction of the sample, individuals are not disabled in the first period and then subsequently transition to disability. Their principal focus is on the onset of disability and its impact on several outcomes, including employment and earnings. Their basis of comparison, utilizing propensity score matching, is the set of individuals who remain nondisabled for the entire duration of the four year panels. Based on this matching of disabled and nondisabled persons, they find that the transition to disability (rather than existing disability status) is associated with a significant reduction in the probability of employment. transitions into disability, which deserve more attention. The general need for additional research, which motivates this paper, is noted by Kruse and Schur (2003, p. 62): "This highlights a need for further research on what leads people to say they have a work-limiting or work-preventing disability in order to gain a better understanding of how labor market conditions, public policies, and employer accommodations may affect these self-reports and employment prospects of people with impairments and activity restrictions". Accordingly, this study has two principal objectives. First, it recognizes that self-reporting is by definition a phenomenon that takes place before or concurrent with the individual's potential access to legally established consequences of disability, such as rights to workplace accommodation under the ADA or public support programs such as SSDI or SSI. However, as noted above, it has been common in the literature to analyze the labor market effects of disabilities for reporting themselves as disabled at the time they are surveyed. This leaves unanswered the important question concerning determinants of the transition to self-reported disability.
Second, this study incorporates two potential dimensions of self selection. Following Heckman (1979), the traditional definition is based on unmeasured traits of self-reported disabled individuals. Positive selection is present if individuals who possess latent tendencies to self-report are characterized by unmeasured attributes that result in higher earnings; negative selection is present when positive propensities to self-report are associated with negative latent earnings attributes. At the same time, however, selection might be related to measured earnings. In this framework, self-reporters are positively selected if they exhibit higher earnings before they report disabilities. Negative selection occurs if self-reporters originate from the lowest earners. This study incorporates both aspects of self selection in the context of a single model. It permits testable hypotheses about selection based on both observed earnings and unmeasured attributes of potential self-reporters.

Econometric Framework for Self Selection
To clarify ideas, we establish a definition of self selection and an econometric framework that permits tests of the appropriate hypotheses. To capture the transition to self-reported disability, we envision a population of initially employed males observed at consecutive points in time. During the first period, each individual is not impaired by a physical or health disability that limits his capacity for employment. For individual i, earnings in the first period are expressed as x is a vector of explanatory variables, and  is a conformable vector of unknown coefficients. The random error term  is normally distributed with zero mean and variance 2   .
By the second period, the individual reports either that he is or is not impaired by a work-limiting where i  represents a latent adjustment and is perceived by the individual as potential disturbance to his earnings in the second period. Self selection in the reporting decision is summarized as the difference between individuals' expected earnings and their (counterfactual) expected earnings if they do not report a disability: Thus the essence of latent self selection is that individuals who report work-limiting disabilities are characterized by unmeasured traits that become manifest in their earnings anticipations formed during the first period. It might be, for example, that individuals who anticipate that they will report disabilities in the second period are characterized by latent characteristics to alter their respective investments in human capital, such as training or acquisition of new work experience, during the initial earnings period. In that case the latent term in (2) would be negative. On the other hand, individuals who anticipate reported disabilities might be motivated to augment their earnings during the nondisabled period, perhaps in anticipation of decreased earning capacity in the second period. In that case the latent adjustment to expected second-period earnings would be positive.
At the same time, however, there is reason to believe that self selection might also arise directly from sources that are measurable, in particular earnings in the first period. Earnings in the initial period might be significant in explaining costs or benefits associated with disability. On one hand, individuals who earn high wages are likely to be relatively productive, which implies a commensurately larger opportunity cost associated with work limitations arising from disability. In addition, since disability is often associated with costly physical adaptations and accommodations to activities of daily living, high-earning individuals are better able to finance those challenges, and self-reported disability will again be associated with higher earnings. On the other hand, the low-earning population might disproportionately consist of workers whose wage-based opportunity costs of reduced work activity are relatively low, and for whom the income replacement values of public and private benefits for disability are relatively high. In those cases the propensity to report disabilities would be negatively related to earnings. Both hypotheses are plausible, and resolution of their respective merits is an empirical issue.
These considerations suggest an econometric framework that encompasses selection into self-reported disability on the basis of measured as well as unmeasured traits. We address that in the context of a two equation model. 7 Denote * i d as the individual's latent propensity to self-report a work limiting disability (for brevity, hereafter "report" where i z is a vector of measured characteristics,  is a conformable vector of coefficient parameters and i  is the idiosyncratic error term as postulated in equation (2). All variables with the exception of the reporting outcome are measured in the initial period; thus our approach is to compel self selection, if it exists, to manifest itself in the decision to report and not after the fact.
The reporting propensity is not observed. Instead we observe a dichotomous variable The error terms i  and i  are assumed to posses a bivariate normal distribution with zero means, respective variances 2   and 1, and covariance parameter   .
This model contains two parameters that are informative about self selection. In (3), the coefficient  captures selection based on measured earnings. If  > 0, then individuals who experience higher earnings, holding other factors constant, are more prone to report disabilities than those with lower earnings. This reflects positive selection on earnings; if  < 0, negative selection on earnings is present.
The covariance parameter   reflects the presence of latent attributes. If 0    , then individuals who are high earners in the initial period due to unobserved traits likewise possess unobserved characteristics that make them more likely to report disabilities. If 0    , negative selection is present: persons with latent earnings advantages are prone to refrain from reporting disabilities. The two phenomena are not mutually exclusive, and resolution of their respective roles is the principal objective of the empirical analysis in Section V.
Our empirical strategy is to estimate equations (1) and (3) jointly by the method of maximum likelihood after reformulating the model as a pair of reduced form equations. Substituting (1) into (3) yields a reduced form equation for the decision to self-report: The sample is partitioned into individuals who report disabilities in the second period ( For individual i, the univariate density of the error term in equation (1) is (6) where  denotes the density function of a standard normal distribution. For the population of self-reporters, we derive the conditional density: where  denotes the cumulative distribution function associated with the standard normal distribution and A denotes the argument in brackets.
For the population of those who do not report disabilities, the conditional density analogous to (5) is Accordingly, the likelihood function for the entire sample, consisting of D individuals reporting disabilities and N individuals who are nondisabled, is given by In Section V we use the sample data to maximize the likelihood function in order to obtain estimates of (1) and (3) along the associated variance and covariance parameters.
As noted above, the research objective advanced here emphasizes self selection based on both observed and unobserved factors. This has appeared in areas of research outside the disability literature. For example, the role of observable and unobservable factors in the process of endogenous selection has been discussed in the literature on program evaluation. Studies by economists in that literature have been concerned with labor market programs in which individuals who lack job skills or stable work histories receive training or other "treatments" intended to strengthen their employment prospects.
Early examples and extensions in this area include Ashenfelter (1978), Barnow, Cain and Goldberger (1980), and Heckman and Hotz (1989). Our model is a variation of this idea, entailing selection into self-reported disability status directly as a result of observed earnings and indirectly on the basis of covariation between latent earnings and the propensity to report disabilities.
As described in this section, the structural model necessitates a sample of employed males during the first period. Consequently this design precludes investigation of the effects of joblessness or job loss on self-reported disability. Examples of research addressing that question include Hotchkiss (2003Hotchkiss ( , 2004. In addition, we restrict the analysis to males because of the generally higher labor force attachments of men. Inclusion of females would require addressing the familiar problem of nonrandom selection of females into employment (Heckman, 1979), which is beyond the scope of the immediate question in this paper.

Data and Model Specification
Estimation presence of young children in the household, home ownership, and the proportion of the population enrolled in Medicaid for the state in which the individual resides. Inclusion of the latter variable is due to the fact that individuals receiving SSDI benefits are eligible for the Medicaid program after two years of benefit receipt. In the empirical analysis, the earnings variable is expressed as a natural logarithm.
We restrict the sample to men who are married with spouse present in the initial period. To include single men in the sample creates the task of modelling the joint occurrence of marital status and disability incidence, which substantially extends the scope of the model described in Section III. In addition, focusing on married men is of interest in its own right. In households with married men who are wage and salary earners, those men are often significant if not the predominant sources of income in the family. Consequently, in cases where they transition to disability, the implications are likely to be significant, both for the well-being of families and for considerations of public policy. In the empirical estimates, we estimate the model for two alternative measures of income, i.e., the variable denoted y in equations (1)  and zero if reported SSDI income is zero in both periods. The upshot of these extensions is that we report estimates for four versions of the model, corresponding respectively to combinations of the two measures of individual income and two indicators of the disability transition.

Descriptive Statistics
Variable definitions and sample means are presented in Table 1, which partitions the sample by disability outcome. The sample means reveal that men who are not disabled in year t (2002, …, 2009) and who report disabilities by year t are slightly older and have lower earnings and personal incomes in the initial year than those who remain nondisabled. They also report lower levels of home ownership and other family income. The evidence in Table I (1) and (4).   The principal parameters of interest are the coefficient of earnings in the disability equation,  , and the covariance between random error terms in equations (1) and (4),   . The estimates in Table II ( = -0.488,   = 0.220) indicate that the parameters are opposite in sign and strongly significant.

Principal Estimates: Self-Reported Disability Based on Wage Earnings
The earnings coefficient confirms that earnings tend to reduce the incidence of reported disability. The covariance parameter suggests positive latent selection into reported disability: after controlling for measured earnings and other background attributes, men who possess unobserved positive earnings attributes are characterized by unmeasured propensities to report disability.
The positive covariance invites discussion. While we know of no previous attempts in the literature to estimate this parameter in the context of a model that explicitly controls for measured earnings, some authors have suggested what is equivalent to a positive covariance. Blanck, Schwochau, and Song (2003, p. 322), for example, point out that workers with disabilities face substantial obstacles in their requests for workplace accommodations under the ADA. They must simultaneously establish that they are limited in their work capacities but nonetheless qualified for the work in question. In addition, as Goodman and Waidman (2003, p. 357) point out, to the extent that high earnings are associated with greater education attainments, these workers are also more likely to be denied SSDI benefits, thus necessitating participation in the labor force. Once employed, and in order to receive accommodations, they are as a matter of course strongly incentivized to report themselves as disabled. In the population of working people, from which the sample for this study is drawn, it is therefore plausible that these self-reporting individuals possess unmeasured traits that are, ceteris paribus, associated with higher earnings.
Studies that focus on the simultaneous occurrence of reported work limitations and labor force participation include Kreider (1999) and Webber and Bjelland (2015). In both cases, their models attempt to control for correlation between unobserved factors in the two equations. Webber and Bjelland (2015) report positive correlations for respective samples of men. Krieder (1999), on the other hand, obtains negative correlations for samples again partitioned by gender, although his estimates do not significantly differ from zero.
Moreover, working individuals who possess latent strengths in earnings will tend to be in advantageous bargaining positions with their employers with respect to ADA accommodations. Characterized by high Marginal Revenue Products (MRP) relative to their nonworking disabled counterparts, they might expect their employers to more readily acquiesce. That is because employers will have better opportunities exploit those high MRP's as a means of recovering their shares of the costs associated with accommodation. Since workers must declare themselves disabled to initiate the bargaining process, the plausible result is a positive covariance between random error terms in the earnings and disability-reporting equations.

Estimates for Self-Reported Disability Based on Personal Income
As noted above, recognizing that individuals generate income from sources other than wage earnings, we repeated the estimations using the log of individuals' annual personal incomes as the dependent variable in the income equation. Estimates for the control variables are largely consistent in both magnitude and significance with those in Table II. Possible exceptions are the estimated coefficients of education and age in the probit equation, which become smaller in magnitude while remaining strongly significant.
Estimates of the essential parameters in equations (1) and (3),  and   , are presented in the second column of Table III. For ease of comparison, in column one the table reproduces the corresponding estimates from Table II. Estimates for the full model are omitted from Table 3 for the sake of brevity but are available from the authors on request. Thus the principal inferences in Table II, positive latent selection and negative selection based on income, are robust with respect to the use of personal income instead of the more narrowly defined wage earnings.

Additional Estimates: Transition to SSDI
It is common in the literature to define disability in terms of what individuals self-report to surveys or other data sources. To narrow that definition, we estimated the model with the dependent variable in the probit model defined as one if the individual transitions from zero SSDI income to positive SSDI income. This poses an empirical challenge, since the relative rarity of SSDI onset significantly reduces sample variation on the left hand side of the probit equation. In our sample, the proportion of individuals who self-report disability between periods t-1 and t is 0.029, whereas the proportion who transition from zero to positive SSDI income is 0.0004. The consequence of this decrease in variation in the dependent variable of the probit model is a loss of precision in the parameter estimates. That is borne out in the maximum likelihood estimates of the model when it is estimated with the more narrowly defined dependent variable. Estimated coefficients of the control variables, not presented here for sake of brevity but available from the authors on request, are less precise and in many cases not significantly different from zero. Yet, estimates of the self selection parameters remain significant, or nearly so, and they are similar in magnitude to their counterparts from the model based on self-reported disability. Those estimates are presented in the third and fourth columns of Table III. Column three is based on wage earnings as the measure of income, while column four is based on personal income. In both cases, the results continue to lend general support to this paper's central hypotheses: disability onset tends to be negatively selective in terms of measured earnings or income and positively selective in terms of unobserved individual heterogeneity.

The Mitigating Role of Labor Force Attachment
One way to test the robustness of these estimates is to account directly for the extent of individuals' labor force attachments in the initial (nondisabled) period. Workers who anticipate onset of disabilities, due perhaps to developing adverse health conditions, are likely to be characterized by reduced work commitments in periods leading up to the decision to self-report. Thus we might expect that individuals who report more work hours in the initial period are less inclined to report disabilities in the second period. Consequently, the measures of self selection in our model are expected to be of less import if the model is augmented to include a measure of initial work hours. To test this possibility, we estimated the model in equations (1) and (3)

Summary and Conclusion
In the voluminous literature on disability and the labor market, numerous researchers have focused attention on the effects disability on outcomes such as income or earnings. A smaller literature has investigated the reverse phenomenon, namely the effect of income on reported disability. In the latter studies, it has been customary to treat income variables as exogenous or predetermined. Moreover, the literature has devoted far less attention to the phenomenon of latent selection into disabled status. This paper departs from that tradition in both respects. Treating annual earnings or, alternatively, annual personal income as jointly determined with disability outcomes, it offers evidence of negative overt selection: high earning men appear less prone to report themselves as disabled, a conclusion that is sustained when disability is defined strictly in terms of receipt of public disability insurance income. In addition, we estimate the earnings/income effect in the context of a model that controls for selection on the basis of unobserved characteristics. The latent selection inference is of interest in its own right, suggesting that men who are strong earners in the latent dimension are more prone to report themselves as disabled.
We reconcile these opposing effects by appealing to the argument that working individuals who possess latent strengths in earnings will tend to be in advantageous bargaining positions with their employers in seeking accommodation under the requirements of the ADA. These workers have more to gain from accommodation, and hence see it in their self interest to report themselves as disabled in order to become eligible for the ADA's mandates. Since workers must of course be disabled to initiate the bargaining process, the plausible result is a positive covariance between random error terms in the earnings and disability-reporting equations.
An important additional finding in this study is the mediating role of labor force attachments. When the model controls for workers who are engaged in relatively greater hours worked during the initial period, the evidence for latent self selection is diminished. Even in that case, however, the evidence remains suggestive of the direct negative effect of wages or income on reported disability.
This study is the first in the disability literature to incorporate both phenomena in one model, which we argue is an important step in understanding choices made in the disabled population. The evidence presented here invites additional research. An important limitation of this study is its (necessary) reliance on a two year time frame for observing and measuring disability. In addition, this study is limited to earnings and disability outcomes for men who are initially employed and thus able to furnish data on earnings and income. Although it would be useful to extend the analysis to females and nonworkers, that is complicated by the well known empirical challenge of self selection among individuals with weaker attachments to the labor force (Heckman, 1979). Thus, in the context of our model, employment decisions by women and male non-workers add another dimension of endogenous selection that has to be addressed in the likelihood function (6), a task that is beyond the scope of this paper. Another useful but challenging extension would be to include single men in the sample and model joint changes in status with respect to both marriage and disability, as we noted in Section IV. Another shortcoming is the absence in our data of variables that are thought to be antecedents to disability, such as health conditions and preexisting impairments that are present before individuals enter the sample. To the extent that these precursors are reflective of and proxies for unmeasured attributes of individuals, our model addresses them by explicit parameterization of latent selection in the model. As future research evolves, this study points to the importance of addressing both dimensions of the transition to disabled status, an issue that is of undeniable importance for public policy in the United States.