«Steven N. Durlauf* Chao Fu* Salvador Navarro** *University of Wisconsin at Madison **University of Western Ontario 12/15/11 Draft Abstract The panel ...»
Capital Punishment and Deterrence: Understanding Disparate Results
Steven N. Durlauf*
*University of Wisconsin at Madison **University of Western Ontario
The panel data literature on deterrence and capital punishment contains a wide range of empirical
claims despite the use of common data sets for analysis. We interpret the diversity of findings in
the literature in terms of differences in statistical model assumptions. Rather than attempt to determine a “best” model from which to draw empirical evidence on deterrence and the death penalty, this paper asks what conclusions about deterrence may be drawn given the presence of model uncertainty, i.e. uncertainty about which statistical assumptions are appropriate. We consider four sources of model uncertainty that capture some of the economically substantive differences that appear across studies. We explore which dimensions of these assumptions are important in generating disparate findings on capital punishment and deterrence from a standard county-level crime data set.
Corresponding Author: Durlauf, Department of Economics, University of Wisconsin, Madison WI 53706, email@example.com. We thank Timothy Conley and David Rivers for many helpful insights. Durlauf thanks the University of Wisconsin Graduate School and Vilas Trust for financial support. Hon Ho Kwok and Xiangrong Yu have provided outstanding research assistance.
1. Introduction The effectiveness of capital punishment in deterring homicides, despite decades of empirical work, remains very unclear. This is so despite the fact that the Supreme Court’s moratorium on capital punishment and the subsequent adoption of capital punishment by a subset of states, combined with very different rates of execution in those polities with capital punishment, would seem to provide an ideal environment for identifying the magnitude of deterrence effects using panel data methods. Focusing on post-moratorium studies, one can find papers that argue that post-moratorium data reveal large deterrent effects (Dezhbakhsh, Rubin and Shepherd (2003), Zimmerman (2004)), fail to provide evidence of a deterrent effect (Donohue and Wolfers (2005), Durlauf, Navarro, and Rivers (2010)), or that provide a mixture of positive deterrence negative deterrence (brutalization) effects depending on the frequency of execution (Shepherd (2005)).
The presence of disparate results on the deterrent effect of capital punishment is not, by itself, surprising. Social scientists have long understood that the data “do not speak for themselves” and so empirical analyses that involve substantive social science questions such as the measurement of deterrence, can only do so conditional on the choice of a statistical model. The disparate findings in the capital punishment literature reflect this model dependence. This is even true when one conditions on the modern panel literature in which the various models typically represent statistical instantiations of Becker’s (1968) rational choice model of crime. Thus, a common basis for understanding criminal behavior, in this case murder, is compatible with contradictory empirical findings because of the nonuniqueness of the mapping of the underlying behavioral theory to a statistical representation suitable for data analysis.
This paper is designed to understand the sources of the disparate literature findings. Specifically, we consider how different substantive assumptions about the homicide process affect deterrence effect estimates. From our vantage point, alternate models of the homicide process are the result of different combinations of assumptions.
Since there is no a priori reason to assign probability 1 to any of the models that have been studied in the literature or our own new models in this paper, our approach respects and tries to constructively address the fact that the evaluation of the deterrent effect of capital punishment constitutes a context in which one must account for the presence model uncertainty.
The closest predecessor to this paper is Cohen-Cole, Durlauf, Fagan, and Nagin (2009), which employed model averaging efforts to adjudicate the different findings of Dezhbakhsh, Rubin and Shepherd (2003) and Donohue and Wolfers (2005). CohenCole, Durlauf, Fagan and Nagin were especially concerned to illustrate how model averaging could address the problem of different papers coming to diametrically opposite conclusions because of minor differences in model specification. As such, the purpose of the exercise was, to a major extent, the integration of disparate deterrence estimates across papers into a single number. Cohen-Cole et al were thus able to draw conclusions about the disagreement between Dezhbakhsh, Rubin and Shepherd and Donohue and Wolfers, finding that intermodel uncertainty was sufficiently great that no firm inferences about deterrence were possible for the data set under study.
In contrast, the current paper takes a broader view of the capital punishment and deterrence literature in attempting to understand why the literature has generated very disparate results. We explore a broader set of modeling assumptions so that a more general, and we think basic, set of sources of disagreements across papers are considered when evaluating what the data reveal about capital punishment and deterrence. Further, we consider models that have not previously appeared but, for theoretical reasons, in our judgment should be part of the model space. Finally, unlike Cohen-Cole, Durlauf, Fagan and Nagin, we are not primarily interested in reducing the model-specific estimates of deterrence down to a single number. Instead, our goal is to understand which substantive assumptions matter in determining the sign, magnitude and precision of various modelspecific deterrent effects.
We explore four sources of model uncertainty. Each, we believe, represents a fundamental issue in modeling the homicide process from which deterrence estimates are obtained. In all cases our object of interest, the deterrent effect, is measured as the marginal number of lives saved by an additional execution. This is a purely statistical definition. We are not concerned with efforts such as Shepherd (2005) to draw conclusions about distinct behavioral mechanisms that distinguish brutalization effects from deterrence. The reason for this is that the statistical models we study only reveal information (at best) on the net effects of capital punishment and so do not identify distinct behavioral mechanisms.
First, we consider differences in the specification of the stochastic process for unobserved heterogeneity in individual choices as to whether or not to commit a murder.
As argued in Durlauf, Navarro, and Rivers (2010), the standard linear regression model used for deterrence regressions places very strong restrictions on the process for unobserved heterogeneity in the payoff differential between the choices to commit and not to commit a murder. These restrictions are well known to derive from the requirement that the probability that a given individual at a given point in time commits a murder lies between zero and one. We contrast this specification of unobserved heterogeneity with the logistic error assumption that is standard in discrete choice models. While this might seem to be an arcane technical issue, in fact it matters a great deal whether one works with a linear probability or a logistic probability specification because of differences in the implied restrictions on those determinants of the homicide choice that an analyst cannot observe.
Second, we consider model uncertainty with respect to the specification of probabilities of punishment as determinants of individual choices. Here we follow Durlauf, Navarro and Rivers (2010) in contrasting the probabilities that naturally derive from a crime choice problem, probabilities which in fact appear in Ehrlich (1975), with those that have become standard in the literature. In terms of statistical models, this amounts to asking whether joint or conditional probabilities involving apprehension, receipt of a death sentence, and the carrying out of a death sentence, are the appropriate regressors in controlling for the uncertainty facing a potential murderer with respect to possible punishments if a murder is committed. As shown in Durlauf, Navarro and Rivers (2010), the joint probabilities are those suggested by rational choice models of crime. While we are unaware of a decision framework that implies that conditional probabilities should appear additively, as is the empirical standard in the recent literature, we certainly cannot rule out such a decision process and so treat the empirical convention as an alternative to the rational choice model.
Third, we consider model uncertainty from the vantage point of heterogeneity in the deterrent effect of murders, contrasting the assumption in Shepherd (2005) that the deterrent effect varies across states with the standard assumption in the literature that the same parameters apply to all states. The model uncertainty we consider contrasts two cases: one in which the deterrent effect is state-specific and one in which the deterrent effect is constant across states. This form of model uncertainty addresses concerns that have been raised that the skewed distribution of executions across US states renders the interpretation of a single deterrent effect problematic. The most prominent example of this concern is Berk (2005) which questions claims of a deterrence effect from interstate data because of the concentration of executions in a few states, and their absence in many state-year observation pairs. As stated, Berk’s criticism is not an obvious a priori objection to the claims of the deterrence literature. For example, the success of a polio vaccine based on trials in Texas would not be regarded as self-evidently meaning that the vaccine’s efficacy in other states is uninformed by the Texas data. In other words, Berk’s argument has import only if the concentration in a few states means that extrapolability findings for death penalty states to other states is not warranted.1 Shepherd’s introduction of state-specific execution effects is a constructive way to formalize the concern over extrapolability and indeed does so in a fashion that is more general than what we interpret as Berk’s criticism.2 Fourth, we address model uncertainty in the handling of cases in which a given polity-time pair exhibits 0 murders. Once the murder rate is understood as a probability, standard models have difficulty in predicting 0 murders.3 For the most part, the literature (e.g. Dezhbakhsh, Rubin and Shepherd (2003), Mocan and Gittings (2003), Zimmerman (2004), Donohue and Wolfers (2005)) has sidestepped this difficulty by using aggregate The treatment effect literature takes a much more sophisticated view of heterogeneity of effects of a given policy across individuals; see Abbring and Heckman (2007) for a review.
Berk estimates a nonparametric regression relating number of executions to homicides.
State-specific effects thus relax the assumption that two states with similar number of executions should exhibit the same deterrent effect.
Note that this is a distinct concern from whether political unit-time pairs with zero homicides are or are not exchangeable with those that have non-zero homicides, which is subsumed in our third level of model uncertainty.
versions of the linear probability model that do not restrict predicted murder rates to live in the [0,1] interval. By using linear regression models with the murder rate on the left hand-side, the polity-time pairs with zero murders can simply be included as part of the model. However, as shown in Durlauf, Navarro and Rivers (2010), once a properly specified probability model is used as the basis to construct the estimating equation, the left hand side variable (an appropriately transformed version of the murder rate) is no longer defined for zero-murder observations. To deal with this problem, under an assumption of selection on observables, Durlauf, Navarro and Rivers drop these zeromurder observations from the analysis. In this paper we additionally consider what happens when, in models based on the linear probability model, one confronts the zeromurder observation by not including them in the analysis, even though mechanically they could be included. More importantly, we generalize (for both linear and non-linear probability models) the analysis to allow for selection on unobservables. To do so, we use a control function approach to account for the possibility that polity-time observations with zero murders differ from those with positive murders in ways that are unobservable to the econometrician.
Section 2 of the paper outlines the way we think about model uncertainty. Section 3 discusses data. Section 4 describes the model space we employ to evaluate deterrence effects to capital punishment. Section 5 provides empirical results on deterrence effects across models. Following the literature, we place particular emphasis on net lives saved per execution. Section 6 concludes.
2. Model uncertainty: basic ideas
In this section, we discuss the basic idea of model uncertainty. The intuitive idea undoubtedly has been understood throughout the history of statistics, but the way we treat the problem appears to trace back to Leamer (1978). Constructive approaches for addressing model uncertainty have been an active area of research in the last 15 years with particular interest in model averaging, which is a principled procedure for combining information across models. Draper (1995) provides a still useful conceptual discussion, Raftery, Madigan, and Hoeting (1997) is a seminal contribution on the implementation of model averaging in linear models and Hoeting, Madigan, Raftery and Volinsky (1999) and Doppelhofer (2008) are useful overviews. Our own approach is similar in spirit to Leamer (1983) in that we are interested in exploring how different assumptions affect deterrence estimates, although we do not endorse the extreme bounds approach to assessing whether an empirical claim is robust or fragile with respect to a given data set.
The basic idea of model uncertainty is simple to explain. Suppose one wants to calculate a deterrence measure D. Conventional frequentist approaches to empirical work
produce estimates of this measure given available data d and the choice of a model m:
D d, m ˆ (1) Conventional analyses of deterrence estimates derive from the choice of m in (1).