Journal Articles
(Some of the programs related to my papers are available on the page “Software Programs”)
Difference-in-Differences Estimators of Intertemporal Treatment Effects
We consider treatment effect estimators with a panel of groups under parallel trends. We propose event-study estimators of the effect of being exposed to a weakly higher treatment dose for ℓ periods. We also propose normalized estimators, that estimate a weighted average of the effects of the current treatment and its lags. Contrary to usual estimators, they are valid even if the treatment effect is heterogeneous and beyond staggered adoption designs.
Partially Linear Models under Data Combination
We consider partially linear model, when the outcome of interest and some of the covariates are observed in two different datasets that cannot be linked. We obtain a simple characterization of the identified set and develop tractable inference. We apply our results to intergenerational income mobility in the US, without relying on exclusion restrictions behind the two-sample TSLS, unlike previous approaches.
Difference-in-Differences Estimators with Continuous Treatments and no Stayers
Applied researchers usually rely on two-way fixed effect regressions to estimate the effect of a continuous treatment. However, such estimators are not robust to heterogeneous treatment effects and rely on the linearity of treatment effects. We propose estimators that do not impose those restrictions, and that can be used even if the treatment of all units changes from one period to the next.
Testing and Relaxing the Exclusion Restriction in the Control Function Approach
In a nonparametric triangular structure typical of the control function literature, and provided the instrument satises a “local irrelevance” condition, we show that we can test the exclusion restriction. Second, if the “instrument” also directly affects the outcome variable, we show that the identication of average causal effects can be achieved in linear random coefficients models and single index models.
(Find here Y. Sasaki’s webpage for the Stata command testex implementing our test of the exclusion restriction)
A Robust Permutation Test for Subvector Inference in Linear Regressions
We develop a new permutation test for inference on a subvector of coefficients in linear models. The test is exact when the regressors W and the error terms u are independent, but also consistent under, mainly, two conditions: (i) a slight reinforcement of cov(W, u)=0; (ii) the number of strata, defined by values of the regressors not involved in the subvector test, is small compared to the sample size.
Two-Way Fixed Effects and Differences-in-Differences with Heterogeneous Treatment Effects: A Survey
Linear regressions with period and group fixed effects are widely used to estimate policies’ effects. Yet, those regressions may produce misleading estimates, if the policy’s effect is heterogeneous between groups or over time. This survey reviews a fast-growing literature documenting this issue and proposing alternative estimators robust to heterogeneous effects. We use those alternative estimators to revisit Wolfers (2006).
Two-way Fixed Effects Regressions with Several Treatments
We extend our previous paper on two-way fixed effect regressions to cases with multiple treatments.
Note: the new version of our Stata command twowayfeweights (downloadable on SSC) now allows for multiple treatments.
Fixed Effects Binary Choice Models with Three or More Periods
We show that the impossiblity result of Chamberlain for fixed effects binary choice models (with iid errors, the slope parameter is point identified only with logistic errors) only holds with T=2. With three or more periods, we exhibit a family of distribution for which point identification can be achieved. Identification leads to a GMM estimator, which reaches the semiparametric efficiency bound when T=3. We apply our method to revisit Brender and Drazen (2008).
An alternative to synthetic control for models with many covariates under sparsity
The synthetic control program does not admit a unique solution when n, the number of control units, is larger than p, the dimension of the variables used to construct the synthetic control. We propose a method that solves this issue and enjoys a double robustness property. We then extend it to the case where p is potentially larger than n, and show root-n consistency and asymptotic normality of the corresponding ATT estimator.
Nonparametric Difference-in-Differences in Repeated Cross-Sections with Continuous Treatments
We extend the change-in-change model of Athey and Imbens (2006) to continuous treatments. We require an exogenous change over time of the distribution of the treatment inducing a crossing in their cdfs. This crossing plays the role of a “control group”, and allows us to average and quantile treatment effects in a nonparametric way.
Working paper version here
The Marcinkiewicz-Zygmund law of large numbers for exchangeable arrays
We show a Marcinkiewicz-Zygmund law of large numbers (both in L^r and a.s.) for jointly and dissociated exchangeable arrays. As a result, we obtain a law of iterated logarithm for such arrays under a weaker moment condition than the existing one.
Empirical Process Results for Exchangeable Arrays
We show the weak convergence of empirical processes and some bootstrap processes with multiadic (e.g., dyadic) data or multiway clustering. These results imply asymptotic normality and the validity of the bootstrap for a large class of nonlinear estimators. We illustrate our results with trade data.
The last version of the WP can be found here: https://arxiv.org/abs/1906.11293
(this paper supersedes our previous paper “Asymptotic results under multiway clustering”).
Rationalizing Rational Expectations: Characterization and Tests
We construct the best possible test of rational expectations (RE) when we observe expectations on future variables on a certain sample, and realizations of that variable on another sample of different units. we apply our methodology to test for RE about future earnings.
You can find the corresponding R package here: https://github.com/cgaillac/RationalExp
Note: this version supersedes “Rationalizing Rational Expectations: Tests and Deviations” (v1 on arXiv), which also considers deviations from RE.
Segregsmall: A command to estimate segregation in the presence of small units
The Stata command Segregsmall estimates usual segregation indices (e.g. Duncan or Theil) when units (=geographical areas, firms, classrooms…) are small. In such cases, naive estimators are biased upwards. The command computes in particular the estimators described in R. Rathelot’s JBES paper and in our join, QE paper.
The Provision of Wage Incentives: A Structural Estimation Using Contracts Variation
To what extent do people react to incentives? Are observed contracts (nearly) optimal? We answer to these questions using a nonparametric principal agent model and an exogenous variation in contracts between the French national institute of statistics and its interviewers.
Estimating Selection Models without Instrument with Stata
This paper presents the Stata command eqregseg, which computes the extremal quantile regression estimator for sample selection developed in our paper “Extremal Quantile Regressions for Selection Models and the Black-White Wage Gap”.
Two-way fixed effects estimators with heterogeneous treatment effects
We show that if treatment effects are not constant, regressions with groups and time fixed effects identify weighted averages of treatment effects across groups and time periods, with potentially (many) negative weights. We suggest sensitivity checks and better estimands under testable restrictions on the design.
See https://arxiv.org/abs/1803.08807 for the last WP version
A cautionary tale on instrument vector calibration for the treatment of unit nonresponse in surveys
We show that the calibration method based on instruments, proposed by Deville (2002), leads to a large variance when the instrumental variable are poorly related to the calibrating variables. If the exclusion restriction is violated, the bias is also large under the same condition.
Fuzzy differences-in-differences with Stata
This paper presents the Stata command fuzzydid, which computes various estimators of the LATE and LQTE for fuzzy DID designs, following our paper “Fuzzy DID”. It can handle non-binary treatments, multiple periods and groups, covariates and partial identification.
Fuzzydid Stata package available from the SSC repository. You can find the files to replicate the application on Clément’s webpage.
Automobile Prices in Market Equilibrium with Unobserved Price Discrimination
We consider inference on a demand and supply model for differentiated products with price discrimination that is unobserved by the econometrician. We show how to extend BLP’s GMM estimation to this setting, using restrictions on marginal costs. We apply our framework to the French automobile market.
Extremal Quantile Regressions for Selection Models and the Black-White Wage Gap
We consider models with endogenous selection and no instrument nor large support regressors. Identification relies on the independence between the covariates and selection, when the outcome tends to infinity. We propose a simple estimator based on extremal quantile regressions and apply it to the evolution of the black-white wage gap in the US.
Fuzzy Differences-in-Differences
In many applications of the DID method, the treatment rate only increases more in the treatment group. In such fuzzy designs, we show that the popular “Wald-DID” (the DID of the outcome divided by the DID of the treatment) identifies a LATE only if two homogeneous treatment effect assumptions hold. We then propose two alternative estimands that do not rely on such assumptions.
Measuring Segregation on Small Units: A Partial Identification Analysis
Suppose that an individual in a small unit j (a classroom, a small firm…) belongs to a minority with a probability pj. To measure segregation of this minority, one would ideally use an inequality index on the pj, but they are unobserved. Using the observed proportion instead leads to an overestimation. The segregation indices are actually partially identified. We provide tractable bounds and develop inference.
The supplement and code can be found following the link to the journal’s website.
Identification of Additive and Polynomial Models of Mismeasured Regressors Without Instruments
Suppose that Y = g(X*) + h(Z) + U, E(U|X*,Z)=0 but X* is measured with error. We show that g and h can be identified nonparametrically without side information provided that, basically, Z affects X*. A similar result holds when Y=P(X*,Z) + U, with P polynomial.
A Convenient Method for the Estimation of the Multinomial Logit Model with Fixed Effects
We propose a computationally convenient alternative to the conditional MLE for fixed effect multinomial logit models.
The Matlab code can be found here.
Disentangling Sources of Vehicle Emissions Reduction in France: 2003-2008
We study the factors of the decrease in average CO2 emissions of new cars in France between 2003 and 2008. We show that the evolution of consumers’ preferences account for 43% of this decrease, and that these changes follow two environmental policies put in place during this period.
Identification of Nonseparable Triangular Models with Discrete Instruments
Consider a model Y = g(X,U) with X endogenous, and suppose that we have instruments Z such that X = h(Z,V). If Z is independent of (U,V) and both g(X,.) and h(Z,.) are strictly monotonic, then g can be partially or pointly identified if Z is binary. It is fully identified in general if Z takes three values or more.
(The “pdf file” links to the first, longer version of the working paper)
Identification of Mixture Models Using Support Variation
Suppose that observed variables (X1,…,XK) are independent conditional on a continuous and unobserved variable X*. We show that the distributions of Xi conditional on X* are identified if the bounds of the conditional support of Xi are strictly increasing with X*. We also develop a test of this condition.
The Environmental Effect of Green Taxation: the Case of the French “Bonus/Malus”
In 2008 was introduced in France a feebate system for new automobiles. We investigate the effect of this policy on CO2 emissions. We find that the policy actually led to an increase in these emissions, mostly because of a substantial increase in the sales of new automobiles.
La régression quantile en pratique
This article (in French) is an introduction to quantile regression, with an emphasis on its interpretation.
Another Look at Identification at Infinity of Sample Selection Models
The sample selection model can be identified without instrument if basically, the probability of selection, conditional on the potential outcome and covariates, does not depend on covariates as the potential outcome tends to infinity.
Inference on an Extended Roy Model, with an Application to Schooling Decisions in France
Consider an extended Roy model where a binary decision depends on expected gains and an unobserved cost. The model is identified without instruments if, basically, the unobserved cost only depends on covariates. Applying our results to French data, we show that nonpecuniary components are a key factor for going to college.
On the Completeness Condition for Nonparametric Instrumental Problems
Sufficient conditions for the completeness condition (E(g(X)|Z) = 0 => g(X)=0) used in nonparametric IV problems are given. It holds in particular under a large support condition on n(Z) and technical restrictions on V in the generalized additive model X = m(n(Z) + V).
Le coût du bonus/malus écologique : que pouvait-on prédire ?
(in French)
A New Instrumental Method for Dealing with Endogenous Selection
Consider the sample selection model under the nonstandard IV restriction that D is independent of Z conditional on Y. Nonparametric identification is achieved under a completeness condition between Y and Z. Partial identification can also be obtained if one replaces independence by monotonicitiy restrictions.
Identification of Peer Effects Using Group Size Variation
A linear-in-means model close to the one of Manski (1993) is identified provided that we observe groups with three distinct sizes. This applies even if one does not observe all members of the group, and can also be extended to binary outcomes.