Development economics in a developed country: how do poor Americans save?

This recent episode of NPR’s planet money talks about the financial lives of poor Americans. A few practices mentioned in the article, such as group lending, private lenders, and high interest rates for loans, are strikingly similar with what poor people do in developing countries in Africa and Southeast Asia.

This resemblance leads me to think whether ideas and methodologies in development economics can be (more extensively) applied in developed country settings. Classic models in development economics such as health-based poverty trap and credit constraint can be easily applied to study the causes of poverty in a developed country setting. Comparative studies of poor individuals in developed vs. developing countries can shed light on the impact of institutions, governance, and infrastructure on addressing poverty.

The data mentioned in this episode are pretty amazing — 235 poor households across America tracked over a year with high frequency financial diaries. I bet interesting research based on this data is on the way.

Weekly NBER Digest 2/14/16

I decided to start a series of weekly blog posts on the new NBER working papers on development economics, labor economics, and international trade that I find interesting. In the past few months, I have experienced the excitement of finding an interesting research question, going through the empirical methodology to answer the question, cleaning data, and then figuring out there is not enough variation to answer my question (due to the contextual nature of the question). Now I am opening myself up to new ideas, and my NBER digests will serve this purpose as well.

1. How does taxation affect growth through corruption?

This working paper builds an endogenous growth model to examine the relationship between taxation, corruption, and economic growth. Taxes have disincentive effects on entrepreneurs, but also provide them with public infrastructure. Political corruption governs how efficient tax revenues are translated into infrastructure. The model predicts an inversed-U relationship between taxation and growth, which is consistent with data from the Longitudinal Business Database (LBD) at the US Census Bureau.

This paper is an example of combining macro modeling with micro empirical analysis to address an interesting question.

2. Are trade policies no longer important?

This working paper by Goldberg and Pavcnik describes the declining research interest in assessing the impact of trade policies and reasons for this decline, and suggests future areas of research. A lot of useful insights. As an example,

The variation in trade policy across cross-sectional units and time is only helpful for identifying the effects of trade policy in the presence of some type of friction and/or heterogeneity in exposure to policy change. … the main limitation of relying on differential exposure of economic agents to trade policy to identify its causal effects is … that this approach by its nature will generally reveal only the relative and not absolute effects of a policy change. The latter require a theoretical framework within which the relative effects can be interpreted.

2015 年终总结




事事都有两面。这一年的种种戏剧性让我学会了如何在迷雾中保持自己的方向。年初的时候我在手机里装上了Insight Timer的app, 每天早上冥想5分钟,聆听内心的声音,增进对自己的了解,也更能看清他人的喜乐。学业上的焦虑令我和师兄师姐增进交流,更加明白PhD是个不断超越自己的过程,享受旅途和追求结果一样重要。

这一年,我重拾声乐这个爱好,在Duke开始跟一位老师学习歌剧演唱。学期最后我竟然能稳稳当当地唱到B minor,在几十个人面前表演也不会腿脚打颤,想想也是不小的成就。


好友们各有各的生活:有的在世界各地飞来飞去忙事业,有的找到了自己的另一半幸福地安顿下来了,有的还是浑浑噩噩不知每天在忙什么。我这一年最大的感触就是: 单纯的比较是毫无意义的。明白自己想要什么,而且勇敢地去追求,这样才能得到真正的幸福。工作与爱情都是如此。



My Two Cents on Randomized Controlled Trials

Randomized control trials (RCTs) have been at the forefront of development economic research in recent years. How well do these inform us of policy alternatives to reduce poverty?

On the bright side, RCTs allow us to identify the causal impact of policy interventions, and a lot of studies provide evidence that some simple nudging can make a big difference on behavior (see Esther Duflo’s work on encouraging Kenyan farmers to use fertilizers). However, there are also a few caveats in interpreting RCT results:

Publication Bias: only significant results — either positive or negative — get published. Are we learning about the truth or the truth we WANT to know? For instance, microfinance has been applauded as an innovative and effective way to increase savings and investments, encourage entrepreneurship, and reduce poverty. But a recent working paper has found zero effects of access to microfinance on long term development outcomes.

Pre-analysis plan vs. manual selection after study is initiated: Here is a philosophical discussion in the Journal of Economic Perspectives by Ben Olken.

Heterogeneous treatment effects: the magnitude of the effects of policy varies a great deal. External validity is often a concern. Here is a thought-provoking paper by Eva Vivalt, the founder of AidGrade, a database on impact evaluations.

Experimental arms race: Are we simply adding more technical details into the same experiments without shedding light on fundamental channels of how they change behavior? Here is an article by David McKenzie on the tpoic. More specifically, Rachel Glennerster writes about what this implies for RCTs involving governments.

Your thoughts and comments are welcome.

Technological Adoption Analysis with Panel Data

I just submitted my final paper for the panel data class. Sharing it here. Comments and feedback are welcome!

I. Introduction

This paper provides a critical review on the econometric approaches to model technology adoption decisions in developing countries. These decisions include the choice of whether or not to adopt a particular technology (e.g. high yielding variety seeds) and the amount of inputs depending on the technology used.

The developing country setting presents two additional challenges to identifying the determinants to technology adoption. First, imperfect access to credit and insurance introduces correlation between lagged productivity shocks and current input choices, thus violating the strict exogeneity assumption that is commonly maintained in panel data models. Second, the prevalence of informal networks highlights the importance of incorporating learning and externality into the analysis.

Following Foster and Rosenzweig (1995), suppose we are interested in what factors determine the adoption of high yielding variety (HYV) seeds of farmers in developing countries. There are two broad sources of uncertainty that drives differences in different technology adoption behavior. First, farmers may know the returns to HYV seeds but not the optimal levels of inputs. Therefore, a farmer needs to experiment with different levels of input choices once she decides to use HYV seeds. Second, there may be uncertainty in the profitability of this new technology. This source of uncertainty can be especially relevant when the technology is new (Conley and Udry 2010). Although the two sources of uncertainty may co-exist, we focus on only one at a time given the complication of the problem.

II. Input Choice as Technology Adoption

In this section, we assume that returns to technology adoption depends on how close actual input levels are to optimal input levels, i.e. use a target input model. Foster and Rosenzweig (1995) uses this framework to examine how farmers HYV adoption decisions depend on own and neighbors’ experience. In their framework, expected profits of farmer j at time t is
where $\eta_h$ is yield using HYV varieties, $\eta_{ha}$ is the loss associated with using less suitable land as more HYVs are used, $A_j$ is the total amount of land, $H_{jt}$ is the amount of land using HYVs, $\sigma_{\theta jt}^{2}$ is the updated variance of the mean input level, and $\sigma_{u}^2$ is the variance of the error term in target input use (relative to the mean optimal input). The updating of the variance term depends on learning from own and neighbors’ experience. This will be the focus of section IV.

In the empirical analysis, the authors estimate the profit function adding education of the farmers as an additional covariate:
where $S_{jt}$ is the cumulative number of parcels planted by farmer j up to time t, $\bar{S}_{-jt}$ is the average of the cumulative experience of neighboring farmers, $\rho$’s are precision terms of own and neighbors’ experience as signals of optimal input levels. Two approaches are used for estimation.

The first approach uses IV and fixed effects to estimate a first-order reduced-form approximation of equation (2). Instrumental variables are used to address correlation between 1) contemporaneous profit shocks and production decisions, and 2) lagged profit shocks and contemporaneous adoption (potentially because of credit constraints). Fixed effects are used to eliminate individual level heterogeneity $\mu_i$. If we maintain the assumption that input decisions are predetermined, the IV approach address the concern that strict exogeneity is violated. Note that predeterminedness implies that the profit shocks in first differences exhibit first-order autocorrelation but are uncorrelated at all other lags. This seems a reasonable assumption if we believe the profit shocks are unanticipated and are not persistent over time. Because Foster and Rosenzweig (1995) do not describe the nature of the profit shocks, it is difficult to evaluate the validity of the predeterminedness assumption.

The second approach uses nonlinear IV fixed effects to obtain the structural estimates of the profit function. Equation (2) is differenced over time and estimated using standard nonlinear IV procedure. This approach is subject to the same concern as the first approach.

III. Discrete Technology Adoption Decisions

In this section, the outcome variable equals one if the individual adopts technology in period t. Because technology adoption contributes to accumulated experience, adoption in the current period may induce changes in the returns to technology in the next periods in a complicated way.

Foster and Rosenzweig (1995) examine HYV adoption using reduced-form predictions from the structural model. But without solving the decision rules, they are unable to estimate the structural parameters. To address this limitation, we might use nonlinear panel data models with stronger distributional assumptions of the error terms (e.g. logistic distribution) and use conditional maximum likelihood estimators. This, however, rules out serial correlation in the error terms and might be unrealistic. An alternative approach is Manski’s conditional maximum score estimation. This approach achieves identification from “switchers”, but observing enough individuals switching from adopting versus not adopting a specific technology might be challenging as there are often fixed costs involved in a new technology and hence persistence in adoption decisions.

Suri (2011) provides an alternative framework to examine why farmers make different adoption decisions. She uses the information on the correlation between productivity differences and productivity of a technology among farmers who use both technology to project the different productivity levels for farmers who use only one technology. More specifically, she assumes profits for farmer i with productivity


She estimates the following equation for yields:
Based on the primitives of the model,


The identifying assumption is mean independence of the composite error $(\tau_i+\epsilon_{it})$ and the comparative advantage component $\theta_i$, and the histories of the regressors. Translated into assumptions on what drives the hybrid switching behavior, this assumes the unobserved time-varying variables that drive the switching should not be correlated with yields.Chamberlain (1982) correlated random effects approach is used for estimation. Dependence of the observed $\theta_i$’s on the endogenous input $h_{it}$ is accounted for using the linear projection of $\theta_i$ on the full history of inputs and their interactions. Structural parameters are recovered from reduced-form estimates.

The correlated random effects approach reduces the threshold for identification, and it seems reasonable to assume that individual-level heterogeneity are uncorrelated with productivity shocks once the history of input decisions are controlled for in Suri’s setting. Moreover, the focus of Suri (2011) is to identify the \emph{cross sectional} heterogeneity in productivity and its consequence on hybrid seed adoption. It is unclear whether this focus warrants the use of CRE models.

IV.Learning in Technology Adoption

Recent literature on technology adoption highlights the importance of learning from own experience and the experiences of informal network members.

Conley and Udry (2010) collect data on social interactions and address the unobserved variable problem when studying learning effects in technology diffusion: pineapple planting. In their model, risk-neutral farmers each have a single plot, and maximize current expected profits by choosing discrete-valued input $x_{it}$ at time t. Pineapple output realized 5 periods after input decision is
where $\epsilon$’s are unobserved productivity shocks iid distributed with mean 0 and variance 1, $\omega_{it}$ captures spatially and serially correlated shocks to marginal product that is only observed by the farmer (not the econometrician). Farmers do not know the function $f$ but learn about it with a learning rule.

Identification uses the specific timing of plantings to identify opportunities for information transmission. Variation in planting decisions generate a sequence of dates where new information may be revealed to the farmer. Conditional on measures of growing conditions, Conley and Udry isolate events when new productivity information is revealed to the farmer. They then investigate whether new information is associated with changes in farmer’s input use that is consistent with social learning. A logistic regression is used to estimate how farmers’ input decisions respond to actions and outcomes of other farmers in their information networks (data collected by the authors).

The baseline regression model is
where $M_{it}$ is an index of good news on input levels constructed from inputs and profits five years ago and now. The identification assumption is that conditional on measures of changes in growing conditions $\Gamma_{it}$ and other farm level characteristics, the information measure $M_{it}$ is uncorrelated with unobserved determinants in growing conditions and therefore input use. A significant, positive $\beta_1$ is evidence for social learning.

An important limitation of this approach is that it completely ignores the endogenous formation on informal networks and the potential dynamic changes in informal networks. To study the learning effects in technology adoption, we need a better understanding about the formation of informal networks and the nature of learning to evaluate whether the identification assumptions are realistic.


Chamberlain, G. (1982). “Multivariate Regression Models for Panel Data,” Journal of Econometrics 18: 5-46.

Conley, T. and Udry, C. (2010). “Learning about a New Technology: Pineapple in Ghana,” American Economic Review 100(1): 35-69.

Foster, A. and Rosenzweig, M. (1995). “Learning by Doing and Learning from Others: Human Capital and Technological Change in Agriculture,” Journal of Political Economy 103: 1176-1209.

Suri, T. (2011). “Selection and Comparative Advantage in Technology Adoption,” Econometrica 79(1): 159-209.

What I have learned from my first academic presentation

Yesterday I presented my work on parental migration and health outcomes of children in Indonesia in the development lunch at Duke. It was my first time to present my own research in front of a (relatively) large academic audience. The presentation did not progress as planned (similar with most research initiatives), but I learned a great deal from it. Here’s a few.

  1. Talk about key facts instead of broad histories when you are introducing the context of your study. Providing a description of broad histories is easy for you as a presenter but usually makes the audience more confused about your main argument.
  2. Related to the first point, structure your presentation to focus on the key questions you are interested in answering, the strategies you use to address these questions, and where you have experienced difficulty and need advice on.
  3. In a short presentation, avoid doing a detailed literature review. You are almost guaranteed to miss some papers in the literature, and it is easy to spend a long time answering tangential questions.
  4. Know your question really, really well. Present it to different people and see if anything confuses them. If they are confused, try to diagnose the problem and clarify your question. If there are broad terms in your main question, try to narrow them down to clear-cut, specific definitions that people can directly relate to.
  5. Know when to answer questions, when to delay them, and when to politely turn them down. Always answer clarification questions, but delay questions which you are going to address later in your presentation.
  6. Practice. Practice. Practice. You cannot anticipate everything, but if you do not practice, there will be too many awkward moments.

I encourage other students to present their work early on in the PhD program to practice thinking deeply about a question and explaining it to other people. It will be painful at first, but you will get better at it over time.

A few notes on academic presentations

For our writing and presentation class we are asked to explain a standard concept in intermediate microeconomics in a ten-minute presentation. Here are a few good practices I concluded from my own and others’ presentations.

1. Practice your script and appear confident on the stage.

2. Make sure your graphs are legible. Fonts should be large enough. Use contrasting colors that will show up clear given your background.

3. If your graphs are not legible, explain the key messages in the graph verbally or draw the graphs on the board (if they are simple).

4. Stay consistent with your notations.

5. Cite the sources to your materials, even if they come from widely used textbooks or online resources.

6. Don’t include information that you are not going to talk about in your slides.

7. It helps if you stick with the same examples and go through facts->explanation->solution for each of them in the same order throughout your presentation.

8. When you are explaining a model, start from the infrastructure (agents/players, relationships, basic assumptions, etc) and continue to the superstructure.

9. Don’t include too much information in your slides! This will make the audience overwhelmed and eventually bored.

10. Don’t read off the slides. Treasure the dynamic nature of presentations and interact with you audience.