The fun of delving into household surveys

In the week of the semester I was chatting with another grad student in our department. And he was curious about my research project.

“So what are you gonna do for your independent study?”
“Well, I plan to use some Chinese household survey data to analyze the consequences of China’s internal migration on the left behind children.”
“Sounds interesting! But where do you get the data?”
“I got the data from UNC. It’s a longitudinal data set and contains a lot of information that will be useful for my analysis. But I still need some time to clean it and make sure it’s up to the task.”
“Yeah… You can never predict what you can get from these data. That’s why I chose to write a theoretical paper.”

I have had similar conversations with other friends. While I agree that dealing with large-scale household surveys can be frustrating, I don’t think we should shy away from them for this reason. Household surveys provide rich information on the structure and operations of the families. By looking at real world household level data, you get a better sense of how households make decisions (both as a whole and as separate individuals bargaining with each other). You also become more aware of the reasons why some variables can never be measured and why some values are always missing. This is extremely evident in time-use data. Response rates for “yes or no” questions are much higher than questions that asks for a specific number of hours spent on a particular activities. This is hardly surprising given the difficulty of keeping track of time use. Even if we have statistics about how much time mothers spent caring for their children, these are subject to severe measurement errors.

I am using China Health and Nutrition Survey, which contains a variety of questions about household structure, employment decisions, time use, and health and nutrition. It’s been widely used in public health research, but economists can answer interesting questions based from this as well (here and here).


2 thoughts on “The fun of delving into household surveys

  1. Your post reminds me of my chat with one friend who dislikes working with data, especially survey data, for a different reason. He believes that it takes, if not wastes, a lot of time to do that and many people are merely playing with data. He defines his own work as pure modeling, which, to my best understanding, is intense math, such as stochastic dynamic general equilibrium.

    I would never downplay the role of theoretic economic research and its guidance for empirical research. I am only very upset with such an attitude that working data is pointless just because it is data work. We can call a paper a bad one because it ignores some important storyline behind the data or it uses a flawed technique. However, data work itself is not intrinsically inferior to math work. Actually, I believe economics as a discipline is empirical in nature and both types of work should serve the goal of interpreting people’s behavior in the real world, the information of which is largely embodied in the data we observe. A quick example to my mind is the expected utility model, which has been assaulted for not being able to explain people’s choices in many case. Then the prospect theory came forward and has been accepted as a better approach.

    As to what we can learn from HH surveys, I strongly agree that it inspires a lot discussions and sheds light on the intra- and inter-household resource allocation. It can also make us to rethink over what we used to take for granted. One paper I read yesterday is to answer the question: Does keeping diaries really help improve the accuracy of agricultural data? (Deininger et al 2012)

    Maybe only those who have INDEED worked with data can understand why there are so many scholars are still working on various data sets and why there are tons of ongoing efforts to collect more and better data.

    • Well said, Xudong! I think economics, as a discipline of social sciences, should be able to explain what we see in the real world. Expected utility is a great example. Theoretical and empirical work should fuel each other. We need people to develop models, but we also need people to test them, using real-world data, however messy they are. And personally I’ve found survey design immensely interesting. I’ve always admired survey collection initiatives aimed at providing future researchers with better data of better quality.
      The Deininger paper seems interesting. I only found the 2011 version on World Bank’s website. Do you have a reference to the most updated one?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s