In the week of the semester I was chatting with another grad student in our department. And he was curious about my research project.
“So what are you gonna do for your independent study?”
“Well, I plan to use some Chinese household survey data to analyze the consequences of China’s internal migration on the left behind children.”
“Sounds interesting! But where do you get the data?”
“I got the data from UNC. It’s a longitudinal data set and contains a lot of information that will be useful for my analysis. But I still need some time to clean it and make sure it’s up to the task.”
“Yeah… You can never predict what you can get from these data. That’s why I chose to write a theoretical paper.”
I have had similar conversations with other friends. While I agree that dealing with large-scale household surveys can be frustrating, I don’t think we should shy away from them for this reason. Household surveys provide rich information on the structure and operations of the families. By looking at real world household level data, you get a better sense of how households make decisions (both as a whole and as separate individuals bargaining with each other). You also become more aware of the reasons why some variables can never be measured and why some values are always missing. This is extremely evident in time-use data. Response rates for “yes or no” questions are much higher than questions that asks for a specific number of hours spent on a particular activities. This is hardly surprising given the difficulty of keeping track of time use. Even if we have statistics about how much time mothers spent caring for their children, these are subject to severe measurement errors.
I am using China Health and Nutrition Survey, which contains a variety of questions about household structure, employment decisions, time use, and health and nutrition. It’s been widely used in public health research, but economists can answer interesting questions based from this as well (here and here).