ECO 120 Problem Set 2
Professor Jonathan Robinson 1. (10 points) This problem is about an agricultural training evaluation in Kenya. Download
the dataset (farming.dta) and the PDF document which describe the dataset and experiment from the course website.
(a) Report the means and standard deviations of the following variables: age, gender, being able to read Swahili, being able to write Swahili, years education, marital status, the value of animals owned, and the value of household items owned. Make a little Table to report these in a nice format.
(b) What percentage of households have ever used fertilizer and hybrid seeds? What per- centage have used fertilizer and hybrid seeds in the last year?
(c) The variables revenue_blue and revenue_unmarked give the revenue obtained on the 2 plots. What is the mean and median of each of these? From that, what is the percentage increase in revenue at the average?
(d) Is the difference in revenue between the blue plot and the unmarked plot significant at 95% significance? Hint: to do this, you must use a t-test.
2. (20 points) This part of the question is to check for attrition in our dataset.
(a) In our dataset, the variable “training” is equal to 1 for all farmers sampled for training and 0 for all those not sampled for training. How many farmers were there at the time of sampling?
(b) The variable “monitoring” is equal to 1 if the farmer did a monitoring survey with us, and the variable “harvest” is equal to 1 if the farmer harvested with us (these variables are missing for those that didn’t do surveys with us). How many farmers did a monitoring survey? How many harvested with us?
(c) After our randomization, the NGO we worked with added some new people into their farming cooperatives. This happened because some of the original farmers dropped out of the program. For our analysis, should we include those farmers who were added later or not? Why? The variable “added”= 1 for any farmer that was added. How many were added?
(d) We are now going to check if the farmers who dropped out are different than those who stayed. Generate a new variable called “dropped” which is equal to 1 for any farmer that was originally sampled for the project (treatment or control), but who didn’t do a harvest survey, and equal to 0 for any farmer that was originally sampled for the project and who did a harvest survey. How many observations are there for this variable? What is the mean?
(e) We are interested in whether those that dropped out were more or less likely to have ever used fertilizer before. What statistical test should we run to check if those that dropped out are similar to those that stayed?
(f) Run the test you describe in (e) and report the results. Repeat for these other variables: age, gender, literacy, years of education, the value of animals owned, the value of tools owned, the value of household items owned, whether the farmer had ever used hybrid seeds before, whether the farmer had used fertilizer in the last year, and whether the farmer had used hybrid seeds in the year before.
(g) From this, would you conclude that attrition was random or non-random? (h) How would you check if attrition were differential across the treatment and control
groups? Perform this test and report the results. Comment on what you find.
3. (20 points) (Note: this question was from a previous midterm) An NGO is imple- menting a program to improve health by combating malaria. Because the NGO is trying to help the neediest, it targets the poorest people in a village. To identify poor people, the NGO visits households and decides a household is “poor” if the roof of the house is made of a semi-permanent material like thatch, and the household is considered “not poor” if the roof is made of a permanent material like iron sheets. In the villages the NGO works in, 500 households are classified as poor and 2,000 are not poor. The NGO gives each poor household 2 insecticide-treated bednets and also gives them access to effective anti-malarials called ACT (artemisinin combination therapies). If anybody in the household thinks they are getting malaria, they can go to the local clinic and get tested, and if they are positive they get ACT for free.
The NGO is interested in knowing whether the program works, and hires you as a consultant to evaluate the program. The NGO has some data available – before and after the program, the NGO had tested every individual in the village for malaria. The average malaria rates are as follows
Dependent variable: malaria prevalence (% infected) Before program After Program
Poor 0.17 0.10
Not poor 0.07 0.09
(a) (5 points) If you can, write out a regression to provide an estimate of the effect of the program.
(b) (5 points) You should get 4 regression coefficients. Write them out and interpret what they measure. What is the estimated treatment effect? (Hint: even if you can’t write out the regression, please just report the overall treatment effect.)
(c) (5 points) What has to hold for your treatment effect to be a valid estimate? (d) (5 points) If you had more data, how might you test whether (c) holds?
PLACE THIS ORDER OR A SIMILAR ORDER WITH USA ELITE WRITERS TODAY AND GET AN AMAZING DISCOUNT