2015 Data Science Workshop


Wed, April 29, 2015
Location: Fisher Conference Center, Arrillaga Alumni Center

"Large-Scale Time Series Analysis of Food Price Spikes and Malnutrition"


Using large-scale data on food prices and malnutrition among over 350,000 children in 26 countries, we will use the novel time-series analysis method of convergent cross-mapping to determine the relationships between sudden, dramatic spikes in global food prices and child malnutrition rates; and identify which food security programs were most effective in preventing malnutrition during periods of sharp price inflation.

We will link the person-level health data to food market locations using k-nearest neighbor methods.The outcome metrics will include child mortality, weight-by-height, recent fevers and diarrhea, anemia, and a standardized 12-point food security metric; control variables related to education, wealth, urban/rural residence, and other potential confounding factors have also been collected.The data are approximately 2 TB in size.To relate prices to malnutrition and mortality outcomes, we will optimize the novel method of convergent cross-mapping (CCM). CCM has been developed as a new time series analysis technique that takes advantage of a large volume of data to perform causal inference.

In this project, we will optimize the approach using an efficient distributed parallel execution strategy we are developing in the language Julia, which allows us to redesign the CCM method through a “shotgun” strategy that will identify multiple possible relationships between the prices and malnutrition rates, then narrow down to the highest-likelihood results efficiently.The Julia language allows for high numerical accuracy in this approach while benefitting from distributed computation, and involves advanced multiple dispatch features so that we can efficiently define function behavior across many combinations of argument types.