Machine Learning on Spark for the Optimal IDW-Based Spatiotemporal Interpolation

Document Type

Contribution to Book

Publication Date

1-1-2016

Publication Title

Proceedings of the International Conference on Geographic Information Science

DOI

10.21433/B3114dw721gn

Abstract

To improve current spatiotemporal interpolation methods for public health applications (Li et al. , 2010), we combine the extension approach (Li and Revesz, 2004) with machine learning methods, employ the efficient k-d tree structure to store data, and implement our method on Apache Spark (Spark, 2016). Preliminary results demonstrate the computational power of our method, which outperforms the previous work in terms of speed and generates comparable results in terms of accuracy (Li et al., 2014). Future research will continue exploring this method to improve the interpolation accuracy and efficiency, with the long term objective of establishing associations between air pollution exposure and adverse health effects.

Share

COinS