Map-Matching on Big Data: a Distributed and Efficient Algorithm with a Hidden Markov Model
In urban mobility, map-matching aims to project GPS points generated by moving objects onto the road segments representing the actual object positions. Up to now, map-matching has found interesting applications in traffic analysis, frequent path extraction, and location prediction. However, state-of-art implementations of map-matching algorithms are either private, sequential or inefficient.
This is an extension of Luo et al., 2016 [1], i.e., an existing serial algorithm of known efficiency by reformulating it in a distributed way, in order to achieve great scalability on real big data scenarios. It enhances the robustness of the algorithm (which is based on a first order Hidden Markov Model) by introducing a smart strategy to avoid gaps in the matched road segments; indeed, this problem may occur under sparse GPS sampling or in urban areas with highly fragmented road segments.
This implementation is based on Apache Spark 2.1 and GeoSpark 1.1.3.
Please refer to the following research paper:
- Francia, Matteo, Enrico Gallinucci, and Federico Vitali. "Map-matching on big data: A distributed and efficient algorithm with a hidden Markov model." Proc. of MIPRO. IEEE, 2019.
TBD
- Build the project:
./gradlew.bat
- Deploy the
build/libs/BIG-map-matching-all.jar
to the cluster - Run:
spark2-submit \
--class "EntryPoint" \
~/path/to/BIG-map-matching-all.jar \
alpha=4 beta=10 tau=100 theta=4 gamma=200
[1] An Luo, Shenghua Chen, and Bin Xv. Enhanced map-matching algorithm with a hidden markov model for mobile phone positioning. ISPRS Int. J. Geo-Information, 6(11):327, 2017.