Predicting Malaria Incidence in Guinea: A Real Time Machine Learning Tool
Abstract:
Malaria remains a
pressing public health issue in Guinea, with approximately 13 million
individuals at risk of contracting the disease. Despite efforts to reduce
malaria incidence, it remains the leading cause of consultations,
hospitalizations, and deaths in the country. To address this challenge, machine
learning (ML) techniques have gained traction in epidemiology for predicting
disease outbreaks and identifying high-risk areas. During this internship, we
aim to use ensemble learning algorithms to develop a predictive model for
malaria incidence in Guinea. Our methodology involved data integration, feature
engineering, and model training using various ML algorithms, such as logistic
regression, random forest, decision tree, support vector machine, gradient
boosting machine, artificial neural network and ensemble stacking leveraging
diverse datasets, including clinical records, demographic health surveys, and
climatic data spanning six years from 2018 to 2023. We evaluated model
performance using the F1-score metric. We found that the ensemble stacking
method, particularly balanced stacking, demonstrated superior predictive
accuracy (F1-score = 0.74). This highlights the importance of interdisciplinary
collaboration and data integration in epidemiological research, as well as the
potential of ML in informing targeted interventions and resource allocation
strategies for malaria control. Challenges such as multicollinearity and
imbalanced datasets were addressed through robust statistical techniques and
model tuning. This research underscores the significance of translating
research findings into actionable insights for malaria control efforts in
Guinea. By harnessing the power of ML and deploying user-friendly tools, public
health authorities can make informed decisions to mitigate the burden of
malaria and improve health outcomes for affected populations.
References:
[1].   President’s Malaria Initiative (PMI). 2023, Guinea Malaria Operational Plan FY
2023. https://d1u4sg1s9ptc4z.cloudfront.net/uploads/2023/01/FY-2023-Guinea-MOP.pdf
[2].   World Health Organization (WHO). 2020, Malaria
in the African Region. https://www.afro.who.int/health-topics/malaria
[3].   Hamilton, A. J., Strauss, A.
T., Martinez, D. A., et al., 2021, Machine learning and artificial intelligence: Applications in
healthcare epidemiology. Antimicrobial
Stewardship & Healthcare Epidemiology, 1(1), e28. https://doi.org/10.1017/ash.2021.191
[4].   Harvey, D., Valkenburg, W., & Amara, A.,
2021, Predicting malaria epidemics in Burkina Faso with machine learning. PLOS ONE, 16(6), e0253302. https://doi.org/10.1371/journal.pone.0253302
[5].   Ji, C., Zou, X., Hu, Y., et
al., 2019, XG-SF: An XGBoost
classifier based on shapelet features for time series classification. Procedia Computer Science, 147,
24–28. https://doi.org/10.1016/j.procs.2019.01.087
[6].   Huang, J., & Ling, C. X., 2005, Using
AUC and accuracy in evaluating learning algorithms. IEEE Transactions on Knowledge and
Data Engineering, 17(3), 299–310. https://doi.org/10.1109/TKDE.2005.50
[7].   Kalipe, G., Gautham, V., & Behera, R.
K., 2018, Predicting malarial outbreak using machine learning and deep learning
approach: A review and analysis. In 2018
International Conference on Information Technology (ICIT) (pp. 33–38). IEEE. https://doi.org/10.1109/ICIT.2018.00017
[8].   Yaa, E. A., Quaye, I. K.,
Osei, P. P., et al., 2021, Malaria
prediction model using machine learning algorithms. Turkish Journal of Computer and
Mathematics Education (TURCOMAT), 12(11), 7488–7496.
[9].   Nkiruka, O., Prasad, R., & Clement, O.,
2021, Prediction of malaria incidence using climate variability and machine
learning. Informatics in
Medicine Unlocked, 22, 100508. https://doi.org/10.1016/j.imu.2020.100508
[10].  Kim, Y., Ratnam, J. V., Doi,
T., et al., 2019, Malaria
predictions based on seasonal climate forecasts in South Africa: A time series
distributed lag nonlinear model. Scientific
Reports, 9, 17882. https://doi.org/10.1038/s41598-019-54250-3
[11].  Higuchi, D., 2014, Characteristics of
coping strategies for dysesthesia in preoperative patients with compressive
cervical myelopathy. Asian
Spine Journal, 8(4), 393–400. https://doi.org/10.4184/asj.2014.8.4.393
[12].  Castro, M. C., 2017, Malaria transmission
and prospects for malaria eradication: The role of the environment. Cold Spring Harbor Perspectives in
Medicine, 7(9), a025601. https://doi.org/10.1101/cshperspect.a025601
[13].  El-Hasnony, I. M., Elzeki, O. M., Alshehri,
A., et al., 2022, Multi-label active learning-based machine learning model for
heart disease prediction. Sensors,
22(4), 1184. https://doi.org/10.3390/s22031184
[14].  World Health Organization (WHO). 2017, A
framework for malaria elimination. https://iris.who.int/handle/10665/254761
[15].  Weiss, D. J., Lucas, T. C. D.,
Nguyen, M., et al., 2019, Mapping
the global prevalence, incidence, and mortality of Plasmodium falciparum,
2000–17: A spatial and temporal modelling study. The Lancet, 394(10195),
322–331. https://doi.org/10.1016/S0140-6736(19)31097-9
[16].  Garske, T., Ferguson, N. M., & Ghani,
A. C., 2013, Estimating air temperature and its influence on malaria
transmission across Africa. PLoS
ONE, 8(2), e56487. https://doi.org/10.1371/journal.pone.0056487
[17].  Bhatt, S., Weiss, D. J., Cameron, E., et
al., 2015, The effect of malaria control on Plasmodium falciparum in Africa
between 2000 and 2015. Nature,
526(7572), 207–211. https://doi.org/10.1038/nature15535
[18].  Karuri, M. K., & Snow, R. W., 2016,
Forecasting malaria burden in Africa using satellite meteorological data. Frontiers in Public Health, 4,
112. https://doi.org/10.3389/fpubh.2016.00112
[19].  Bousema, T., Griffin, J. T.,
Sauerwein, R. W., et al., 2012, Hitting
hotspots: Spatial targeting of malaria for control and elimination. PLoS Medicine, 9(1),
e1001165. https://doi.org/10.1371/journal.pmed.1001165
[20].  Reiner, R. C., Perkins, T. A.,
Barker, C. M., et al., 2015, A
systematic review of mathematical models of mosquito-borne pathogen
transmission: 1970–2010. Journal
of the Royal Society Interface, 12(106), 20140921. https://doi.org/10.1098/rsif.2014.0921
[21].  Chang, H. H., Davis, G. M., & Waller,
L. A., 2014, Mining spatio-temporal data on malaria for exploratory analysis
and model building. International
Journal of Health Geographics, 13(1), 31. https://doi.org/10.1186/1476-072X-13-31
[22].  Sturrock, H. J. W., Hsiang, M. S., Cohen,
J. M., et al., 2013, Targeting asymptomatic malaria infections: Active
surveillance in control and elimination. PLoS
Medicine, 10(6), e1001467. https://doi.org/10.1371/journal.pmed.1001467
[23].  Snow, R. W., & Marsh, K., 2002, The
consequences of reducing Plasmodium falciparum transmission in Africa. Advances in Parasitology, 52,
235–264. https://doi.org/10.1016/S0065-308X(02)52005-X
[24].  Omumbo, J. A., Hay, S. I.,
Goetz, S. J., et al., 2002, Updating
historical maps of malaria transmission intensity in East Africa using remote
sensing. Photogrammetric
Engineering & Remote Sensing, 68(2), 161–166.
[25].  Osei, P., Frempong, G. A., & Nettey, O.
E. A., 2020, Spatial analysis of malaria incidence and associated risk factors
in Ghana. Geospatial Health,
15(1), 13–22. https://doi.org/10.4081/gh.2020.859
[26].  LeCun, Y., Bengio, Y., & Hinton, G.,
2015, Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
[27]. Müller, A. C., & Guido, S., 2016, Introduction to machine learning with Python: A guide for data scientists. O’Reilly Media.

