View on GitHub

wicked_problems

Methodological Investigation: Measles

Vaccinations are an important part of healthcare and not everyone has access to them due to failing healthcare systems and a lack of access to resources. The measles is a viral disease that if not treated will cause death, and it is most prevalent in young children. Measles is a disease which can be protected against by a vaccination, but people in Africa, especially, have decreased access to it due to their location or the cost of the vaccines. Some data science methods that have been used to analyze measle vaccine distribution in Africa are the Bayesian hierarchal model and a geospatial method. The Bayesian networks allow the modeling of causal relationships between variables (Tanguy, 2020). In an article, this was used when vaccination coverage was mapped to see the effects of vaccine delivery mechanisms. Geospatial analytics gather, manipulate, and display geographic information system data and imagery including GPS and satellite photographs (What, 2020). It is can also be used to create maps from the data gathered, in this case using surveys. Using these methods, data scientists have been able to show the levels of vaccination in Africa and determine where it is lacking and resources need to be pooled. They have also attempted to determine what is causing the lack in vaccinations in these areas and what part of the system is different from other areas that are succeeding in administering immunizations. The problem is not creating a vaccine to begin saving lives but ensuring everyone has access and receives this life saving vaccine. Every country needs to advance their healthcare system or find ways to have sustainable vaccine administration sites so that their residents can get routine immunizations. It is a human right to receive adequate healthcare and finding ways to distribute the measles vaccine to everyone in Africa is one step closer to advancing human development.

One study was performed which used the Bayesian model and focused on the geospatial variation in measles vaccine coverage in Nigeria through routine and campaign strategies. In this study, household surveys were analyzed to determine the vaccine distribution after a vaccination campaign occurred between 2017 and 2018 in which there were supplemental immunization activities (SIAs). Post-campaign coverage surveys (PCCSs) were conducted in every state, and a two-stage sampling procedure was used which involved determining enumeration areas (EAs). These areas were selected from Nigeria as a whole, and household surveys were taken from each EA. In the second stage, the National Bureau of Statistics (NBs) used simple random sampling without replacement, to select households to be interviewed. The households eligible had children ages 9-59 months, and seven of these households were randomly chosen from a total of 1110 EAs in Nigeria (Utazi, 2020). Data was recorded about each child, his or her age and the centroid of the cluster from which the child’s household was selected. Using home-based records of routine vaccination, data on measles-containing vaccine coverage from before and after the campaign were obtained. They used six indicators to assess the performance of routine immunizations (RI) and SIAs: coverage before the SIA, SIA coverage among MCV zero-dose children, SIA coverage among children vaccinated previously, overall SIA coverage, coverage before and during the SIA, and coverage before and/or during the SIA. The cluster-level vaccination coverage for these children was mapped for all six indicators. The numbers of children sampled at the cluster level was between 1 and 26 and the numbers vaccinated was between 0 and 25, and certain clusters were excluded due to inefficient sample size (shown in Table 1) (Utazi, 2020). Because data that would skew the results is omitted, it shows that the data are likely reliable. Additional data on settlements with in the country were obtained from the GRID3 program. Lastly, more population data on coverage estimates for different administrative areas was obtained that corresponded to survey years from WorldPop.

image

For this study, the covariates helped improve the prediction of vaccination coverage and were included in the analysis because they were associated with the spatial distribution of vaccine coverage or were proxies for other factors that affect distribution. The covariates included data sets relating to remoteness, poverty, livestock, land cover, and land surface temperature. Covariate data values were extracted from each of the standard layers created for each PCCS cluster location in Nigeria. For each of the three directly modelled indicators, the covariates were selected using a combination of requirements, such as fitting single covariate models and ranking them based on predictive ability and checking for multicollinearity and selecting between highly correlated covariates. A final list of covariates was formed and they were significant in at least two of the three best models. Urban accessibility was only significant in one of the models but was still included based on finding from previous work (Utazi, 2020)

A geostatistical approach was used to model vaccination coverage using the selected covariates and to make predictions in unsampled areas in Nigeria, using a 1x1 km grid. The equation used is: logit(p(si)) = x(si)Tβ + ω(si) + €(si). In this model, Y(si) is the number of children vaccinated at cluster location si, x(si) is the vector of covariate data associated with si, β is the corresponding regression coefficient, ω(si) is a Gaussian spatial random effect used to capture residual spatial correlation, and €(si) is an independent and identically distributed Gaussian random effect with variance used to model non-spatial residual variation (Utazi, 2020). A Bayesian approach was adopted for fitting this model. To ensure consistency of the indicators, a conditional probabilities approach was used and fitted the model for each of the indicators: coverage before the SIA, SIA coverage among MCV zero-dose children, and SIA coverage among children previously vaccinated. These models were applied to predict vaccination coverage at 1x1 km. The estimates of the remaining three indicators were calculated using probability relations. Statistical tools and tests were used to evaluate predictive performance, showing this data is reliable (Utazi, 2020).

Both methods, the Bayesian approach and geospatial method, work hand in hand to describe vaccination coverage and distribution in Nigeria. The spatial methods work to show where the populations are that have been immunized and are getting better access to resources while the Bayesian model is used to show the effect each of the covariates has on the different indicators and make further predictions. These methods take the data obtained from surveys and other resources and analyze them to address the distribution of immunizations in different parts of Nigeria. The figure below shows the cluster level vaccination for children and the locations of the clusters.

image

The results of this study and these methods found that the covariates that gave the best predictions were: distance to the edge of cultivated areas (ECA), settlement type, land surface temperature, travel time to a health facility (HF), enhanced vegetation index, and travel time to cities (urban accessibility) (these are shown in Figure 3 below). The amount and type of covariate that were significant predictors for each indicator varied. For example, except for temperature and urban accessibility, the rest of the covariates were significant predictors of coverage before the SIA, and for SIA coverage among previously vaccinated children, urban accessibility was the only significant one. Distance to ECA and travel time to HF had consistent relationships with all three modelled indicators. The correlation between indicators and the significant covariates could be due to differences in the spatial distributions of the indicators. Higher coverage estimated for urban settlements for coverage before the SIA and higher coverage in rural settlements for the other two modelled indicators measuring SIA coverage likely demonstrates the reach of the SIA in rural areas. Looking at the predicted vaccination coverage maps, those of the indicators of interest (Figure 2 below) show significant heterogeneities in the spatial distribution of coverage before the SIA, with most of the northern regions and parts of the south having low coverage levels. The predicted map of the remaining two indicators of interest, coverage before and during SIA, (Figure 4 below) suggests greater likelihood of receipt of two doses in areas with higher coverage before the SIA. For overall SIA coverage, SIA coverage among MCV zero-dose children, and coverage with at least one lifetime dose, the predicted coverage rates were generally higher ad more spatially homogenous, although there are still pockets of low coverage in parts of the country. The areas of low uncertainty were found to be more prevalent within and near areas with high density of survey clusters, while higher prediction error in unsampled areas was found. Lastly, Figure 6 below shows the probabilities of obtaining herd immunity, which is 95% vaccination. Some areas, especially urban areas, show higher probabilities, but others show low probabilities of reaching this benchmark of 95% vaccination against measles. The SIAs and RIs show extremely helpful, but do not solve the issue of lack of vaccinations in all areas of Nigeria.

image Figure 2: image image Figure 6: image

In conclusion, Bayesian modeling and geospatial analysis addressed where vaccination coverage is highest and lowest and how covariates might impact what areas experience better immunization rates. This study used surveying to obtain data in Nigeria and then used these data science methods to analyze immunization rates. It predicted that there will still be some cold spots, where vaccination rates do not meet the threshold of 95% for herd immunity. This means many people will still be dying from measles, a treatable disease by immunization. There is still a lack of data or reasoning on how to get resources to these specific areas around Africa. RIs and SIAs obviously are not working in this case, so a new system needs to be created to get everyone this access to measle’s vaccine. It also can be hard to get surveys from every single country in Africa, so barriers need to be broken to get all of this data. Nigeria is one of the wealthier countries, so surveys might not be the best option in obtaining information from other countries. On a broad scale, data needs to be gathered from all of Africa, so that the analysis of vaccination coverage in Africa can be truthful and an in-depth summary of how vaccine distribution needs to change can be created.

Bibliography

Tanguy, A. (2020, March 2). Bayesian hierarchical modeling (or “more reasons why automl cannot replace data scientists yet”). Medium. Retrieved November 2017, 2021, from https://towardsdatascience.com/bayesian-hierarchical-modeling-or-more-reasons-why-automl-cannot-replace-data-scientists-yet-d01e7d571d3d#:~:text=Bayesian%20hierarchical%20modelling%20is%20a,distribution%20using%20the%20Bayesian%20method.

Utazi, C. E., Wagai, J., Pannell, O., Cutts, F. T., Rhoda, D. A., Ferrari, M. J., Dieng, B., Oteri, J., Danovaro-Holliday, M. C., Adeniran, A., & Tatem, A. J. (2020, February 29). Geospatial variation in measles vaccine coverage through routine and campaign strategies in Nigeria: Analysis of recent household surveys. Vaccine. Retrieved October 4, 2021, from https://www.sciencedirect.com/science/article/pii/S0264410X20303017?via%3Dihub.

What is geospatial analytics? definition and related faqs. OmniSci. (2020). Retrieved November 17, 2021, from https://www.omnisci.com/technical-glossary/geospatial-analytics#:~:text=Geospatial%20analytics%20gathers%2C%20manipulates%20and,street%20address%20and%20zip%20code.