Header image

How Open-Source Big Data Can Tackle Healthcare Access using CARTO

September 1, 2020 blog-post geospatial big-data healthcare government
The Challenge

In light of the pandemic, it is very important for us to have a quick and scalable way of identifying vulnerable populations for pinpointed interventions. One example of this is how important it is to make sure that the majority of the population has access to the necessary health facilities that they can go to should they need it. Now, more than ever, government and health groups need to be able to quickly identify where to allocate resources in a way that allows as much impact as possible.

Luckily, through the excellent work of the scientific and open-source geospatial community, we have global datasets that help inform this decision making. We have global datasets on a lot of indicators such as population, health facilities, and mobility, all of which helps in being able to quickly pinpoint these areas -- even down to the house-level granularity.

However, with large datasets comes an even larger problem -- computing power. With you having to process gigabytes of information to access these datasets, every single processing step becomes a blocker -- something as simple as viewing data on a map becomes a huge endeavor. And with huge processing tasks come huge resource requirements of renting out the compute capacity. Thus, for groups without the technical know-how around geospatial processing options or without the resources, it becomes difficult and/or costly to do very granular countrywide geospatial analysis.

The Solution

Understanding this problem, CARTO, with the support of BigQuery, developed BigQuery Tiler -- a quick and easy tool to process, visualize and, thereafter, analyze, large spatial datasets straight from BigQuery. Using this technology together with Thinking Machines’ datasets and geospatial processing expertise, we created a demo of how we can quickly identify healthcare gaps at scale.

Going back to the problem earlier -- how do we identify high impact locations for the construction of new health facilities. We use two very popular datasets as a proof-of-concept:

Our goal is to be able to identify high concentrations of settlements that do not have access to health facilities within a certain distance. For the purpose of this blog post, we focus on the Philippines, Malaysia and Vietnam -- a total of almost 1 million square kilometers in terms of area. The population layer alone has around 19.6M rows in its dataset, and with the health facilities being a bit over a million points of interest in total.

Health Facility Access Map using Geospatial Data and CARTO's BQTiler!

Health Facility Access Map using Geospatial Data and CARTO's BQTiler!

Using BigQuery Tiler, we’re able to load both datasets onto a map in almost no time at all, without having to worry about any ETL, loading times or cost!

BigQuery Tiler allows us to partition our very large datasets in BigQuery into vector tiles, which makes loading and visualization of datasets much more manageable for our web maps. What this means is that we can easily view the population and health facility data of an entire country without having to worry about the dataset size or scale.

Once it’s on the map, we can also easily build analysis layers on top. For example, we can filter out settlements that already have access to health facilities. This allows users to focus primarily on areas that are not within a certain distance to a health facility -- a distance that can be easily chosen by the user.

We can even quantify the vulnerable population within an area by using our drawing tools to select custom areas of interest and easily summarize the data based on that.

A big plus is that we can generalize the same methodology across many different types of use cases and datasets. For example, at Thinking Machines, we use Machine Learning and AI to extract wealth information from satellite images at scale. We’re able to combine our extracted wealth information with building infrastructure to allow our telecommunications partners to identify ideal locations for cell sites based on their target wealth profiles and potential customer volume. We can easily to extend this to other industries or use cases that requires any sort of expansion or site selection.

And now, with BigQuery Tiler, we’re able to collect that information without having to worry about the scale and compute required to visualize and process the data.

MORE STORIES

Can machine learning and satellite imagery help improve humanitarian aid to Venezuelan migrants?

A hard life awaits millions of Venezuelan migrants seeking refuge in neighboring Colombia.

Spread algorithmic Christmas cheer with our snowflake generator

What happened when we asked a physicist, a compute…ta scientist to make our company Christmas cards?

5 Ways You Already Rely On Artificial Intelligence

Sure, Siri is a long way from being Scarlett Johansson. But she’s a real-world example of how artificial intelligence has crept into daily life.