
Thinking Machines pushes for open science through the AI4D Research Bank
We’re thrilled to announce our work with UNICEF Venture Fund to build the AI4D (Artificial Intelligence for Development) Research Bank.
The AI4D (Artificial Intelligence for Development) Research Bank will accelerate the development and adoption of effective machine learning (ML) models for development across Southeast Asia. We want more people to understand the nuances, limitations, and strengths of AI models and their uses in the development sector. Our goal is for this work to open up the black box of “AI”, so that discussions around using AI for policy work can be grounded in clear eyed understanding.
Over the last few years that we’ve been working on AI applications in the development sector, we’ve seen how difficult it gets. A short sample of the challenges we’ve seen: when the global benchmark models are trained with data in other countries where poverty looks quite different, when it takes a lot of data engineering work to extract geospatial features from satellite imagery, when machine learning measures of accuracy differ from policy requirements for accuracy, and so on.
By increasing access to local training data and technical resources for geospatial data handling, we aim to equip anybody who wants to integrate machine learning models in their work by providing a research bank of resources, including the following:
- Training data for localized datasets
- Data pipeline software available for ease of collecting and processing features
- Explorability to see what datasets are out there, how to compare methods to each other, and to understand how to use ML models as analytical tools
- Access to methodology, model benchmarks, datasets to reproduce research
Over the next couple of months, we’ll be releasing a set of benchmarks for ML methods for haze detection in Thailand and a set of benchmarks for ML methods for poverty mapping in the Philippines, Cambodia, Myanmar, Timor Leste, Malaysia, Thailand, Vietnam, Indonesia, and Laos. We’re building these models with the help of a data wrangling tool for geospatial feature engineering, which we will also be open sourcing.
We’re excited to push for open science, open data, and open source with all of you! If you are a geospatial analyst, machine learning researcher, or data engineer excited about working on this programme, please look at our careers page and join us.