Using AI for Automatic Logo Detection on Store Shelves
We worked with one of the biggest pharmaceutical, consumer healthcare, and personal care manufacturers in Southeast Asia to leverage artificial intelligence models in monitoring their brand’s visibility in retail locations. Within three weeks, Thinking Machines developed a high-performance logo detection model and front-end mobile application that could identify our client’s product on shelves.
The core problem — monitoring the visibility of the company’s 350 brands across multiple marketing and sales channels.
Applying Deep Learning to Logo Detection
Our client had recently set up an internal Innovation Team to champion the adoption of new technologies and spark an innovation culture within the conglomerate. They came to us with a core problem -- monitoring the visibility of the company’s 350 brands across multiple marketing and sales channels. A key metric they track for each brand is their share of shelf space in retail stores, which today is done by sending a small army of people to physically go to stores and count items on shelves.
Why not create a more efficient process with AI? Computer vision models are everywhere! They’re embedded in your phone, your doorbell, even your marketing materials. Together, we worked to train one of these models to recognize one of their brand logos.
AI-powered mobile app allows the brand to quickly and easily assess the brands visibility in thousands of product photos.
YOLO - The Single Shot Detection Guide To Logo Identification
Our team set out to build a simple mobile application that could detect the logo of one of our client’s over-the-counter (OTC) brands in photos of retail shelves. The company’s marketing team could use this app to quickly and easily assess the brand’s visibility in hundreds or even thousands of photos, without having to review each photo individually.
Gathering Training Data
We started out with an initial training data set of only 732 images of the product with logo. Through a series of data augmentation techniques, in which we cropped every image that had the product with logo and performed some transformations like horizontal flip, vertical flip, decolorization, edge enhancement, and blurring, we managed to create 10,000 examples from the original 732 images. To add a bunch of true negatives, we captured several tens of thousands of images of similar products of different brands.
There are a number of model architectures that perform object recognition quite nicely, and we iterated through dozens of experiments to find the best-performing deep learning model setup.
Big picture, there are two types of approaches that make sense for logo detection:
Method 1: MobileNetV2 for Image Classification
For our baseline model, we used Convolutional Neural Networks (CNN) with Tensorflow through the Tensorflow Hub framework. This uses Google’s MobileNetV2 architecture, which is specifically designed for mobile vision applications such as classification, object detection, and semantic segmentation.
Method 2: YOLOV3 for Object Detection
We also ran a Single Shot Detection (SSD) model using the YOLOV3 (shorthand for “You Only Look Once” -- who said data scientists don’t have a sense of humor?) framework with pre-trained weights from the Darknet53 architecture. This model doesn’t just check if the image contains the product, but can also locate the product logo’s precise position within the photo.
The model is able to detect multiple class objects within the photo and identify their location using bounding boxes.
Results… to classify or to recognize?
We started with image classification models, assuming that the logo would be the dominant part of the images that were going to be analyzed. TensorFlow Hub has pre-trained checkpoints which we used for transfer learning with our training data. Initial training yielded good accuracy but was very biased towards the data we used. Our training data consisted mostly of close-up, posed shots of the product with the logo, so our model learned to detect those scenarios only. The model performed poorly when we tested it with real-world photos, with different and awkward angles or lighting.
F1 SCORE: 0.615
We tried to tweak our training data and parameters but soon realized that the problem required a different approach.
The second method we tried, Object Detection, requires training data with bounding boxes labeled manually. The tedious prep work was part of the reason why we chose to try Image Classification before this method. We bit the bullet and labeled a couple hundred photos using an open source tool that could output to the format we needed. We trained the model using a GPU and manually tested the checkpoints. The model performed really well even with the limited set of labeled training images we gave it.
F1 SCORE: 0.875
As the last step, we built a mobile application for our client using Firebase, Google’s mobile development platform. Using the app, our client’s marketing and sales team can point their phone camera at any number of objects and have the app automatically detect if and where the product is to speed up brand audits in retail environments.
In addition, we built a KPI dashboard for our client’s Innovation Team to track how often the app has detected the product logo.
- Google Cloud Platform
- Google Data Studio
- Google Firebase
Driving Innovation One Shot At A Time
In three weeks, we tested and build out a high-performance logo detection model. It’s a great demonstration of what’s possible now with state of the art computer vision libraries, and the speed of deployment possible with Firebase.
Our client is exploring to scale up this application so it can be used to monitor even more over-the-counter brands. In addition, our Logo Detection Application sparked interest to drive further data-driven innovation within across our client’s organization.
How can our team help you get insights from data? Leave us a note on social media or email us directly at [email protected].