Header image

Send Your Friends a Digital Holiday Card With Our AI Holiday Greeter!

December 18, 2020 blog-post general-interest christmas holidays artificial-intelligence data-visualization nlp topic-modeling dimensionality-reduction transformers

It’s that time of the year again! As is now tradition, the Thinking Machines team challenged ourselves to use our artificial intelligence and data visualization expertise to bring you some holiday cheer through a fun side project.

In the past, we’ve come up with a snowflake generator, trained an AI to generate a unique artwork in the style of famous artists, and used geospatial data to create parol-inspired holiday cards. With this year being tough for everyone, we definitely didn’t want to miss the chance to make something special for our friends and partners at Thinking Machines. This year, we decided to highlight Natural Language Processing (NLP) and data visualization in the newest addition to our holiday card generators, the AI Holiday Greeter!

Enjoy a bokeh christmas lights visualization of what people are saying about the holidays in 2020 and a series of AI-generated holiday greetings as you make your own holiday card with our AI Holiday Greeter below. Add in your personal message and voila! Your very own digital holiday card is good to share to your friends and family.

More interested in how we came up with this? Scroll down for the details!


Step 1: Coming up with topic clusters

What better way to know what people are saying about the holidays than via social media listening? We scraped Twitter for tweets containing holiday-themed keywords like Christmas, pasko, and holidays; each tweet became one point in our dataset.

To help the topic modeling algorithm and keep things simple, we pre-processed the tweets by removing common stopwords (e.g. the, a, is), punctuation, mentions, and hashtags to leave just the core message.

We then turned the text into a format that topic modeling algorithms will understand -- numbers. We used a technique called TF-IDF to turn each tweet into a single vector. The vector carries information about each word’s importance based on how often it appears in the entire corpus and in the tweet.

Now we are ready to apply machine learning models. We used non-negative matrix factorization (NMF) to discover striking topics shared by similar tweets. This assigns documents (or in our case, tweets) to topics based on how relevant each word in the tweet is to a topic.

With this information, we can now assign a christmas-y color to each cluster of tweets and map everything in 2D space. But wait! How exactly do we map the tweets in 2D space if the size of their vector representations equals the vocabulary size (which is definitely much larger than 2!). This is where dimensionality reduction techniques like Uniform Manifold Approximation and Projection for Dimension Reduction (UMAP) and t-distributed Stochastic Neighbor Embedding (t-SNE) come in.

Step 2: Turning scatter plots into Christmas-y visualizations

So how did we make a bunch of dots look festive and Christmas card-worthy? The answer was plain and simple: bokeh.

Using D3, a JavaScript library for data visualization, we first had to choose which set of 2D coordinates looked the most pleasantly scattered. In most of our attempts, the t-SNE mapping worked best —it was important for each dot to have some separation from the others for each one to stand out.

The next step was coloring the dots by cluster, and the right colors weren’t hard to find at all! The christmas-y palette came from our very own Thinking Machines design system, Asimov. The primary colors in Asimov perfectly resembled holiday lights, and after carefully assigning each color to the individual clusters, we had a vibrant scatter plot.

It still wasn’t quite as christmas-y as we wanted it to be, but after a few tweaks in dot size and opacity, we went from this:

to this:

Now that we’ve made the bokeh look satisfyingly realistic, we added some blur and luminance touches, rendered it (in 3D!) in the app, and our christmas lights were good to go!

Step 3: Training an AI model

Of course, we didn’t want to stop at just those beautiful visualizations! To make it extra special, we decided to train an AI to generate the greetings for us.

We started with OpenAI's GPT-2 language model, an AI that takes a short prompt and uses it to generate coherent text which mimics the style and content of the prompt. For example, you could give it one paragraph of a story and it could continue on with a few more pages of text which surprisingly make sense.

We then tailored it for our purpose by scouring the internet for christmas-y songs, poems, and greetings and using these sample text to fine-tune the model further. After letting it run for a few hours on a GPU, we’re done! Our GPT-2 has learned the holiday spirit and can now merrily greet us :)

(Curious to try an unabridged version? Why not train the AI yourself? Check out the instructions in our Google colab notebook linked at the end of this article!)

Step 4: Letting it all come together

As you’ve already seen, it all comes together in a web app that is designed to look like a postcard!

We built the web app using Next.js and used various react libraries to help us with UI components, detecting whether the user is on mobile or desktop, and creating a carousel slider for the AI-generated messages. The generated cards, when shared, are then saved in Firestore so we don’t have to re-run the model every time it’s viewed. Each card has its own unique, random URL that you can privately share with your loved ones!

Check out our code!

We also open-sourced our Generator AI implementation on Google colab, so that you can train the model on your own, no supercomputer needed!


Why We Think Sabbatical Leave Benefits Are Important

Sabbaticals have led many of us at Thinking Machines to return feeling recharged, with fresh eyes, and full of new ideas.

What’s the story? How we taught a deep learning model to understand the topics in thousands of online articles

We built a deep learning word embeddings model that learns nuanced topic relationships by reading the news.

How exposed is your barangay to different natural hazards?

The Philippines is one of the most at-risk and vulnerable countries in the world. But the geographical distribution of risks and vulnerabilities within the archipelago is uneven.