See how Media.Monks improved CPE by 200% with GenAI targeting

Now it’s clear. Machines understand human emotions better than humans.

GenAI & emotions will change what’s possible in CTV

At the heart of every successful advertising campaign is the ability to evoke emotions that draw people closer to the brand. Television, known for compelling storytelling that captures viewers’ imagination, is, of course, the perfect medium to do this.

Today, as viewers shift to streaming, advertisers are following. This year, connected TV ad spending is expected to grow 25% to more than $30 billion. It’s an exciting time for our industry as sophisticated, data-driven strategies, which compelled the growth of digital, begin paving the way for CTV marketers.

There are plenty of blue-sky opportunities ahead. For instance, while both traditional TV and CTV earn more attention than any other medium, CTV viewers still only pay attention 30% of the time. So, how can advertisers tip the scales in their favor, earning more attention and fostering meaningful connections with their brand? If emotions are truly at the heart of human connection, savvy, data-driven CTV marketers who adopt strategies optimized for emotion will find it much easier to break through. As a proof point, at the end of this blog I will review how emotion influenced the performance of this year’s top Super Bowl ads.

Until now, emotions-based marketing strategies have been out of reach for most marketers. But this is where I believe generative AI (GenAI), with its ability to understand what is happening on the screen, interpret human emotions, and support scalable solutions stands to transform our industry. These technologies show promise of changing the ways brands build emotional resonance in CTV. In fact, in this blog I’ll make the argument that machines actually understand emotions better than humans. But, before we consider why and how that is, let’s first look at how emotions are structured and received. This will help us to better understand the connection between biology, psychology, and computer science – and how all three will play a role in helping marketers influence decisions, and capture attention.

The anatomy of emotions

Marketers know very well that emotions have a big impact on decision-making processes, as well as on short and long-term brand recall. Many ads are, on purpose, aimed at certain emotions as advertisers seek to guide consumers towards choosing their brand with confidence. By focusing on emotions, these ads capitalize on something that humans can’t really control.

Stemming from the subcortical region of the brain, emotional responses are visceral. They are spontaneous, automatic expressions of the humnan subconscious. Rooted in such primal instincts, they make a lasting impact. As Charles Darwin pointed out first back in 1872 in his book, The Expression of the Emotions In Man And Animals, emotions are ancient mechanisms, deeply rooted in our brains, observed in both humans and animals. Emotions have similar evolutionary origins and affect the decisions of the central nervous system with similar mechanics. It’s as if a separate brain inside our brain is wired differently, trained separately and makes decisions in parallel to our consciousness. We are taught from an early age to “control our emotions”, but in truth, emotions affect our decisions and memory more than we like to admit.

This brain within our brain is called the amygdala. The amygdala, located deep in the center of our brain, receives conscious and subconscious visual and auditory signals and regulates hormones that are the source of what we call emotions. As we react to stimuli, changes in the levels of four key hormones (cortisol, serotonin, dopamine, and oxytocin), impact our emotional state.

The psychologist Robert Plutchik created in 1980 a helpful tool to illustrate how these emotions interconnect to create the range of feelings we experience. His wheel of emotions suggests eight primary bipolar emotions: joy versus sadness; anger versus fear; trust versus disgust; and surprise versus anticipation. As you can see below, these come together to support more nuanced emotions such as optimism (anticipation + joy), or anxiety (anticipation + fear).

What is a thought?

Let’s define a thought. Our consciousness, which is a neural network residing in the frontal lobe, receives processed information from different parts of our brain such as shapes and objects from the visual thalamus and visual cortex, auditory objects and frequency patterns from the temporal and auditory cortex, and words and sentences formed in the Broca’s & Wernicke’s Areas. Emotional hormones, like these other signals, play a role as our consciousness tries to form a thought. It goes through a process we sometimes call focus, which inhibits information it is trained to ignore. Once the focusing process is done, a new thought is formed and sometimes, when repeated enough, will be stored as a memory. The emotions are never well inhibited, and are stored together with that thought.

In the brain, neurons are interconnected with synapses. Psychologist Donald Hebb’s law has been summarized as “Cells that fire together, wire together,” This means that any two neurons that are repeatedly active at the same time strengthen their synaptic connection associating those neurons together. If one of these neurons fires as a result of input signals, it will cause the other neuron to fire as well. This is known as associative memory – if we remember one thing, it might remind us of a different thing because the neurons and synapses are connected. It’s also the mechanism for anticipating sequences. When two signals typically come one after the other, the brain is trained to anticipate the second signal after receiving the first (This is similar to Transformers decoding in large language models (LLMs) predicting the next word or pixel).

Realized or not, this association is often what is triggered by an effective CTV ad. The ad will trigger emotions which will be cataloged in the brain together with the sights and sounds of the creative that represent the brand. This combination is what is stored in the brain as a thought. Later, when the thought is retrieved, it comes with the emotions attached. I’d argue that CTV’s ability to evoke emotions that can be ingrained into thoughts related to the brand is a primary reason that the medium is so effective.

Emotions are contagious

Of course, emotions serve a purpose beyond enriching our thoughts. They are also an important signal for social interaction. When you enter a room where everyone is happy, you are happy too. If no one else is laughing during a stand up routine, you’re not likely to laugh either. In fact, when I walk my dog in our friendly suburban neighborhood, I often experiment with emotional responses. I’ve noticed that when I say “Good morning” with frown, I get a similar reaction. When I greet someone with a big friendly smile, I get a big friendly smile in return. When I speak in a cold, formal manner, my neighbors reply with a cold, formal tone.

CTV marketers would do well to note the contagious aspect of emotions. Afterall, eliciting emotional responses is part of what the medium does best. If you watch a happy movie, you feel happy. If you watch a sad story on the news, you feel sad. If we continue the above analogy of a person mirroring the emotions of others in a room, the programs that ads are associated with will have set the tone of the “room” and define the emotional state of the viewer. Marketers don’t want their ads to be like the awkward person who walks into a room laughing when everyone else is crying.

Connecting man and machine

We’ve covered the anatomy of emotions, how emotions signal thoughts, and are a critical part of our social language. Now, what does any of this have to do with GenAI?

The “trend” in GenAI right now is multimodal large language models (MLLMs). OpenAI was the first to introduce the concept of MLLMs earlier last year with visual ChatGPT-4 and we’ve all seen the magnificent instantiation of that technology with Sora, which was recently released for “social response.” Google also got into the game with its recent Gemini release that claimed to be multimodal from the “ground up”, Pro 1.5 surpassing visual ChatGPT-4 in token intake (over 1M) and visual signals processing; Apple, which released a new open-source AI model called “MGIE” or “MLLM-Guided Image Editing”; and Meta, which invested in more than 350,000 of Nvidia’s H100 GPUs to support computational efforts for their open-source development of the multimodal Llama 3, not to mention Sam Altman seeking $7T to fund new hardware that will train the ultimate Artificial General Intelligence (AGI) or at this point probably Artificial Superintelligence (ASI), surpassing humans in creativity and decision making.

Compared to previous LLMs, multimodal LLMs can receive inputs not just as text, but also audio, image, and video in parallel. If this sounds familiar, it’s because it’s the same for the human brain. The brain and our thoughts are multimodal, just like MLLMs – we are not thinking in words or images alone, but rather in a combination of all of those inputs. The key difference between our brains and MLLMs, however, is that GenAI models are efficient, computational, scalable mathematical functions that imitate (and ultimately surpass) the brain’s behavior. Let’s explore more on that next.

Machines understand emotions better than humans

Today, thanks to the vast amount of knowledge and human experiences documented online over the past 30 years, MLLMs have much to pull from. While an individual’s intelligence might be limited to their personal experiences and creativity, MLLMs are building on the intelligence, including the emotional intelligence, of humanity as a whole.

As individuals, it’s sometimes hard to express our emotions – we don’t know what we feel most of the time and are unable to put our emotions into words. But, the best writers, artists, and creatives have figured out how to express their feelings with words, songs, and images – and have shared their work on the Internet for years. MLLMs have been trained upon these emotional assets, and, as a result, can understand and describe human emotion with much more clarity than any individual.

As an example, OpenAI trained their first model in 2017 on predicting Amazon product reviews, using an unsupervised scalable language training technique that could predict the next words to create fully intelligible sentences. One of the first things they noticed was the “sentiment neuron.” Their system, just by analyzing language used in the reviews, trained itself to understand the emotion of the review, and could apply that sentiment to its text generation, as humans would. Today, recent studies have shown that MLLMs are so well trained on human communication that if you use emotional language, they actually provide better responses that more closely imitate humans.

The opportunity for marketers

For advertisers, MLLMs introduce a powerful opportunity to leverage GenAI to engage with audiences in new ways based on their thoughts and emotions.

GenAI, by analyzing imagery and language, can define with high clarity the emotions that any content or ad would invoke. Going back to an example we used earlier, if you watch a happy movie, there is a high probability that the MLLMs will know it’s happy – giving CTV platforms an opportunity to augment their ad request, and find a brand and creative that works well with this emotion.

To illustrate how this could work to marketers’ benefit, we ran an analysis of the emotions evoked by the top commercials shown during the recent Super Bowl, which saw nearly 124 million viewers. These viewers fell into two main personas: first, the dedicated sports fan who was there “For the Game” and was emotionally invested in its outcome, and second, the viewer who was there “For the Show,” primarily happy to socialize with friends and watch the industry’s best ads. The emotional state of the “For the Game” persona would likely swing towards anger and anticipation (high cortisol and low dopamine), while the “For the Show” viewer would likely experience joy and anticipation (high serotonin and low dopamine).

To choose which ads to consider, we used Kantar’s Creative Effectiveness Report – Super Bowl LVIII 2024 Top Performing Ads, which ranked ads by impact – defined as “the ability to be noticed and create advertising memories for the brand.” According to this report, the top three ads of the game were Doritos’ “Dina & Mita” (impact score of 93), Popeyes’ “The Wait is Over” (impact score of 92), and’s “Tina Fey Books Whoever She Wants to Be” (impact score of 91).

As you can see in our GenAI analysis below, all three ads scored highly for joy and anticipation – two emotions that align well with the “For the Show” persona. Interestingly, Doritos’ “Dina & Mita” ad, a comedy-driven commercial that shows two women annoyed over someone else taking the last bag of chips off the shelf, also scored high for anger – reflecting the emotions of those dedicated sports fans. This may have helped it rank in the #1 spot in Kantar’s report, since it appealed to both personas by combining anger, joy, and anticipation.

Looking ahead

There is a lot of excitement surrounding GenAI right now. A deeper look at the intersection of biology, psychology, and computer science sheds light on the exciting future this technology holds. By understanding the advancements of MLLMs, as well as the anatomy of the brain, our emotions, and thoughts, we can gain a deeper appreciation for the idea that machines can indeed understand emotions better than humans and, as such, have the power to create more positive experiences for both brands and consumers.

Already, a new generation of MLLM-powered solutions are poised to capitalize on the digital footprint of CTV and redefine how media is targeted (in fact, Disney recently announced a new AI-driven mood analysis, which is available to advertisers on their streaming inventory). For marketers, being able to better align advertising efforts with the emotional states of their audience can inspire trust and action, guiding consumers towards choosing their brand with confidence. Precise emotional targeting will not only benefit marketers, but publishers who will reap the reward of greater media investment that comes hand-in-hand with better performance.

Get news and updates from Wurl