I watch these shorts on YT with some science being explained, and some Minecraft dude is jumping over these blocks hanging in the sky, and it feels like bro will fall, but he doesn’t. I stop scrolling for the content, but I stay for the satisfaction of the extra, irrelevant video.
Why are we hooked on watching videos where a story is narrated in a split-screen format, with half the screen showing some irrelevant gameplay or visually appealing procedure, such as painting or cleaning? The top half is usually the content, and the lower half is Visual ASMR. This new phenomenon of content delivery and attention-grabbing tricks is a symptom of the speed economy.
I’ve provided the technical context and an “attention equation” for your analysis as a creator or consumer.
- TL;DR
- Real engagement
- Visual ASMR
- The 2 halves of the video – attention ping-pongs between the 2 as artificial competition creates stimulation
- The speed economy – the reason we need an attention “re-capture” mechanism
- Method to maximize stimulation
- The attention equation – Goldilocks zone of attention
- Is the split-screen good or bad?
- Sources
TL;DR
- The second video in split-screen content is a content creator’s way to recapture our attention as it drops so as not to lose our attention to someone else’s reel.
- This attention drop is a problem because we expect high stimulation and high speed of stimulation (the speed economy), and providing an alternative area to focus on while consuming content satisfies this stimulation requirement.
- Split-screen videos increase engagement by about 4% and users find them calming and easy to focus on.
Here’s what I mean. Check out the video below.
This is the format. 1 half of the screen is the content. The other half is the “attention anchor” half that generally contains gaming, monotonous movements, visual ASMR, or cleaning, or 5-minute-craft-style content. Its goal is to maintain the user’s attention in case the user gets distracted or habitually chooses to scroll. To unpack this format, we have to unpack quite a few things.
First, I will introduce what visual ASMR is. Then we will look at how competition between the bottom and top halves maintains a user’s attention.
Think: Which half of the screen steals your attention back?
Real engagement
- Canvas8 reports that about 54% of Gen-Zs feel[1] that multiple videos at the same time hold their attention well.
- Adweek, in their 2021 survey, reported that about 94% Gen-Z and around 84% Millennials[2] watch a second screen while watching a TV show.
- Zebracat, a video creating company, reports (source unverified[3]) that split-screen video creators increased by 36% in 2025, and videos with standardized storytelling formats (like split-screen content) showed 38% higher engagement.
- According to a 2025 paper with experimental research[4], there is a 4% increase in user engagement from split-screen content, and users are more satisfied with it compared to single-screen videos (source and explanation below). Split-screen videos are also, counter-intuitively, NOT distracting because of their ASMR content. And neither was there any change in information recall. So, no damage, but more engagement.
In their 2025 paper submitted to MIPRO, researchers from Quinlan School of Business Loyola University Chicago, United States, conducted a study on this split-screen format of videos using eye-tracking tools to measure how and where viewers are engaging with the videos and how they feel about the video. Unsurprisingly, they found that viewers were more engaged and satisfied with split-screen content compared to only subtitles or no split-screen content, and split-screen content had a calming effect. Contrary to intuition, the split-screen content did not reduce nor increase information recall from the videos compared to no split-screen or just subtitles for the content. Again, contrary to intuition, the split-screen did not create cognitive overload and thus was not fatiguing.
In their study, the second screen had a positive impact on engagement and emotion. Users were more engaged and felt more positively about the content compared to videos that did not have a second screen.
There are very interesting patterns in their study.

When comparing videos with a second screen versus the same videos without a second screen, the engagement is 4% higher. It is also 4% higher compared to just subtitles. Subtitles and second screen show 2% increase in engagement rate compared to just subtitles, but 2% decrease in engagement compared to just the second screen. This decrease appears to emerge from cognitive overstimulation and associated fatigue of subs + asmr + video content.

Users also showed a 4% increase in positive emotional response because of the second screen.
The magic no. 4.
You can expect a 4% increase in user engagement with split-screen content.

You can expect a 4% increase in user engagement with split-screen ASMR content.
To understand the mechanics, read on.
Visual ASMR
If you’ve been online for long enough, you’ve seen people comment that a certain video is so satisfying to watch. That satisfaction usually comes from highly predictable but neatly organized visual stimulation that is often seen in the process of painting a wall perfectly, or sorting screws into their correct sizes, or mixing colored powders, or using pressure water jets to clean a dirty carpet. That content creates the ASMR feeling.
ASMR stands for autonomous sensory meridian response. It is a tingling sensation in the head and scalp that comes along with a “positive high” or a euphoric feeling upon listening to certain types of sounds. Whispering and wet sounds during speech typically induce this ASMR response. Auditory ASMR tends to be more intense than visual ASMR, but visual ASMR usually suits the split-screen reel format better.
Visual ASMR is very quick to process and appreciate, so people tend to watch it when they come to realize that there is a strong payoff/reward. People can predict that this reward will be delivered to their brain (the so-called dopamine hit) within a few seconds. From the very first frame on the screen, there is progress toward the payoff. So this continuous growth toward the payoff captures attention.
The 2 halves of the video – attention ping-pongs between the 2 as artificial competition creates stimulation
The first half is the “content”, usually a human voice in the context of a story or an explanation. This is the content offered by the creator. The visual ASMR is leveraged to win the attention battle – to steal your attention back when it starts to dwindle.
Stimulus competition is at the heart of why split-screen drives engagement. Our attention system is simply a way to prioritize and select 1 stimulus over another. It is a way to resolve competition between multiple competitors. When you hear 2 voices, 1 with your name and 1 with a random word, your attention will mostly prioritize your name.
The 2 halves are 2 streams of attention in competition with each other, but that competition shifts attention between the 2 streams to keep a user locked in on the whole video.

Contrary to popular belief[5], short distractions, switching tasks, and paying attention to a small variety of information is good to stay focused for a long time. When we choose just one type of task and monotonously focus on it for a while, we get habituated (unresponsive) and reduce attention to details (and changes). Our minds wander and then become prone to distractions.
The competition for attention capture has increased. More things online are competing for your attention. This steady increase in the battle for attention eventually birthed the speed economy.
The speed economy – the reason we need an attention “re-capture” mechanism
The speed at which our attention shifts from 1 thing to another is a learned behavior from getting used to many things competing for our attention. We’ve been trained to shift our attention to maximize reward. And because we are used to it, we expect the reward to be delivered quickly. This is seen in faster food delivery, faster pacing in movies, more options to select from, faster expectations of online transactions. Even faster failures – if something has to fail – like a payment, or an attempt to watch a TV show, it has to be fast, too. For example, we are habituated to making decisions about watching a 10-hour season within the first few seconds of watching an episode. Our decision is based on a very small time window as a heuristic for commitment.

So, now that our attention is trained to find the most rewarding stimulus, there is a problem.
===> How do you recover a user’s attention as it drops when the brain realizes the reward is too slow? That’s the job of the visual ASMR in the second half of the screen.
The second half of the screen is the artificial competition that is created to recapture attention. And the nature of this artificial competition is such that one can get habituated to it, so attention will drop again and go back to the other half of the screen. This way, there is a continuous feedback loop between the 2 halves of the screen. Both capture dwindling attention.
The content creator eventually wins attention because instead of the user scrolling away as attention drops, attention is re-captured by the same video. Essentially, this competition between the 2 halves (the 2 stimuli) win against some other video that would appear after scrolling. Its an artiificial conflict to win against someone else’s content.
Method to maximize stimulation
We can think of rapidly switching between 2 things to do as a way to increase overall stimulation. In the case of these reels, 1 reel at a time may not be enough. This is similar to how we end up scrolling through reels while watching a rerun of a TV show or a very light comedy. In those cases, our attention isn’t fully captured, that is, we are understimulated, and there is surplus attention remaining, which gets occupied by a second screen or some form of multitasking.
So the need for attention to be fully occupied means we increase stimulation. The split-screen format is a solution to this need. More stimulation within 1 single visual frame on a phone. This expectation of speed from the speed economy promotes maximizing stimulation. Faster the reward; higher the stimulation. It is accelerated stimulation. The user pursues either speed or maximum reward, and both approaches fit right into split-screen video content.
Maximizing stimulation is the reason we can often listen to music while studying and regain focus when it gets boring. Or why listening to a podcast while cleaning keeps you engaged in cleaning. The drop in stimulation is literally defined as boredom, and maximizing stimulation is literally a way to stay engaged. So humans instinctively learn to balance their attention needs by multitasking or increasing or decreasing the intensity of their task.
The attention equation – Goldilocks zone of attention
Stimulation need = Primary content stimulation + additional stimulation.
SN = PCS (content) + AS (second screen OR split screen ASMR)
SN is in the user. PCS is created by the content creator’s main content. AS is the split-screen content that recaptures attention.
PCS ⇄ AS | Low AS | High AS |
---|---|---|
Low PCS | Under‑stimulation SN too low → user under‑engaged | Artificial engagement SN marginally meets need → users stay but skim; low content satisfaction |
High PCS | Optimal engagement SN ≈ need → full focus on primary story | Over‑stimulation SN exceeds need → cognitive overload, poor retention |
AS (Additional Stimulation): intensity of the Visual ASMR anchor.
SN (Stimulation Need): individual’s target stimulation level for engagement.
If a user’s SN is equal to the total stimulation from PCS and AS, there is maximum engagement. This is the sweet spot of maximum attention-grabbing capacity. If SN is lower, it is overstimulation with low engagement. If it is higher, there is understimulation with low engagement.
- If the difference between stimulation need and primary content stimulation is not much, additional stimulation is likely to hurt by distracting the user.
- If the additional stimulation is too much, it is cognitive overload, which leads to poor retention of information.
- If the stimulation need is high and the primary content is unstimulating, split-screen videos will increase engagement.

AS is usually captured by 1 of 2 types of attention – the exogenous attention, which responds to distractions and important stimuli, and endogenous attention, which is a motivated, goal-directed focus.
This is the Goldilocks zone that is commonly used in astrophysics to describe a narrow band of orbital ranges that can support life on a planet. It is the small range of orbit around a star where a planet is just far enough to sustain liquid water and an atmosphere, and just close enough to receive heat to sustain chemical reactions that support life. Split-screen videos are in that attention Goldilocks zone.
As a user, or a content creator, the central question to ask is – what is the need for stimulation? Based on that, the overall stimulation can be adjusted by modifying PCS or AS.
Is the split-screen good or bad?
Are we on social media to feel good and use zero brain cells, or are we there to learn?
- The good: Depending on that, zero-brain-cell behavior will likely mean the split-screen content is good for you because it keeps you engaged without thinking.
- The bad: But, for the sake of learning, the split-screen stimulation depends on how stimulating the content itself is and how much your “need for stimulation” is. Out-of-balance stimulation will kill viewership. (There is a personality trait describing this need for stimulation called sensation seeking; you might enjoy reading about that.
Sources
[2]: https://www.adweek.com/brand-marketing/infographic-what-is-everyone-doing-on-their-second-screens/
[3]: https://www.zebracat.ai/post/tiktok-statistics#strongcontent-trendsstrong
[4]: https://docs.mipro-proceedings.com/hci/18_20084_hci.pdf
[5]: https://www.sciencedirect.com/science/article/abs/pii/S0010027710002994

Hey! Thank you for reading; hope you enjoyed the article. I run Cognition Today to capture some of the most fascinating mechanisms that guide our lives. My content here is referenced and featured in NY Times, Forbes, CNET, and Entrepreneur, and many other books & research papers.
I’m am a psychology SME consultant in EdTech with a focus on AI cognition and Behavioral Engineering. I’m affiliated to myelin, an EdTech company in India as well.
I’ve studied at NIMHANS Bangalore (positive psychology), Savitribai Phule Pune University (clinical psychology), Fergusson College (BA psych), and affiliated with IIM Ahmedabad (marketing psychology). I’m currently studying Korean at Seoul National University.
I’m based in Pune, India but living in Seoul, S. Korea. Love Sci-fi, horror media; Love rock, metal, synthwave, and K-pop music; can’t whistle; can play 2 guitars at a time.