fitness influencer trends: early insights
April 26, 2025

why this topic?
I recently completed a machine learning course and wanted to build on what I learned by working with real-world data. After looking through countless datasets without finding the right fit, I ended up creating my own using fitness influencer content on Instagram.
There's pattern's that influencers follow--whether they are successful or not. Let's dive into the content and see if we can uncover what those patterns look like.
the data and insights
I needed real data from a social media platform, so I used the Instaloader package to pull the ten most recent posts from ten fitness influencers on Instagram. I would've loved to retrieve more, but after running into rate limits, I decided to settle for a smaller dataset.
The raw dataset was useful, but I expanded it by adding image and caption sentiment scores using pre-trained sentiment models. Of course, these models aren't perfect, but they effectively capture the general emotional tone.
Now, let's see what the data reveals about the influencers.
how do influencers sound in captions?
From my own experience using social media, captions are rarely negative. This is supported here as positive and neutral captions makeup up 96%, with positive being the most common. It's safe to say positive posts are widespread, but which ones drive the most engagement? I'm curious about how negative posts compare to positive ones.
what content are influencers posting?
Photos are almost non-existent when you're trapped in the Reel wormhole. With audio and motion, videos are naturally more engaging. This chart shows that most influencers post videos. I wonder which media performs better--or if it takes a combination of the two.
how does engagement correlate?
Engagement is pretty much linear--more likes equals more comments. But the type of comment is more interesting. If a post's caption encourages users to respond, I'd expect less genuine comments. This could be a caption such as 'Comment "Diet" in the comments below to receive my diet plan via DM.' It could also be a polarizing post or something so phenomenal that users feel compelled to say something.
followers per user
This brought up a lot of questions when I compared this chart with the engagement chart. I had assumed that influencers with more followers would naturally have higher likes and comments, but that wasn't always the case. Some users are actually outperforming others who have far more followers.
I then looked at the media type chart to figure out why users with higher follower counts weren't performing better than those with measurably fewer followers. I couldn't find a direct correlation, so this will definitely require a deeper dive.
how is emotion distrubted across post media?
Happy and angry posts make up the majority of post types, but I think 'anger' here often reflects the intensity on people's faces during workouts. One thing to note is that the videos themselves weren't evaluated--just the thumbnail images. Also, not all posts show human emotion; some feature pets, which the model was not trained to recognize emotion for. This metric might be dropped or reevaluted, but it is interesting to look at.
what's next
Based on the ideas outlined while reviewing the data visuals, it's clear this could branch in many directions. To stay focused, I'll stick with the null hypothesis: Fitness influencer content has no significant variation.
Disproving this should help answer my curiosities--like why some influencers with lower follower counts outperform those with more followers--and also highlight which posting strategies seem to work (or not).
The dataset itself may still need refining. I'll have to decide whether to include non-facial posts, depending on how they're currently labeled, and whether video thumbnails should be excluded since they don't represent the full clip.
In the next post in this series, I'll start testing the null hypothesis and work toward answering the questions raised here.
shoutouts
Special thanks to the open-source tools that made this exploration possible!
- Hugging Face for providing easy-to-use models and Python tools
- Instaloader for simplifying Instagram data scraping
- TweetEval the benchmark used to fine-tune the twitter-roberta-base-sentiment model I used for caption sentiment--checkout their paper here
- Dmytro Iakubovskyi for the facial emotion image detection model used for post images