Alright, so I decided to dive into predicting the Rune vs. Djokovic match. Here’s how it went down, step-by-step, from the initial thought to (attempting) to put it into action.

First things first: Gathering Data
Okay, so the very first thing I did was hit up the usual spots for tennis data. I started scraping match results, focusing on recent performances of both Rune and Djokovic. Think sets won, sets lost, their head-to-head record, and even stuff like their performance on different court surfaces.
- Got match history from a couple of tennis stats websites.
- Pulled in data on their rankings and recent tournament results.
- Even tried to find some info on their injury history (always important!).
Cleaning and Prepping the Data – Ugh, the Grunt Work
This part? Not so fun. The data was all over the place. Dates in different formats, inconsistencies in player names…you name it. I spent a good chunk of time cleaning everything up. This involved:
- Standardizing date formats.
- Making sure player names were consistent (Djokovic vs. N. Djokovic vs. Novak – you get the idea).
- Handling missing data – sometimes I imputed it based on averages, sometimes I just had to toss it.
After cleaning, I organized the data into a format that I could actually use. Think spreadsheets and maybe a bit of Python to get things in order.
Figuring Out the Key Factors
Alright, with the data somewhat under control, I started thinking about what actually matters in a tennis match. I decided to focus on these key areas:
- Head-to-Head Record: How have they performed against each other in the past?
- Recent Form: How have they been playing in recent tournaments? Wins and losses, obviously, but also things like how many sets they’re winning and losing.
- Court Surface: Are they better on clay, grass, or hard courts?
- Ranking: A general indicator of skill level.
I gave each of these factors a weighted score. Head-to-head record, for example, got a slightly higher weight because I figured it was a good indicator of their psychological edge against each other.
Building a “Prediction Model” (I Use the Term Loosely)
Okay, so “model” might be a bit of an overstatement. Basically, I created a simple scoring system based on the factors I identified. Here’s how it worked:
- I assigned scores to each player based on their performance in each category. For example, if Djokovic had won their last three matches, he’d get a higher score in the “head-to-head” category.
- I multiplied each score by its weight.
- I added up all the weighted scores to get a final score for each player.
- The player with the higher score was my “predicted” winner.
It wasn’t fancy, but it was something.
The Actual “Prediction”
So, after crunching the numbers (using my super-sophisticated scoring system), I made my prediction. Drumroll, please…

Based on the data and my weighting system, my “model” predicted Djokovic would win.
What Happened?
Well, that’s where things get interesting. Did I get it right? Did I completely whiff? Honestly, I’m not going to tell you the result here. What matters is the process, right?
Lessons Learned
This whole exercise was a good reminder that:
- Data cleaning is a HUGE part of any project.
- Even simple models can be insightful.
- Predicting anything is hard, especially something as unpredictable as a tennis match.
Would I use this “model” to bet my life savings? Absolutely not. But it was a fun way to explore data and think about what factors influence a tennis match. And who knows, maybe with some tweaking and more sophisticated techniques, I could actually build a decent prediction model someday.
That’s all folks, thanks for coming to my Ted Talk!