Okay, so here’s the deal. I messed around with some “free fights” stuff the other day, and figured I’d share what I learned. It’s not pretty, but hey, it worked…sort of.

First off, I grabbed a bunch of random fighting game footage off YouTube. I’m talking everything from Street Fighter to Mortal Kombat. Just searched for “fighting game tournament” and started downloading. I know, maybe not the most ethical, but it was for science…mostly.
Then, I needed something to analyze it all. I ended up using OpenCV with Python. It’s a bit clunky at first, but once you get the hang of it, it’s pretty powerful. I installed it using pip, like pip install opencv-python
. Easy peasy.
Next, I wanted to isolate the fighters. My brilliant plan? Background subtraction. I figured if I could get a relatively static background, I could subtract it from the current frame to get the fighters. So, I took the first few frames of each video and averaged them together to create a “background” image. Then, in a loop, I subtracted that background from each subsequent frame.
It kinda worked. The characters definitely showed up, but there was a lot of noise. Shadows, lighting changes, the crowd going nuts in the background…it all messed with the subtraction. I tried a bunch of different thresholding techniques ( in OpenCV), but nothing really cleaned it up perfectly.
After that, I tried to track the movement of these blob-like fighter shapes. I used contours to identify the edges of the subtracted regions (), and then calculated the center of each contour. The idea was to track the X and Y coordinates of these centers over time to see who was moving and how fast.
This is where things got really messy. Because the background subtraction wasn’t perfect, the contour detection was picking up all sorts of random stuff. Plus, when fighters got close together, their contours would merge, and I’d lose track of them. I tried filtering contours based on size and aspect ratio, but it was a constant battle of tweaking parameters.
I also messed around with optical flow (), thinking I could track the movement of pixels more directly. This gave me a ton of data, but I struggled to make sense of it. It was like trying to drink from a firehose. All these little arrows pointing in different directions… I needed a way to aggregate them and figure out which direction the overall movement was going.
Eventually, I gave up on trying to perfectly isolate and track the fighters. Instead, I focused on detecting changes in the video. I calculated the difference between successive frames () and looked for areas where the difference was above a certain threshold. This highlighted areas where there was movement, which was good enough for my purposes.

With the areas of movement identified, I just counted how many pixels were changing in each frame. If the number of changing pixels exceeded a certain threshold, I considered it an “action event.” I then marked those frames in the video. Super simple, super crude, but it highlighted the moments when someone was throwing a punch or getting knocked down.
Lessons learned? Background subtraction is great in theory, but in practice, it’s a pain to get right. Contour detection is useful, but it’s easily fooled by noise. Optical flow is powerful, but it generates a lot of data. Sometimes, the simplest approach is the best approach.
- Grab some footage. YouTube is your friend.
- Install OpenCV.
pip install opencv-python
. - Experiment with background subtraction. But don’t expect miracles.
- Try contour detection and tracking. Be prepared for a lot of tweaking.
- Consider optical flow. If you’re feeling ambitious.
- Don’t underestimate simple frame differencing. Sometimes, less is more.
Will I be building a real-time fight analysis system anytime soon? Probably not. But it was a fun way to waste a few hours and learn some new stuff. And hey, maybe this inspires someone else to take it further. Good luck!