It’s World Cup season, so that means that even articles about machine learning have to have a football angle. Today’s concession to the beautiful game is a system that takes 2D videos of matches and recreates them in 3D so you can watch them on your coffee table (assuming you have some kind of augmented reality setup, which you almost certainly don’t). It’s not as good as being there, but it might be better than watching it on TV.
The “Soccer On Your Tabletop” system takes as its input a video of a match and watches it carefully, tracking each player and their movements individually. The images of the players are then mapped onto 3D models “extracted from soccer video games,” and placed on a 3D representation of the field. Basically they cross FIFA 18 with real life and produce a sort of miniature hybrid.
Considering the source data — two-dimensional, low-resolution and in motion — it’s a pretty serious accomplishment to reliably reconstruct a realistic and reasonably accurate 3D pose for each player.
Now, it’s far from perfect. One might even say it’s a bit useless. The characters’ positions are estimated, so they jump around a bit, and the ball doesn’t really appear much, so everyone appears to just be dancing around on a field. (That’s on the to-do list.)
But the idea is great, and this is a working if highly limited first shot at it. Assuming the system could ingest a whole game based on multiple angles (it could source the footage directly from the networks), you could have a 3D replay available just minutes after the actual match concluded.
Not only that, but wouldn’t it be cool to be able to gather round a central location and watch the game from multiple angles? I’ve always thought one of the worst things about watching sports on TVs is everyone is sitting there staring in one direction, seeing the exact same thing. Letting people spread out, pick sides, see things from different angles to analyze strategies — that would be fantastic.
All we need is for someone to invent a perfect, affordable holographic display that works from all angles and we’re set.
The research is being presented at the Computer Vision and Pattern Recognition conference in Salt Lake City, and it’s a collaboration between Facebook, Google and the University of Washington.