Dynamic NeRFs for Soccer Scenes

We investigate the use of neural radiance fields, a recent deep learning-based novel view synthesis method, for reconstructing soccer replays in space and time. This challenging task involves generating new views of a large-scale static environment, the stadium, in which small dynamic elements, the players and the ball interact. Very few open solutions exist, and none of them produces satisfying results. A commercial method with impressive results exists but requires significantly precise conditions, with unknown inner workings. We therefore propose an introduction to the use of the more recent dynamic neural radiance fields models for this task, by considering several camera positions and identifying both key challenges and key helpful components. Due to the absence of existing open data, we construct synthetic environments that we publicly release along with our code, based on a recent open-source framework, Nerfstudio.

Abstract


The long-standing problem of novel view synthesis has many applications, notably in sports broadcasting. Photorealistic novel view synthesis of soccer actions, in particular, is of enormous interest to the broadcast industry. Yet only a few industrial solutions have been proposed, and even fewer that achieve near-broadcast quality of the synthetic replays. Except for their setup of multiple static cameras around the playfield, the best proprietary systems disclose close to no information about their inner workings. Leveraging multiple static cameras for such a task indeed presents a challenge rarely tackled in the literature, for a lack of public datasets: the reconstruction of a large-scale, mostly static environment, with small, fast-moving elements. Recently, the emergence of neural radiance fields has induced stunning progress in many novel view synthesis applications, leveraging deep learning principles to produce photorealistic results in the most challenging settings. In this work, we investigate the feasibility of basing a solution to the task on dynamic NeRFs, i.e., neural models purposed to reconstruct general dynamic content. We compose synthetic soccer environments and conduct multiple experiments using them, identifying key components that help reconstruct soccer scenes with dynamic NeRFs. We show that, although this approach cannot fully meet the quality requirements for the target application, it suggests promising avenues toward a cost-efficient, automatic solution. We also make our work dataset and code publicly available, with the goal to encourage further efforts from the research community on the task of novel view synthesis for dynamic soccer scenes.

Results


Close-up Views


The first scened considered consists of 30 cameras placed around the player and close to him. This resembles typical NeRF benchmark datasets with inward-facing cameras around the content of interest. In these conditions, the models as expected do not require additional components to work well. However, such amount of closeup views would not be available in real-world conditions.

Broadcast-style Views


To consider more realistic conditions, we consider a very similar scene with 20 more distant cameras located at the borders of the field. These views resemble more closely broadcast conditions but exhibit challenges for the models, as they are more distant and the players are smaller. We show that the models can still produce satisfying results, but require additional components to do so, in particular mainly ray importance sampling.

Stadium-wide Views


A third more challenging scene is considered with additional views that might be more easily available in real-world conditions, especially when considering an array of static cameras like done in our work. These views are located at the top of the stadium and are very distant from the players. We show that the models can still produce interesting results but lead to limited applications due to the substantial lack of details in the players.

Paper


Dynamic NeRFs for Soccer Scenes

Sacha Lewin, Maxime Vandegar, Thomas Hoyoux, Olivier Barnich, Gilles Louppe

description arXiv version (11.5 MB)
insert_comment BibTeX
videocam Video Results
save Data
integration_instructions Code

Please send feedback and questions to Sacha Lewin

Citation


@inproceedings{lewinsoccernerfs,
author = {Lewin, Sacha and Vandegar, Maxime and Hoyoux, Thomas and Barnich, Olivier and Louppe, Gilles},
title = {Dynamic NeRFs for Soccer Scenes},
year = {2023},
isbn = {9798400702693},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3606038.3616158},
doi = {10.1145/3606038.3616158},
booktitle = {Proceedings of the 6th International Workshop on Multimedia Content Analysis in Sports},
pages = {113–121},
numpages = {9},
keywords = {dynamic, 3d reconstruction, sports, soccer, scene representation, neural radiance fields},
location = {Ottawa ON, Canada},
series = {MMSports '23}
}

Acknowledgements


We sincerely thank EVS Broadcast Equipment for providing the necessary compute for the various conducted experiments. We also thank the Nerfstudio community for helpful insights. We also thank Joey Litalien for the framework for this website, originally made for Instant-NGP.

Stadium blender model by MrChimp2313 (CC0 License)
Player models from the Adobe Mixamo Repository