Contributions:
- Extensive experiments on 2D video approximation and sparse 3D reconstruction.
- Analysis of activation function performance under varying data sparsity conditions.
- Implementation of a geometric initialisation scheme for Gaussian activation.
Authors: Ruben Schenk, Bruce Balfour, Alexandra Trofimova
Institution: ETH Zurich, Institute for Visual Computing
Overview
Coordinate-based Multi-Layer Perceptrons (MLPs) have gained significant traction in recent years for their ability to approximate complex signals such as 2D images, 3D shapes, and even 4D spatiotemporal data. However, a key aspect that remains underexplored is the implicit bias introduced by various activation functions within these networks. This project delves into the effects of different activation functions on the representational capacity of coordinate-MLPs, with a particular focus on their performance in 2D video approximation and sparse 3D reconstruction tasks.
Motivation
While periodic activation functions like SIREN have proven effective for learning continuous signals, they introduce inherent biases that may not be optimal for certain data types, especially in sparse or irregularly sampled datasets. The objective of this study was to evaluate the performance of non-periodic activation functions (Gaussian, Laplacian, Quadratic) in comparison to SIREN, and to propose a geometric initialization scheme aimed at enhancing stability in 3D approximation tasks.
Methodology
The experimental framework was divided into two main parts:
2D Video Approximation:
- Videos were represented as sequences of 2D frames, each mapped to a continuous function using coordinate-MLPs.
- Non-periodic activations (Gaussian, Laplacian, Quadratic) were implemented and compared against SIREN in terms of signal fidelity and convergence speed.
Sparse 3D Reconstruction:
- Sparse 3D data was generated using a set of irregularly sampled points derived from volumetric data.
- The proposed geometric initialization scheme for Gaussian activation aimed to mitigate overfitting while maintaining fidelity in the reconstructed geometry.
Results
- The Gaussian activation demonstrated superior robustness in sparse settings, particularly in 3D reconstruction tasks where data was highly irregular and discontinuous.
- Laplacian activation achieved lower error rates in 2D video approximation but exhibited slower convergence compared to SIREN.
- The geometric initialization scheme effectively reduced training instability in Gaussian-activated networks, resulting in smoother reconstructions and less artifacting.
Conclusion
This study highlights the nuanced impact of activation functions on coordinate-MLPs, suggesting that while periodic functions like SIREN remain powerful, non-periodic functions can offer significant advantages in certain data regimes, particularly when combined with tailored initialization strategies. Future work could extend these findings by integrating hybrid activation schemes or exploring the effects of network depth and width on signal representation.