|Paper title||Rule-Based, Noisy Labels for Overhead Imagery Segmentation|
|Form of presentation||Poster|
Within the past decade, modern statistical and machine learning methods significantly advanced the field of computer vision. For a significant portion, success stories trace back to training deep artificial neural networks on massive amounts of labeled data. However, generating human labor-intensive annotations for the ever-growing volume of earth observation data at scale renders Sysiphus-like.
In the realm of weakly-supervised learning, methods operating on sparse labels attempt to exploit a small set of annotated data in order to train models for inference on the full domain of input. Our work presents a methodology to utilize high resolution geospatial data for semantic segmentation of aerial imagery. Specifically, we exploit high-quality LiDAR measurements to automatically generate a set of labels for urban areas based on rules defined by domain experts. The top of the figure attached provides a visual sample for such automatized classifications in suburbs: vegetation (dark madder purple), roads (lime green), buildings (dark green), and bare land (yellow).
A challenge to the approach of auto-generated labels is introduction of noise due to inaccurate label information. Through benchmarks and improved architecture design of the deep artificial neural networks, we provide insights on success and limitations of our approach. Remarkably, we demonstrate that models trained on inaccurate labels have the ability to surpass annotation quality when referenced to ground truth information (cf. bottom of figure attached).
Moreover, we investigate boosting of results when weak labels get auto-corrected by domain expert-based noise reduction algorithms. We propose technology interacting with deep neural network architectures that allows human expertise to re-enter weakly supervised learning at scale for semantic segmentation in earth observation. Beyond the presentation of results, our contribution @LPS22 intends to start a vital scientific discussion on how the approach substantiated for LiDAR-based automatic annotation might get extended to other modalities such as hyper-spectral overhead imagery.