Products & Publications

Products and Publications

An important goal of ML4Earth is to build and maintain an international community within the AI4EO domain. We are pursuing this goal together with the Space Agency of the German Aerospace Center. We are doing so by providing benchmark data sets as a service to the community, while offering the community opportunities for training and networking.

Benchmark Datasets

We are offering benchmark datasets for a wide range of application scenarios. Benchmark data products are pre-labeled EO datasets that are delivered together with a baseline solution, i.e., pre-trained AI models. With such benchmarks, the user does not need to start from scratch and train their own AI models, which can become very time-consuming and resource-intensive. Instead, developers can use the benchmarks as a head start and directly apply the delivered models to their science domain or advance the training of their own AI models.

Click Here to Visit EARTHNETS

Community Building

Every year, we are organizing workshops, each covering one of the research domains, and hackathons to foster knowledge exchange democratize AI methods for EO applications. With these events, we share our methodological expertise to enable more researchers to exploit Copernicus and other EO data sources.

The DLR Space Agency is hosting a Slack Channel to build the ML4Earth Community. At this moment, more than 170 students, research staff, and stakeholders have joined the “ML4Earth” channel to discuss, ask questions, network, or have informal exchange.

Would you like to join the channel?

Just send a short e-mail to Dr. Matthias Kahl at matthias.kahl@tum.de

ML4Earth Community Building Symbol Image

Publications

Publications of our research team will be listed here.

Featured paper

Accurate and up-to-date mapping of the human population is fundamental for a wide range of disciplines, from effective governance and establishing policies to disaster management and crisis dilution. The traditional method of gathering population data through census is costly and time-consuming. Recently, with the availability of large amounts of Earth observation data sets, deep learning methods have been explored for population estimation; however, they are either limited by census data availability, inter-regional evaluations, or transparency. In this paper, we present an end-to-end interpretable deep learning framework for large-scale population estimation at a resolution of 1 km that uses only the publicly available data sets and does not rely on census data for inference. The architecture is based on a modification of the common ResNet-50 architecture tailored to analyze both image-like and vector-like data. Our best model outperforms the baseline random forest model by improving the RMSE by around 9% and also surpasses the community standard product, GHS-POP, thus yielding promising results. Furthermore, we improve the transparency of the proposed model by employing an explainable AI technique that identified land use information to be the most relevant feature for population estimation. We expect the improved interpretation of the model outcome will inspire both academic and non-academic end users, particularly those investigating urbanization or sub-urbanization trends, to have confidence in the deep learning methods for population estimation.

Latest papers

Urban development in South America has experienced significant growth and transformation over the past few decades. South America’s urban development and trees are closely interconnected, and tree cover within cities plays a vital role in shaping sustainable and resilient urban landscapes. However, knowledge of urban tree canopy (UTC) coverage in the South American continent remains limited. In this study, we used high-resolution satellite images and developed a semi-supervised deep learning method to create UTC data for 888 South American cities. The proposed semi-supervised method can leverage both labeled and unlabeled data during training. By incorporating labeled data for guidance and utilizing unlabeled data to explore underlying patterns, the algorithm enhances model robustness and generalization for urban tree canopy detection across South America, with an average overall accuracy of 94.88% for the tested cities. Based on the created UTC products, we successfully assessed the UTC coverage for each city. Statistical results showed that the UTC coverage in South America is between 0.76% and 69.53%, and the average UTC coverage is approximately 19.99%. Among the 888 cities, only 357 cities that accommodate approximately 48.25% of the total population have UTC coverage greater than 20%, while the remaining 531 cities that accommodate approximately 51.75% of the total population have UTC coverage less than 20%. Natural factors (climatic and geographical) play a very important role in determining UTC coverage, followed by human activity factors (economy and urbanization level). We expect that the findings of this study and the created UTC dataset will help formulate policies and strategies to promote sustainable urban forestry, thus further improving the quality of life of residents in South America.
Trees in urban areas act as carbon sinks and provide ecosystem services for residents. However, the impact of urbanization on tree coverage in South America remains poorly understood. Here, we make use of very high resolution satellite imagery to derive urban tree coverage for 882 cities in South America and developed a tree coverage impacted (TCI) coefficient to quantify the direct and indirect impacts of urbanization on urban tree canopy (UTC) coverage. The direct effect refers to the change in tree cover due to the rise in urban intensity compared to scenarios with extremely low levels of urbanization, while the indirect impact refers to the change in tree coverage resulting from human management practices and alterations in urban environments. Our study revealed the negative direct impacts and prevalent positive indirect impacts of urbanization on UTC coverage. In South America, 841 cities exhibit positive indirect impacts, while only 41 cities show negative indirect impacts. The prevalent positive indirect effects can offset approximately 48% of the direct loss of tree coverage due to increased urban intensity, with full offsets achieved in Argentinian and arid regions of South America. In addition, human activity factors play the most important role in determining the indirect effects of urbanization on UTC coverage, followed by climatic and geographic factors. These findings will help us understand the impact of urbanization on UTC coverage along the urban intensity gradient and formulate policies and strategies to promote sustainable urban development in South America.
Semantic understanding of high-resolution remote sensing (RS) images is of great value in Earth observation, however, it heavily depends on numerous pixel-wise manually-labeled data, which is laborious and thereby limits its practical application. Semi-supervised semantic segmentation (SSS) of RS images would be a promising solution, which uses both limited labeled data and dominant unlabeled data to train segmentation models, significantly mitigating the annotation burden. The current mainstream methods of remote sensing semi-supervised semantic segmentation (RS-SSS) utilize the hard or soft pseudo-labels of unlabeled data for model training and achieve excellent performance. Nevertheless, their performance is bottlenecked because of two inherent problems: irreversible wrong pseudo-labels and long-tailed distribution among classes. Aiming at them, we propose a decoupled weighting learning (DWL) framework for RS-SSS in this study, which consists of two novel modules, decoupled learning and ranking weighting, corresponding to the above two problems, respectively. During training, the decoupled learning module separates the predictions of the labeled and unlabeled data to decrease the negative impact of the self-training of the wrongly pseudo-labeled unlabeled data on the supervised training of the labeled data. Furthermore, the ranking weighting module tries to adaptively weight each pseudo-label of the unlabeled data according to its relative confidence ranking in its pseudo-class to alleviate model bias to majority classes as a result of the long-tailed distribution. To verify the effectiveness of the proposed DWL framework, extensive experiments are conducted on three widely-used RS semantic segmentation datasets in the semi-supervised setting. The experimental results demonstrate the superiority of our method to some state-of-the-art SSS methods. Our code will be available at https://github.com/zhu-xlab/RS-DWL
The quantification of predictive uncertainties helps to understand where the existing models struggle to find the correct prediction. A useful quality control tool is the task of detecting out-of-distribution (OOD) data by examining the model’s predictive uncertainty. For this task, deterministic single forward pass frameworks have recently been established as deep learning models and have shown competitive performance in certain tasks. The unique combination of spectrally normalized weight matrices and residual connection networks with an approximate Gaussian process (GP) output layer can here offer the best trade-off between performance and complexity. We utilize this framework with a refined version that adds spectral batch normalization and an inducing points approximation of the GP for the task of OOD detection in remote sensing image classification. This is an important task in the field of remote sensing, because it provides an evaluation of how reliable the model’s predictive uncertainty estimates are. By performing experiments on the benchmark datasets Eurosat and So2Sat LCZ42, we can show the effectiveness of the proposed adaptions to the residual networks (ResNets). Depending on the chosen dataset, the proposed methodology achieves OOD detection performance up to 16% higher than previously considered distance-aware networks. Compared with other uncertainty quantification methodologies, the results are on the same level and exceed them in certain experiments by up to 2%. In particular, spectral batch normalization, which normalizes the batched data as opposed to normalizing the network weights by the spectral normalization (SN), plays a crucial role and leads to performance gains of up to 3% in every single experiment. For reproducibility, the code can be found here: https://github.com/ChrisKo94/DUE_Land_Cover.

Products & Publications

In the remote sensing community, extracting buildings from remote sensing imagery has triggered great interest. While many studies have been conducted, a comprehensive review of these approaches that are applied to optical and synthetic aperture radar (SAR) imagery is still lacking. Therefore, we provide an in-depth review of both early efforts and recent advances, which are aimed at extracting geometrical structures or semantic attributes of buildings, including building footprint generation, building facade segmentation, roof segment and superstructure segmentation, building height retrieval, building-type classification, building change detection, and annotation data correction. Furthermore, a list of corresponding benchmark datasets is given. Finally, challenges and outlooks of existing approaches as well as promising applications are discussed to enhance comprehension within this realm of research.
The mass loss of glaciers outside the polar ice sheets has been accelerating during the past several decades and has been contributing to global sea-level rise. However, many of the mechanisms of this mass loss process are not well understood, especially the calving dynamics of marine-terminating glaciers, in part due to a lack of high-resolution calving front observations. Svalbard is an ideal site to study the climate sensitivity of glaciers as it is a region that has been undergoing amplified climate variability in both space and time compared to the global mean. Here we present a new high-resolution calving front dataset of 149 marine-terminating glaciers in Svalbard, comprising 124 919 glacier calving front positions during the period 1985–2023 (https://doi.org/10.5281/zenodo.10407266, Li et al., 2023). This dataset was generated using a novel automated deep-learning framework and multiple optical and SAR satellite images from Landsat, Terra-ASTER, Sentinel-2, and Sentinel-1 satellite missions. The overall calving front mapping uncertainty across Svalbard is 31 m. The newly derived calving front dataset agrees well with recent decadal calving front observations between 2000 and 2020 (Kochtitzky and Copland, 2022) and an annual calving front dataset between 2008 and 2022 (Moholdt et al., 2022). The calving fronts between our product and the latter deviate by 32 ± 65 m on average. The R2 of the glacier calving front change rates between these two products is 0.98, indicating an excellent match. Using this new calving front dataset, we identified widespread calving front retreats during the past four decades, across most regions in Svalbard except for a handful of glaciers draining the ice caps Vestfonna and Austfonna on Nordaustlandet. In addition, we identified complex patterns of glacier surging events overlaid with seasonal calving cycles. These data and findings provide insights into understanding glacier calving mechanisms and drivers. This new dataset can help improve estimates of glacier frontal ablation as a component of the integrated mass balance of marine-terminating glaciers.
Localizing desired objects from remote sensing images is of great use in practical applications. Referring image segmentation, which aims at segmenting out the objects to which a given expression refers, has been extensively studied in natural images. However, almost no research attention is given to this task of remote sensing imagery. Considering its potential for real-world applications, in this article, we introduce referring remote sensing image segmentation (RRSIS) to fill in this gap and make some insightful explorations. Specifically, we created a new dataset, called RefSegRS, for this task, enabling us to evaluate different methods. Afterward, we benchmark referring image segmentation methods of natural images on the RefSegRS dataset and find that these models show limited efficacy in detecting small and scattered objects. To alleviate this issue, we propose a language-guided cross-scale enhancement (LGCE) module that utilizes linguistic features to adaptively enhance multiscale visual features by integrating both deep and shallow features. The proposed dataset, benchmarking results, and the designed LGCE module provide insights into the design of a better RRSIS model. The dataset and code will be available at https://gitlab.lrz.de/ai4eo/reasoning/rrsis.
Object detection (OD) is an essential and fundamental task in computer vision (CV) and satellite image processing. Existing deep learning methods have achieved impressive performance thanks to the availability of large-scale annotated datasets. Yet, in real-world applications, the availability of labels is limited. In this article, few-shot OD (FSOD) has emerged as a promising direction, which aims at enabling the model to detect novel objects with only few of them annotated. However, many existing FSOD algorithms overlook a critical issue: when an input image contains multiple novel objects and only a subset of them are annotated, the unlabeled objects will be considered as background during training. This can cause confusions and severely impact the model’s ability to recall novel objects. To address this issue, we propose a self-training-based FSOD (ST-FSOD) approach, which incorporates the self-training mechanism into the few-shot fine-tuning process. ST-FSOD aims to enable the discovery of novel objects that are not annotated and take them into account during training. On the one hand, we devise a two-branch region proposal networks (RPNs) to separate the proposal extraction of base and novel objects. On the another hand, we incorporate the student-teacher mechanism into RPN and the region-of-interest (RoI) head to include those highly confident yet unlabeled targets as pseudolabels. Experimental results demonstrate that our proposed method outperforms the state of the art in various FSOD settings by a large margin. The codes will be publicly available at: https://github.com/zhu-xlab/ST-FSOD.
Jingliang Hu, Rong Liu, Danfeng Hong, Andrés Camero, Jing Yao, Mathias Schneider, Franz Kurz, Karl Segl, and Xiao Xiang Zhu
Teo Beker, Homa Ansari, Sina Montazeri, Qian Song, Xiao Xiang Zhu (2022)