Within the realm of biodiversity conservation, the applying of superior machine studying strategies is revolutionizing our skill to research and predict ecological patterns. The GeoLifeCLEF 2024 competitors on Kaggle offered a wonderful platform for making use of these improvements to foretell plant species distribution utilizing complicated, multi-source geospatial knowledge.
Predicting plant species composition and its change in house and time at a superb decision is helpful for a lot of situations associated to biodiversity administration and conservation, enhancing species identification and stock instruments, and academic functions.
This problem goals to foretell plant species in a given location and time utilizing varied doable predictors: satellite tv for pc photos and time collection, climatic time collection, and different rasterized environmental knowledge: land cowl, human footprint, bioclimatic, and soil variables.
To take action, we offer a large-scale coaching set of about 5M plant occurrences in Europe (single-label, presence-only knowledge) in addition to a validation set of about 5K plots and a take a look at set with 20K plots, with all the current species (multi-label, presence-absence knowledge).
The difficulties of the problem embrace multi-label studying from single optimistic labels, robust class imbalance, multi-modal studying, and large-scale.
Members had been tasked with predicting the presence or absence of plant species utilizing a wealthy dataset, which included:
- Satellite tv for pc Imagery: Each RGB and Close to-Infrared (NIR) picture patches are centered at commentary areas.
- Climatic Time Sequence: Information of seasonal vegetation modifications, excessive pure occasions, and land use modifications over the previous twenty years.
- Environmental Rasters: GeoTIFF rasters detailing bioclimatic circumstances, soil traits, elevation, land cowl, and human footprint.
Predicting the plant species current at a given location is useful for a lot of biodiversity administration and conservation situations.
First, it permits for constructing high-resolution maps of species composition and associated biodiversity indicators reminiscent of species range, endangered species, and invasive species. In scientific ecology, the issue is named Species Distribution Modelling.
Furthermore, it may considerably enhance the accuracy of species identification instruments — reminiscent of Pl@ntNet — by lowering the record of candidate species observable at a given website.
Extra usually, it may facilitate biodiversity inventories by growing location-based suggestion providers (e.g., on cellphones), encouraging citizen scientist observers’ involvement, and accelerating the annotation and validation of species observations to supply giant, high-quality knowledge units.
The dataset consists of each species observations and environmental knowledge, essential for predicting plant species distribution.
1. Observations Knowledge
Presence-Absence (PA) Surveys:
- Description: Round 90,000 surveys overlaying roughly 10,000 species of European flora.
- Goal: Helps to handle false-absence points in Presence-Solely knowledge and calibrate fashions to keep away from biases.
- Knowledge Entry: Out there within the file
PresenceAbsenceSurveys/GLC24_PA_metadata_train.csv
.
Presence-Solely (PO) Occurrences:
- Description: Combines round 5 million observations from varied datasets, primarily from the International Biodiversity Data Facility (GBIF).
- Goal: Offers a big quantity of knowledge throughout all research areas however might include sampling biases.
- Knowledge Entry: Out there within the file
PresenceOnlyOccurences/GLC24_PO_metadata_train.csv
.
2. Environmental Knowledge
The environmental knowledge offered consists of varied spatialized geographic and environmental variables that function extra enter options for the mannequin:
Satellite tv for pc Picture Patches:
- Description: 3-band RGB and 1-band NIR photos, every 128×128 pixels at a 10-meter decision.
- Supply: Sentinel2 distant sensing knowledge pre-processed by the Ecodatacube platform.
- Knowledge Entry: Out there within the folder
SatelliteImages/
.
Satellite tv for pc Time Sequence:
- Description: As much as 20 years of seasonal values for six satellite tv for pc bands (R, G, B, NIR, SWIR1, and SWIR2).
- Supply: Landsat distant sensing knowledge.
- Knowledge Entry: Out there within the folder
SatelliteTimeSeries/
.
Environmental Rasters:
- Bioclimatic Rasters: 19 low-resolution rasters (30 arcsec, ~1km) for local weather variables generally utilized in species distribution modeling.
- Soil Rasters: 9 pedologic rasters describing soil properties (e.g., pH, clay content material).
- Elevation: Excessive-resolution raster (~30 meters) describing elevation.
- Land Cowl: Medium-resolution raster (~500m) describing land cowl courses.
- Human Footprint: Low-resolution rasters (~1km) describing human-induced environmental pressures over time.
- Knowledge Entry: Out there within the folder
EnvironmentalRasters/
.
Hyperlink for the info: https://www.kaggle.com/competitions/geolifeclef-2024/data
Our method utilized a multi-model structure to course of and analyze the various knowledge varieties successfully:
1.UNet for NIR Photographs
UNet’s structure, with a contracting path to seize context and an increasing path for exact localization, is good for NIR photos, serving to to spotlight vegetative patterns and stress ranges in crops.
2.Swin Transformer for RGB Photographs
The Swin Transformer adapts the Transformer structure for picture knowledge, specializing in native and international dependencies in RGB photos. This mannequin excels at capturing detailed textural and shade data of the terrain.
3.ResNet for Environmental and Time Sequence Knowledge
ResNet, identified for its skill to deal with deep networks by way of residual connections, processes environmental rasters, and time collection knowledge. It extracts patterns and tendencies essential for understanding long-term environmental modifications and differences due to the season.
The mixing of outputs from UNet, Swin Transformer, and ResNet is essential. That is achieved by means of a complicated fusion layer that harmonizes the various knowledge inputs:
- Characteristic Alignment: Ensures that outputs from all fashions are constant in scale and dimension.
- Weighted Sum: Prioritizes extra predictive options by means of weighted mixtures.
- Non-linear Mixture: Applies non-linear activations to seize complicated interactions between completely different knowledge varieties.
This technique permits the ensemble mannequin to leverage the strengths of every part whereas compensating for any particular person weaknesses, thus enhancing general predictive accuracy.
Coaching such a complicated ensemble required cautious planning. We employed strategies like switch studying, the place pre-trained fashions on comparable duties had been tailored to our particular drawback, considerably lowering coaching time and enhancing mannequin efficiency.
The Multimodal Ensemble mannequin options 4 distinct branches to deal with completely different knowledge modalities: Landsat, Bioclim, Environmental Rasters, and Sentinel Satellite tv for pc Picture knowledge. Every department preprocesses its particular kind of enter knowledge, using both modified ResNet18 or Swin Transformer architectures for characteristic extraction. The Landsat and Bioclim branches normalize the enter knowledge and make the most of convolutional layers for characteristic extraction, whereas the Sentinel department leverages a pre-trained Swin Transformer. The options extracted from all branches are concatenated right into a single characteristic vector, combining numerous data sources for enhanced classification. This unified characteristic vector is processed by means of a number of absolutely linked layers with linear transformations and ReLU activations, culminating in a remaining linear layer that maps the options to the output house, offering the ultimate classification predictions based mostly on the realized knowledge representations.
The coaching course of concerned 10 epochs, using the AdamW optimizer with a studying charge of 0.001. We employed a step-based studying charge scheduler with a step dimension of 30 and a gamma of 0.1 to adaptively alter the educational charge, optimizing mannequin efficiency. The Cross-Entropy Loss criterion was used to measure the accuracy of our predictions throughout coaching, making certain strong efficiency throughout varied courses.
Our analysis metrics targeted on accuracy and F1-score, making certain that the fashions not solely predicted the presence of species precisely but additionally minimized false positives and false negatives.
The mannequin demonstrated glorious predictive efficiency, with vital enhancements over baseline fashions utilized in earlier competitions. Our mannequin achieved a Micro-F1 rating of 0.31, demonstrating its efficacy in managing imbalanced classification duties. This efficiency metric underscores our mannequin’s skill to stability precision and recall throughout all courses successfully. Notably, our efforts culminated in securing a place among the many high 20 opponents, reflecting the robustness and competitiveness of our method on this rigorous tutorial problem. The power to precisely predict plant species distribution helps conservation efforts by figuring out vital areas for defense and restoration.
This undertaking highlights the potential of integrating varied deep-learning fashions to handle complicated ecological challenges. As expertise advances, we anticipate extra refined functions of AI in environmental science, additional aiding conservation efforts worldwide.