Official implementation for
Multi-scale species richness estimation with deep learning
Victor Boussange, Bert Wuyts, Philipp Brun, Johanna T. Malle, Gabriele Midolo, Jeanne Portier, Théophile Sanchez, Niklaus E. Zimmermann, Irena Axmanová, Helge Bruelheide, Milan Chytrý, Stephan Kambach, Zdeňka Lososová, Martin Večeřa, Idoia Biurrun, Klaus T. Ecker, Jonathan Lenoir, Jens-Christian Svenning, Dirk Nikolaus Karger. arXiv: 2507.06358 (2025)
If you ❤️ the project, consider giving it a ⭐️.
We provide a self-contained tutorial to predict species richness maps from the paper's pretrained deep SAR model:
To retrain the deep SAR model, follow these steps:
- Ensure you have all required data (biodiversity data and environmental features) under
data/, and install the project environment. - Generate training data using
scripts/data_processing/compile_eva_chelsa.py(see alsoscripts/data_processing/compile_gift_chelsa.pyfor test data generation). - Train the ensemble model with
train.py. The main architecture isDeep4PWeibull, defined indeepsar/deep4pweibull.py. - Generate predictions using
project.py(see also Inference).
deepsar/: Utility functions for generating training samples and defining deep SAR models.scripts/: Pipelines for data processing, model training, and mapping predictions.figures/: Scripts to generate figures for the paper.data/: Contains the data associated with the project.
To install dependencies and set up the environment, ensure you have uv installed, then run:
uv sync
uv pip install torch --torch-backend=auto
uv pip install -e .Anonymised vegetation plot data for training is located at data/processed/EVA/anonymised and consists of:
plot_data.parquet: Metadata for vegetation plots.species_data.parquet: Anonymised species names per plot.
To obtain the full dataset, request access at EVA database.
Regional checklists from the GIFT database, harmonized with EVA, are provided as a test dataset under data/processed/GIFT/anonymised.
Bioclimatic variables from the CHELSA dataset were used as predictors. To download them (e.g., for use with pretrained weights), navigate to data/CHELSA/ and run:
wget --no-host-directories --force-directories --input-file=envidat.txtPretrained weights for the ensembled deep SAR model deep4pweibull are available at scripts/results/train/. See Quick Start: Inference for usage instructions.
If you use the anonymised data, please cite:
@misc{boussange2025,
title={Multi-scale species richness estimation with deep learning},
author={Victor Boussange and Bert Wuyts and Philipp Brun and Johanna T. Malle and Gabriele Midolo and Jeanne Portier and Théophile Sanchez and Niklaus E. Zimmermann and Irena Axmanová and Helge Bruelheide and Milan Chytrý and Stephan Kambach and Zdeňka Lososová and Martin Večeřa and Idoia Biurrun and Klaus T. Ecker and Jonathan Lenoir and Jens-Christian Svenning and Dirk Nikolaus Karger},
year={2025},
eprint={2507.06358},
archivePrefix={arXiv},
primaryClass={q-bio.PE},
url={https://arxiv.org/abs/2507.06358},
}@article{weigelt2020,
author = {Weigelt, Patrick and König, Christian and Kreft, Holger},
title = {GIFT – A Global Inventory of Floras and Traits for macroecology and biogeography},
journal = {Journal of Biogeography},
volume = {47},
number = {1},
pages = {16-43},
doi = {https://doi.org/10.1111/jbi.13623},
eprint = {https://onlinelibrary.wiley.com/doi/pdf/10.1111/jbi.13623},
year = {2020}
}@article{chytry2016,
title = {European Vegetation Archive (EVA): An Integrated Database of European Vegetation Plots},
author = {Chytr{\'y}, Milan and Hennekens, Stephan M. and {Jim{\'e}nez-Alfaro}, Borja and Knollov{\'a}, Ilona and Dengler, J{\"u}rgen and Jansen, Florian and Landucci, Flavia and Schamin{\'e}e, Joop H.J. and A{\'c}i{\'c}, Svetlana and Agrillo, Emiliano and Ambarl{\i}, Didem and Angelini, Pierangela and Apostolova, Iva and Attorre, Fabio and Berg, Christian and Bergmeier, Erwin and Biurrun, Idoia and {Botta-Duk{\'a}t}, Zolt{\'a}n and Brisse, Henry and Campos, Juan Antonio and Carl{\'o}n, Luis and {\v C}arni, Andra{\v z} and Casella, Laura and Csiky, J{\'a}nos and {\'C}u{\v s}terevska, Renata and Daji{\'c} Stevanovi{\'c}, Zora and Danihelka, Ji{\v r}{\'i} and De Bie, Els and {de Ruffray}, Patrice and De Sanctis, Michele and Dickor{\'e}, W. Bernhard and Dimopoulos, Panayotis and Dubyna, Dmytro and Dziuba, Tetiana and Ejrn{\ae}s, Rasmus and Ermakov, Nikolai and Ewald, J{\"o}rg and Fanelli, Giuliano and {Fern{\'a}ndez-Gonz{\'a}lez}, Federico and FitzPatrick, {\'U}na and Font, Xavier and {Garc{\'i}a-Mijangos}, Itziar and Gavil{\'a}n, Rosario G. and Golub, Valentin and Guarino, Riccardo and Haveman, Rense and Indreica, Adrian and I{\c s}{\i}k G{\"u}rsoy, Deniz and Jandt, Ute and Janssen, John A.M. and Jirou{\v s}ek, Martin and K{\k a}cki, Zygmunt and Kavgac{\i}, Ali and Kleikamp, Martin and Kolomiychuk, Vitaliy and Krstivojevi{\'c} {\'C}uk, Mirjana and Krstono{\v s}i{\'c}, Daniel and Kuzemko, Anna and Lenoir, Jonathan and Lysenko, Tatiana and Marcen{\`o}, Corrado and Martynenko, Vassiliy and Michalcov{\'a}, Dana and Moeslund, Jesper Erenskjold and Onyshchenko, Viktor and Pedashenko, Hristo and {P{\'e}rez-Haase}, Aaron and Peterka, Tom{\'a}{\v s} and Prokhorov, Vadim and Ra{\v s}omavi{\v c}ius, Valerijus and {Rodr{\'i}guez-Rojo}, Maria Pilar and Rodwell, John S. and Rogova, Tatiana and Ruprecht, Eszter and R{\=u}si{\c n}a, Solvita and Seidler, Gunnar and {\v S}ib{\'i}k, Jozef and {\v S}ilc, Urban and {\v S}kvorc, {\v Z}eljko and Sopotlieva, Desislava and Stan{\v c}i{\'c}, Zvjezdana and Svenning, Jens-Christian and Swacha, Grzegorz and Tsiripidis, Ioannis and Turtureanu, Pavel Dan and U{\u g}urlu, Emin and Uogintas, Domas and Valachovi{\v c}, Milan and Vashenyak, Yulia and Vassilev, Kiril and Venanzoni, Roberto and Virtanen, Risto and Weekes, Lynda and Willner, Wolfgang and Wohlgemuth, Thomas and Yamalov, Sergey},
year = {2016},
journal = {Applied Vegetation Science},
volume = {19},
number = {1},
pages = {173--180},
issn = {1654-109X},
doi = {10.1111/avsc.12191},
}