Published February 27, 2026 | Version v1.0.0
Standard Open

FEGA Metadata Technical Report: EGA v2 model

  • 1. ROR icon European Bioinformatics Institute
  • 2. ROR icon European Molecular Biology Laboratory
  • 3. ROR icon Centre for Genomic Regulation
  • 4. ROR icon Instituto Superior Técnico
  • 5. ROR icon BioData.pt
  • 6. INESC-ID
  • 1. ROR icon University of Oslo
  • 2. ROR icon National Bioinformatics Infrastructure Sweden
  • 3. ROR icon Science for Life Laboratory
  • 4. EDMO icon Uppsala University
  • 5. National Bioinformatics Infrastructure Sweden (NBIS)
  • 6. ROR icon Centre for Genomic Regulation
  • 7. ROR icon European Bioinformatics Institute
  • 8. ROR icon Canada's Michael Smith Genome Sciences Centre
  • 9. ROR icon University of Lausanne
  • 10. ROR icon SIB Swiss Institute of Bioinformatics
  • 11. ROR icon University of Łódź
  • 12. ROR icon Nantes Université
  • 13. ROR icon Alexander Fleming Biomedical Sciences Research Center
  • 14. ROR icon Harokopio University of Athens
  • 15. ROR icon Inserm
  • 16. ROR icon University of Tartus
  • 17. ROR icon University of Silesia in Katowice

Description

The Federated EGA (FEGA) Metadata Working Group (MWG) has designed a new, process-oriented metadata model intended to replace the EGA v1 model, used by Central EGA (CEGA) and FEGA nodes using Local EGA. It prepares the FEGA network for FAIR, linked-data interoperability across human omics. The model is under active development, and this report accompanies that work in progress. What exists today is an abstract specification and a first set of JSON Schema drafts with embedded JSON-LD contexts; detailed serialisations, production deployments and full validator roll-out are future milestones, not completed deliverables.

Core entities include Biomaterial, Protocol, Process, Datafile, Dataset, Policy, Data Access Committee (DAC), Study, Cohort, Project, Protocol Collection and DCAT-style Catalog objects. Validation through the open-source ELIXIR Biovalidator ensures both syntactic and selected semantic checks (e.g., ontology term validation). Use-case workshops in genomics, microarrays, proteomics, and microbiomes confirmed the model's flexibility without needing schema rewrites.

A transparent GitHub repository contains the schemas, documentation, versioning, automated workflows, and a change process aligned to FEGA's network governance. As the first stepping stones for future migration of the current EGA v1 model to the EGA v2 model, we propose completing the set of model schemas, an initial v1-to-v2 model mapper, and a test implementation at CEGA. Finally, we present a phased adoption by FEGA nodes and other stakeholders. This approach would culminate when the maturity and efficacy of the model have been proved end-to-end and the model shift can happen in the production environments.

Files

FEGA Metadata Technical Report v1.0.0.pdf

Files (14.2 MB)

Name Size Download all
md5:fd165891ad48a92c6e4481a3a4a485c5
422.8 kB Download
md5:8d1efae15e50037fc9ddd249a14b7661
5.9 MB Preview Download
md5:657ea405916b09cf6f69516f1d3ffa9d
1.9 MB Download
md5:268833cb0fde2ee99da24ccd86d303a7
7.8 kB Download
md5:56812c2874e89c7c0838a29109026dd5
252.9 kB Download
md5:1145eddae13b6a02dc96cea96a3c9034
4.2 kB Download
md5:f79d357f21edb41d192ccf417ff6f820
200.6 kB Download
md5:c587777d8ba44a98b3b66144d759cb9e
3.9 kB Download
md5:cdab96de4ab421f14da1b4cc3d605dab
55.2 kB Download
md5:c282ec2b77f261b8261398ffde84f1cd
1.7 kB Download
md5:b0502541b51c5eebabb4ed26dfd42d33
49.5 kB Download
md5:f364d7b49c5cdfb2072deebac64301da
4.6 kB Download
md5:77321bb5f266aa6518ec3aa171c67521
224.9 kB Download
md5:f5a569c1fa63489c288d44f864070d3e
6.6 kB Download
md5:007837434dd66f4b2c587f350b0f550d
411.8 kB Download
md5:0e8ef53178b6f36cae80571e6f467e31
1.9 kB Download
md5:3fe0b11794d59cbb713af7dc2e15bae9
96.7 kB Download
md5:97e5ec85c2bc5df452635892d9b53fd1
317.8 kB Download
md5:9909e86c56de4153b7701976b45d6eba
602.1 kB Download
md5:57fd306dbb90cfd282a9166c17ad3ec8
418.1 kB Download
md5:ee22a71e18bf5436f9e210927288d499
458 Bytes Download
md5:ef115fbc230f3268cc60843321ed5bac
9.8 kB Download
md5:bb03e11d2c50bf896950d40f51355124
270.3 kB Download
md5:0e1b2ec76d4b1a25ba2ca27660424298
1.2 kB Download
md5:4dbfbda9f18a59f948acde50aa6057f4
43.2 kB Download
md5:d748132060e8f194b31b335666596a66
3.4 kB Download
md5:4b4c28a2c303c00e10610643f07bc461
83.5 kB Download
md5:67653e2d1ef4e1e574e2dbf5d50730c6
180.9 kB Download
md5:c212224a200b6dff7b5ba0f5e616ecc9
313.9 kB Download
md5:a45b37ba663e79c8347d3923fa6f2724
4.3 kB Download
md5:3cef11afd1a1932101d5614d4852265c
224.9 kB Download
md5:db944e1ad8448596c017d2a3909491a3
6.0 kB Download
md5:c332910d09dbd02e269bb73ed16585c0
226.4 kB Download
md5:8a4f89fc8deeb5465949d306b40c744a
462.3 kB Download
md5:25bb11c366a023b933fc0cb1be5db7d1
1.3 MB Download
md5:16d286bf086f5aad1baf06e8ed0c1a4e
158.0 kB Download

Additional details

Funding

European Commission
HEREDITARY - HetERogeneous sEmantic Data integratIon for the guT-bRain interplaY 101137074
European Commission
ELIXIR - European Life-science Infrastructure for Biological Information 211601

Software

Repository URL
https://github.com/M-casado/fega-metadata-schema
Programming language
JSONLD , Python
Development Status
Active