The OPENBIB project, maintained by the German Kompetenznetzwerk Bibliometrie,
provides access to curated OpenAlex data with a focus on the German research landscape.
Curated data is provided for following entities:
- Address information 🏛️
- Publishers 📚
- Funding information 📄
- Document types 🗂️
- Transformative Agreements 📑️
- Authors 👩🎓 (tba)
Annual snapshots from the OPENBIB project are openly available to users of the Kompetenznetzwerk Bibliometrie, via the Open Scholarly Data Warehouse of the SUB Göttingen and Zenodo.
The current release is based on the August 2024 snapshot of OpenAlex. The OPENBIB snapshot is offered in both CSV and JSONL format.
The following figure compares the assignment of publications to German institutions in OpenAlex and in OPENBIB. The figure includes publications between 2014 and 2024, but is not restricted to specific document or publication types. While OpenAlex combines rule-based and machine learning algorithms to match address affiliations in documents with institutions, OPENBIB applies a pattern matching approach. The figure only displays institutions that are present in both OpenAlex and OPENBIB and can be assigned a unique Research Organisation Registry (ROR) ID.
Fig.1: Publications assigned to German institutions in OpenAlex and OPENBIB based on ROR-Matching. Only publications published between 2014 and 2024 are considered.The following figure compares the classification of article and reviews in OpenAlex and in OPENBIB, limited to the publication years 2014 to 2024. OpenAlex counts more article and reviews than OPENBIB because the OPENBIB classifier is stricter when classifying research contributions. Articles and reviews are assigned to German institutions exclusively via OPENBIB address information.
Fig.2: Classification of article and reviews in journals for German institutions in OpenAlex and by OPENBIB. Only publications published between 2014 and 2024 are considered.The following figure compares the number of publications with funding information of the German Research Foundation in OpenAlex and in OPENBIB. Only publications funded by the German Open-Access-Publikationskosten program are considered. Publications are assigned to German institutions exclusively via OPENBIB address information. No restrictions were placed on the type of documents or publications, however most of the records are journal articles.
Fig.3: Publications containing funding information of the German Research Foundation per German institution in OpenAlex and by OPENBIB. Only publications published between 2020 and 2024 are considered.-
If you are a user of the Kompetenznetzwerk Bibliometrie you can access the data snapshot via the KB data infrastructure hosted by FIZ Karlsruhe.
-
For big scholarly data analysis in a Google Cloud environment, you can use the Open Scholarly Data Warehouse maintained by the SUB Göttingen.
-
Alternatively, you can download the snapshot from Zenodo: https://zenodo.org/records/15308680.
A list of all entities and fields included in the OPENBIB snapshot can be found here.
- A jupyter notebook containing code examples for working with the OPENBIB snapshot in the KB data infrastructure can be found here.
- A jupyter notebook containing code examples for working with the OPENBIB snapshot in the Open Scholarly Data Warehouse of the SUB Göttingen can be found here.
To export a complete OPENBIB snapshot from the KB database, use the following code. A VPN connection to FIZ Karlsruhe is required to access the database.
from scripts.export_files import OpenBibDataRelease
openbib_snapshot = OpenBibDataRelease(
export_directory='openbib_export',
export_file_name='kbopenbib_release',
host='host',
database='database',
port='port',
user='user',
password='password'
)
openbib_snapshot.make_archive(export_format='csv')
If you see mistakes, want to suggest changes or submit feature requests, please create an issue.
Data is made available under the CC0 license.
Haupka, N., Culbert, J., Donner, P., Jahn, N., Lenke, C., Mayr, P., Meier, A., Mittermaier, B., Scheidt, B., Stahlschmidt, S., & Taubert, N. (2025). OPENBIB: Selected curated open metadata based on OpenAlex (0.1) [Data set]. Kompetenznetzwerk Bibliometrie. https://doi.org/10.5281/zenodo.15308680
Nick Haupka, SUB Göttingen. nick.haupka@sub.uni-goettingen.de