Data Centers specialized on Nucleotide, Plant and Environmental Data
The Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben and the German Plant Phenotyping Network (DPPN) have jointly initiated the Plant Genomics and Phenomics Research Data Repository (e!DAL-PGP) as an infrastructure to publish plant research data. e!DAL-PGP provides access to cross-domain, plant-related research data that exceeds existing repositories due to their size or scope. e!DAL-PGP is registered as research data repository at BioSharing.org, re3data.org and OpenAIRE as valid EU Horizon 2020 open data archive.
Contact in GFBio context: Contact in GFBio context: Dr. Daniel Arend, Dr. Uwe Scholz, Dr. Matthias Lange Contact e!DAL-PGP e!DAL-PGP - extended profileService Description
A desktop-application is available to upload large datasets to e!DAL-PGP. For smaller publications, also an intuitive web interface is provided. The user authentication features ELIXIR AAI, GOOGLE or ORCID accounts. Accepted data domains are among others image collections from plant phenotyping, unfinished genomes, genotyping data, visualizations of morphological plant models, data from mass spectrometry as well as software and documents. All datasets must be supplied by technical metadata and reviewed by two scientific and one administrative reviewer for (meta-)data quality and reusability. The user is guided by e-mails through the review process. Data submitters can define an optional embargo date. Every published dataset is referenced with a DOI to guarantee a FAIR-aware and long-term stable citation.
Citation: Arend et al. - PGP repository: a plant phenomics and genomics data publication infrastructure. Database. 2016. https://doi.org/10.1093/database/baw033
Service Levels
Data Set | Data Package | Data Management | Research Objects |
✔ | ✔ | ✔ |
Data Submission Formats
Data | No limitations, almost any file format is accepted |
Metadata | No limitations, all (standardized or other) formats are accepted |
Data Accessibility
Public access points | GFBio, PGP Repository Content Page of citable DOIs, DataCite Search |
Long-term availability | Minimum 10 years |
Data Publication Services
Data Citation | DOI |
✔ | ✔ |
The European Nucleotide Archive provides a comprehensive record of the world's nucleotide sequencing information, covering raw sequencing data, sequence assembly information and functional annotation as well as metadata (sample description, experimental setup) and interpreted information (annotations). ENA is developed and operated by the EMBL-European Bioinformatics Institute (EMBL-EBI), an academic research institute based in the UK and part of the European Molecular Biology Laboratory (EMBL). ENA is one of the three databases that make up the International Nucleotide Sequence Database Collaboration (INSDC).
Contact in GFBio context: Dr. Frank Oliver Glöckner, Dr. Ivaylo Kostadinov ENA ENA-extended profileService Description
The GFBio Brokerage Service provides the timely, standards-compliant deposition of all molecular sequence data into the public repositories of the INSDC. The key components of the service include: (a) Support for metadata standardization, curation and quality control, (b) negotiation of embargo periods, including communication with INSDC, (c) parallel submission of environmental metadata to PANGAEA, (d) cross-linking sequence data and environmental data (PANGAEA) via accession number and DOI.
Service Levels
Data Set | Data Package | Data Management | Research Objects |
✔ | ✔ | ✔ |
Data Submission Formats
Data | Sequence data has to be in one of the formats supported by ENA. |
Metadata | Molecular sequence metadata should be compliant with the standards of the “Minimum information about any (x) gene sequence” (MIxS). It can be put in manually, uploaded in a GCDJ/JSON format or as a tab-separated TSV. Appropriate templates are available for all formats. |
Data Accessibility
Public access points | ENA, data is exchanged daily with members of the INSDC Consortium, DNA Data Bank of Japan (DDBJ) and National Center for Biotechnology Information (NCBI) |
Long-term availability | Unlimited |
Data Publication Services
Data Citation | DOI |
ENA issues Accession Numbers, which are to be included as citations in publications, which use the respective datasets. Accession numbers are available for different granularity levels (e.g. study/datasets, samples, etc.). ENA recommends citing the study Accession Number (i.e. dataset identifier) throughout the text of the publication, more details under Citing ENA Data.
The Data Publisher for Earth & Environmental Science is a globally leading information system, long term archive and data publisher for spatial geoscientific, biological and environmental data. Data published by PANGAEA origins from a broad range of subdisciplines from earth system research such as biological sciences, chemistry, physics with a special focus an earth sciences and environmental sciences. Jointly hosted by the Centre for Marine Environmental Sciences (MARUM) at the University Bremen and the Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research (AWI), PANGAEA is laid out as a permanent facility, guaranteeing the long-term availability and accessibility of archived data and metadata in secure and machine readable formats. It is also a World Data Center (WDC-PANGAEA) and accredited by ICSU World Data System.
Contact in GFBio context: Dr. Michael Diepenbroek, Dr. Robert Huber, Dr. Janine Felden Contact PANGAEA PANGAEA – extended profileService Description
Data management, including curation, long term archiving and data publication for geoscientific, biological and environmental data. Curation includes user support, definition of data set granularity, quality control, archival format transformation, metadata description and control. Supported data types are tabular data but also binary data e.g multimedia. PANGAEA supports long-tail data that is data acquired by individual scientists as well as data collected during small to large scale research projects. Every archived data set is citable and attributed by a persistent DOI.
Service Levels
Data Set | Data Package | Data Management | Research Objects |
✔ | ✔ | ✔ |
Data Submission Formats
Data | Preferably spreadsheets (CSV), databases, binary files, almost any file format is accepted |
Metadata | All (standardized or other) formats are accepted |
Data Accessibility
Public access points | GFBio, PANGAEA, Institutional landing pages of citable stable URIs, GBIF, OBIS, GEOSS and others |
Standardised exchange formats | INSPIRE, ISO 19115, Darwin Core, Dublin Core |
Data formats | ASCII, EXCEL, Darwin Core |
Long-term availability | Unlimited, certified (WDS) long term archive |
Data Publication Services
Data Citation | DOI |
✔ | ✔ |
Data Centers at Natural Science Collections
The Botanic Garden and Botanical Museum Berlin, Freie Universität Berlin with its Research Group Biodiversity Informatics is a centre of biodiversity research in Europe, housing extensive scientific collections of herbarium specimens (about 3.5 million), one of the world's largest living plants collections, a DNA Bank as well as the most complete botanical library in Germany.
Contact in GFBio context: Anton Güntsch, David Fichtmüller and Maren Gleisberg Contact the BGBM GFBio team BGBM – extended profileService Description
Data archiving for research projects is focusing on
Type 1a | Botanical specimen data; (Physical) DNA Storage (under discussion at BGBM); Referenced multimedia objects |
Type 1b | Botanical observational data; Referenced multimedia objects |
Type 2 | Botanical systematics and monographic works |
Type 4 | RAW data (data sets and/or data packages) only if well documented and in formats and structures appropriate for long-term archiving, without further data management required |
A preference will be given on data which fall under the geographic and taxonomic research foci of the BGBM (BGBM - extended profile, BGBM: Research). The data archiving and publication includes management processes with JACQ , reBiND -workflow and the EDIT Platform for Cybertaxonomy as well as the data quality service platform and transformation and import services provided by BGBM.
Service Levels
Data Set | Data Package | Data Management | Research Objects |
✔ | ✔ | ✔ | ✔ |
Data Submission Formats
Data | Preferably via BGBM collection data form for Botanical collections, DNA sample collections and/or Tissue collections which can be found in the GFBio collection of recommended data submission templates and/or standardized formats of any kind (e.g. ABCD or DwC-A files); Spreadsheets (CSV, excel-files, image files); Export files from external EDIT platform installations |
Metadata | EML (GFBio submission package leaflet), ABCD, DarwinCore, DublinCore, SDD |
Data Accessibility
Public access points | GFBio, BioCASe Data Access Services at the BGBM, Institutional landing pages of citable stable URIs, BiNHum, BioCASE, GBIF, Europeana and others |
Standardised exchange formats | XML-files in ABCD, DarwinCore, EML, SDD standard; Web services of the EDIT Platform for Cybertaxonomy |
Data formats | TXT, CSV, XML |
Long-term availability | Unlimited |
Data Publication Services
Data Citation | DOI |
✔ | ✔ via ZB MED/DataCite publication |
The Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig with Database and IT-Department is one of the largest biological resource centers worldwide. Its culture and tissue collections and DNA Bank currently comprise almost 40,000 items, including about 20,000 different bacterial and 5,000 fungal strains, 700 human and animal cell lines, 800 plant cell lines, 1,000 plant viruses and antisera, and 4,800 different types of bacterial genomic DNA.
Contact in GFBio context: Prof. Dr. Jörg Overmann and Dr. Carola Söhngen Contact the DSMZ GFBio team DSMZ – extended profileService Description
Data archiving for research projects is focusing on
Type 1 | Data accompanied by the deposit of a biological resource within the DSMZ collections (Deposit in the DSMZ) |
Type 2 | Data describing microbial diversity (e.g. taxonomic classification, morphology, physiology, cultivation, origin, natural habitat), according to our profile description (DSMZ - extended profile, DSMZ) |
Service Levels
Data Set | Data Package | Data Management | Research Objects |
✔ | ✔ | ✔ | ✔ |
Data Submission Formats
Data | Via DSMZ online accession form or spreadsheets (CSV), preferably standardized formats of any kind, MySQL Dumps |
Metadata | EML (GFBio submission package leaflet), ABCD, DarwinCore, MCL |
Data Accessibility
Public access points | GFBio, BioCASe Data Access Services at the DSMZ, GBIF,BacDive |
Standardised exchange formats | XML-files in ABCD, DarwinCore, EML standard. Web services of the DSMZ accession platform and BacDive platform |
Data formats | Text, CSV, XML |
Long-term availability | Unlimited |
Data Publication Services
Data Citation | DOI |
(✔) | ✔ via GBIF publication |
The Leibniz Institute for Research on Evolution and Biodiversity, Berlin is a research museum within the Leibniz Association. It is one of the most significant research museums worldwide focusing on biodiversity, evolution and geo-sciences. The zoological, paleontological and mineralogical collections of the Museum are directly linked to Research and comprise more than 30 million items. In addition, the Museum has an Animal Sounds Archive containing approximately 120,000 animal sound recordings and a DNA Bank. The Library of the MfN is one of the most important reference libraries in zoology in the German-speaking world. Research at the Leibniz Institute for Research on Evolution and Biodiversity is organised in four Science Programmes ("Forschungsbereiche"): Evolution and Geoprocesses, Collection Development and Biodiversity Discovery, Digital World and Information Science, Public Engagement with Science.
Contact in GFBio context: Dr. Mareike Petersen and Falko Glöckler Contact the MfN GFBio team MfN – extended profileService Description
Type 1 | Occurrence data and associated media originating from zoological, environmental and paleontological studies. |
Type 2 | A data type driven focus (independent from scientific domain) will be on archiving multi-media objects, traits, taxonomic and observation data, e.g. originating from Citizen Science initiatives (MfN - extended profile). |
Service Levels
Data Set | Data Package | Data Management | Research Objects |
✔ | ✔ | ✔ | ✔ |
Data Submission Formats
Data | Pre-structured, delimited plain-text files (CSV) and spreadsheet files (Microsoft Excel, Open Document Spreadsheet); SQL dump files; Binary files (e.g. images, audio, volume data); All submissions preferably in open formats |
Metadata | EML (GFBio submission package leaflet), ABCD, DarwinCore |
Data Accessibility
Public access points | GFBio, BioCASe Data Access Services at the MfN, Institutional landing pages of citable stable URIs, BioCASe, GBIF, BiNHum, GeoCASE, Europeana |
Standardised exchange formats | XML-files in ABCD, DarwinCore, EML standard |
Data formats | Text, CSV, XML |
Long-term availability | Unlimited (minimum guaranteed time period of 15 years) |
Data Publication Services
Data Citation | DOI |
✔ | ✔ |
The Senckenberg Gesellschaft für Naturforschung (SGN) conducts research in bio- and geosciences within six research institutes and three natural history museums in Germany. The mission of the SGN is to make science and scientific findings accessible to the public through teaching, publishing, museums and special exhibitions in Frankfurt, Dresden, Görlitz and Tübingen. Senckenberg's research activity is divided into four large research fields: Biodiversity, Systematics and Evolution, Biodiversity and Environment, Biodiversity and Climate & Biodiversity and Earth System Dynamics.
With about 40 million objects/items in currently more than 200 collections the SGN has one of the largest scientific collections in Germany. The objects involve a herbarium, zoological, anthropological, paleontological and mineralogical collections plus a DNA Bank. SGN is part of the Leibniz association.
Service Description
Data archiving for research projects is focusing on botanical, zoological and anthropological data, according to our profile description (SGN - extended profile).
Type 1a | Collection data, together with the deposit of physical objects, referenced multimedia objects. |
Type 1b | Observation and occurrence data, species monitoring projects, referenced multimedia objects. |
Type 3 | Any type of triple-structured data, referenced multimedia objects. |
Service Levels
Data Set | Data Package | Data Management | Research Objects |
✔ | ✔ | ✔ | ✔ |
Data Submission Formats
Data | Flexible; Preferably standardized formats like ABCD, EML; Spreadsheets (CSV) |
Metadata | EML (GFBio submission package leaflet), ABCD, DarwinCore |
Data Accessibility
Public access points | GFBio, BioCASe Data Access Services at the SGN, Institutional landing pages of citable stable URIs, BioCASE, GBIF, Metacat, SeSam/AQUiLA |
Standardised exchange formats | XML-files in ABCD, DarwinCore, EML standard |
Data formats | Text, CSV, XML |
Long-term availability | Unlimited (minimum guaranteed time period of 10 years) |
Data Publication Services
Data Citation | DOI |
✔ | ✔ |
The State Museum of Natural History Stuttgart is one of two State museums in Baden-Württemberg, southern Germany. With its important zoological, paleontological and mineralogical collections and herbarium containing more than 11 million specimens (fossils, minerals, plants, insects, molluscs, and vertebrates) the museum does possess an excellent foundation for biosystematic research. Due to its diverse international scientific contacts and relations, the natural history museum significantly contributes to the identity of the State Baden-Württemberg.
Contact in GFBio context: Dr. Joachim Holstein and Dr. Juan Carlos Monje Contact the SMNS GFBio team SMNS – extended profileService Description
Data archiving for research projects focusing on botanical, zoological and paleontological according to our profile description (SMNS - extended profile).
Type 1a | Collection data, together with the deposit of physical objects, referenced multimedia objects. |
Type 1b | Observation and occurrence data, species monitoring projects, referenced multimedia objects. |
Type 2 | Taxon reference list data and checklist data. |
Service Levels
Data Set | Data Package | Data Management | Research Objects |
✔ | ✔ | ✔ | ✔ |
Data Submission Formats
Data | (a) Export files from external installations of DiversityCollection, DiversityTaxonNames (b) any spreadsheets (CSV, excel-files), structured according existing DWB import schemes (see SMNS GitHub, SNSB GitHub and ZFMK GitHub) (c) spreadsheets and databases appropriate to create new DWB import schemes); Image formats have to be agreed for submission |
Metadata | EML (GFBio submission package leaflet), ABCD, DarwinCore |
Data Accessibility
Public access points | GFBio, BioCASe Data Access Services at the SNSB, BioCASe, GBIF, BiNHum, DTN Taxon List Services and others |
Standardised exchange formats | XML-files in ABCD, DarwinCore, EML standard; Web services of the DWB platform |
Data formats | Text, CSV, XML |
Long-term availability | Unlimited (minimum guaranteed time period of 15 years) |
Data Publication Services
Data Citation | DOI |
(✔) | ✔ via GBIF publication |
The Staatliche Naturwissenschaftliche Sammlungen Bayerns, München with SNSB IT Center are a research institution for natural history in Bavaria. They encompass five State Collections (zoology, botany, paleontology and geology, mineralogy, anthropology and paleoanatomy), the Botanical Garden Munich-Nymphenburg and eight museums with public exhibitions in Munich, Bamberg, Bayreuth, Eichstätt and Nördlingen. Our research focuses mainly on the past and present bio- and geodiversity and the evolution of animals, fungi and plants. To achieve this we have large zoological, anthropological, paleontological and mineralogical collections and herbarium (almost 35,000,000 specimens) as well as a DNA Bank.
Contact in GFBio context: Dr. Dagmar Triebel and Tanja Weibulat Contact the SNSB GFBio team SNSB – extended profileService Description
Data archiving for research projects is focusing on botanical, mycological, zoological and paleontological data. The data archiving includes management processes with Diversity Workbench (DWB) databases involved. The data publication is done via the DWB network at the SNSB (SNSB – extended profile).
Type 1a | Collection data, together with the deposit of physical objects, referenced multimedia objects. |
Type 1b | Observation and occurrence data, species monitoring projects, referenced multimedia objects. |
Type 2 | Taxon reference list data and checklist data. |
Type 3 | Any type of triple-structured data, referenced multimedia objects. |
Type 4 | RAW data (data sets and/or data packages) if well documented and in formats and structures appropriate for long-term archiving, without further data management requirements. |
Service Levels
Data Set | Data Package | Data Management | Research Objects |
✔ | ✔ | ✔ | ✔ |
Data Submission Formats
Data | (a) Export files from external installations of DiversityCollection, DiversityTaxonNames and DiversityDescriptions, (b) any spreadsheets (CSV, excel-files), structured according existing DWB import schemes (see SMNS GitHub, SNSB GitHub, ZFMK GitHub or example templates for data submission in the GFBio collection of recommended data submission templates) (c) any spreadsheets and databases in an accessible (not legacy) format appropriate to create new DWB import schemes; Image formats have to be agreed for submission |
Metadata | EML (GFBio submission package leaflet), ABCD, DarwinCore, DublinCore, SDD |
Data Accessibility
Public access points | GFBio, BioCASe Data Access Services at the SNSB, Institutional landing pages of citable stable URIs, BioCASe, GBIF, BiNHum, DTN Taxon List Services and others |
Standardised exchange formats | XML-files in ABCD, DarwinCore, EML, SDD standard; Web services of the DWB platform |
Data formats | Text, CSV, XML; Agreed image data formats |
Long-term availability | Unlimited, with tape archiving support through LRZ |
Data Publication Services
Data Citation | DOI |
✔ | ✔ via ZB MED/DataCite publication |
The Zoological Research Museum Alexander Koenig – Leibniz Institute for Animal Biodiversity, Bonn carries out species-related biodiversity research and ensures the transfer of knowledge to researchers and the general public.
Core stocks are the zoological collections of more than 5 million specimens, tissue collections and a DNA Bank. The research focusses on performing an inventory of the zoological species diversity on earth.
The results of research and the collections are made accessible to the public with permanent and temporary exhibitions and using other methods for public education.
Service Description
Data archiving for research projects is focusing on terrestrial animals (ZFMK – extended profile). The data archiving includes management processes with Diversity Workbench (DWB) databases and Morph∙D∙Base involved.
Type 1a | Collection data, together with the deposit of physical objects, referenced multimedia objects. |
Type 1b | Observation and occurrence data, species monitoring projects, referenced multimedia objects. |
Type 2 | Taxon reference list data and checklist data. |
Type 3 | Tissue/DNA biobank storage, referenced multimedia objects (e.g. sound data, 3d image stacks/volume data). |
Type 4 | RAW data (data sets and/or data packages) only if well documented and in formats and structures appropriate for long-term archiving, without further data management required. |
Service Levels
Data Set | Data Package | Data Management | Research Objects |
✔ | ✔ | ✔ | ✔ |
Data Submission Formats
Data | (a) Export files from external installations of DiversityCollection and DiversityTaxonNames, (b) any spreadsheets (CSV, excel-files), structured according existing DWB import schemes (see SMNS GitHub, SNSB GitHub, ZFMK GitHub or example templates for data submission in the GFBio collection of recommended data submission templates) (c) any spreadsheets and databases in an accessible (not legacy) format appropriate to create new DWB import schemes |
Metadata | EML, ABCD, DarwinCore, GGBN, SDD, DublinCore |
Data Accessibility
Public access points | GFBio, BioCASe Data Access Services at the ZFMK, Institutional landing pages of citable stable URIs, GBIF, DTN Taxon List Services, Morph∙D∙Base and others |
Standardised exchange formats | XML-files in ABCD, DarwinCore, EML, SDD standard; Web services of the DWB platform |
Data formats | TXT, CSV, XML; Original and derivate image, audio, and sound data formats |
Long-term availability | Unlimited (minimum guaranteed time period of 10 years) |
Data Publication Services
Data Citation | DOI |
✔ | ✔ via ZB MED/DataCite publication |