Zum Inhalt wechseln

Data Centers specialized on Nucleotide and Environmental Data


The European Nucleotide Archive provides a comprehensive record of the world's nucleotide sequencing information, covering raw sequencing data, sequence assembly information and functional annotation as well as metadata (sample description, experimental setup) and interpreted information (annotations). ENA is developed and operated by the EMBL-European Bioinformatics Institute (EMBL-EBI), an academic research institute based in the UK and part of the European Molecular Biology Laboratory (EMBL). ENA is one of the three databases that make up the International Nucleotide Sequence Database Collaboration (INSDC).


Contact in GFBio context: Dr. Frank Oliver Glöckner, Dr. Ivaylo Kostadinov      ENA    ENA-extended profile

Service Description

The GFBio Brokerage Service provides the timely, standards-compliant deposition of all molecular sequence data into the public repositories of the INSDC. The key components of the service include: (a) Support for metadata standardization, curation and quality control, (b) negotiation of embargo periods, including communication with INSDC, (c) parallel submission of environmental metadata to PANGAEA, (d) cross-linking sequence data and environmental data (PANGAEA) via accession number and DOI.

Service Levels

Data Set Data Package Data Management Research Objects
 

 

Data Submission Formats

Data

Sequence data has to be in one of the formats supported by ENA.

Metadata

Molecular sequence metadata should be compliant with the standards of the “Minimum information about any (x) gene sequence” (MIxS). It can be put in manually, uploaded in a GCDJ/JSON format or as a tab-separated TSV. Appropriate templates are available for all formats.

 

Data Accessibility

Public access points

ENA, data is exchanged daily with members of the INSDC Consortium, DNA Data Bank of Japan (DDBJ) and National Center for Biotechnology Information (NCBI)

Long-term availability

Unlimited

 

Data Publication Services

Data Citation DOI
   

 

ENA issues Accession Numbers, which are to be included as citations in publications, which use the respective datasets. Accession numbers are available for different granularity levels (e.g. study/datasets, samples, etc.). ENA recommends citing the study Accession Number (i.e. dataset identifier) throughout the text of the publication, more details under Citing ENA Data.



The Data Publisher for Earth & Environmental Science is a globally leading information system, long term archive and data publisher for spatial geoscientific, biological and environmental data. Data published by PANGAEA origins from a broad range of subdisciplines from earth system research such as biological sciences, chemistry, physics with a special focus an earth sciences and environmental sciences. Jointly hosted by the Centre for Marine Environmental Sciences (MARUM) at the University Bremen and the Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research (AWI), PANGAEA is laid out as a permanent facility, guaranteeing the long-term availability and accessibility of archived data and metadata in secure and machine readable formats. It is also a World Data Center (WDC-PANGAEA) and accredited by ICSU World Data System.


Contact in GFBio context: Dr. Michael Diepenbroek, Dr. Robert Huber, Dr. Janine Felden      Contact PANGAEA    PANGAEA – extended profile

Service Description

Data management, including curation, long term archiving and data publication for geoscientific, biological and environmental data. Curation includes user support, definition of data set granularity, quality control, archival format transformation, metadata description and control. Supported data types are tabular data but also binary data e.g multimedia. PANGAEA supports long-tail data that is data acquired by individual scientists as well as data collected during small to large scale research projects. Every archived data set is citable and attributed by a persistent DOI.

Service Levels

Data Set Data Package Data Management Research Objects
 

 

Data Submission Formats

Data

Preferably spreadsheets (CSV), databases, binary files, almost any file format is accepted

Metadata

All (standardized or other) formats are accepted

 

Data Accessibility

Public access points

GFBio, PANGAEA, Institutional landing pages of citable stable URIs, GBIF,  OBIS, GEOSS and others

Standardised exchange formats

INSPIRE, ISO 19115, Darwin Core, Dublin Core

Data formats ASCII, EXCEL, Darwin Core
Long-term availability

Unlimited, certified (WDS) long term archive

 

Data Publication Services

Data Citation DOI



Data Centers at Natural Science Collections


The Botanic Garden and Botanical Museum Berlin, Freie Universität Berlin with its Research Group Biodiversity Informatics is a centre of biodiversity research in Europe, housing extensive scientific collections of herbarium specimens (about 3.5 million), one of the world's largest living plants collections, a DNA Bank as well as the most complete botanical library in Germany.


Contact in GFBio context: Anton Güntsch, David Fichtmüller and Maren Gleisberg      Contact the BGBM GFBio team    BGBM – extended profile

Service Description

Data archiving for research projects is focusing on

Type 1a Botanical specimen data; (Physical) DNA Storage (under discussion at BGBM); Referenced multimedia objects
Type 1b Botanical observational data; Referenced multimedia objects
Type 2 Botanical systematics and monographic works
Type 4 RAW data (data sets and/or data packages) only if well documented and in formats and structures appropriate for long-term archiving, without further data management required

A preference will be given on data which fall under the geographic and taxonomic research foci of the BGBM (BGBM - extended profile, BGBM: Research). The data archiving and publication includes management processes with JACQ , reBiND -workflow and the EDIT Platform for Cybertaxonomy as well as the data quality service platform and transformation and import services provided by BGBM.

Service Levels

Data Set Data Package Data Management Research Objects

 

Data Submission Formats

Data

Preferably via BGBM collection data form for Botanical collections, DNA sample collections and/or Tissue collections which can be found in the GFBio collection of recommended data submission templates and/or standardized formats of any kind (e.g. ABCD or DwC-A files); Spreadsheets (CSV, excel-files, image files); Export files from external EDIT platform installations

Metadata

EML (GFBio submission package leaflet), ABCD, DarwinCore, DublinCore, SDD

 

Data Accessibility

Public access points

GFBio, BioCASe Data Access Services at the BGBM, Institutional landing pages of citable stable URIs, BiNHum, BioCASE, GBIFEuropeana and others

Standardised exchange formats

XML-files in ABCD, DarwinCore, EML, SDD standard; Web services of the EDIT Platform for Cybertaxonomy

Data formats TXT, CSV, XML
Long-term availability Unlimited

 

Data Publication Services

Data Citation DOI


via GBIF publication



The Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig with Database and IT-Department is one of the largest biological resource centers worldwide. Its culture and tissue collections and DNA Bank currently comprise almost 40,000 items, including about 20,000 different bacterial and 5,000 fungal strains, 700 human and animal cell lines, 800 plant cell lines, 1,000 plant viruses and antisera, and 4,800 different types of bacterial genomic DNA.


Contact in GFBio context: Prof. Dr. Jörg Overmann and Dr. Carola Söhngen      Contact the DSMZ GFBio team    DSMZ – extended profile

Service Description

Data archiving for research projects is focusing on

Type 1

Data accompanied by the deposit of a biological resource within the DSMZ collections (Deposit in the DSMZ)

Type 2 Data describing microbial diversity (e.g. taxonomic classification, morphology, physiology, cultivation, origin, natural habitat), according to our profile description (DSMZ - extended profile, DSMZ)

Service Levels

Data Set Data Package Data Management Research Objects

 

Data Submission Formats

Data

Via DSMZ online accession form or  spreadsheets (CSV), preferably standardized formats of any kind, MySQL Dumps
Example templates for data submission can be found in the GFBio collection of recommended data submission templates

Metadata EML (GFBio submission package leaflet), ABCD, DarwinCore, MCL

 

Data Accessibility

Public access points

GFBio, BioCASe Data Access Services at the DSMZ, GBIF,BacDive

Standardised exchange formats

XML-files in ABCD, DarwinCore, EML standard. Web services of the DSMZ accession platform and BacDive platform

Data formats Text, CSV, XML
Long-term availability Unlimited

 

Data Publication Services

Data Citation DOI
(✔)
via GBIF publication



The Leibniz Institute for Research on Evolution and Biodiversity, Berlin is a research museum within the Leibniz Association. It is one of the most significant research museums worldwide focusing on biodiversity, evolution and geo-sciences. The zoological, paleontological and mineralogical collections of the Museum are directly linked to Research and comprise more than 30 million items. In addition, the Museum has an Animal Sounds Archive containing approximately 120,000 animal sound recordings and a DNA Bank. The Library of the MfN is one of the most important reference libraries in zoology in the German-speaking world. Research at the Leibniz Institute for Research on Evolution and Biodiversity is organised in four Science Programmes ("Forschungsbereiche"): Evolution and Geoprocesses, Collection Development and Biodiversity Discovery, Digital World and Information Science, Public Engagement with Science.


Contact in GFBio context: Dr. Mareike Petersen and Falko Glöckler      Contact the MfN GFBio team    MfN – extended profile

Service Description

Type 1

Occurrence data and associated media originating from zoological, environmental and paleontological studies.

Type 2

A data type driven focus (independent from scientific domain) will be on archiving multi-media objects, traits, taxonomic and observation data, e.g. originating from Citizen Science initiatives (MfN - extended profile).

Service Levels

Data Set Data Package Data Management Research Objects

 

Data Submission Formats

Data

Pre-structured, delimited plain-text files (CSV) and spreadsheet files (Microsoft Excel, Open Document Spreadsheet); SQL dump files; Binary files (e.g. images, audio, volume data); All submissions preferably in open formats
Example templates for data submission can be found in the GFBio collection of recommended data submission templates

Metadata

EML (GFBio submission package leaflet), ABCD, DarwinCore

 

Data Accessibility

Public access points

GFBio, BioCASe Data Access Services at the MfN, Institutional landing pages of citable stable URIs, BioCASe, GBIF,  BiNHum, GeoCASE, Europeana

Standardised exchange formats

XML-files in ABCD, DarwinCore, EML standard

Data formats Text, CSV, XML
Long-term availability Unlimited (minimum guaranteed time period of 15 years)

 

Data Publication Services

Data Citation DOI



The Senckenberg Gesellschaft für Naturforschung (SGN) conducts research in bio- and geosciences within six research institutes and three natural history museums in Germany. The mission of the SGN is to make science and scientific findings accessible to the public through teaching, publishing, museums and special exhibitions in Frankfurt, Dresden, Görlitz and Tübingen. Senckenberg's research activity is divided into four large research fields: Biodiversity, Systematics and Evolution, Biodiversity and Environment, Biodiversity and Climate & Biodiversity and Earth System Dynamics.
With about 40 million objects/items in currently more than 200 collections the SGN has one of the largest scientific collections in Germany. The objects involve a herbarium, zoological, anthropological, paleontological and mineralogical collections plus a DNA Bank. SGN is part of the Leibniz association.


Contact in GFBio context: Anke Penzlin and Dr. Thomas Hörnschemeyer      Contact the SGN GFBio team    SGN – extended profile

Service Description

Data archiving for research projects is focusing on botanical, zoological and anthropological data, according to our profile description (SGN - extended profile).

Type 1a

Collection data, together with the deposit of physical objects, referenced multimedia objects.

Type 1b

Observation and occurrence data, species monitoring projects, referenced multimedia objects.

Type 2

Taxon reference list data and checklist data.

Type 3

Any type of triple-structured data, referenced multimedia objects.

Service Levels

Data Set Data Package Data Management Research Objects

 

Data Submission Formats

Data

Flexible; Preferably standardized formats like ABCD, EML; Spreadsheets (CSV)
Example templates for data submission can be found in the GFBio collection of recommended data submission templates

Metadata

EML (GFBio submission package leaflet), ABCD, DarwinCore

 

Data Accessibility

Public access points

GFBio, BioCASe Data Access Services at the SGN, Institutional landing pages of citable stable URIs, BioCASE, GBIF, Metacat, SeSam/AQUiLA

Standardised exchange formats

XML-files in ABCD, DarwinCore, EML standard

Data formats Text, CSV, XML
Long-term availability Unlimited (minimum guaranteed time period of 10 years)

 

Data Publication Services

Data Citation DOI


via GBIF publication
via DataCite publication



The State Museum of Natural History Stuttgart is one of two State museums in Baden-Württemberg, southern Germany. With its important zoological, paleontological and mineralogical collections and herbarium containing more than 11 million specimens (fossils, minerals, plants, insects, molluscs, and vertebrates) the museum does possess an excellent foundation for biosystematic research. Due to its diverse international scientific contacts and relations, the natural history museum significantly contributes to the identity of the State Baden-Württemberg.


Contact in GFBio context: Dr. Joachim Holstein and Dr. Juan Carlos Monje      Contact the SMNS GFBio team    SMNS – extended profile

Service Description

Data archiving for research projects focusing on botanical, zoological and paleontological according to our profile description (SMNS - extended profile).

Type 1a

Collection data, together with the deposit of physical objects, referenced multimedia objects.

Type 1b

Observation and occurrence data, species monitoring projects, referenced multimedia objects.

Type 2

Taxon reference list data and checklist data.

Service Levels

Data Set Data Package Data Management Research Objects

 

Data Submission Formats

Data

(a) Export files from external installations of DiversityCollection, DiversityTaxonNames (b) any spreadsheets (CSV, excel-files), structured according existing DWB import schemes (see SMNS GitHub, SNSB GitHub and ZFMK GitHub) (c) spreadsheets and databases appropriate to create new DWB import schemes); Image formats have to be agreed for submission
Example templates for data submission can be found in the GFBio collection of recommended data submission templates

Metadata

EML (GFBio submission package leaflet), ABCD, DarwinCore

 

Data Accessibility

Public access points

GFBio, BioCASe Data Access Services at the SNSB, BioCASe, GBIFBiNHum DTN Taxon List Services and others

Standardised exchange formats

XML-files in ABCD, DarwinCore, EML standard; Web services of the DWB platform

Data formats Text, CSV, XML
Long-term availability Unlimited (minimum guaranteed time period of 15 years)

 

Data Publication Services

Data Citation DOI
(✔)
via GBIF publication



The Staatliche Naturwissenschaftliche Sammlungen Bayerns, München with SNSB IT Center are a research institution for natural history in Bavaria. They encompass five State Collections (zoology, botany, paleontology and geology, mineralogy, anthropology and paleoanatomy), the Botanical Garden Munich-Nymphenburg and eight museums with public exhibitions in Munich, Bamberg, Bayreuth, Eichstätt and Nördlingen. Our research focuses mainly on the past and present bio- and geodiversity and the evolution of animals, fungi and plants. To achieve this we have large zoological, anthropological, paleontological and mineralogical collections and herbarium (almost 35,000,000 specimens) as well as a DNA Bank.


Contact in GFBio context: Dr. Dagmar Triebel and Tanja Weibulat      Contact the SNSB GFBio team    SNSB – extended profile

Service Description

Data archiving for research projects is focusing on botanical, mycological, zoological and paleontological data. The data archiving includes management processes with Diversity Workbench (DWB) databases involved. The data publication is done via the DWB network at the SNSB (SNSB – extended profile).

Type 1a

Collection data, together with the deposit of physical objects, referenced multimedia objects.

Type 1b

Observation and occurrence data, species monitoring projects, referenced multimedia objects.

Type 2

Taxon reference list data and checklist data.

Type 3

Any type of triple-structured data, referenced multimedia objects.

Type 4

RAW data (data sets and/or data packages) if well documented and in formats and structures appropriate for long-term archiving, without further data management requirements.

Service Levels

Data Set Data Package Data Management Research Objects

 

Data Submission Formats

Data

(a) Export files from external installations of DiversityCollection, DiversityTaxonNames and DiversityDescriptions, (b) any spreadsheets (CSV, excel-files), structured according existing DWB import schemes (see SMNS GitHub, SNSB GitHub, ZFMK GitHub or example templates for data submission in the GFBio collection of recommended data submission templates) (c) any spreadsheets and databases in an accessible (not legacy) format appropriate to create new DWB import schemes; Image formats have to be agreed for submission

Metadata

EML (GFBio submission package leaflet), ABCD, DarwinCore, DublinCore, SDD

 

Data Accessibility

Public access points

GFBio, BioCASe Data Access Services at the SNSB, Institutional landing pages of citable stable URIs, BioCASe, GBIFBiNHum DTN Taxon List Services and others

Standardised exchange formats

XML-files in ABCD, DarwinCore, EML, SDD standard; Web services of the DWB platform

Data formats Text, CSV, XML; Agreed image data formats
Long-term availability

Unlimited, with tape archiving support through LRZ

 

Data Publication Services

Data Citation DOI


via GBIF publication



The Zoological Research Museum Alexander Koenig – Leibniz Institute for Animal Biodiversity, Bonn carries out species-related biodiversity research and ensures the transfer of knowledge to researchers and the general public.
Core stocks are the zoological collections of more than 5 million specimens, tissue collections and a DNA Bank. The research focusses on performing an inventory of the zoological species diversity on earth.
The results of research and the collections are made accessible to the public with permanent and temporary exhibitions and using other methods for public education.


Contact in GFBio context: Dr. Peter Grobe, Birgit Klasen      Contact the ZFMK GFBio team    ZFMK – extended profile

Service Description

Data archiving for research projects is focusing on terrestrial animals (ZFMK – extended profile). The data archiving includes management processes with Diversity Workbench (DWB) databases and Morph∙D∙Base involved.

Type 1a

Collection data, together with the deposit of physical objects, referenced multimedia objects.

Type 1b

Observation and occurrence data, species monitoring projects, referenced multimedia objects.

Type 2

Taxon reference list data and checklist data.

Type 3

Tissue/DNA biobank storage, referenced multimedia objects (e.g. sound data, 3d image stacks/volume data).

Type 4

RAW data (data sets and/or data packages) only if well documented and in formats and structures appropriate for long-term archiving, without further data management required.

Service Levels

Data Set Data Package Data Management Research Objects

 

Data Submission Formats

Data

(a) Export files from external installations of DiversityCollection and DiversityTaxonNames, (b) any spreadsheets (CSV, excel-files), structured according existing DWB import schemes (see SMNS GitHub, SNSB GitHub, ZFMK GitHub or example templates for data submission in the GFBio collection of recommended data submission templates) (c) any spreadsheets and databases in an accessible (not legacy) format appropriate to create new DWB import schemes

Metadata

EML (GFBio submission package leaflet), ABCD, DarwinCore, GGBN, SDD, DublinCore

 

Data Accessibility

Public access points

GFBio, BioCASe Data Access Services at the ZFMK, Institutional landing pages of citable stable URIs,  GBIF, DTN Taxon List Services, Morph∙D∙Base and others

Standardised exchange formats

XML-files in ABCD, DarwinCore, EML, SDD standard; Web services of the DWB platform

Data formats TXT, CSV, XML; Original and derivate image, audio, and sound data formats
Long-term availability

Unlimited (minimum guaranteed time period of 10 years)

 

Data Publication Services

Data Citation DOI

via GBIF publication