GFBio, the German Federation for Biological Data, is a portal that aims to facilitate data sharing in the biological and environmental sciences and to support researchers in all aspects of research data management. With joint forces of the 19 consortium members including ten long-term data centers we offer a variety of data management services to meet the needs of researchers and funding agencies for reproducible and reusable data (see e.g., DFG guidelines on handling biodiversity data). The main components of the GFBio portal comprise a helpdesk for user support, the data submission, the GFBio Search, the Terminology Service, and the VAT-System (virtual research environment for visualization, analysis, and transformation of biodiversity data). GFBio is currently funded by the German Research Foundation (DFG). In May 2016 the non-profit organization GFBio e.V. was founded in order to pave the way for a sustainable provision of services after the funding phase.
To make full use of our services (e.g. data submission, VAT-system) you need to register as GFBio user. With your account, you can also keep track of all your requests. To create an account, please go to the sign-in page and follow the instructions.
- ABCD data archives accessed from the GFBio portal should follow the pattern:
<Authors> (<Publication_year>). <Title>. [Dataset]. Version: <VersionNr>. Data Publisher: <Data_center_name>. <URI>.
- If you use any of GFBio's services, please cite the following article:
Diepenbroek M., Glöckner F., Grobe P., Güntsch A., Huber R., König-Ries B., Kostadinov I., Nieschulze J., Seeger B., Tolksdorf R. & Triebel, D. Towards an Integrated Biodiversity and Ecological Research Data Management and Archiving Platform: The German Federation for the Curation of Biological Data (GFBio) In: Plödereder E, Grunske L, Schneider E, Ull D, editors. Informatik 2014 – Big Data Komplexität meistern. GI-Edition: Lecture Notes in Informatics (LNI) – Proceedings. GI edn. Vol. 232. Bonn: Köllen Verlag; 2014. pp. 1711–1724
- GFBio provides a single drop-off point with individual support for complex datasets
- your data will be archived in one of ten high-standard data centers with expertise in archiving and curating data within various fields (biodiversity data, collection data, ecological data, molecular data, environmental data)
- Long-term preservation of your data is guaranteed
- By receiving a persistent identifier (e.g. DOI) or an accession number (molecular data) your data become fully citable comparable to any paper publication
- You have the possibility to archive also data that are not connected to any prior research publication
- This means that your data become findable, accessible, interoperable and reusable (FAIR data principles)
GFBio aims at supporting researchers to develop and implement effective data management strategies for findable, accessible, interoperable and reusable data (FAIR data principles). We offer help in all steps of the data life cycle, from project proposal to long-term data archival, e.g. data management planning support, data curation, consulting and cost estimation, and training. Please check the detailed description of our service portfolio.
The VAT-System is a tool to visualize, analyze and transform your spatio-temporal (biodiversity) data in a comfortable online GIS environment! You have access to collections and biodiversity data centers, international data aggregators and environmental data, and can integrate your own data!
The GFBio Terminology Service (TS) helps you connect through a single access point to various kinds of terminologies in the biological domain in a uniform and transparent manner. The TS provides services and tools to find, explore, share and reuse terminologies for semantic enhancement of research platforms.
Please contact our helpdesk if you have questions or need personal support. Your request and the subsequent communication will be documented in our system and you will be referred to an expert who will take care of your request. For data archival and publication there are experienced curators at each of our data centers who will support you. If you want to deposit your data directly you can start the submission process and upload your files here.
To sustain research data management as an integral part of your research we recommend applying for money in the course of your research proposal and to name GFBio as partner if you intend to archive your data through GFBio. The DFG is prepared funding costs related to the preparation of research data to fulfill the FAIR criteria (more information by the DFG can be found here). Additionally, the DFG recommends applicants to consult and use established data service centers, such as GFBio (see specific DFG recommendations for biodiversity data).
The use of, and access to data available via GFBio is free of charge.
- For small scale projects, we charge a fixed percentage of 3% of the total budget of the project. These costs are eligible for funding by the DFG and most probably also by other funders, just give it a try (more information on costs that can be funded by the DFG can be found here).
- For larger projects costs need to be negotiated with GFBio on an individual basis. They usually vary from 3% to 5% of the total budget of the project. Depending on the complexity and volume of data up to 10% might be possible. Please contact us for more information.
- There are no hidden or subsequent costs for data that are already archived.
- In any case, we ask you to apply for money for research data management in the course of your research proposal and to name GFBio as a partner if you intend to archive your data through GFBio. If you have questions about the exact wording or need an offer please contact us.
- During the GFBio funding period (until 2021) we are able to provide consultation, data archival, and publication services free of charge for individual researchers not linked to specific larger funding programs (meaning projects with moderate data complexity and volumes).
Data Management Planning
Certainly! Developing custom-tailored DMPs is a key service of GFBio. DMPs are living documents helping you to organize and update all relevant information on the handling of your research data before and during your research project. Learn more about data management planning or contact our helpdesk for information and support.
GFBio aims at supporting individual researchers, research groups, and large projects to develop and implement effective data management strategies for FAIR (Findable, Accessible, Interoperable, and Re-usable) data as part of good scientific practice.
A well-structured Data Management Plan (DMP) clarifies how and what data will be created, processed, and documented. It names means of data archiving and publication regarding costs as well as access conditions for the scientific community and the public. In short, a DMP helps you to ask all the right questions concerning data management. Also, DMPs are increasingly required as a mandatory proposal part by research institutions and funding agencies. By writing a DMP, you save a lot of time in the long run, you are aware of potential data management obstacles, and you increase the transparency as well as the integrity of your work.
In our how-to, you learn about preparing a data management plan with the GFBio Data Management Plan Tool (DMPT) and about the services GFBio offers to support you.
Data Management Plans (DMPs) are evolving into a mandatory part of the project application for many funding agencies (e.g. DFG guidelines for handling research data; DFG guidelines on the handling of research data in biodiversity research). GFBio offers custom-tailored DMPs that meet the requirements of your research project. Be sure to contact us no later than four weeks before the application deadline via our helpdesk, in order to better assessing your needs and preparing an appropriate DMP. For large scale projects, GFBio should be involved already at the early planning stage of the proposal preparation.
Start the GFBio Data Management Planning Tool and create your own data management plan to be integrated into your research proposal!
More information on DMP can be found here.
You can upload max. 20 files with a total size of 200 MB per submission. If the total size exceeds 200 MB, please only upload representative data. Later, a curator will assist you in transferring your complete data set.
Go to the data submission page and start the submission process. Log in with your GFBio account (or via your institutional DFN-AAI login). After successful login you are asked to answer a limited number of questions before you can upload your data (respectively metadata) files. The uploaded files will be checked by our curators for standard conformance and they will come back to you if they need additional information.
Yes, GFBio asks for a minimum set of descriptive information during data submission to ensure the findability, accessibility, interoperability, and reusability of the archived data (FAIR data principles). Metadata should contain basic structured information about your data so that others can understand and use your data without further information. You should always make sure to answer the following questions: Who collected the data, what kind of data were collected, where, when, how, and why were they collected? Our data curators will help and assist you to get your metadata transferred into a respective metadata standard.
Yes, there is a solution. You can choose to archive your data at one of our data centers but initially restrict access to them by putting a temporal embargo on the publication. In this case you can choose flexible access rights for single users or groups of users. It is also possible to restrict access to part of the information on your data (e.g. if you have observational data of endangered species and don’t want to publish the exact geographical coordinates of the observations).
In principle, there is no limit to the amount of data that you can archive. However, if your project plans to produce massive amounts of data (in the order of TB), please contact GFBio at an early planning stage.
All GFBio data centers guarantee a minimum archival period of 10 years.
Your data are deposited to one of currently ten associated/collaborating, publicly-accessible data centers that are committed to the long-term preservation of research data and providing permanent open access to it.
Data Publication, Persistent Identifiers
No, as soon as a dataset is published in one of our data centers you cannot edit the dataset anymore. In case of a minor mistake, a curator can manually adjust it. In case of significant changes, you are requested to provide a new version of the dataset which will be archived under version control and receive a new DOI.
Yes, the data are saved under version control. In case it should be necessary for you to provide a new version of the dataset (e.g. removal of a mistake) it will be indicated as being an updated version of the initial dataset you provided.
Yes, all GFBio associated data centers issue a persistent identifier (e.g. DOI). With a persistent identifier, your data become citable and their discoverability and re-use are increased.
Yes, you can set an embargo period (time period during which the data is already deposited in a data center but is not publicly accessible yet).
Rights, Permissions, Legal Issues
The requirements for provenience, transfer, access, and the utilization of genetic resources (Nagoya Protocol) do also apply for basic scientific research. Genetic material that was collected prior to December 29, 1993, is not subject of international laws on biodiversity.
For data that are submitted to GFBio one of the creative commons licenses can/should be chosen.
We recommend CC BY, as this means that you need to be cited as the author of the data set, just like it is the case for scientific paper publication citations.
The Nagoya Protocol on Access Benefit Sharing to the Convention on Biological Diversity is an international agreement which aims at sharing the benefits arising from the utilization of genetic resources in a fair and equitable way. It entered into force on 12 October 2014.
Yes, the services are open for use to any scientist. To make full use of our services it is necessary to create a GFBio user account.
Search, Access, Reuse
You can reuse the data according to the terms specified in the creative commons license that applies to the respective dataset you intend to reuse. Please always cite ALL data sets that you are using in your scientific work as you would cite any paper publication.
All data available via GFBio can be found through our GFBio Search. A beta-version of the Semantic Search functionality is now available at the GFBio portal. The semantic search narrows down (or extends) the search output by understanding the intent and context of your search (e.g. by considering relations and synonyms of search terms). Check our how-to-search guide to find out more.
After successfully having searched for datasets, you might want to explore them more in-depth. For this purpose, GFBio offers the VAT Tool (visualization, analyzation, transformation). Please note: You have to be logged in to use this function.