Frequently Asked Questions

BASIC INFORMATION


What are research data?

“Research data are (digital) data that, depending on the scientific context, are related to, originate from, or are the result of a research process.” (Kindling et al. 2013).

Scientific data are created by a variety of methods, depending on the research question. These include studying source material, experiments, measurements, descriptions, surveys, or polls. The data are the basis of scientific results. This results in the recognition of discipline- and project-specific data with different requirements for processing and managing such data.

Since research data are necessary to verify the results based on them, the preservation of such data is a recognised part of good scientific practice (see, for example, “DFG-Leitlinien zum Umgang mit Forschungsdaten” (Guidelines on the handling of research data)).

Research data include measuring data, laboratory results, audio-visual information, texts, survey data, objects from collections or samples that are the result of, were developed, or evaluated during scientific work. Software, simulations or images are also included.

Why should I publish research data?

Publishing research data provides opportunities not just for researchers, but also for science in general:

Opportunities for researchers
  • Your research becomes more visible. Publications are cited significantly more often when the data are publicly available (Piwowar and Visions 2013).
  • The publication of research data is gaining more and more recognition as a scientific achievement.
  • You can increase the quality and credibility of your research by offering others a chance to verify your data.
  • You comply with the current requirements of the research funding agencies (see above)
  • You can secure your own research investment by setting blocking periods.
Opportunities for science
  • The publication of data opens up new potentials for research as data become available for re-analysis in the context of new research questions and methods or for combining data from different sources.
  • It also reduces the production of redundant scientific data, which saves time and money.

Research funding agencies and the scientific community increasingly demand Open Access to research data (achieved by publication of data in Open Access) so that published research results can be verified and the data accessed for reuse.

The science ministers of the G8 signed one of the most important international commitments to Open Science in 2013: “…to the greatest extent and with the fewest constraints possible, publicly funded scientific research data should be open […] whilst acknowledging the legitimate concerns of private partners.” (G8 Science Ministers 2013). The German Federal Ministry of Research supports research data management initiatives which includes the publication of research data.

The geoscientific community engages internationally under the leadership of AGU, Earth Science Information Partners (ESIP) and Research Data Alliance (RDA) in the project Enabling FAIR Data for open and FAIR research data and propagates the publication of research data  in the Enabling FAIR Data Commitment Statement.

Some research funders, like the EU (Pilot on open research data in the HORIZON2020 programme) and the DFG (“Leitlinien zum Umgang mit Forschungsdaten” (Guidelines on the Handling of Research Data)), urge scientists to publish research data.

Where can I publish research data?

The Registry of Research Data Repositories re3data offers a global overview of data repositories, especially suitable if you search for data. To find data repositories in re3data which A) accept data upload from scientists worldwide and B) adhere to high standards defined by the Enabling FAIR Data project, use Repository Finder, a tool that filters the large re3data database accordingly. You may as well follow our recommendation of three data publication services:

GFZ Data Services

GFZ German Research Centre for Geosciences cooperates with FID GEO in data publishing. It  issues Digital Object Identifiers (DOI) for data sets since 2004 and publishes data sets in GFZ Data Services, the data repository of GFZ. Almost all geoscientific disciplines are covered by this service.

Datasets are submitted online by the author and are described using an online metadata editor. This editor is easy to use and provides extensive help functions. You may refer to the “Quick Start Guide for Data Publications”.

As a special feature, GFZ Data Services offers the possibility to publish the research data of entire projects or all data records of a particular institution on websites which have the “look and feel” of the respective project or the respective institution. Additionally, GFZ Data Services supports institutions to harvest metadata from the GFZ repository to transfer the data to their own systems, e.g. a university bibliography.

PANGAEA

Pangaea is an open access database that archives, publishes and make available georeferenced data from earth and life sciences. Long-term availability of the content is guaranteed through the Pangea’s operating institutions, the Alfred Wegener Institute Helmholtz Center for Polar and Marine Research and the MARUM Center for Marine Environmental Sciences at the University of Bremen.

Records are electronically uploaded by the authors and described by means of an electronic form. Data and metadata uploads start with a registration and are carried out via a so-called ticket system, which is described in more detail here. It also provides detailed information on data upload, workflow and possible cost sharing.

EarthChem Library

The EarthChem Library is a data repository that archives, publishes and makes accessible geoscientific data and other digital objects. EarthChem Library publishes analytical data, data syntheses, models, technical reports, etc. The EarthChem Library Submission Guidelines provide detailed instructions for submitting research data. Public access to submitted data sets can be restricted with an embargo up to a maximum of 2 years.

Access to data in the EarthChem Library is open (Open Access) under the terms of the Creative Commons license BY-NC-SA 3.0. The EarthChem Library ensures long-term availability of its content by working with the Columbia University Libraries Digital Program. Data sets in the library are equipped with a Digital Object Identifier (DOI). The EarthChem Library is part of IEDA, a publishing agent of the DataCite Consortium.

What do I have to do if I want to publish data with FID GEO?

Just send us an e-mail or call. Publishing research data is usually straightforward, but depending on the type and amount of data different things may need to be considered. We will help you and make sure that your data are citable, licensed and permanently available and that the data can be found worldwide, all according to the latest standards.

What does data publication mean and what do I need if I want to publish my data?

Publishing data means that the data can be accessed and cited. For this purpose, it is important to create a persistent electronic “guide” to make sure that the data can always be found on the Internet, even if the web address (URL) under which the data can be accessed changes. Scientific publications usually employ the DOI (digital object identifier) to fulfil this purpose. Our “Frequently Asked Questions” list contains more information on the subject of DOIs and how to create a DOI (we´ll create a DOI for you, indeed).

In order for your data to be found on the Internet, the data must be described in a way that ensures search engines can read the information. This description is realised by means of metadata. Our “Frequently Asked Questions” list contains more information below on the subject of metadata and how you can use metadata to describe your data set. For other people to evaluate and use your data, the data often require another, human-readable  description in addition to the machine-readable metadata.

An electronic licence attached to your data will indicate how other people can reuse your data. The appeal for the use of open licences in science (“Nutzung Offener Lizenzen in der Wissenschaft”) by the Alliance of Science Organisations in Germany, which is also supported by research funders such as the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation), recommends open licences. In this context, Creative Commons Licences have proved a good way to publish openly accessible research data. You can choose between several options: a CC BY license, for example, grants other people free use of your data; they can change and even redistribute your data but they always have to include you as the author of the data. In most cases, we recommend this type of licence. Our “Frequently Asked Questions” list contains more information on the subject of electronic licences and on our recommendations.

Will I still be the owner of my data after publishing?

Yes. And only you decide what others may or may not do with your data. In order to make clear in which way your data may be reused or not, a licence directly connected to the data defines the copyrights and access rights. We will be happy to recommend a suitable licence and advise you on this topic.

Can I set an embargo on the publishing of my data?

Yes. The creator of the data of course has the right of first publication. It is therefore possible to implement embargos, in particular, but not limited to work associated with earning academic credit such as PhD theses. Even though the data are not publicly accessible during the embargo, they can already be published: thanks to the assigned DOI your data can be cited and your data publication can be found by search engines thanks to the assigned metadata.

Do the scientific journals in which I publish my research articles support the publication of data?

Most publishers of geoscience journals support the publication of research data, some of them even require it. A great number of publishers have signed the COPDESS “Statement of Commitment“, in which they commit to support the publication of data and to accept citations of data sets in the lists of works cited in scientific articles. Copernicus, Elsevier, Science, SpringerNature and Wiley as well as societies such as the American Geophysical Union, the European Geosciences Union and the Geological Society of London have signed this statement.

Why shouldn’t I publish my data supplement with the scientific journal in which my article is published?

Discipline-specific data repositories offer additional benefits that publishers of journals usually do not offer.

For more than 10 years now, publishers have offered the chance to add electronic data supplements to scientific articles. For a long time, this used to be the only option for publishing research data. Now there are data repositories (= special electronic archives for research data only), and these assign great value to the publication of reusable data: the quality of the metadata (and data) is checked by scientists from the corresponding field. Moreover, data repositories have a lot of experience in curation and long-term archiving of data.

This is why even many publishers of journals  now recommend publishing research data through dedicated data repositories and linking the data to the research articles published by them. This recommendation is also part of the COPDESS “Statement of Commitment” signed by, for example, Copernicus, Elsevier, Science, SpringerNature, Wiley and societies such as the American Geophysical Union, the European Geosciences Union and the Geological Society of London.

PUBLISHING DATA WITH FID GEO


What are the advantages of publishing research data with FID GEO?

The service is supervised by specialists at the German Research Centre for Geosciences, GFZ, in Potsdam, who are familiar with geoscientific data. They oversee the GFZ data repository, which has been publishing data sets with DOIs since 2004 and are, at the same time, actively involved in the international development of state-of-the-art research data management.

  • DOIs (Digital Object Identifiers) are assigned to guarantee the unique and permanent identification of data on the Internet.
  • We make sure that your data appear in catalogues relevant for geoscientists around the world, thus guaranteeing the highest possible level of visibility in your research community.
  • We know which metadata are best suited to describe geoscientific data.
  • We also offer advice on data documentation and are able to provide information on data quality.
  • We make sure that data and corresponding published texts are electronically connected.
  • We can also advise you regarding legal aspects, in particular on topics such as control and reuse of your data. Depending on your needs, we can recommend the appropriate licenses that clearly define how others can use your data.
What type of research data can be published using FID GEO?

The FID GEO service is limited to the publication of research data that provide the basis for an article published in a scientific journal. If you would like to publish other types of data, such as data not yet linked to a text published in a scientific journal or data you don’t plan to link to a published text, we will be happy to advise you.

What do I have to do if I want to publish data with FID GEO?

Just send us an e-mail or call. Publishing research data is usually straightforward, but depending on the type and amount of data different things may need to be considered. We will help you and make sure that your data are citable, licensed and permanently available and that the data can be found worldwide, all according to the latest standards.

What are metadata?

Metadata are data that provide information about data. They consist of structured information that describes or helps localise resources or that makes it easier to access, use, or handle the corresponding resources in another way. The National Information Standards Organization offers a detailed description of what metadata are and how they are used: Understanding Metadata.

There are different types of metadata. The metadata for data discovery are the most important. In addition, there are structural and contextual metadata.

In order to make the automatic exchange of metadata possible, standardised, machine-readable metadata have been developed. These standards usually refer to the metadata for data discovery and include, for example, information on the authors and/or creators of the data, the title of the data set, the year the data was published and the geographic location, but also a brief description of the data set and the cross references to related published articles.

Contextual and structural metadata are information required for reusing the data, such as an overview of the units of the parameters in a table or information on data processing or an overview of all individual files of a data package. This type of metadata is often made available in the form of README.txt files or other supplementary documents.

How do I provide metadata to FID GEO?

You provide the metadata for data discovery (see answer to previous question) online, using an online metadata editor. Although the editor includes extensive user support, most users will find it easy to fill in the form. Since metadata play such an important role for your data to be found on the Internet later on, we will check your entries before publishing them.

Contextual and structural metadata (see answer to previous question) are provided in a useful format depending on the data set. We will be happy to give advice.

Which metadata scheme do we use?

Not all data require the same type of metadata. This is why different metadata schemes for different types of data and data from different disciplines have emerged over time. We use the DataCite metadata scheme.

What is a Digital Object Identifier (DOI)?

A DOI is an online reference assigned to a digital resource (e.g. an article in a journal or research data) to give it a unique and permanent reference on the Internet. The DOIs are permanently connected to the digital resource – regardless of changes on websites or servers being shut down (in this case a DOI is simply rerouted to a new URL). The use of DOIs, for example, prevents the occurrence of dead links when publishers change the web address of a server. Among all the different ways to reference digital objects on the Internet permanently, DOIs have become the leading system when publishing text and data.

How do I get a DOI for my data publication?

We will take care of assigning a DOI to your publication. DOIs are assigned based on the rules of the International DOI Foundation. The German Research Centre for Geosciences, GFZ, is a DOI publication agent and assigns the DOIs for data publications of GFZ Data Services. FID GEO takes advantage of the competency and infrastructure of GFZ Data Services not only to assign DOIs, but for the entire process of data publication.

What type of format is needed to submit my data?

There are no fixed specifications, but recommendations are offered by, for example, the UK Data Service and Stanford University. We will be happy to advise you.

In general, the following applies: data should be exchangeable without barriers and readable by others. Ideal formats are non-proprietary, unencrypted and commonly known across your research community and are based on open, documented standards. If the problem of proprietary formats occurs, in particular in the case of commercial software, you may be able to convert the data into open, standardised formats. Open and common formats are always preferable to proprietary formats if they achieve the same results or can be used accordingly without much effort.

Where will my data be stored?

The data are stored in the data repository of the German Research Centre for Geosciences, GFZ, where they are permanently available. The GFZ has been publishing geoscientific research data since 2004 and ensures technical integrity and long-term availability of the data.

Will the data be versioned?

Data sets that have been assigned a DOI must not be changed. It is, however, possible to assign a new DOI to changed data sets.

Exceptions are constantly growing dynamic data sets, such as the time series from a climatological station. Here, new data can be added to the existing data set without changing the DOI IF the already published data have not been changed. However, the moment the already published data set is changed (e.g. after the removal of outliers or a recalibration) a new version of the DOI must be created. When a DOI gets a new version, this will be indicated in both the original and the new version.

My home institution also offers a data publishing service. Should I publish my data with FID GEO anyway?

Subject-specific services such as FID GEO are deeply rooted in their disciplines and usually offer specific advantages for that reason, for example regarding the documentation of the data or the visibility for the research community. Our “Frequently Asked Questions” list more advantages of publishing research data with FID GEO.

If your home institution agrees, the presentation of your data published with FID GEO on the Internet can be adapted to the look and feel of your home institution’s websites. It is possible, for example, to display all FID GEO data publications of a specific university on the Internet with the web design elements of that university. This emphasizes your affiliation with your home institution and at the same time increases the visibility of the institution on the Internet.

If your home institution also offers the publication of data, you should inform the institution of your publication with FID GEO. This is important to make sure that the metadata of your publication are also added to your home institution’s catalogue and can be used there for, for example statistical evaluations with regard to the performance-based allocation of resources. We will be happy to contact your institution.

What principles and standards does FID GEO rely on to publish research data?

There are different initiatives that promote progress in the publication of research data. Three of the most important initiatives for the publication of geoscientific data are described below.

Coalition on Publishing Data in the Earth and Space Sciences (COPDESS)

With its initiative, this group of publishers and data facilities has created a framework for the joint development of policies and approaches to how to publish and cite data. In its “Statement of Commitment” the group recommend, among other things, to make specialist databases accessible. In this statement the publishers commit to providing information on the storage location and availability of the data related to a journal article. Moreover, the citation of data sets in scientific publications is equivalent to the citation of published journal articles.

Since January 2015, more than 40 publishers and scientific institutions, organisations and data facilities have signed the statement. These include publishers like Copernicus, Elsevier, Science, SpringerNature and Wiley as well as societies such as the American Geophysical Union, the European Geosciences Union and the Geological Society of London.

Joint Declaration of Data Citation Principles (JDDCP)
In order to be able to use data publications to their full potential, the data must be cited appropriately. If the data is to be reused, it is, for example, important that the citations are not just human-readable but also machine-readable. The Force 11 “Data Citation Synthesis Group” has published recommendations for good practice for publishing data with the “Joint Declaration of Data Citation Principles (JDDCP)”.
Guiding Principles for Findable, Accessible, Interoperable and Re-usable Data Publishing (FAIR Principles)
The “FAIR Principles” offer further practical information in addition to the JDDCP mentioned above. The FAIR Principles are guidelines and recommendations for practical use and are intended to make it easier to find, access and reuse published data. A detailed description of the Priniples was published by Wilkinson et al. 2016.

5_sub-logo

2_gfz-cd_logorgb_de

supported by

creative-commons-logo-by2Licensed under the Creative Commons Attribution International 4.0 license.