FAIR Data Practices and Data Management Made Simple

It is the responsibility of every Maize Researcher to make data from publicly funded research Findable, Accessible, Interoperabl eand Reusable (FAIR). Here we outline some basic guidelines for good data management. We are always happy to answer your questions on these issues!

Download slides from Lisa Harper's presentation at the 2019 Maize Meeting

  1. Put your Data in the right Database.
    Some examples: DNA/RNA/ProteinSequences, genome assemblies should go to NCBI, EBI or DDBJ: NCBI (US), EBI (Europe),and DDBJ (Asia) provide stable,long-term databases for DNA, RNA, and protein sequence data and create stable identifiers (accessions) for datasets. These three share sequence data on a daily basis so data deposited at one is available at all. Each has multiple sub-databases, for example, NCBI has SRA and GEOfor un-mapped and mapped sequence reads. SNPs: All non-human SNPs should be submittedto EVA at EBI. Genome Assemblies: Please submit genome assemblies to EBI or NCBI Genomes. We understand this can take some time to complete. We can help, don’t be tempted to simply submit contigs to Genbank. If you are unsure where to submit data, or need help submitting, please ask anyone at MaizeGDB. If your journal article refers to data NOT published with your article, please make sure to obtain and add a persistent identifier and location of your data in your article.

  2. Don’t rename genes that already have names.
    Renaming genes that already have names is a HUGE problem in maize, especially when an existing name is reused for a different gene. Please look up your gene at MaizeGDB before assigning a name, and follow the maize nomenclature guidelines. (/nomenclature).

  3. Attach complete and detailed metadata to your data sets, and use accepted file formats.
    When you deposit data, you are asked for information about yourdata (metadata). Please give this the same careful attention you give to your bench work and analysis. Datasets that are not adequately described are not reusable or reproducible, and raise questions about the carefulness and accuracy of the research.

  4. Insure your datasets are “machine readable”.
    Computers can find data that matches a search query. Complete, proper identifiers, including the proper case (LG1 is not the same as lg1), use permanent identifiers wherever possible, and include GO, PO, PATO terms when possible. Please check and validate that your data is in common, well-used machine readable formats.

  5. Publish your data with your paper.
    Sometimes data are too large to publish as a table or supplementary material with your paper. These data can be deposited in data repositories, which provide accessions or DOIs (stable identifiers). DOIs should be listed in your paper.

  6. Budget time for Data Management.
    Please budget time to do a good job of managing your data as you are with the other aspects of your research.

  7. Familiarize yourself with the FAIR data sharing standards.
    To support the reuse of scholarly data, a group of data scientists have created a set of recommendations to make data Findable, Accessible, Interoperable, and Reusable. Here are some resources: www.go-fair.org, AgBioData recommendations for research databases

The MaizeGDB.org team

MaizeGDB is a founding member of AgBioData, a consortuim of agriculture-related online resources which is committed to making agriculture-related research data FAIR.