FAIR Data Practices and Data Management Made Simple
It is the responsibility of every Maize Researcher to make data from
publicly funded research Findable, Accessible, Interoperabl eand Reusable
Here we outline some basic guidelines for good data management. We are always
happy to answer your questions
on these issues!
Download slides from Lisa Harper's presentation at the 2019 Maize Meeting
- Put your Data in the right Database.
Some examples: DNA/RNA/ProteinSequences, genome assemblies should go to
NCBI, EBI or DDBJ: NCBI (US), EBI (Europe),and DDBJ (Asia) provide
stable,long-term databases for DNA, RNA, and protein sequence data and
create stable identifiers (accessions) for datasets. These three share
sequence data on a daily basis so data deposited at one is available at all.
Each has multiple sub-databases, for example, NCBI has SRA and GEOfor
un-mapped and mapped sequence reads. SNPs: All non-human SNPs should be
submittedto EVA at EBI. Genome Assemblies: Please submit genome assemblies
to EBI or NCBI Genomes. We understand this can take some time to complete.
We can help, don’t be tempted to simply submit contigs to Genbank.
If you are unsure where to submit data, or need help submitting, please
ask anyone at MaizeGDB. If your journal article refers to data NOT
published with your article, please make sure to obtain and add a
persistent identifier and location of your data in your article.
- Don’t rename genes that already have names.
Renaming genes that already have names is a HUGE problem in maize,
especially when an existing name is reused for a different gene. Please
look up your gene at MaizeGDB before assigning a name, and follow the
maize nomenclature guidelines. (/nomenclature).
- Attach complete and detailed metadata to your data sets, and use accepted file formats.
When you deposit data, you are asked for information about yourdata
(metadata). Please give this the same careful attention you give to your
bench work and analysis. Datasets that are not adequately described are not
reusable or reproducible, and raise questions about the carefulness and
accuracy of the research.
- Insure your datasets are “machine readable”.
Computers can find data
that matches a search query. Complete, proper identifiers, including the
proper case (LG1 is not the same as lg1), use permanent identifiers wherever
possible, and include GO, PO, PATO terms when possible. Please check and
validate that your data is in common, well-used machine readable formats.
- Publish your data with your paper.
Sometimes data are too large to publish as a table or supplementary
material with your paper. These data can be deposited in data repositories,
which provide accessions or DOIs (stable identifiers). DOIs should be
listed in your paper.
- Budget time for Data Management.
Please budget time to do a good job of managing your data as you are with
the other aspects of your research.
- Familiarize yourself with the FAIR data sharing standards.
To support the reuse of scholarly data, a group of data scientists have
created a set of recommendations to make data Findable, Accessible,
Interoperable, and Reusable. Here are some resources:
recommendations for research databases
The MaizeGDB.org team
MaizeGDB is a founding member of AgBioData
a consortuim of agriculture-related online resources which is committed to
making agriculture-related research data FAIR.