EPSRC DMP Compliance Rubric

A check-list / marking rubric to be used when evaluating Data Management Plans for compliance with EPSRC data retention and sharing requirements

Funder Template

EPSRC

Purpose of rubric

Providing feedback to researchers

Author(s)

Mary Donaldson [email protected]
Service Coordinator, Research Data Management, University of Glasgow.

Version

This is version 2.0 and has been posted to enable discussion and comments within the research data community.

A PDF can be downloaded from the Zenodo repository - https://doi.org/10.5281/zenodo.247087

Notes

BBSRC expects data sharing to take place in the following research areas:

  •   Data arising from high volume experimentation.
    
  •   Low throughput data arising from long time series or cumulative approaches.
    
  •   Models generated using systems approaches.
    
    Data sharing is also expected in other areas where there is a strong scientific case and where it is cost effective.

The general expectation for research data is that it will be available at the end research project, and that the DMPs will be used to describe how it will be made
available. If data are not to be shared, there should be a clear case why not, and if a full dataset is not available, why a partial dataset, or restricted dataset might not
be available. (From Michael Ball, BBSRC)

Maximum plan length is one A4 page.

Documents used

Clarifications of EPSRC expectations on research data management
EPSRC policy framework on research data

Criteria and performance levels

Performance criteria

Detailed

Addressed but incomplete/unsatisfactory

Not addressed

What type of data will be collected?

Data types clearly defined. Eg
experimental measurements, models, ecordings, video, images, machine logs
etc.

Data types mentioned for some of project / dataset but not all.

No details included.

What format of data will be collected?

Data formats clearly defined. Eg spreadsheets in .csv or .xlsx; micrographs in .tiff or .jgp; proprietary manufacturer formats where necessary.

Data formats are mentioned for some of dataset but not all.

No details included.

What scale / volume of data will be collected?

Clear estimate of dataset size given for each data type.

Dataset size given but not broken down by data type. Size not give for all data types. Dataset size is clearly unrealistic (not always possible to judge!).

No indication of data volume is given.

How will data be collected?

Methodology is clearly stated.

Methodology is mentioned for a subset of the data to be collected.

No methodology is mentioned.

What documentation will accompany
the data?

Clear outline of documentation with references to existing good practice in the community or detailed project specific approach where community standards don't exist.

Some mention of documentation without detail about community standards or a project-specific approach.

No mention of documentation

What metadata will accompany the
data?

Clear outline of metadata strategy with references to existing good practice in
the community or detailed project specific approach where community standards don't exist.

Some mention of metadata without detail about community standards or a project-specific approach.

No mention of metadata.

Are there any ethical issues to consider?

Clear assessment of any ethical issues related to the project or a statement that there are no ethical issues to
consider in the project.

Some mention of ethical issues, but no details.

No mention of ethical issues related to project and no statement that there are no issues to consider.

How will ethical issues be managed?

Clear methodology for managing ethical issues identified in relation to the
project. This section might be blank if project has no related ethical issues.

Some mention of managing ethical issues, but methodology is not clear or
is clearly inadequate / inappropriate (this could be difficult to judge).

No mention of how identified ethical issues will be dealt with.

How will copyright and IPR issues be managed?

Clearly states how copyright and IPR are allocated / owned with relation to the
project.

Some mention of how copyright and IPR are allocated / owned with relation to the project, but details are lacking, or only a subset of the data is addressed.

No mention of how copyright and IPR are allocated / owned with relation to the project.

How will data be stored?

Clear description of data storage systems. Eg data stored on managed storage provided by IT services; data
stored on local machine and portable drive.

Mention of data storage systems, but lacking detail or clearly inappropriate (could be difficult to judge).

No mention of data storage systems.

How will data be backed-up?

Clear description of data backup routines / protocols. Eg automatic backup every night; weekly backup of
equipment to server.

Some mention of data backup routines / protocols but detail lacking or clearly
inappropriate.

No mention of data backup systems.

How will access to and security of data be managed?

Clear description of access and security procedures. Eg data stored on password-protected drive. Data
encrypted if necessary. Office door locked when researcher is out of office.

Some mention of data security, but detail lacking or methods inappropriate.

No mention of data access / security controls.

Which data will be retained?

Clear assessment of which data will be retained for long-term. Eg data underpinning publications. Raw data versus cleaned data (or vice versa). Raw data and final version, but not interim versions. De-identified data versus complete record. Audio recordings versus transcriptions.

Data retention mentioned but detail is lacking.

No mention of data retention.

Which data will be shared?

Clear assessment of which data will be shared. Eg data underpinning publications. Raw data versus cleaned data (or vice versa). Raw data and final version, but not interim versions. Deidentified
data versus complete record.
Clear statement that some / all data is not suitable for sharing with justification
as to why.

Data sharing is mentioned, but reference is lacking detail about which subsets are suitable. Statement that some / all data are not suitable for
sharing but no justification as to why.

No mention of data sharing.

What are the plans for long-term preservation of the data?

Clear strategy for long-term
preservation of dataset. Eg deposit in an appropriate responsible repository.
Clear statement that dataset won't be preserved / is not suitable for preservation.

Preservation is mentioned but strategy is not clear or lacks detail.

No mention of preservation of dataset.

How will dataset(s) be shared?

Clear consideration of where, how and to whom the data will be made available. Strategy is in line with good practice in the area of research (if able to judge!). Assessment of specific access mechanisms if needed.

Some mention made of how the data will be shared but details missing.

No mention of how dataset will be shared.

When will data be shared outwith the study team?

A clear timescale is indicated eg no later than the publication of the main
findings of the research (funder expectation), at the end of the award, within 3 years of the generation of the
dataset. Where a delay to release is indicated, reasons are given.

Timescale is mentioned but not clear or not clear for all datasets.
Timescale is clearly not in accordance with funder expectations.
Delayed timescale is indicated but reasons are not given for delay.

Timescale for release of data is not mentioned.

Are there any restrictions to data sharing?

Clear assessment of any restrictions which might apply to data sharing, with reasons, eg potential patent application, ethical reasons, commercial co-funding. Clear statement that there would be no restrictions to sharing any of the data.

Mentions a need to restrict access to data/subset of data but without giving reasons why.

No mention of data sharing restrictions.

Who is responsible for study-wide data management?

Clear indication of who is responsible. This might be more than one person eg PI has overall responsibility, but postdoc / student has daily responsibility for record keeping, data entry and metadata recording.

Mentions that responsibility will be taken for data management without giving details of who / which processes.

No mention of responsibility for data management.

What resources are needed to deliver the plan?

Required resources are listed, or there is a statement that no further resources are needed. Resources requested relate to implementation of the rest of the plan.

It is stated that resources are needed, but details are not provided.

No mention of required resources.

For how long will dataset be retained?

Clear indication of appropriate retention schedule, in line with
funder expectation (10 years from the date of last access). Where retention schedule deviates from the funder's
expectation, reasons are given.

Some indication of retention
schedule, but doesn't cover all of dataset.
Or, retention schedule deviates from funder expectation, but no reasons
are given for this.

No mention of retention schedule for dataset(s).