Skip to Main Content

Data management for students: Storing and archiving

Research Data Management (RDM) for students of the Radboud University

Storing and archiving

Storage deals with how you can safely store your research data while you are working on it. Archiving deals with storing data after your project is over, for scientific integrity. In some cases, you will want to share your data publicly for reuse after your project is over. This is not common for students and if it applies to you, this should be discussed with your supervisor in detail.

Safe storage and sharing

Safe storage and sharing is important, whether you are working with personal data or not, because it prevents loss of data and data leaks. Sharing personal data should be avoided in general. If your storage solution has no built-in sharing solution, you can use SURFfilesender. Enable file encryption in case it is absolutely necessary to share personal data through SURFfilesender.

Preferred:

  • Radboud University Microsoft 365 Teams: You can ask your supervisor to create a Team for you. They may need to request permission to create a Team, after which they can create a Team with a private channel for you, and disable the synchronization function. Use the Microsoft 365 account given to you and your supervisor by the Radboud University, do not use a private Microsoft 365 account!
    Features: automatic backup, designed for sharing, accessible from anywhere
  • Workgroup folder: You can ask your supervisor to create a workgroup folder for you via the account portal. Your supervisor should choose the option “Workgroup folder (with students): folder request”.
    Features
    : automatic backup, limited sharing, accessible via eduVPN and RU connect

Alternatives:

  • Radboud University Microsoft 365 Teams: You can create a Team to which you add your supervisor(s). Use the Microsoft 365 account given to you by the Radboud University, request permission via the account portal, create a Team to which you add your supervisor(s), and disable the synchronization function.
    Features: automatic backup, designed for sharing, accessible from anywhere
  • Local drive on personal pc/laptop: You can use the local drive on your personal pc or laptop if it is password protected and (when you have personal data) encrypted. This is a less preferred option because it has no automatic backups – if you lose your computer or it crashes, you can lose all your data.
    Features: no automatic backup, no built-in sharing

Do NOT use:

  • A commercial cloud storage: Avoid cloud storage, such as Google Drive and Dropbox, especially for personal data! Also, avoid these as backups!
  • Portable drives: Try to avoid portable drives, such as a USB drive or an external hard drive, to store or backup your data since they are easily lost.

For more information about safe storage click here (ICT facilities)

Archiving data for scientific integrity

Why should you archive?

Once your research project is completed, it is important to archive your data for the sake of scientific integrity, making it verifiable for others (e.g., for your supervisor or during audits). These datasets are not made public or shared with other researchers, but are merely accessible to people such as your supervisor. 

Once you have archived your data, remove all other copies of the data you have stored elsewhere, for example on your local drive on a personal computer. 

What should be archived?

You should archive all data that are relevant and necessary for an outsider to be able to reproduce your analysis and conclusions. If you have any doubts about what to archive, you should discuss this with your supervisor.

It is good practice to include documentation with your data. This documentation explains your data and makes sure that your dataset is still understandable in a few years from now. The files that should ideally be included in a dataset are:

  • Research data files: This may include the collection materials (e.g., experiment and stimuli, questionnaires) the raw data (e.g., Qualtrics export, responses to interviews such as audio files, output from an experiment), (pre-)processed data (cleaned up data used for the analyses), and the analysed data consisting of the results (tables, charts, statistical outputs) as well as the analysis scripts (e.g., SPSS, R files)
  • Documentation: The goal is to provide information about the context, content and structure of the dataset.
    • A readme file provides some context for the dataset. It also lists the data files that you uploaded by name and briefly describes the content of each file. You can find an example of a readme file in the Appendix.
    • A codebook describing the variables in your research as well as a methodology file can be useful additions, too.
  • Supplementary files: You may additionally include your data management plan if you have one. And if you had human participants you could include ethical approval, blank informed consent forms, information sheets, and debriefing forms.

NB: Personal data which you did not need for your conclusions should be deleted as soon as possible (e.g., administrative data such as email addresses used to contact participants) and should not be archived in order to protect your participants’ privacy.

Where to archive?

Check with your supervisor or teacher to see where you should archive your data.

  • RIS for students: Some study programmes ask students to use RIS for Students to archive their datasets. You can find a manual on the website.
  • A workgroup folder: Your supervisor will need to create a workgroup folder and give you access. This can be done here.

NB: Data from RadboudUMC cannot be archived in RIS for Students nor in a workgroup folder.

Sharing data publicly for reuse

Sharing data publicly is not standard procedure for student projects. If, however, your research gets published and/or your supervisor and you think that your data could be valuable to others for reuse, you should discuss the following points with your supervisor.

DO’S

  • Ask your supervisor if they think it’s a good idea to share your data publicly.
  • Ask your supervisor what data to share exactly.
  • Ask your supervisor what an appropriate license and access level for your dataset is.
  • Ask your supervisor about an appropriate archive in your field. Otherwise DANS EASY and DANS Data Stations are good options.

DON’TS

  • Do not share personal data publicly!

A dataset that you share for reuse must not contain any personal data (unless this is required for a journal publication and you got specific consent from participants to do so and if your local ethics committee approved this). Thus, a dataset that you share publicly will often be different from the dataset you archived for scientific integrity. For example, you usually have to archive raw data for scientific integrity. However, you are often not allowed to share these data publicly. For example, when collecting audio recordings of participants, you will archive these for scientific integrity. However, if you want to share your data publicly for reuse, you will very likely only be able to share an anonymous transcript of the audio and not the audio files themselves.

Appendix: example readme file

Dataset title: Good arguments or a charming narrator? Exploring a text’s persuasiveness through eye-tracking

Student:                            Sanne Huisman (s123456)
First supervisor:               dr. Lisa Begeleider

Second reader:                dr. Ton Lezer

Short summary

This dataset contains all relevant data files for the thesis Good arguments or a charming narrator? Exploring a text’s persuasiveness through eye-tracking, written by Sanne Huisman to obtain the degree of Bachelor of Arts and conclude the bachelor’s programme International Business Communication at Radboud University. This research was conducted at the CLS Lab in the spring of 2019 and supervised by dr. Lisa Begeleider and dr. Ton Lezer.

The goal of this thesis was to explore the persuasiveness of a text as a function of the quality of the presented arguments as well as the likability of the person making these arguments. While persuasiveness is often measured with questionnaires, we explored whether persuasiveness is also reflected in eye movements. A total of 78 participants took part in this study.

Dataset structure

This dataset contains a total of 8 files as well as two zip folders:

  • README.txt:
    That is this very readme file.
  • Rawdata_eyetracking.zip:
    This zip folder contains the raw eye-tracking data of 78 participants. The data was collected using an EyeLink 1000+ eye-tracker. The folder structure of the raw output is unchanged.
  • Experiment.zip:
    This zip folder contains all files necessary to run the experiment using the experiment software Experiment Builder as installed in the CLS Lab.
  • Participant_overview.xlxs:
    This Excel file contains a pseudonymized overview of all participants. It includes the participant id (001, 002 etc), gender and age of each participant.
  • Persuasiveness_questionnaire_empty.doc:
    This file contains an empty version of the questionnaire by McCroskey and Teven (1999) which was used to measure the persuasiveness.
  • Persuasiveness_results.xlsx:
    This file contains the results of all 78 participants on McCroskey and Teven’s (1999) questionnaire.
  • Eyedata_clean.txt:
    This text file contains all the pre-processed and cleaned up eye-tracking data. The variable names are chosen in such a way that they are self-explanatory.
  • data_analysis_anova.sav:
    This file contains all the eye-tracking and questionnaire data as it was analysed in SPSS
  • analysis_anova.sps:
    This SPSS syntax file contains the full analysis.
  • Methodology_HuismanS_2019.pdf:
    This is the methodology section of the thesis which was written based on these data.

References

McCroskey, J. C., & Teven, J. J. (1999). Goodwill: A reexamination of the construct and its measurement. Communications Monographs, 66(1), 90-103. doi: https://doi.org/10.1080/03637759909376464