Processing & Analysing Data
Version control systems
A popular version of a control system for data and software is Git. Git is widely used for source code version control; researchers can use Git to track changes, create branches for different analyses, and collaborate on data-related projects.
Many data repositories, such as Dataverse and OSF, allow for the association of a specific version with a dataset. Each version retains its own unique identifier and timestamp, thus enabling researchers to track the evolution of the dataset over time.
Data analysis tools
Some widely used quantitative analysis tools across various disciplines include: R and RStudio, Python and Jupyter Notebook, SPSS, Stata, MATLAB, and even Excel.
Qualitative data analysis involves systematically examining non-numerical data, such as text, audio, images and video, in order to identify patterns, themes and insights. An example of qualitative analysis software that supports coding, text interpretation and multimedia analysis is ATLAS.ti.
The software offered by VU Amsterdam includes: Stata, SPSS, ATLAS.ti.
Data analysis documentation
Data analysis documentation involves notes on data preprocessing, such as cleaning, documentation of the statistical analysis, storing code and scripts, and keeping track of software versions.
Research data & software management training
Research data management (RDM) concerns the organisation, documentation, storage, archiving and sharing of digital and analogue data.
Proper RDM ensures reliable verification of results, and permits new and innovative research built on existing information. In other words, proper RDM makes data more FAIR (findable, accessible, interoperable and reusable) as it supports the transparency of scientific output. The Library Calendar gives an overview and invites you to register for courses that are offered.
Software Carpentry workshops aim to help researchers get basic research computing skills. These hands-on workshops will cover basic concepts and tools, including program design, version control, data management, and task automation.