Add columns to the participant.tsv and participant.json (before data collection)(optional)
Prepare participant files for data collection
Participant data should be added whenever new participant data is collected. To prepare for data collection it is necessary to prepare the participant tsv
and json
file, so your newly acquired data can be validated and is immediately annotated. Here you learn how to fill in the participant file during data acquisition and how properly extend the participant files before data collection.
participants.tsv
file
The default participant.tsv
file contains three columns: participant_id
, age
, and sex
. Whenever data belonging to a new participant is added to a dataset, a new row should be added to this file.
Example contents of participants.tsv
:
participant_id age sex
sub-v01 24 m
sub-v02 28 o
sub-v03 25 f
The participants.tsv
file has the following rules:
- Use tabulator as column separator.
- Use lower case
m
,f
ando
for thesex
values. According to BIDS these values refer to the phenotypical sex. - Age should be an integer. For other values, see
age
different than integer. - Use
89
to indicate ages over 89 to prevent participant identification. Do not indicate age as89+
or any other string value.
participants.json
file
The default participants.json
contains the description of the participants.tsv
columns. The default annotations are in line with the annotations required for the querying of summarized participant level information via Neurobagel.
age
different than integer
The template assumes that the age
column has integer values. If your age
values are of a different type, this has to be indicated in the section of the participant.json
marked below and according to this table.
"age": {
"Annotations": {
"IsAbout": {
"Label": "Age",
"TermURL": "nb:Age"
},
"Transformation": {
"Label": "integer value", <-------
"TermURL": "nb:FromInt" <--------
},
"MissingValues": []
},
"Description": "The age of the participant at data acquisition",
"Unit": "years"
}
Additional columns
Your participant.tsv
may contain additional columns describing your participants. In such a case, the participants.json
has to be extended with the descriptions of all additional columns.
Additional columns MUST NOT contain personal data.
group
column with patient's diagnosis
An additional group
column describes participant's diagnosis. Use the annotation tool provided by Neurobagel for generating the column description.
Continuous data columns
Description of an example column height
with values in centimeters:
"height": {
"Description": "The self-reported height of a subject.",
"Unit": "centimeters"
}
Categorical data columns
Description of an example column level_of_education
with three categorical values (1
: elementary
, 2
: secondary
, 3
: postsecondary
)
"level_of_education": {
"Description": "Level of education of the subject at the moment of data collection.",
"Levels": {
"1": {
"Label": "elementary"
},
"2": {
"Label": "secondary"
},
"3": {
"Label": "postsecondary"
}
}
}