Skip to content

Add columns to the participant.tsv and participant.json (before data collection)(optional)

Prepare participant files for data collection

Participant data should be added whenever new participant data is collected. To prepare for data collection it is necessary to prepare the participant tsv and json file, so your newly acquired data can be validated and is immediately annotated. Here you learn how to fill in the participant file during data acquisition and how properly extend the participant files before data collection.

participants.tsv file

The default participant.tsv file contains three columns: participant_id, age, and sex. Whenever data belonging to a new participant is added to a dataset, a new row should be added to this file.

Example contents of participants.tsv :

participant_id	age	sex
sub-v01	24	m
sub-v02	28	o
sub-v03	25	f

The participants.tsv file has the following rules:

  • Use tabulator as column separator.
  • Use lower case m, f and o for the sex values. According to BIDS these values refer to the phenotypical sex.
  • Age should be an integer. For other values, see age different than integer.
  • Use 89 to indicate ages over 89 to prevent participant identification. Do not indicate age as 89+ or any other string value.

participants.json file

The default participants.json contains the description of the participants.tsv columns. The default annotations are in line with the annotations required for the querying of summarized participant level information via Neurobagel.

age different than integer

The template assumes that the age column has integer values. If your age values are of a different type, this has to be indicated in the section of the participant.json marked below and according to this table.

    "age": {
        "Annotations": {
            "IsAbout": {
                "Label": "Age",
                "TermURL": "nb:Age"
            },
            "Transformation": {
                "Label": "integer value", <-------
                "TermURL": "nb:FromInt" <--------
            },
            "MissingValues": []
        },
        "Description": "The age of the participant at data acquisition",
        "Unit": "years"
    }

Additional columns

Your participant.tsv may contain additional columns describing your participants. In such a case, the participants.json has to be extended with the descriptions of all additional columns.

Additional columns MUST NOT contain personal data.

group column with patient's diagnosis

An additional group column describes participant's diagnosis. Use the annotation tool provided by Neurobagel for generating the column description.

Continuous data columns

Description of an example column height with values in centimeters:

    "height": {
        "Description": "The self-reported height of a subject.",
        "Unit": "centimeters"
    }

Categorical data columns

Description of an example column level_of_education with three categorical values (1: elementary, 2: secondary, 3: postsecondary)

    "level_of_education": {
        "Description": "Level of education of the subject at the moment of data collection.",
        "Levels": {
              "1": {
                  "Label": "elementary"
              },
              "2": {
                  "Label": "secondary"
              },
              "3": {
                  "Label": "postsecondary"
              }
          }
     }