My research group recently began working with the Human Connectome Project (HCP) dataset, a large database of resting-state fMRI, task fMRI and other brain imaging and demographic data for 500+ subjects. The HCP is pretty unique in that it combines a large number of subjects with long scans and a standardized scanning protocol. This distinguishes it from other large neuroimaging datasets, such as ABIDE, ADHD-200 or the 1000 Functional Connectomes Project, which are grassroots compilations of data collected across multiple sites with different scanning equipment, acquisition protocols, preprocessing pipelines and quality control procedures. Most of the resting-state scans in those datasets are 5-10 minutes long, whereas in the HCP there are 60 minutes of resting-state data collected for each subject.
Our group and many others are interested in using the HCP to develop and evaluate different ways of analyzing brain imaging data. However, some aspects of the HCP are new to many researchers, and it’s not always obvious from an outside perspective how to work with and interpret the data. One example of this is data stored in “grayordinates” using the CIFTI file format. While the HCP also contains data in volumetric space (in NIFTI file format), working with the grayordinates is appealing for a number of reasons that I won’t elaborate on here, except to say that the idea is to isolate the gray matter, which consists of cortical, subcortical and cerebellar areas. However, working with CIFTI files can be daunting for a new user, and the documentation is not always very user-friendly.
I recently started working with these files, so based on my experience here is a basic introduction to navigating, reading and writing CIFTI files with MATLAB. This explanation is based on my limited experience, but it will hopefully be helpful for other novice users. Please feel free to make additions or corrections in the comments!
Part 1: Navigating CIFTI files
The first thing to know about CIFTI files is that they contain several parts, similar to a structure in MATLAB or a list in R. In that sense, they are very different from NIFTI files, which are essentially just 3D or 4D arrays of intensities. Furthermore, there are several types of CIFTI files, including times series (*.dtseries.nii), parcellations (*.dlabel.nii), and scalar images (*.dscalar.nii), and the parts or “fields” contained in each type of CIFTI file are different (though they have several commonalities).
Consider a CIFTI file representing time series data. The time series themselves are stored in a VxT matrix, where V is the number of grayordinate “voxels” and T is the number of time points in the series. (Technically “voxels” only refer to volumetric space, but it’s convenient to have a single term for both surface and volumetric elements… please forgive the abuse of notation.) The total number of voxels in a CIFTI file is around 90,000, including about 60,000 surface voxels (about 30,000 per hemisphere) and 30,000 subcortical and cerebellar voxels. The surface voxels are a 2D representation of the cortical gray matter, while the subcortical voxels are still in 3D/volumetric space. Each surface grayordinate is essentially an average of several cortical gray matter voxels in the original volumetric scan. So what else does a CIFTI file contain? Common to all types of CIFTI files are the following fields:
- brainstructure: a vector of length V with a numerical indicator (1-21) for the major brain structure that each voxel forms part of
- brainstructurelabel: a vector of length 21 with the name of each major brain structure (e.g. CORTEX_LEFT, CORTEX_RIGHT, CAUDATE_LEFT, CAUDATE_RIGHT, CEREBELLUM_LEFT, CEREBELLUM_RIGHT, etc.)
- pos: a Vx3 matrix of the x-y-z coordinates of each voxel. However, only the subcortical and cerebellar voxels have coordinates specified; for surface voxels, the values are missing.
In addition, each type of CIFTI file contains additional information.
- A time series (*.dtseries.nii) CIFTI file contains time (1xT) and dtseries (VxT), where time is the timing (in seconds) of each time point (0, 0.72, 1.44, …) and dtseries is the time series of each voxel.
- A parcellation (*.dlabel.nii) CIFTI file contains some parcellation field myfield (Vx1) and myfieldlabel (1xQ), where Q is the number of parcels. For example, in the HCP groupICA parcellation files (melodic_IC_ftb.dlabel.nii), the parcellation field is named “indexmax“, and Q is the number of ICA components. (As a side note, I believe “indexmax” refers to how spatial components from a groupICA were used to create a parcellation, namely by identifying the component with the maximum z-score at each voxel.) In the FreeSurfer anatomical parcellation files ending in aparc.32k_fs_LR.dlabel.nii, theparcellation field is named “x100307_aparc”, and Q=70.
- A scalar (*.dscalar.nii) CIFTI file contains one or more fields field1 (Vx1), field2 (Vx1), etc., each representing a scalar image of some sort. For example, in the HCP groupICA spatial maps are represented as this type of file, with the fields x1,….,xQ representing the Q spatial maps. Structural images for each subject are also represented as .dscalar files. For example, the files ending in MyelinMap_BC.32k_fs_LR.dscalar.nii contain the field myelinmap_bc, which contain the estimated myelin content of each of the roughly 60,000 surface voxels.
You might have noticed that I never mentioned where CIFTI files store the location of each surface voxel. That’s because they don’t! This is because the surface can be “inflated” to varying degrees, from very “wiggly” (showing full anatomical detail) to completely flattened (showing no anatomical detail). Depending on the degree of inflation, the 3-dimensional location of each surface voxel changes! For a given degree of inflation, the x-y-z location of each voxel are stored as a different type of file, called a GIFTI. I’ll mention below how to read and navigate these files in MATLAB too.
Part 2: Reading CIFTI files with MATLAB
I use the FieldTrip toolbox, which includes the function ft_read_cifti. Once you have the toolbox downloaded and have added to the path, reading a CIFTI file is as simple as mycifti = ft_read_cifti(‘fname.dsomething.nii’). mycifti is then a MATLAB structure containing fields such as the ones described above. Each field can be accessed or altered with the “.” operator, i.e. mycifti.fieldname.
As mentioned above, CIFTI files do not contain the x-y-z coordinates of the surface voxels. These are instead contained in GIFTI files with names ending in “surf.gii”. In the HCP dataset, the level of inflation is indicated in the file name and includes “flat”, “inflated”, “mid thickness”, and “very_inflated”. The left- and right-hemispheres are stored in separate GIFTI files. The GIFTI toolbox can be used to read GIFTI files into MATLAB. After installing and adding the toolbox, mygifti = gifti(‘fname.surf.gii’) results in a MATLAB structure with several fields, including vertices, a VL x 3 matrix containing the x-y-z coordinates of each voxel, where VL is around 30,000, the number of surface voxels in one hemisphere of the brain. To map the voxels in a GIFTI file to the corresponding voxels in a CIFTI file, reference the brainstructure field: all voxels with brainstructure==1 (CORTEX_LEFT) map to those in a GIFTI file for the left hemisphere; all voxels with brainstructure==2 (CORTEX_RIGHT) map to those in a GIFTI file for the right hemisphere.
Part 3: Writing CIFTI files with MATLAB
I have found this to be the most challenging aspect of working with CIFTI files, in part because the documentation for the ft_write_cifti function seems to be somewhat limited. Furthermore, since there are different types of CIFTI files, you cannot simply run ft_write_cifti(‘fname’, mycifti), even if mycifti is simply the result of mycifti = ft_read_cifti(‘fname.d*.nii’). Instead, you must provide some additional information in the form of “optional” key-value pairs. Also, in general I’ve found that you must respect the original file type. For example, to save a parcellation you should start by reading in a .dlabel file; to save a static image you should start by reading in a .dscalar.nii file; etc. Here are a few examples.
- ft_write_cifti(‘fname’, mycifti, ‘parameter’, ‘dtseries’) saves the object mycifti, which has a dtseries field containing the functional data, as the CIFTI file fname.dtseries.nii.
- ft_write_cifti(‘fname’, mycifti, ‘parameter’, ‘x1’, ‘parameter’, ‘x2’) saves the object mycifti, which has fields x1 and x2 containing two static images, as the CIFTI file fname.dscalar.nii.
- ft_write_cifti(‘fname’, ‘mycifti’, ‘parcellation’, ‘parcels’, ‘parameter’, ‘parcels’) or ft_write_cifti(‘fname’, ‘mycifti’, ‘parameter’, ‘parcels’) saves the object mycifti, which has a field parcels containing the labels of each voxel, as the CIFTI file fname.dscalar.nii. (I haven’t figured out how to save a .dlabel file. Running ft_write_cifti(‘fname’, ‘mycifti’, ‘parcellation’, ‘parcels’) throws an error that indicates that ‘parameter’ is required. Specifying ‘parameter’ and ‘parcellation’ or just ‘parameter’ results in a .dscalar file. If anyone knows how to do this, I’d love to know!)
One aspect of working with CIFTI files that I have conspiciously ignored is visualization. This is obviously a very important aspect of working with imaging data, but for now let’s just say it’s next on my to-do list. Luckily another member of our group, Mary Beth Nebel, has started working with the Connectome Workbench, and has produced some beautiful figures, including the ones below. I hope to pick her brain soon and report back!
Scan-rescan reliability by ICA parcel (Q=100):
Group ICA parcellations: