Tutorial 10 - Independent Component Analysis (ICA) and Resting State fMRI

Goals

  • To understand the principles of ICA

  • To learn how independent components can be classed as BOLD signals vs. noise based on multiple criteria

  • To learn how ICA can be used to identify noise components

  • To learn how ICA can be used to identify functional networks in task data or resting state data

Relevant Lecture

Lecture 09g: Independent Component Analysis

ADD: Brain Connectivity

Accompanying Data

Tutorial 10 Data

Tutorial 10 Questions

Independent Component Analysis (ICA)

We know that fMRI task data is comprised of known signals plus various sources of noise (such as drift, breathing, motion artifacts). Our approach thus far has been to predict the known sources of activation changes while either preprocessing data to reduce noise (i.e., residuals) or to model out known sources of noise (for example, when we include motion parameters as predictors of no interest). This approach is theoretically driven data analysis.

However, we may not always know how to model certain patterns of activation (e.g., activation related to a seizure or hallucination) and there may be sources of noise we haven't accounted for. Data-driven analyses can provide a valuable tool in such cases.

The goal of Spatial ICA is to conduct a data-driven decomposition of fMRI signals into a collection of independent sources. As we should expect with fMRI, many of these sources will be driven by noise while others will be related to experimental conditions.

In spatial ICA, we are computing a decomposition that will provide a collection of independent time courses that can be used to model observed signal over space – either all voxels, or some subset of voxels. Then, for each voxel, ICA will tell us to what degree (positive or negative) each component contributes to the observed signal.

In other words, ICA will give us a collection of components with which we can model each voxel’s observed time course as a weighted sum, which can be visualized as follows.

 
 

This basic model should seem familiar – as it’s quite similar to how we model observed signal as a weighted sum of predictor functions in the GLM . Here though, ICA will automatically compute both the independent components and the model weights such that certain statistical properties are satisfied. In particular, the independent components are generated such that they collectively share an approximately minimal amount of mutual information . A nicely balanced explanation of how ICA works, including for fMRI analysis, is available at http://users.ics.aalto.fi/whyj/publications/thesis/thesis_node8.html

Why ICA?

ICA can be a useful complement to traditional GLM analyses.

Common uses of ICA in fMRI are:

1. Removing noise

Artifacts often account for a lot of variance. The approaches we have seen so far include filtering out artifacts (e.g., low-frequency drift) and modelling out artifacts with predictors of no interest (especially motion parameters). However, these approaches may still leave considerable variance accounted for (thus increasing residuals). For example, head motion may cause artifacts that do not match expectations (e.g., an abrupt motion can yield spike artifacts). Thus another approach is to extract artifacts from the data itself.

ICA can be used to flag noise components and their beta weights. These components can then be subtracted from the data to leave cleaner data. In this approach, the removal of noise will reduce the residuals and improve statistical significance (Thomas et al., 2002). Alternatively, the time courses from noise components could be extracted and included in a GLM as predictors of no interest.

2. Identifying Networks in Resting State Data

Fluctuations in brain networks during resting state scans are spontaneous and can't be modelled with a GLM. One common approach to analyzing resting state data is to use ICA to identify networks with correlated activation (and/or deactivation). For example, ICA can identify foundational brain networks (e.g., default-mode network, DMN; task-positive network; sensory and motor networks).

3. Identifying Activation that is Not Easily Modelled

Some paradigms expect activation that is correlated with events but not in a straightforward way. For example, a participant may have a seizure or halluciantion that evokes neural processes that are not well modelled with convolved box-car predictors.

ICA in BrainVoyager

BrainVoyager offers a simple implementation of ICA for a single run that makes it easy to visualize components. Though beyond the scope of the course, BrainVoyager also offers a more complex implementation, the Group ICA Plug-in, that can investigate ICA components across participants.

In this part of the tutorial, we will interpret the results of applying ICA to two different scans – one experimental scan and one resting state scans. Our goal is to identify independent components and their associated statistical maps that correspond to BOLD signal as opposed to various sources of noise. We will use 3 techniques to help determine whether an IC is signal or noise:

1) IC time course examination,
2) IC map examination, and
3) fingerprinting.

The following instructions will guide you through interpretation of ICA applied to an task-based scan from our localizer. Then you will apply the same techniques to one resting state scans in an effort to find the components most likely to be driven by BOLD signal.

When running ICA, the experimenter must decide how many components to search for. In this example, we used Brain Voyager's default to decompose the data into 30 independent components. For all of the examples below, we have kept the default setting.

ICA for Task Data

Open the anatomical file sub-15_Localizer/sub-15_ses-01_T1w_IIHC_MNI.vmr .

From the Analysis menu, select Overlay ICA… . Click the Load .ICA… button under the Analysis tab then select and open sub-15_ses-01_task-Localizer_run-01_bold_SCCTBL_3DMCTS_MNI_Masked.ica

Check the Show component time courses box . The Time Course Plot window will then open for you to visualize IC time courses.

Click the "Fingerprint" button .

 
REBEKKA REDO AND FLAG FINGERPRINT BUTTON TOO

REBEKKA REDO AND FLAG FINGERPRINT BUTTON TOO

 

Try to arrange the windows on your computer so you can see the maps on the VMR, the ICA maps control panel, the time courses and the fingerprints all at the same time.

Figure 10-2.  Try to arrange the windows on your screen so you can see all these windows simultaneously.

Figure 10-2. Try to arrange the windows on your screen so you can see all these windows simultaneously.

Each fingerprint is associated with a spatial profile, as shown by the statistical map (upper left in Fig. 10-2), and a temporal profile (time course, lower right in Fig. 10-2). Regions of the IC map shown in hot colors are positively correlated with the IC time course; regions shown in cool colors are negatively correlated.

Fingerprints

Often the components we are most interested in are the BOLD components. The challenge is finding them. One useful tool is "fingerprints".

There are certain properties that we expect for "real" activation vs. artifacts.

1) Real activation should have a high degree of clustering because adjacent voxels fall in the same functional region; whereas, some sources of noise may be more "higgledy piggledy" (one voxel here and there.

2) Real activation should have a temporal autocorrelation (as indicated by the one-lag autocorrelation) because hemodynamics do not change dramatically from volume to volume; whereas, some artifacts will not have a strong autocorrelation.

3) Real activation should have power in intermediate temporal frequencies ; whereas, artifacts often show up in low frequencies (e.g., drift) and high frequencies (e.g., breathing, cardiac).

4) The histogram of voxel intensities should have different properties for real activation (high skewness, low kurtosis vs. artifacts.

Since this is a lot of information to evaluate, we can make the task easier by rendering a "fingerprint" to visualize these attributes in a polar plot. In the white fingerprint below (or the grey-blue one next to it), the 11 spokes represent different attributes (e.g., 12 o'clock = degree of clustering) and the length of each spoke indicates the strength of that attribute (e.g., the plot below shows a high degree of clustering but a low kurtosis). This is the expected pattern for real BOLD activation. Note that it looks a bit like a Mercedes symbol -- three prongs at 12 o'clock, ~5 o'clock, and ~8 o'clock.

 
 
Figure 10-3. An example from BrainVoyager explaining a fingerprint . The fingerprint is a polar plot that represents 11 pieces of information . Each piece of information is represented by a different angle “around the clock”. The strength of that fe…

Figure 10-3. An example from BrainVoyager explaining a fingerprint . The fingerprint is a polar plot that represents 11 pieces of information . Each piece of information is represented by a different angle “around the clock”. The strength of that feature is represented by the distance from the centre of the plot. The fingerprint in panel E shows strong clustering of activation (B) and skewness in the histogram (A) (12 and 1 o’clock positions), a strong one-lag autocorrelation (4 o’clock position), and high power in the intermediate temporal frequencies of the power spectrum (D): ( 9 and 10 o’clock positions: 0.02-0.1 Hz; cycles of 10-50 s) but not low frequencies (7 o’clock position, < .02 Hz; cycles > 50 s) or high frequencies (11 o’clock position: > 0.1 Hz, cycles < 10 s).

 

Various sources of noise will have different fingerprints as indicated by the multicoloured fingerprints below.

 
Figure 10-4. Different types of noise have different fingerprints. From De Martino et al., 2007, NeuroImage.

Figure 10-4. Different types of noise have different fingerprints. From De Martino et al., 2007, NeuroImage.

 

Sorting Components

We can sort components to make it easier to find the most interesting components. In the Independent Component Analysis control panel, you can sort components by adjusting the Sorting Option choice.

Two options are:

Nr = Component number

RMS = Amount of variance accounted for

SortingRMS2.jpg

Another option is to sort components based on the predictors from a conventional GLM. For this localizer data run, components can be sorted for similarity with the 4 convolved boxcar predictors for Bodies, Faces, Hands, and Scrambled relative to baseline.

DM Pred Max = Similarity to design matrix predictors

DM Max Pred Index = Predictor most strongly correlated (1 = Bodies; 2 = Faces, 3 = Hands, 4 = Scrambled).

Sorting Pred index2.jpg

wIn addition, components can theoretically be sorted by spatial similarity to known VOIs or networks (but BV kept crashing when we tried to do this). THESE DO NOT WORK FOR THIS ICA.

Spatial Template (max corr) = maximum correlation with VOI

Spatial Template (VOI index) = which VOI is most correlated

Exploring ICs

Use the following information to look through the each of the 30 components:

  1. Component sorting
  2. Fingerprints
  3. Spatial IC map
  4. Component time courses

Remember that the IC map shows regions that are positively (warm colors) and negatively (cool colors) correlated with the time course. Thus there are two equivalent ways to represent the data. For example, let's say the time course shows higher signal at Time A than Time B. If Voxel X shows A>B, it would appear in a warm color. If Voxel Y shows A<B, it would appear in a cool color. Mathematically, the data could be explained equally well by the same time course inverted (i.e., with every timepoint multiplied by -1) such that it shows lower signal at Time A than Time B. In this case, Voxel X would appear in a cool color and Voxel Y would appear in a warm color. Although these are mathematically equivalent, sometimes it can seem more intuitive to see one combination instead of the other. You can flip between these two displays using the "Pos <-> Neg" button in the Independent Components control panel.

Question 1: Using the four strategies described above, quickly look through each of the 30 ICs and identify two or three that you think are driven by BOLD signal, and two or three that are driven by different types of noise. Give a one-sentence summary for each as to your reasoning.

Question 2: Find a BOLD component that makes more sense to you with one version of the "Pos <-> Neg" rendering than the other and explain why this is so.

Question 3: ICs can include a combination of both signal and noise. Find one component like this and explain what makes you think it is a mixture.

Masking with ICA

Any fMRI data, analyzed with a GLM or with ICA can be masked to include only voxels inside the brain. Usually we want to exclude voxels outside the brain, though sometimes it can be informative to see voxels outside the brain, where effects can not be due to BOLD activation, only artifacts.

Generally for ICA, it's best to use a mask so that the components are used to explain brain patterns. However, there may be cases where unmasked data can reveal additional artifacts.

Click Load ICA and select sub-15_ses-01_task-Localizer_run-01_bold_SCCTBL_3DMCTS_MNI_Unmasked.ica

Question 4: Can you find (a) component(s) outside the brain that account for a lot of variance and that could be useful for knowing something important about the participants' behavior. Hint: Use the blue box mode to set the MNI coordinates to z = -38.

Resting State Functional Connectivity

Even when the brain is not engaged in a task, activation fluctuates and these fluctuations are correlated across brain regions. To investigate such fluctuations, data can be collected in Resting State Scans. Participants are asked to lie in the scanner and "not think about anything in particular" (whatever that means). The eyes can be open or closed and there are some differences in the networks evoked in these two cases.

Networks in Resting State

Two networks that are commonly found during resting state scans are the Default Mode Network (DMN) and the Task-Positive Network (TPN). The names for these two networks come from their activation during task data. The TPN has been found to be activated more than a rest baseline during a diverse range of cognitive tasks (e.g., attention or memory tasks). The DMN has been found to be activated less than a rest baseline during such cognitive tasks (thus another name could've been Task-Negative Network). Though there has been much debate about the function of these two networks, one prominent interpretation is that the TPN is engaged when a person is engaged with the outside world, whereas, the DMN is engaged when a person is in a more self-reflective state like daydreaming or mind wandering.

Figure 10-5. Default-Mode Network (blue-green) and Task-Positive Network (red-yellow) from Fox &amp; Raichle, 2007, Nature Reviews Neuroscience

Figure 10-5. Default-Mode Network (blue-green) and Task-Positive Network (red-yellow) from Fox & Raichle, 2007, Nature Reviews Neuroscience

Figure 10-6. Neurosynth.org map for “default mode”

Figure 10-6. Neurosynth.org map for “default mode”

In addition to these two prominent networks, other networks can be identified from resting state data (e.g., visual, auditory, motor, executive function). Nodes within distinct network are thought to be functionally connected due to anatomical connections, though resting-state data alone cannot determine whether connections are direct or indirect.

Preprocessing for Resting State Data

Because the fluctuations at rest typically unfold over tens of seconds, resting state data is typically preprocessed differently than task data. Most importantly, unlike task data, resting state data is typically bandpass filtered -- to remove high temporal frequencies as well as low temporal frequencies.

The data we will use for the next exercise comes from one 10-minute resting state scan from one participant. The participant's eyes were closed throughout the scan.

The data were preprocessed with moderate spatial smoothing and bandpass filtering to remove temporal frequencies slower than 6 cycles/run (6 cycles/run = 6 cycles/600 s = 0.01 Hz = 100 s/cycle) and temporal frequencies higher than 0.1 Hz (= 10 s/cycle).

Question 5: Why would it be beneficial to use bandpass filtering for resting state analysis?

Two Approaches to Analyze Resting-State Data

There are two main approaches to analyzing resting state data

  1. seed-based analysis
  2. ICA

You will explore both.

Seed-based Analysis of Resting-State Data

An experimenter may want to find the brain network associated with a particular brain region.

To find the DMN, we will use a seed based on the most robust node in the DMN -- the precuneus. We created a spherical ROI (diameter = 6 mm) around the hot spot of the precuneus from a search of "default mode" on Neurosynth.org.

Close all open BrainVoyager windows.

Select File/Open... and open sub-20_RSC/sub-20_ses-01_T1w_IIHC_MNI.vmr

Select Analysis/Link Volume Time Course (VTC) File..., and link sub-20_ses-01_task-RestEyesClosed_run-01_bold_SCCTBL_3DMCTS_MNI_SD3DVSS4.00mm_THPGLMF6c_TDTS10.0s.vtc

Next, click Analysis/Region-of-Interest Analysis..., load Neurosynth_DMN.voi

Select DMN Precuneus and click Show VOI to see the location.

We have saved the time course for this region for the heavily filtered functional data.

Select Analysis/Compute Linear Correlation Maps..., load DMN_Precuneus_Seed_Run1_filtered sub-20_ses-01_task-RestEyesClosed_run-01_bold_SCCTBL_3DMCTS_MNI_SD3DVSS4.00mm_THPGLMF6c_TDTS10.0s.rtc and click GO.

You have now computed a voxelwise correlation between this predictor function and every voxel in the brain.

Adjust the threshold to highlight only the most robust activation.

Question 6: How successful was this approach at finding the DMN?

In Analysis/Overlay Volume Maps..., go to the Map Options tab. Under the Threshold using p value section, select a Bonferroni correction with a threshold of p = .05 and click Apply.

Question 7: Considering that a Bonferroni correction is typically overly conservative, what could explain the pattern of unusual activation here? What does this indicate about the challenges of resting state analyses? What strategies could improve the map?

ICA of Resting-State Data

Now let's try ICA on the same data.

Select Analysis/Overlay ICA... amd load sub-20_ses-01_task-RestEyesClosed_run-01_bold_SCCTBL_3DMCTS_MNI_SD3DVSS4.00mm_THPGLMF6c_TDTS10.0s.ica

Using the techniques you used for exploration of ICs with task data earlier, explore the ICs for this resting state data. Look for meaningful components using all the strategies you have learned (spatial profile, temporal profile, fingerprinting, component sorting). Note that the data can be sorted (using DM pred max) for the correlation with the DMN precuneus time course. Also note that because we have heavily temporally filtered the data, the left side of the fingerprint will be less useful than before (because we have filtered out the frequencies that would normally show up in the 7 o'clock and 11 o'clock ranges). It will help to move your cursor around the volume. Remember that sometimes components make more sense when positive and negative activation are flipped.

Question 8: Which component best reveals the DMN and/or TPN? Explain the reasoning for your choice.

Question 9: Why might ICA be superior to seed-based analyses even for major networks like the DMN vs. TPN?

Question 10: Can you find evidence for three other major sensory, cognitive or motor networks? Explain the reasoning for your choice.

Question 11: Based on your experiences in this tutorial, even though ICA is cool, what factors can make it challenging to interpret? What challenges would you anticipate in applying resting state analyses to group data?