Tutorial 9 - Multivariate Analyses Methods

This tutorial was adapted from a class project developed by Geoffrey Laforge, Kathleen Lyons, and Clara Stafford. Thanks!!!

Goals

  • To understand the difference between approaches that focus on multivariate (multivoxel) analyses rather than univariate (subtraction) analyses

  • To understand the method of Representational Similarity Analysis

  • To understand how Multidimensional Scaling can be applied the RSA data

  • To understand how theoretical models can be tested with RSA data

Relevant Lectures

Accompanying Data

Tutorial 9 Questions

There is no fMRI data for this tutorial. There are two Excel files showing how Data RSMs are generated.

RSA-Example_sub-01_MainExp_Right-FFA_NoSplit.xlsx

RSA-Example_sub-01_MainExp_Right-FFA_OddEvenSplit.xlsx

Multi-voxel pattern analysis (MVPA)

Multivoxel pattern analysis (MVPA) is an analysis technique that allows us to ask questions about how information is represented spatially in different areas in the brain (Norman et al., 2006). To do this, instead of focusing on BOLD activity averaged across an entire brain region and across all participants, MVPA investigates the spatial pattern of activity across individual voxels in individual participants to determine if different cognitive states or "representations" of stimuli can be distinguished within those voxels. The different approaches of MVPA ask if the patterns of activity elicited from one condition or a set of conditions are different from or similar to the patterns of activity elicited from another condition or set of conditions in the same region (Mur et al., 2009). MVPA is different from a univariate approach to analyzing fMRI data because MVPA analyzes the relationship between experimental conditions and the pattern of activity across voxels, thus, it can characterize how different conditions (>2) are related to one another (Davis et al., 2014). For example, using a univariate approach, a researcher might determine that there is greater activation for faces than hands in the fusiform face area (FFA), suggesting that the FFA is sensitive to faces and not hands, but using a MVPA approach, we may be able to determine that the FFA carries information not only about faces and hands but a wide variety of other stimulus categories (e.g., chairs). This approach was pioneered by Haxby and colleagues (2001, Science) to show that even areas in the ventral temporal cortex that show peak activation to one type of stimulus category over others still (such as the FFA) carry information about those other categories (e.g., chairs, shoes) in the pattern of activity in each voxel in those areas.

There are two main "flavours" of MVPA:

1) Classifiers, which will not be discussed here

2) Correlations, especially representational similarity analysis, which is the most common correlational approach and will be explored further in this tutorial

Representational similarity analysis (RSA)

RSA is a type of multi-voxel pattern analysis (MVPA) for fMRI data. This method is based on the idea that populations of neurons within a brain region jointly represent information about a stimuli in a specific population code, or pattern of activity (Diedrichsen & Kriegeskorte, 2017; Kriegeskorte & Kievit, 2013). Specifically, representations of content are understood as points in a high dimensional space, and when the brain perceives this content, this leads to a specific pattern of firing of neurons that represent this point in representational space (Davis & Poldrack, 2013). Furthermore, we can characterize the information these populations of neurons are representing using representational geometry , or how far apart the points in a high dimensional pattern space are for different represented stimuli. An important point in this theory is that downstream neurons can read out the information being encoded from the neurons representing this information, and thus it is possible to decode represented information based on this population code (Diedrichsen & Kriegeskorte, 2017; Kriegeskorte & Kievet, 2013).

A note for the philosophically inclined: "Representations" is always a controversial term. Researchers who use RSA, operationally define the term as spatial patterns. Note that the brain may "represent" things in other ways (e.g., at a resolution finer than we can scan or by connections this approach can't measure). Be sure you don't take the term too literally.

Implementation and calculation of approaches

For RSA in fMRI, spatial activity patterns are computed for each ROI (or in searchlights, for each point in the brain). These activity patterns will often be beta weights for each voxel for each condition (but could alternatively be %signal change values, t values, etc.)

Comparisons of spatial patterns across conditions can be represented either in terms of the similarity of the voxel patterns or the dissimilarity . The first RSA studies used a simple Pearson correlation, r , to evaluate similarity. With this metric, dissimilarity is simple to compute: 1 - r .

Geek note: other metrics of distance have been proposed -- Euclidean distance, Mahalanobis distance and cross-validated Mahalanobis or "crossnobis" distance. Crossnobis distances are prefered but are more computationally intensive to calculate.

To keep things simple, because r is a relatively intuitive statistic, we will use r values for similarity and 1-r values for dissimilarity.

Computation of Represenatational Similarity Matrices

Unsplit Data

Open the Excel file: RSA-Example_sub-01_MainExp_Right-FFA_NoSplit.xlsx

This file shows how a Representational Similarity Matrix is computed when the data are combined, i.e., NOT split . This data is from one individual participant (sub-01).

Examine the first tab: Raw Betas.

Question 1: What do columns reflect? How would you use a GLM to generate the values in this tab? What do those values measure?

Question 2: Would you spatially smooth the data? Why or why not?

Step through each of the four tabs -- Step 1-4. Examine the computations in the cells and try to understand the series of steps performed.

Geek note: During Step 2, we normalized each voxel to its own mean , by subtracting the voxel’s overall mean activation from each condition’s activation, so that the overall mean is zero. This is often done to deal with the ‘common activation pattern’ that arises due to some voxels being generally more active or less active than others, which will be shared across some or all conditions and will impact the correlation across conditions (Diedrichsen & Kriegeskorte, 2017). There is some debate whether this is a valid way to correct for this problem (see Garrido et al, 2013), but a discussion of this is beyond this tutorial.

Question 3: Using one sentence per tab, explain each of the four steps in words.

Question 4: Why is it important to perform RSA on each individual participant rather than on averaged data across participants?

Understanding RSMs

Figure 9-1. The representational similarity matrix for the left fusiform face area using unsplit data.

Figure 9-1. The representational similarity matrix for the left fusiform face area using unsplit data.

The figure on the left depicts an example of a non-split representational similarity matrix (RSM) displaying the correlations between the patterns of activity representing each stimulus condition in the left FFA. Sometimes we will refer to such matrices derived from the data as RSMdata.

Note: Warmer colours indicate higher correlations. Note: This lookup table for r values is different from the color coding in the Excel spreadsheet.

Question 5: How would you interpret this data?

Figure 9-2. The representational similarity matrix for the left motor cortex (hand knob) using unsplit data.

Figure 9-2. The representational similarity matrix for the left motor cortex (hand knob) using unsplit data.

Let’s look at the RSMdata for the hand region of primary motor cortex (M1).

Question 6: How do you interpret this RSM and in what ways does it differ compared to the FFA RSM?

The data above was based on unsplit data. You may wish to refer back to the Excel file, RSA-Example_sub-01_MainExp_Right-FFA_NoSplit.xlsx

Question 7: Why do the diagonals have a value of similarity = 1, dissimilarity = 0 in unsplit data?

Computation of Representational Similarity Matrices

Split data

Open the Excel file: RSA-Example_sub-01_MainExp_Right-FFA_OddEvenSplit.xlsx

Now the data are split into even and odd runs. Note that for a single split, this is the simplest and best approach. For real data, a better approach is to to multiple splits and average the results. For example, with eight runs, we could do eight splits (comparing the correlations between each of the eight runs and the remaining seven runs) and average the correlation matrices.

For simplicity, we will just examine the even-odd split.

Question 8: Examining Step 1 of the spreadsheet, what is the key difference between unsplit and split data?

Question 9: How has the correlation matrix changed with split data? Consider (a) the diagaonal, (b) the diagonal symmetry of the matrix; and (c) how noisy the correlation data will be when data is split vs. unsplit.

Figure 9-3. The representational similarity matrix for the left fusiform face area using split data.

Figure 9-3. The representational similarity matrix for the left fusiform face area using split data.

RSM displaying the correlations between the patterns of activity representing each stimulus condition split into odd and even runs in the left FFA. Note: Warmer colours indicate higher correlations.

Question 10: What is the value added of seeing actual (dis)similarity metrics along the diagonal?

Multidimensional Scaling applied to RSA Data

We can visually represent the similarities between conditions using multidimensional scaling (MDS). Multidimensional scaling provides a spatial visualization of the relationship between conditions by preserving the similarities in a high-dimensional space and projecting them into a lower-dimensional representation. For example, the MDS for the left FFA is shown below.

Question 11:

a) How many dimensions were in the original data? How many dimensions are represented in the MDS plots?

b) What are the advantages and disadvantages of reducing dimensionality?

Question 12: How would you explain the MDS plots for Left FFA and Left M1 Hand? Can you tell from MDS plots whether conditions are quantitatively (i.e., statistically) different? Why or why not?

Figure 9-4. Multidimensional scaling output for Left FFA and Left M1 Hand Area.

Figure 9-4. Multidimensional scaling output for Left FFA and Left M1 Hand Area.

Statistical inferences in RSA

RSM Model Matrices (RSMmodel)

Prior to computing any of the processing steps described above, we want to start by defining some theoretical models that we think could represent the data. Keep in mind, these models are not data-driven, but hypothesis-driven , based on our expectation of the relationships between conditions. For the rest of this Tutorial green will indicate higher correlations and red will indicate lower correlations.

As a “sanity check”, one theoretical model can test whether each condition itself more similar to itself than any other condition.

Based on this class project, we would also hypothesize that faces would be more similar to faces and hands would be more similar to hands, – however faces would be relatively dissimilar to hands.

Figure 9-5. This RSMmodel tests whether there is higher similarities across the exact same condition than all other pairs of conditions.

Figure 9-5. This RSMmodel tests whether there is higher similarities across the exact same condition than all other pairs of conditions.

Figure 9-6. This RSMmodel tests whether similarities are higher within categories than between categories.

Figure 9-6. This RSMmodel tests whether similarities are higher within categories than between categories.

Question 13: What other hypothesis driven model could we potentially investigate using RSA in our data set? What would this model look like as an RSM?

Now that we’ve settled not only on some theoretical models but have also now computed ROI-driven RSMs and MDSs, we can move onto the final step RSA: statistical inference.

Comparing RSMdata to RSMmodel

Compare RSMs of each ROI to each theoretical model. The first step is to compute the correlation between a RSMdata from a particular brain region and a RSMmodel. Because this is a correlation between actual and hypothetical correlations, Jody called this a metacorrelation. Usually, and in the case of the course data, a rank order correlation (Spearman) correlation is used because we do not want to assume a linear relationship between the data RSM and model RSM values. Indeed, because the model RSM has ordinal values (e.g., +1 vs. -1) rather continuous values, a Spearman correlation is more appropriate.

For this step, the correlations will be equivalent if we use similarity metrics (r) or dissimilarity metrics (1-r). Either way, a higher correlation indicates a better fit between the RSMdata and the RSMmodel.

We would compute the metacorrelations for each region and each model in each participant.

Figure 9-7. Metacorrelations between the data RSMs and model RSM for the diagonal model across multiple brain regions (FFA = Fusiform Face Area; OFA = Occipital Face Area; STS = face-selective Superior Temporal Sulcus; aIPS = Anterior Intraparietal Sulcus; FEF = Frontal Eye Fields.

Figure 9-7. Metacorrelations between the data RSMs and model RSM for the diagonal model across multiple brain regions (FFA = Fusiform Face Area; OFA = Occipital Face Area; STS = face-selective Superior Temporal Sulcus; aIPS = Anterior Intraparietal Sulcus; FEF = Frontal Eye Fields.

Since we have computed similarity, in the first model (Diagonal), we can see that most of the reference RSMs have positive mean correlations with the candidate RSM, confirming our sanity check that each individual condition will be more similar to itself than any other condition in the majority of the ROIs. In all but 3 regions this mean correlation is above zero. However, note that the correlations within a stimulus category do not reach a value of 1.0.

Question 14: What could be reasons for the correlation not reaching 1?

Question 15: What do you conclude when inspecting the Hands vs. Face results?

RSA allows us to develop hypothesis-driven theoretical models that are designed to more accurately/plausibly capture the response similarity between different regions for the same (or different) stimulus categories. Overall, the majority of the mean correlations between the reference RSMs and the candidate RSMs in our ROIs were substantially higher than those observed using the diagonal model alone, particularly in the left and right FFA, OFA, and LOTC Hand regions.

Question 16: What does the fact that Hand vs. Face is a better model than the diagonal model tell you about how these areas represent these stimuli?

Question 17:

a) How could adding error bars allow you to determine whether the models account for significant variance in the patterns? What should those error bars represent?

b) Statistically, how could you determine whether one model performs significantly better than another?

Figure 9-8. Metacorrelations between the data RSMs and model RSM for the faces vs. hands model across multiple brain regions (FFA = Fusiform Face Area; OFA = Occipital Face Area; STS = face-selective Superior Temporal Sulcus; aIPS = Anterior Intraparietal Sulcus; FEF = Frontal Eye Fields.

Figure 9-8. Metacorrelations between the data RSMs and model RSM for the faces vs. hands model across multiple brain regions (FFA = Fusiform Face Area; OFA = Occipital Face Area; STS = face-selective Superior Temporal Sulcus; aIPS = Anterior Intraparietal Sulcus; FEF = Frontal Eye Fields.


brain-is-fried.jpg

If your brain is fried, you can stop here, but if you are going to use RSA in your own work, work through the next sections before you start analyzing data.


Geek Notes (Optional): Noise Ceiling

If participants' Data RSMs are highly consistent, then a Model RSM can explain the patterns well. However, if participants' Data RSMs are all very different from one another, then no model will perform well.

Brain regions differ in how consistent participants' data can be. Generally "primary" brain areas are more consistent than "association areas". For example, if we are examining primary visual cortex (V1) in a visual task, participants' data RSMs will be very consistent; whereas, if we are examining parietal cortex in a sensorimotor task, Data RSMs may be quite idiosyncratic.

The noise ceiling provides and estimate of how well the best possible model could perform. The noise ceiling is actually a range with a lower bound and an upper bound. These bounds are computed by correlating the Data RSM for each participant with a group-averaged Data RSM and averaging these values across participants. For the lower bound, the group-average Data RSM excludes the participant in question; for the upper bound, it includes the participant in question.

If a Data-Model metacorrelation falls within the range of the noise ceiling, we can conclude that the model does as well as any model could be expected to perform.

Examine the differences in the noise ceiling for the two brain regions shown below.

Figure 9-9. Metacorrelations between the Data RSM for Left FFA and various model RSMs, including the Diagonal model (leftmost) and Hand vs. Face model (second from Left). Note that the noise ceiling in FFA is relatively high (~.8) and the Hand vs. Face model falls within the range of the noise ceiling. This indicates that this model does as well as any model could be expected to do. Error bars represent 95% confidence limits.  Many models, including models based on gaze/hand orientation, do no better than zero (chance).

Figure 9-9. Metacorrelations between the Data RSM for Left FFA and various model RSMs, including the Diagonal model (leftmost) and Hand vs. Face model (second from Left). Note that the noise ceiling in FFA is relatively high (~.8) and the Hand vs. Face model falls within the range of the noise ceiling. This indicates that this model does as well as any model could be expected to do. Error bars represent 95% confidence limits. Many models, including models based on gaze/hand orientation, do no better than zero (chance).

Figure 9-9. Metacorrelations between the Data RSM for Left M1 hand and various model RSMs. No models do much better than chance (especially considering the number of comparisons) but that is unsurprising considering that the noise ceiling is very low.

Figure 9-9. Metacorrelations between the Data RSM for Left M1 hand and various model RSMs. No models do much better than chance (especially considering the number of comparisons) but that is unsurprising considering that the noise ceiling is very low.


Geek Notes (Optional): Fisher Transforming r Values

Correlation values (Pearson r, Spearman rho) are not normally distributed, especially for absolute values greater than .5. This violates the assumptions of many statistical tests (like a one-sample t-test to determine whether Data-Model metacorrelations are significantly greater than zero). One way to normalize correlation coefficients is to apply a Fisher (z) transformation. You should do this before plotting r values and doing stats tests, especially if r > .5.


Geek Notes (Optional): Dealing with the Diagonal

Because unsplit data forces the Data RSM diagonal to have similarity = 1 (dissimilarity = 0), this can create problems for certain types of statistical testing of the metacorrelations between Data RSMs and Model RSMs. For a more detailed discussion, see Ritchie & Op de Beeck (2017).

One problem is that if the diagonal isn't excluded, Data-Model metacorrelations become spuriously high.

Figure 9-10. In theory, a data RSM with null effects should have a correlation of zero with any model. However, because unsplit data has r = 1 along the diagonal, this leads to erroneously high correlations.

Figure 9-10. In theory, a data RSM with null effects should have a correlation of zero with any model. However, because unsplit data has r = 1 along the diagonal, this leads to erroneously high correlations.

One easy solution to this problem is to exclude the diagonal.

Figure 9-11. Excluding the data in the model corrects for the problem shown in Figure 8-10.

Figure 9-11. Excluding the data in the model corrects for the problem shown in Figure 8-10.

The approach of excluding the diagonal works well for non-factorial designs. For example, Kreigeskorte's classic paper had 92 stimuli (human and animal faces & bodies, natural and manmade objects). This was NOT a factorial design.

Our course experiment, however, is a 2 Category (Faces/Hands) x 3 Orientation (Left/Centre/Right) factorial design. If we only discard the diagonal, the factorial contrast becomes unbalanced. For example, a contrast of Same Category vs. Different Category with the only the diagonal removed means that the Same Category correlation values always have different orientations; whereas, the Different Category correlation values can have the same or different orientations. Thus if there is a significant effect of Orientation regardless of category, this could erroneously lead to a negative correlation between RSM data and RSM model for category.

Figure 9-12. Excluding the data in the model is NOT a good choice for unsplit data on factorial designs like ours.  Here, the data RSM shows no main effect of category (faces vs. hands) but does show an effect of orientation (left/centre/right). Excluding the main diagonal in the model also excludes all the green cells with the same orientation from the model. Because the conditions with the same orientation were not excluded from the red cells in the model, the contrast is unbalanced and the correlation becomes negative.

Figure 9-12. Excluding the data in the model is NOT a good choice for unsplit data on factorial designs like ours. Here, the data RSM shows no main effect of category (faces vs. hands) but does show an effect of orientation (left/centre/right). Excluding the main diagonal in the model also excludes all the green cells with the same orientation from the model. Because the conditions with the same orientation were not excluded from the red cells in the model, the contrast is unbalanced and the correlation becomes negative.

One solution to the problem with factorial designs is to also exclude off-diagonals to ensure that contrasts for one factor remain properly balanced over the other factor.

Figure 9-13. Excluding the main diagonal and off-diagonals provides one solution for analyzing factorial designs.

Figure 9-13. Excluding the main diagonal and off-diagonals provides one solution for analyzing factorial designs.

This all becomes rather complicated so a nicer solution is to analyze data with split data (ideally split across multiple permutations rather than just an even-odd split).

Figure 9-14. Split data RSMs do not force the diagonal to have r = 1. This enables meaningful metacorrelations with the Diagonal Model and allows contrasts in factorial designs to remain balanced.

Figure 9-14. Split data RSMs do not force the diagonal to have r = 1. This enables meaningful metacorrelations with the Diagonal Model and allows contrasts in factorial designs to remain balanced.