Paper 4

Title

Authors

FLORIAN WICKELMAIER

Summary

Introduction

Multidimensional Scaling (MDS) has become more and more popular as a technique for both multivariate and exploratory data analysis.

MDS is a set of data analysis methods, which allow one to infer the dimensions of the perceptual space of subjects.

Input: a measure of global similarity or dissimilarity of the stimuli or objects under investigation.

Outcome: a spatial configuration, in which the objects are represented as points, arranged in a way that their distances correspond to the similarities of the objects.

Deriving proximities

The data for MDS analyses are called proximities, indicating the overall similarity or dissimilarity of the objects. There are 2 major groups of methods for deriving proximities:

Direct methods
Indirect methods.

the priximity matrix serves as an input for MDS programs. It might be straight forward to ask the participants directly for their judgements of the similarity of objects in many practical applications. Indirect methods might be suited to the investigation of basic perceptual dimensions or additional measures of the objects already exist.

Direct methods

Subjects might either assign a numerical value to each pair, or provide a ranking of the pairs.

(Dis)similarity ratings

In the case of a dissimilarity scale, lower number indicates a stronger similarity between pairs.

Symmetrical relation: the order within each pair is of no relevance, $n(n-1) / 2$ possible pairs needed.
Asymmetric proximity: e.g. object a is more similar to object b than b to a. $n(n-1)$ Pairs needed.

Advantages of direct rating: data are immediately ready, both individual investigation of each participant and aggregate analysis based on averages across the proximity matrices are possible.
Disadvantages of direct ratings: rapidly growing number of pairs as the number of objects increase.

sorting tasks

give each pair a score.
put pairs with low similarity into one pile, and high similarity in the next pile and so on. give each pile a score.
For each object, separate the other objects into 2 piles: low similarity pile and high similarity pile. Then count how many times 2 objects appear together in a pile.

Some of them do not allow for an individual analysis of the data.

Indirect methods

do not need to assign numerical value to each pair. The proximity matrix is derived from other measures, e.g. from confusion data or from correlation matrices.

Confusion data

how often subjects mistake one stimulus for another. rarely confused get a high dissimilarity value, often confused get a lower one.

Advantages: the similarity is judged on a perceptual level without involving much cognitive processing.
Disadvantages: Asymmetric and do not allow for an individual analysis and there must be a good chance of confusing one object with the other, which excludes perfectly discriminable stimuli from being investigated using this method.

Correlation matrices

An application of MDS is to use it for visualizing correlational data. It’s hard to detect patterns of correlation even with just a few objects.

MDS provides an explanation of the correlations by interpreting the axes of the MDS space, and might reveal the relations between the objects more vividly than reporting correlation coefficients.

Basic mechanism

Goal: find a spatial configuration of objects when only some measure of their general similarity is known.

The spatial configuration should provide some insight into how the subjects evaluate the stimuli in terms of a (small) number of potentially unknown dimensions.

There is distinction between Classical MDS and Nonmetric MDS.

### Classical MDS

Assumption: The proximitiy matrix display metric properties, like distances as measured from a map.

Thus, the distances in a classical MDS space preserve the intervals and ratios between the proximities as good as possible.

For human dissimilarity ratings such assumption will often be too strong.

### Nonmetric MDS

Assumption: only assume the order of the proximities is meaningful.

So, the order of the distances in a nonmetric MDS configuration reflects the order of the proximities as good as possible.

Classical MDS

Consider the Inverse problem of calculating the Euclidean distance between points in a map.

**Having only the distances, is it possible to obtain the map? **

When applying classical MDS to proximities, it is assumed that the proximities behave like real measured distances. This might hold e.g. for data that derived from correlation metrics, but rarely for direct similarity ratings.

It provides analtytical solution, requiring no iterative procedures.

Algorithm

find coordinate matrix $X$ s.t. $B = XX’$ where $B$ is double-centered squared proximities.

Take square $P^{(2)} = [p^2]$
Double centering $B = -\frac{1}{2}JP^{(2)}J$ where $J = I - n^{-1}11’$.
find $m$ largest positive eigenvalues $\lambda_1, …,\lambda_m$ of $B$ and correspondign $m$ eigenvectors $e_1, …, e_m$
A $m$-dimensional spatial configuration is derived from the coordinate matrix: $X = E_m \Lambda_m^{¹⁄₂}$, $X$ has dimension: $n \times m$.

Question:

How to transfer objects into points in MDS space?

Nonmetric MDS

only the ordinal information in the proximities is used.

A monotonic transformation of the proximities is calculated, which yields scaled proximities.

Optimally scaled proximities are sometimes referred to as disparities $\hat{d} = f(p)$.

The problem of nonmetric MDS is how to find a configuration of points that minimizes the squared disfferences between the optimally scaled proximities and the distances between the points, that is minimize the so-called stress

$STRESS = \sqrt{\frac{\sum (f(p) - d)^2}{\sum d^2}}$. there also exists some other different versions of stress.

Goodness of fit

stress can be used for judging the goodness of fit of an MDS solution:

stress	GoF
>.20	poor
.10	fair
.05	good
.025	excellent
.00	perfect

This table as a simple guidelines is only used for the stress in the above formula.

Stress decreases as the number of the dimensions increases. Thus, a 2-dimensional solution always has larger stress than a 3-dim soluton.

Additional Techniques commonly used for judging the adequacy of an MDS solution:

Scree plot: dim vs stress

looking for the lowest number of dimensions with acceptable stress: “elbow” (similar to Elbow method when choosing the number of clusters, which is WSS vs number of clusters.)

Shepard diagram: proximities and the distances of the point configuration.

Less spread in this diagram implies a good fit. In nonmetric MDS, the ideal location for the points in a Shepard diagram is monotonically incresing line discribing the so-called disparities. The points are close to the monotonically increasing line indicates good fit.

Algorithm

use twofold optimization process (like coordinate descent), to find the optimal monotonic transformation of the proximities and the optimal arrangement of points of a configuration iteratively.

Random configuration of points, e.g. by sampling from a normal distribution.
Calculate $d$ between points.
Find the optimal monotonic transformation $f$ in order to obtain optimally scaled $f(p)$, i.e. $f = \arg\min_f stress$
Find the optimal configuration of points, i.e. $d = \arg\min_d stress$.
repeat step 3 and 4 until $stress$ is small enough according to some criterion.

Case Study using MDS

Goal of the study was to reveal and identify the dimensions that subjects use in evaluating environmental sounds.

77 subjects participated in the experiment and were presented with $12 \times 11 / 2 = 66$ pairs of sounds, and rated the dissimilarity of each pair from 1 to 9. (direct methods to derive proximities)

Individual analysis

Consider only 1 subject as an example.

Nonmetric MDS was chosen, since it’s doubtful if there is more than ordinal information in the data.

Aggregate analysis

77 single proximity matrices were combined by computing the average value for each pair.

Again, a nonmetric MDS model with Euclidean distances was chosen to represent the data.

If there are additional measurements of the stiuli available, it’s possible to search for an empirical interpretation of the dimensions by correlating them with the external measure. e.g. correlating the values on the first dimension with xxx using Spearman’s rank correlation yields a statistically significant correlation $\rho = 0.69$. Roughly speaking, half of the variance along the x-axis can be explained by xxx.

Individual difference scaling(INDSCAL)

Using Individual difference scaling or weighted MDS, it’s possible to represent both the stimuli in a common MDS space and the individual differences.

Input: individual proximity matrices of all subjects.

assumption: all subjects use the same dimensions when evaluating the objects, but they might apply individual weights to these dimensions. By estimating the individual weights and plotting them, different groups of subjects can be detected.

MDSfig7

Discussion: Decisions to take before you start

the outcome of MDS dependent on the decisions that are taken beforehand.

Data collection stage:
- Similarity judgement cannot simply be regarded as the “inverse” of a dissimilarity judgment.
- Direct or indirect methods
- symmetric or asymmetric
classical or nonmetric method.
Euclidean distances or non-Euclidean distances: Euclidean distances are recommended whenever the most important goal is to visualize the structure; Non-Euclidean distances might be a valuable tool for investigating specific hypotheses about hte subject’s perceptual space.
Type of stress measure.
number of dimensions. (Borg & Groenen, 1997): a $k$-dim representation requires at least $4k$ objects.
type of MDS analysis: individual/ aggregate/ INDSCAL (attention for the assumptions)
software.

Reference

Wickelmaier, Florian. (2003). An introduction to MDS.

Thoughts

Last updated on Jun 18, 2019