WORKING MEMORY CAPACITY
3.1. Capacity measures
Before discussing capacity estimates more in detail, it is important to introduce different measures that have been used to estimate this capacity. It is not our goal to
exhaustively present all the measure that can be used to measure working memory capacity.
The goal is to explain those measures that have been used and have contributed to an understanding of the research on the maintenance of integrated features.
One of the most clear-cut and easy to interpret measures of working memory capacity concerns working memory span (e.g., Daneman & Carpenter, 1980). This measure is an adaptation of the more general memory span. An individual’s memory span corresponds to the maximum number of items he or she can maintain in correct order. Working memory span corresponds then to the memory span assessed in the presence of concurrent processing activities. In practice, small methodological variations are observed in order to determine the span. For example, one can consider a person’s span as the highest number of items this person can recall at least once, on half of the trials or according to yet another predetermined criterion. Despite these small variations in calculation, the span measure allows for a
Another measure that is as well easy to interpret concerns the capacity estimate k. This measure was designed to allow the derivation of a capacity estimate from the change
detection paradigm. As the change detection paradigm is commonly utilized in research on integrated features, this capacity estimate k has thus often been used to estimate the capacity limits for the maintenance of integrated features. Pashler (1988) was the first to introduce this measure in the change detection paradigm with whole display test. He reasoned that, in theory, the hit rate (H; the detection of a change when a change actually occurred) should correspond to
H = k/N (1)
with “k” corresponding to the maximum number of items that can be maintained in memory and “N” to the number of items presented in the array. For example, imagine that four items can be maintained in memory while arrays of eight items are presented. There is a probability of 4/8 that the change occurs on an item stored in memory. The hit rate should thus be 50 %. In practice, the hit rate would be slightly higher due to guessing “yes”. This guessing “yes” rate corresponds to the false alarm rate (FA; signaling a change when no
change occurred) and applies to those items not maintained in memory. The formula for the hit rate was hence
H = k/N + FA* (N-k)/N (2)
Inverting the formula allows a direct estimation of the capacity measure k:
k = N*(H-FA/1-FA) (3)
Based on the number of items presented, the hit rate and the false alarm rate, one can thus estimate how many items are actually maintained within working memory. Later on, Cowan (2001) made an adaptation of this formula in order to correspond to a variation of the change detection paradigm in which only one item is tested. The test item is then circled or indicated in any other way within the array. The use of this kind of single test probes has implications for the informed decisions and guessing rate. In the whole display test,
participants can correctly detect changes if this change appears on an item present in memory.
They can however never correctly reject a “no change”, some guessing is always implied for those items not present in memory. When using a single test probe, participants can correctly detect a change or correctly reject a no change if the probed item is in memory. Cowan adapted Pashler’s formula for k hence to include this difference in informed decision making.
The hit rate was still defined as
H = k/N + FA* (N-k)/N (4)
The rate of correct rejections was defined as
CR = k/N + (1-FA)* (N-k)/N (5)
A combination of these equations leads to the following formula for the capacity estimate k:
k = N*(H-FA) (6)
Despite the theoretical recommendation to use Pahsler’s k in case of a whole display test and Cowan’s k in case of a single test probe, several authors have nevertheless incorrectly made use of these capacity measures (e.g., Saults & Cowan, 2007; Treisman & Zhang, 2006) or have notwithstanding reported both versions of k (e.g., Vogel, Woodman, & Luck, 2006).
So despite its ease to interpret this capacity measure, caution is warranted when applying the measure. Additionally, the estimate k is recommended to be used only on set sizes superior to
the capacity limit. If this is not the case, the capacity estimate is at risk to be biased (Rouder, Morey, Morey, & Cowan, 2011).
Circumventing the inconveniences linked to the capacity estimate k, several more basic capacity estimates have been reported based on the correct detection of a change or no change. The reported values are harder to interpret on their own as they do not supply
absolute values of the number of items in memory. They are nevertheless well-suited to make comparisons between different conditions. Among these measures are for example the mean proportion correct, expressed as a percentage (e.g., Luck & Vogel, 1997; Morey, 2009). This percentage corresponds to the sum of the percentages of a correct detection of a change and the percentage of a correct rejection of a no change. From this mean proportion correct, rough capacity estimates could still be derived. Luck and Vogel (1997) for example based the working memory capacity measure of four items on the observation of an almost perfect correct recognition score for arrays ranging from one to three items followed by a strong decrease starting from four items on. Based on the signal detection theory (Tanner & Swets, 1954), some more elaborated measures have as well been reported like for example the corrected recognition rate, which corresponds to the hit rate minus the false alarm rate (e.g., Allen et al., 2006). This measure takes into account the response bias. Imagine that a
participant has no items in memory but responds “yes there was a change” on all trials. As normally 50% change and 50% no change trials are presented, this would result in a mean proportion correct of 50%. However, his corrected recognition rate would be 0 % as the false alarm rate is as high as the hit rate (50%). This score of 0% reflects better the actual capacity of this individual. The corrected recognition rate takes thus into account the response bias, while other measures based on the signal detection theory supply a change detection measure free from response bias. These are termed measures of sensitivity and are known as d’, and A’
as its most popular non-parametric variant. These measures are as well calculated based on the hit and false alarm rates. The formula for d’ corresponds to
d’ = Z(H) – Z (FA) (7)
Larger scores indicate a higher ability to detect a change and a perfect sensitivity would correspond to a d’ score of +∞. To justify the use of d’, two assumptions about the underlying distributions have however to be fulfilled, i.e., normality of the signal and noise distributions and an equal standard deviation (see Stanislaw & Todorov, 1999, for details). If
these assumptions are violated, it is still possible to calculate the non-parametric sensitive measure A’4. This measure corresponds to
A’ = .5 + (H-FA)*(1+H-FA)/4H(1-FA) when H ≥ FA (8) and
A’ = .5 + (FA-H)*(1+FA-H)/4FA(1-H) when H < FA (9) An A’ score of .5 means that no ability to detect a change is present and 1 corresponds to a perfect sensitivity to detect changes. These A’ and d’ measures are hence more difficult to interpret in terms of absolute capacity. Nevertheless, they still allow comparisons to be made for example between the maintenance of single features and integrated features, with a higher d’ or A’ corresponding to a higher capacity estimate (e.g., Allen et al., 2012; Brown &
Within the research literature on capacity limits, different studies have reported different capacity measures, as a function of the paradigm used or based on the
straightforwardness of the results. An example of this latter scenario was given in chapter two. We reported the difference in conclusions between two studies, based on a different use of capacity measures. Brown and Brockmole (2010) reported only A’ for their results and based their conclusion on this measure. Allen et al. (2012) replicated the design used by Brown and Brockmole and showed that the conclusions drawn from the measure A’ were not corroborated by other measures like d’ or corrected recognition rate. The reliance on a single measure might thus in some cases induce misleading conclusions. In order to reach firm conclusions, convergent evidence from different capacity measures as well as from different methodologies should be aimed for.
4 A’ has remained the most popular non-parametric measure of sensitivity, despite a correction of the measure formula by Smith (1995, i.e., A'') and a correction on this correction by Zhang and Mueller (2005, i.e., A).