Validity and Test Development
Order Description
A review paper describes many published research studies on a particular topic. For this project, you will be critiquing an article that is a review paper (Mitchell, 2012) related to external validity of psychological research. External validity refers to the generalizability of results of a study and how well those results will hold up outside a laboratory setting in the real world.
Complete a three- to four-page paper that summarizes the main points and findings. Describe it in a scholarly (using critical thinking) manner, but also in a way that someone from outside the area can understand. Be sure to include the reference of the paper in your submission.
Mitchell, G. (2012). Revisiting truth or triviality: The external validity of research in the psychological laboratory.
Perspectives on Psychological Science, 7(2), 109-11
Perspectives on Psychological Science
7(2) 109–
117
© The Author(s) 2012
Reprints and permission:
sagepub.com/journalsPermissions.nav
DOI: 10.1177/1745691611432343
http://pps.sagepub.com
A widely held assumption within the social sciences is that the
rigor of experimental research is purchased at the price of generalizability
of results (Black, 1955; Locke, 1986; Wilson,
Aronson, & Carlsmith, 2010). This trade-off plays out most
directly in those fields that use laboratory experiments to study
how humans navigate complex social environments, such as in
social and industrial–organizational (I-O) psychology. In these
fields, highly controlled experiments produce internally valid
findings with suspect external validity (e.g., Flowe, Finklea, &
Ebbesen, 2009; Greenwood, 2004; Harré & Secord, 1972).
Researchers typically respond to external validity suspicions
in one of three ways: by arguing that findings from even
highly artificial laboratory studies advance theories that
explain behavior outside the laboratory (e.g., Mook, 1983;
Wilson et al., 2010), by conducting field studies that demonstrate
that causal relations observed in the laboratory hold in
the field (e.g., Behrman & Davey, 2001), or by conducting a
meta-analysis of laboratory and field studies to assess the
impact of research setting on results within a particular area of
research (e.g., Avolio, Reichard, Hannah, Walumbwa, & Chan,
2009). Anderson, Lindsay, and Bushman (1999) offered a
novel and broad response to the external validity question by
comparing 38 pairs of effect sizes from laboratory and field
studies of various psychological phenomena as compiled in 21
meta-analyses (i.e., each meta-analysis compared the mean
effect size found in the laboratory to that found in the field for
the particular phenomenon under investigation).1 Anderson
and colleagues found a high correlation between these metaanalyzed
laboratory and field effects (r = .73), leading them to
conclude that “the psychological laboratory is doing quite well
in terms of external validity; it has been discovering truth, not
triviality: (Anderson et al., 1999, p. 8).
Anderson et al. (1999) has been widely cited (as of
this writing, 150 times in PsycINFO), often for the proposition
that psychological laboratory research in general possesses
external validity and, thus, the new laboratory finding being
reported is likely to generalize (e.g., Ellis, Humphrey, Conlon,
& Tinsley, 2006; von Wittich & Antonakis, 2011; West, Patera,
& Carsten, 2009). This proposition, and its use to allay external
validity concerns about new laboratory findings, assumes
the external validity of Anderson and colleagues’ conclusion
about the external validity of laboratory studies.
However, Anderson and colleagues’ conclusion was based
on a fairly small number of paired effect sizes that show
considerable variation despite the strong overall correlation
between laboratory and field results. For instance, their six
comparisons of laboratory and field effect sizes from
Corresponding Author:
Gregory Mitchell, School of Law, University of Virginia, Charlottesville, VA
22903.
E-mail: greg_mitchell@virginia.edu
Revisiting Truth or Triviality: The External
Validity of Research in the Psychological
Laboratory
Gregory Mitchell
University of Virginia
Abstract
Anderson, Lindsay, and Bushman (1999) compared effect sizes from laboratory and field studies of 38 research topics
compiled in 21 meta-analyses and concluded that psychological laboratories produced externally valid results. A replication
and extension of Anderson et al. (1999) using 217 lab-field comparisons from 82 meta-analyses found that the external
validity of laboratory research differed considerably by psychological subfield, research topic, and effect size. Laboratory
results from industrial–organizational psychology most reliably predicted field results, effects found in social psychology
laboratories most frequently changed signs in the field (from positive to negative or vice versa), and large laboratory effects
were more reliably replicated in the field than medium and small laboratory effects.
Keywords
external validity, generalizability, meta-analysis, effect size
Downloaded from pps.sagepub.com at UNIV WASHINGTON LIBRARIES on April 23, 2012
110 Mitchell
meta-analyses of gender differences in behavior reached
inconsistent results (r = -.03). Furthermore, their correlational
result indicated the direction and magnitude of the
relationship, but not the magnitude of differences in effect
sizes between the laboratory and the field (i.e., the rank
ordering of effects could be quite consistent despite large differences
in effect size between the lab and field). Because the
small sample examined by Anderson and his colleagues limited
the analyses that could be performed and the conclusions
that could be drawn from their study, a replication and extension
of Anderson et al. (1999) was undertaken to examine the
external validity of psychological laboratory research after
10 years using a larger database of effect sizes covering a
wider range of psychological phenomena. This larger data
set permitted a more detailed examination of external validity
by psychological subfield and area of research.2
The goal of my study, therefore, was to replicate Anderson
et al.’s (1999) study using a larger data set to determine whether
their broad positive conclusion about the external validity of
laboratory research remains defensible or whether there are
identifiable patterns of external validity variation. This study,
like Anderson and colleagues’ study, is focused on whether laboratory
and field results agree and thus employs a coarse distinction
between research settings—comparing results obtained
under laboratory conditions to those found in the field or under
more mundanely realistic conditions. To the extent that variation
between the laboratory and field is observed, a more
detailed inquiry is called for because many different design
variables could account for the variation: differences in participant
characteristics between lab and field studies and across cultures
(Henrich, Heine, & Norenzayan, 2010; Henry, 2009);
differences in guiding design principles such as the use of
“mundane realism” versus “psychological realism” (Aronson,
Wilson, & Akert, 1994, p. 58) versus representative sampling of
stimuli to develop participant tasks, environments, and measures
(Dhami, Hertwig, & Hoffrage, 2004); or differences in the
timing of the research that may be related to larger societal
or historical changes (Cook, 2001). Also, there may be fundamental
differences in the generalizability of the processes or
phenomena studied across psychological subfields: Some phenomena
at some levels of analysis may not vary with the characteristics
of the individual and situation, some phenomena may
be unique to particular laboratory designs using particular types
of participants (i.e., some phenomena may be created in the
laboratory rather than be brought into the laboratory for study),
and some phenomena may generalize across a narrow range of
persons and situations.
In short, examining the consistency of meta-analytic estimates
of effects across research settings provides a good first
test of the generalizability of laboratory results, but the limits
of this approach must be acknowledged. The inferences to be
drawn from positive results are limited by the diversity of the
participant and situation samples found in the synthesized
studies, and negative results call for deeper inquiry into the
causes of external invalidity. The meta-analytic data examined
here cover a wide range of psychological topics, research settings,
and participants. Therefore, if results based on this data
set approximate those found by Anderson et al. (1999), then
we should have greater confidence in their conclusion that
psychological laboratories reveal truths rather than trivialities.
If results based on this larger data set differ, then the task will
be to understand why some laboratory results generalize while
others do not.
Meta-Analytic Data on Effects Studied in
the Laboratory and the Field
An effort was made to identify all meta-analyses that synthesized
research on some aspect of human psychology conducted
in a laboratory setting and in an alternative research
setting (see the Appendix for details on the literature search).
In keeping with the approach taken by Anderson et al. (1999),
comparisons were not limited strictly to laboratory versus
field research on the same topic but also included comparisons
of results found under less and more mundanely realistic conditions
(e.g., the use of experimentally created versus real
groups in the study of group behavior and the use of hypothetical
versus real transgressions in the study of forgiveness). A
review of over 1,100 papers located in the literature search
identified 82 meta-analyses reporting effect sizes for at least
two research settings, for a total of 217 comparisons of results
found under laboratory, or less realistic, conditions to results
found under field, or more realistic, conditions (including two
dissertations that contributed six lab–field comparisons).3 The
full data set is provided in an online supplement.
Most meta-analyses reported effect sizes in terms of r.
When an effect size was reported in a unit other than r, the
effect size was converted to r using standard conversion formulas
(Cohen, 1988; Rosenthal, 1994). When both weighted
and unweighted effect sizes were reported, the weighted effect
sizes were used in the analyses reported here.
Four of the meta-analyses compared two types of laboratory
studies with one or more types of field studies, and 17 of
the meta-analyses compared two or more types of field studies
with a single type of laboratory study (see online supplement
for details). The results discussed below focus on the
comparison of laboratory effects with true field studies or
with conditions that differ most from the laboratory conditions
because these research settings possess the least “proximal
similarity” (Cook, 1990) to the laboratory and thus are
likely to raise the greatest generalizability concerns (e.g.,
McKay & Schare’s, 1999, comparison of results found in a
traditional laboratory to those found in the field serves as the
focal comparison, rather than their comparison of a traditional
lab to a “bar lab”).4
In order to examine possible variation in generalizability
across research domains, I classified the meta-analytic data in a
number of ways: (a) by PsycINFO group codes that are used to
classify studies by primary subject matter (for more information
on this classification system, see http://www.apa.org/pubs/
Downloaded from pps.sagepub.com at UNIV WASHINGTON LIBRARIES on April 23, 2012
External Validity of Laboratory Research 111
databases/training/class-codes.aspx), (b) by psychological subfield
as classified by the present author before knowing the
PsycINFO classifications of the meta-analyses, (c) by psychological
subfield of meta-analysis first author as determined by
the affiliation disclosed in the meta-analysis or from information
available on the Web if the first author’s subfield affiliation
was not apparent from the meta-analysis, and (d) by research
topics according to PsycINFO subgroup codes and classification
by the present author. Results using the PsycINFO classifications
are emphasized because those classifications were
made by independent coders, show consistency over time, and
cover more of the data than some alternative classifications.5
Consistency and Variation in Effects in the
Laboratory and Field
Aggregate results
A plot of the data reveals considerable correspondence in
paired laboratory and field effects (see Fig. 1). When one
potential outlier is removed, the overall correlation between
lab and field effects in this expanded sample approximates that
found in Anderson et al.’s (1999) sample: r = .71 versus r = .73
reported by Anderson and colleagues (see Table 1 for the full
correlation matrix).6
As a measure of the reliability of the direction of effects
found in the laboratory, the number of times in which a laboratory
effect changed its sign in the field (from positive to negative
or vice versa) was counted: overall, 30 of 215 laboratory
effects changed signs (14%).7 Thus, a nontrivial number of
effects observed in the laboratory produced opposite effects in
the field. With respect to the relative magnitude of effects, the
mean difference between laboratory and field effects was only
.01, but this difference had a standard deviation of .18 on a
scale in which the average laboratory and field effects were
both r = .17.
Results by subfield
It is possible that the dispersion seen in Figure 1 is random
across research topics and domains, or it may be that the
aggregate results mask systematic differences in lab–field
correspondence. To examine possible differences in lab–field
correspondence across traditional divisions of psychological
inquiry, the paired effects were divided by two alternative subfield
classifications: first by the subfield that PsycINFO classified
each meta-analysis into, and second by the subfield
that I classified each lab–field comparison into (see Table 2).
Subfield assignments and results converged under the two
approaches to classification, indicating that there was meaning
and consistency to the partitioning of the research by psychological
subfield.
The two subfields with the greatest number of paired
effects, I-O psychology and social psychology, differed considerably
in the degree of correspondence between the lab and
the field. Laboratory and field effects from I-O psychology
correlate very highly (r = .89, n = 72, 95% CI [.83, .93]),
whereas laboratory and field effects from social psychology
show a lower correlation (r = .53, n = 80, 95% CI [.35, .67]).8
A similar result holds if we partition effects by the subfield
affiliation of the first author of each meta-analysis: The
1.000
0.500
0.000
–0.500
–1.000
–0.400 –0.200 0.000 0.200 0.400 0.600 0.800 1.000
Lab
Field
y = .639x + .062
Fig. 1. Scatter plot of paired lab and field effects across all meta-analyses.
Table 1. Correlation of Lab-Field Effects
Lab Lab2 Field Field2 Field3
Lab 2 (n = 216) .99 [.99, .99] —
Field (n = 216) .71 [.64, .77] .70 [.63, .76] —
Field 2 (n = 42) .68 [.48, .82] .69 [.49, .82] .57 [.32, .74] —
Field 3 (n = 21) .49 [.07, .76] .49 [.07, .76] .63 [.27, .83] .43 [.00, .73] —
Note: “Lab” represents collection of primary lab results; “Lab2” substitutes second lab result for primary lab
result from four meta-analyses that examined two types of lab studies. “Field” represents collection of primary
field results; “Field2” and “Field3” represent field studies from meta-analyses examining two or three different
types of field studies. Sample sizes reflect number of paired effect sizes. Brackets present 95% confidence
intervals. Results exclude the possible outlier paired-effects from Mullen et al. (1991).
Downloaded from pps.sagepub.com at UNIV WASHINGTON LIBRARIES on April 23, 2012
112 Mitchell
lab–field correlation from meta-analyses conducted by I-O
authors is .82 (n = 107, 95% CI [.75, .87]), whereas the lab–
field correlation from meta-analyses conducted by social psychology
authors is .53 (n = 76, 95% CI [.35, .67]).9
A plot of paired lab and field effects for I-O psychology
and social psychology illustrates the greater convergence of
lab and field results within I-O psychology: The slope of the
fitted line is steeper for I-O psychology, with I-O lab effects
thus being better predictors of field effects (see Fig. 2).10
Also, the paired effects from I-O psychology differed less in
their magnitude, as the distribution around zero difference is
steeper for I-O psychology than for social psychology
(KurtosisI-O = 2.318 vs. KurtosisSocial = -.03). For comparison
purposes, a boxplot of the differences in effect size
between the laboratory and field across all subfields is provided
in Figure 3.
Furthermore, most of the 30 laboratory effects that changed
signs in the field came from social psychology. Twenty-one of
80 (26.3%) laboratory effects from social psychology changed
signs between research settings, but only 2 of 71 (2.8%) laboratory
effects from I-O psychology changed signs; as an additional
reference point, only 1 of 22 (.05%) laboratory effects
from personality psychology changed signs, ?2(2) = 19.12,
p < .001.11
Table 2. Correlation of Lab-Field Effects by Subfield Classifications
PsycINFO classification (n) r r Author’s classification (n)
Social (80) .53 .60 Social (79)
I-O (72) .89 .82 I-O (98)
Personality (22) .83 .84 Clinical (19)
Consumer (7) .59 .59 Marketing (7)
Education (7) .71 .87 Education (5)
Developmental (3) -.82 -.88 Developmental (6)
Psychometrics/Statistics/Methods (19) .61
Human Experimental (5) .61
Note: Sample sizes reflect number of paired effect sizes. The PsycINFO classification excludes one
pair of effects classified as “Environmental Psychology,” and the author classification excludes two
pairs of effects classified as “Health Psychology.” Results exclude possible outlier effects from Mullen
et al. (1991).
–.40 –.20 .00 .20 .40 .60 .80 1.00
–.40
–.20
.00
.20
.40
.60
.80
1.00
–.40 –.20 .00 .20 .40 .60 .80 1.00
Field
Lab
y = .522x + .087 y = .819x + .02
Social I-O
Fig. 2. Scatter plot of paired lab and field effects from social and I-O psychology.
Downloaded from pps.sagepub.com at UNIV WASHINGTON LIBRARIES on April 23, 2012
External Validity of Laboratory Research 113
Results by effect size
A partial explanation for the relatively weaker external validity
of social psychology laboratory results appears to be a disproportionate
focus on small effect sizes. Using Cohen’s rule of
thumb to categorize laboratory effect sizes, meta-analyses
within I-O psychology examined 29 small, 22 medium, and 21
large laboratory effects, and meta-analyses within social psychology
examined 53 small, 20 medium, and 8 large laboratory
effects.12 Small laboratory effects studied by social psychologists
varied more in the field than medium effects from social
psychology labs: rsmall effects = .30 (n = 53, 95% CI [.03, .53]) vs.
rmedium effects = .57 (n = 20, 95% CI [.17, .81]).13 Small laboratory
effects from I-O psychology likewise varied more in the field
than larger effects: rsmall effects = .53 (n = 29, 95% CI [.20, .75]) vs.
rmedium effects = .84 (n = 22, 95% CI [.65, .93]) vs. rlarge effects = .90
(n = 21, 95% CI [.77, .96]). This trend held across all studies,
rsmall effects = .47 (n =112, 95% CI [.31, .60]) vs. rmedium effects = .56
(n = 66, 95% CI [.37, .71]) vs. rlarge effects = .83 (n = 38, 95% CI
[.70, .91]), and small laboratory effects more frequently changed
signs in the field than medium and large effects (22.7% vs. 6.1%
vs. 2.6%, respectively).
Results by research topic
Lab–field correlations for specific areas of research (e.g.,
aggression studies, leadership studies) with at least nine
meta-analytic comparisons of laboratory and field effects were
examined. These results should be interpreted cautiously
because they are more sensitive to extreme values given the
smaller number of comparisons, but these results do converge
with the subfield results because topics of primary interest to
I-O psychologists showed the highest correlations and topics
of primary interest to social psychologists showed greater
variation (see Table 3).
However, these results also illustrate the hazard of assuming
that aggregate correlations of lab–field effects are representative
of the external validity of all laboratory research within a
subfield. There were large differences in the relative magnitude
of laboratory and field results across research topics (see the
standard deviations in mean effect size differences in Table 3)
and in the magnitude of the correlations. For instance, although
results from I-O laboratories tended to be good predictors of
field results, I-O laboratory studies of performance evaluations
were less predictive than I-O laboratory studies of other topics,
and leadership studies within I-O psychology were less predictive
than leadership studies within social psychology (r = .63
for 10 paired laboratory and field effects from leadership metaanalyses
conducted by I-O-affiliated authors vs. r = .93 for 7
paired effects from leadership meta-analyses conducted by
social-affiliated authors). Laboratory studies of gender differences
fared particularly poorly compared with other types of
social psychological research, which may be due to the small
effect sizes found in these studies.14
1.00
.50
–.50
–.100
.00
Difference (Lab Effect Minus Field Effect)
Social
I-O
Personality
Consumer
Psychometrics Stats & Methods
Developmental
Environmental
Human Experimental
Education
Fig. 3. Boxplot of differences between lab and field effect sizes by subfield.
Downloaded from pps.sagepub.com at UNIV WASHINGTON LIBRARIES on April 23, 2012
114 Mitchell
Discussion
This expanded comparison of laboratory and field effects replicated
Anderson and colleagues’ (1999) basic result, but it also
raises questions about treating the external validity of psychological
laboratory research as an undifferentiated whole: In the
aggregate, laboratory and field effect sizes tended to covary (r =
.71 vs. Anderson et al.’s r =.73, if we exclude a potential outlier
from social psychology), but this result depended on the
extremely high correlation of laboratory and field effects from
I-O psychology. If we exclude I-O effects, the aggregate correlation
drops considerably (to r = .55).
External validity differed across psychological subfields
and across research topics within each subfield, and all subfields
showed considerable variation in the relative size of
effects found in the laboratory versus the field. External validity
also differed by effect size: Small laboratory effects were
less likely to replicate in the field than larger effects. This latter
result empirically demonstrates the importance of considering
effect size when planning a field test, not only to
determine sample size but also to determine the sensitivity
with which measurements should be made and the type of
research design needed to isolate the influence of the variables
of interest (Cohen, 1988).
Despite the variations in generalizability observed, it is
tempting to invoke Cohen’s effect size rule of thumb and conclude
that all of psychology is performing well in terms of
external validity because all subfields showed large lab–field
correlations, but doing so would ignore Cohen’s (1988) injunction
that “the size of an effect can only be appraised in the context
of the substantive issues involved” (p. 534). For an
investigator considering whether to pursue a new line of
research building on prior work, even small lab–field correlations
may be sufficient to proceed. For an organization or government
agency considering whether to implement a program
based on psychological research, even large lab–field correlations
may be insufficient, particularly if the costs of implementation
are high relative to the likely benefits. To determine
likely benefits, the constancy of effect direction and the relative
magnitude of the effect in the lab versus that found in the field
should be considered, but aggregate correlations between lab
and field effects do not provide this information.
Reliance on a subfield’s “external validity effect size” could
be particularly misleading for results from social psychology,
where more than 20% of the laboratory effects changed signs
between research settings. Shadish, Cook, and Campbell
(2002) emphasize constancy of causal direction over constancy
of effect size in their discussion of external validity on grounds
that constancy of relations among variables is more important
to theory development and the success of applications. The
number of sign reversals observed across domains should be
cause for concern among those seeking to extend any psychological
result to a new setting before any cross-validation work
has occurred.
Whether these sign reversals should be cause for concern in
any particular case depends on the goals of the research. Mook
(1983) correctly noted that some studies require external invalidity
to test a prediction or determine what is possible. In such
studies, what matters is whether the study helps advance a
theory, not whether a specific finding will generalize. But
Mook (1983) also noted that, “[u]ltimately, what makes
research findings of interest is that they help us understand
everyday life” (p. 386). Psychologists often examine minimal,
manageable interventions to open a window on psychological
processes and causal relations among variables (Prentice &
Miller, 1992), and that approach is justifiable if it ultimately
produces theories that explain and predict behavior outside the
laboratory. Small effects found in the lab can be important, and
large effects found in the lab can be unimportant (Cortina &
Landis, 2009); whichever is the case must eventually be established
in the field.
Conclusion
My results qualify the conclusion reached by Anderson et al.
(1999): Many psychological results found in the laboratory
can be replicated in the field, but the effects often differ greatly
in their size and less often (though still with disappointing frequency)
differ in their directions. The pattern of results suggests
that there are systematic differences in the reliability of
laboratory results across subfields, research topics, and effect
sizes, but the reliability of these patterns depends on the representativeness
of the laboratory studies synthesized in the
meta-analyses that provided the data for this study.
Also, it is possible that alternative divisions of the data
would yield different patterns. The data divisions that were
Table 3. Correlation of Lab-Field Effects and Standard Deviations
of Effect Size Differences by Research Topic Classifications
Classification (n) r SD
PsycINFO classification
Group Processes & Interpersonal
Processes (33)
.58 .18
Social Perception & Cognition (9) .53 .17
Personality Traits & Processes (20) .83 .13
Behavior Disorders & Antisocial Behavior
[aggression studies] (14)
.68 .14
Personnel Management & Selection &
Training (14)
.92 .12
Personnel Evaluation & Job Performance (21) .74 .16
Organizational Behavior (18) .97 .09
Author classification
Aggression-focused comparisons (17) .63 .13
Gender-focused comparisons (22) .28 .13
Group-focused comparisons (43) .63 .19
Leader-focused comparisons (18) .69 .21
Note: Sample sizes reflect number of paired effect sizes. Results exclude
possible outlier effects from Mullen et al. (1991).
Downloaded from pps.sagepub.com at UNIV WASHINGTON LIBRARIES on April 23, 2012
External Validity of Laboratory Research 115
chosen reflect two ideas: (a) different subfields develop and
teach unique research design customs and norms (see, e.g.,
Rozin, 2001), and (b) different research topics require different
compromises to enable their study in the laboratory (e.g.,
prejudice and stereotyping research in the laboratory must
often use simulated work situations, whereas research into the
accuracy of impressions based on thin slices of behavior may
be well-suited for laboratory study;15 Secord, 1982). Determining
the mix of factors responsible for the observed variations
in external validity will require further research.
A good starting place for such further inquiry is I-O psychology.
Results from I-O labs varied in their generalizability,
but the high degree of convergence in I-O effects across
research settings indicates that something about this subfield’s
practices or research topics tends to produce externally valid
laboratory research. It may be that I-O psychologist’s traditional
skepticism of laboratory studies (Stone-Romero, 2002)
is adaptive: In a culture that trusts well-done laboratory studies,
internal validity challenges will likely command the
researcher’s (and journal editor’s) attention, whereas in a culture
that distrusts even well-done laboratory studies, external
validity challenges may grab much more of the researcher’s
(and editor’s) attention.16 It may be that the topics I-O psychologists
study are more amenable to laboratory study than
those studied by social psychologists, but that seems unlikely
given the focus in both subfields on behavior in complex
social settings. It may be that I-O psychologists, as primarily
applied researchers, benefit from the trial and error of basic
researchers in other subfields and are able to devote their
attention to robust results. If the explanations all reduce down
to the applied focus of I-O psychology, then the external and
internal validity of research within the basic research subfields
could benefit from greater attention to applications, for replication
in the field reduces the chances that relations observed
in the laboratory were spurious (Anderson et al., 1999).
Anderson et al. (1999) presented a positive message about
the generalizability of psychological laboratory research, but
the message here is mixed. We should recognize those domains
of research that produce externally valid research, and we
should learn from those domains to improve the generalizability
of laboratory research in other domains. Applied lessons
are often drawn from laboratory research before any crossvalidation
work has occurred, yet many small effects from the
laboratory will turn out to be unreliable, and a surprising number
of laboratory findings may turn out to be affirmatively