Skip to main content


We study the pattern of correlations across a large number of behavioral regularities, with the goal of creating an empirical basis for more comprehensive theories of decision-making. We elicit 21 behaviors, using an incentivized survey on a representative sample (n=1,000) of the US population. Our data show a clear and relatively simple structure underlying the correlations between these measures. Using principal components analysis, we reduce the 21 variables to six components corresponding to clear clusters of high correlations. We examine the relationship between these components, cognitive ability, and demographics. Common extant theories are not compatible with all the patterns in our data.

I.  Introduction

Decades of research in economics and psychology has identified a large number of behavioral regularities—specific patterns of behavior present in the choices of a large fraction of decision-makers—that run counter to the standard model of economic decision-making. This has led to an enormous amount of research aimed at understanding each of these behaviors. However, significantly less work has gone into linking these regularities with each other, either theoretically or empirically. Instead, most regularities have been studied in isolation, with specific models developed for each one. This has led to concerns about model proliferation in behavioral economics.1 As Fudenberg (2006, 698) notes, “[B]efore behavioral theory can be integrated into mainstream economics, the many assumptions that underlie its various models should eventually be reduced to the implications of a smaller set of more primitive assumptions.”

In this paper, we study the pattern of correlations across a large number of behavioral regularities, with the goal of creating an empirical basis for more comprehensive theories of decision-making. We use an incentivized survey to elicit 21 behaviors from a representative sample of the US population (n=1,000). These econographics—a neologism describing measures of trait-like behaviors related to economic decision-making—cover broad areas of social preferences (eight measures), attitudes toward risk and uncertainty (nine measures), overconfidence (three measures), and time preferences (one measure). Whenever possible, our elicitations are incentivized: the compensation participants receive depends on their choices. We also include two measures of cognitive abilities and several demographic variables. Moreover, we took steps to limit the effects of measurement error by eliciting many of our measures twice and using the “obviously related instrumental variables” (ORIV) technique (Gillen, Snowberg, and Yariv 2019).

Overall, our main finding is that there is a clear and relatively simple structure underlying our data. We can summarize 21 econographics with six components, as shown in section IV and summarized in table 1. These components can easily be seen in the correlational structure of the data—for example, the correlation between the measures in the Generosity component range from 0.34 to 0.86. Two of these components underlie social preferences and beliefs about others, and two underlie risk and uncertainty. One component has explanatory power in both the domains of social and risk preferences. A final component underlies overconfidence measures. Time preferences are spread across a number of components but load most heavily on the Punishment component of social preferences. Each econographic, except for time preferences, loads heavily on only one of the six components. As we detail below, existing models predict some of the relationships we observe, but to our knowledge none match the overall pattern, suggesting the need for new theories.

Table 1. 
Table 1. 

Twenty-One Econographics Can Be Summarized with Six Components

Note.—Simplified summary of patterns shown in Table 7. The names of econographics correspond to those in the literature. For definitions of specific econographics, see sec. II. As noted in the text, time preferences load weakly on the Punishment component, slightly changing its meaning: this is indicated by the parentheses around that measure and the alternative name of the component. WTA = willingness to accept; WTP = willingness to pay; CR = Common Ratio.

A.  Approach and Limitations

We focus on a specific set of measures and study their empirical relationships in order to create an empirical basis that can aid the development of comprehensive theories of decision-making. Our approach is complementary to others in the literature. One standard approach is to focus on specific theoretical links (e.g., Chakraborty, Halevy, and Saito 2020, which links time and risk preferences). Others start with a specific theoretical mechanism and then study—theoretically and, in some cases, also empirically—which behaviors it can generate. This is the case of popular modern research programs on rational inattention (Stevens 2020; Woodford 2020; Khaw, Li, and Woodford 2021; Frydman and Jin 2022), salience (Bordalo, Gennaioli, and Shleifer 2012, 2013; Li and Camerer 2022), limited strategic thinking (Camerer, Ho, and Chong 2004; Crawford, Costa-Gomes, and Iriberri 2013; Farhi and Werning 2019; García-Schmidt and Woodford 2019), incomplete preferences (Masatlioglu and Ok 2014; Cerreia-Vioglio, Dillenberger, and Ortoleva 2015), preference imprecision (Butler and Loomes 2007, 2011), and cognitive uncertainty (Enke and Graeber 2019). While these important programs are complementary to our approach, our investigation is primarily empirical and does not seek to relate a large number of behaviors to a single underlying mechanism. Rather, it aims to construct a basis for theoretical modeling by exploring the empirical structure of our data, which will turn out to be more complex than what can be explained by a single mechanism.

Examining all links between econographics, as we do—including many not studied by theories that seek to link different behaviors—has a number of benefits, especially when compared with the standard approach of theorizing about a particular connection between two behaviors and testing it. First, as correlations are not transitive, identifying underlying structures requires observing all links (or lack thereof) between behaviors.2 Second, once that structure is identified, its components can be examined to see whether they are associated with existing constructs—such as demographics or cognitive ability. Third, by measuring all behaviors simultaneously in a representative sample, we ensure that the patterns we identify are not due to shifting participant populations between studies.3

The empirical structure we uncover could have taken one of three forms, with different implications. First, there may be no discernible structure, suggesting the continued usefulness of examining behaviors in isolation. Second, there may be a structure that matches well with some existing theoretical approach. Third, it could be that a structure exists but does not match any extant theory. Our results are somewhere between the second and third possibilities: extant theories match some of the correlations we observe, but no theory explains the patterns in a specific domain, let alone the overall patterns we observe. This suggests the need for further theoretical exploration, for which we hope we give an empirical basis.

There are inherent constraints on the number and type of behaviors we could elicit on an incentivized survey. This meant that we had to leave out many interesting phenomena. While we would like to have been able to capture many other behaviors, making choices was unavoidable. Some measures were infeasible in our context: for example, measuring time preferences using a real-effort task (Augenblick, Niederle, and Sprenger 2015; Cohen et al. 2020) or preferences for competition (Niederle and Vesterlund 2007); such measures, while interesting, would have taken as much time as approximately a half-dozen of our other elicitations combined. Other measures—such as inattention, character, and some noncognitive skills—lack clear gold-standard elicitations. There are standard measures of cognitive skills (IQ and cognitive reflection) and personality (see, for early examples in economics, Almlund et al. 2011; Becker et al. 2012). We choose to focus on cognitive skills because of the possibility of their central role in behavioral economics (see, e.g., Benjamin, Brown, and Shapiro 2013; Stango and Zinman, forthcoming). Future studies can and should focus on alternative sets of behaviors. Indeed, as we discuss in section VII, some contemporaneous studies already are.

B.  Analysis and Results

Describing the 378 correlations between 21 econographic variables, as well as between cognitive and demographic variables, is a daunting task. However, we are aided by the fact that many of the econographics fall into clusters of variables—featuring high intracluster correlations and low intercluster correlations. To summarize them, we make use of principal components analysis (PCA), a statistical technique that produces components—linear combinations of variables—that explain as much variation in the underlying behaviors as possible, as discussed in section III.4

In our analysis, discussed at length in section IV, we are aided by a fact that was not ex ante obvious: risk and social preferences are largely—but, as we will see below, not completely—independent. Thus, we first study social and risk preferences separately, before combining them with measures of time preferences and overconfidence.

Our results show that there is significant scope for representing our eight measures of social preferences in a more parsimonious way. The behaviors we measure break down into three clusters, as summarized in table 1. In particular, altruism, trust, and two different types of reciprocity form a cluster. Pro-social punishment and antisocial punishment constitute a second cluster. The third cluster is formed by two different types of inequality aversion—dislike of having more than another person and a dislike of having less. As we discuss below, these results are not in line with any of the existing theories of social preferences.

Risk preferences show a structure that is less parsimonious than standard theory. Our nine measures form three clusters, as summarized in table 1. Two separate ones contain different measures of risk attitudes toward ordinary lotteries. One cluster is formed by the willingness to accept (WTA) for a lottery ticket, together with risk aversion for lotteries with gains, losses, and gains and losses (see sec. II.B) as measured by certainty equivalents. The other cluster is formed by the willingness to pay (WTP) for a lottery ticket, along with risk aversion as measured by lottery equivalents. While the members of each cluster are highly related to each other, they are largely independent across clusters, suggesting the presence of two separate and independent forms of risk attitude toward lotteries.5 These two components have different relationships with other features (e.g., age), as shown in section V. The third cluster consists of aversion to compound lotteries and ambiguity aversion. These behaviors are highly correlated, consistent with previous studies (Halevy 2007; Dean and Ortoleva 2019; Gillen, Snowberg, and Yariv 2019). Our richer data allow us to document that they are largely unrelated to other aspects of risk preferences.

Analyzing risk and social preferences together—along with overconfidence and time preferences—results in six components, as summarized in table 1. Two social components and two risk components are largely unchanged. The three overconfidence measures form a new component.6 Patience loads most heavily (and negatively) on the punishment component of social preferences. However, one social component—inequality aversion—and one risk component—related to WTP—combine with each other, showing the possibility of further parsimony in the representation. Of course, the generic idea of a connection between risk and inequality aversion has been suggested before. For example, from behind the “veil of ignorance” more inequality creates more risk (Carlsson, Daruvala, and Johansson-Stenman 2005). Our results indicate, on the one hand, that this has some empirical support (unlike other possible connections that have been suggested). On the other hand, they show how this holds for only one of the two components of risk attitudes toward ordinary lotteries (in particular, the one related to WTP), not the other (the one related to WTA).

We also examine the relationships between the components we identify and cognitive abilities and demographics, in section V. Four of the six components are correlated with cognitive abilities: higher cognitive ability is associated with a lower propensity to punish and greater patience, lower overconfidence, and higher generosity. Moreover, we document a relationship between cognitive abilities and risk preferences, but with only one of the two components linked with risk aversion—the one joined with inequality aversion. Connecting with demographics, the strongest links are with education and income; moreover, these variables do not seem to be simply proxying for cognitive ability but have their own individual effect.

C.  Relation to Theory and Literature

Finally, we draw out the implications of our data for existing theories, in section VI, some of which were alluded to above. Broadly speaking, we show that while there is some overlap between our data and common theories of social and risk preferences, no theory explains the overall patterns well, even within a specific domain. For social preferences, the three outcome-based models we consider—altruistic preferences, social welfare, and inequality aversion—predict that many of the measures that make up the Generosity component (described in table 1) should be positively correlated.7 However, these theories also predict a relationship with inequality aversion, which we do not observe. For risk preferences, models delineating risk and uncertainty do not match our data. Moreover, although reference points are clearly important in explaining the split between risk-aversion measures associated with WTA and those with WTP, common theories of reference dependence make additional predictions that are not consistent with our data.

Our study is uniquely suited to our aim of understanding the empirical basis for more parsimonious behavioral models across the risk, social, time, and confidence domains. There is a significant literature, summarized in section VII, that examines the correlation between two or three behavioral measures and/or cognitive ability. These more limited sets of relationships are not able to recover all of the nuanced structure we document here. There are a small number of studies that do measure more behaviors (Burks et al. 2009; Dohmen et al. 2018; Falk et al. 2018; Dean and Ortoleva 2019; Stango and Zinman, forthcoming). As discussed in section VII, these studies differ from ours in terms of behaviors examined, incentivization, representativeness, and/or how they deal with measurement error.

II.  Design and Econographics

Four design decisions followed from our goal of providing an empirical basis for the underlying structure of theories of behavioral decision-making. While the most important was the selection of behaviors to elicit, we return to this discussion after introducing the behaviors we measure.

Our second decision was to study behavioral measures rather than using those measures to estimate model parameters. For example, when measuring risk aversion, we elicit certainty equivalents for lotteries and use (a linear transform of) them in our analyses, rather than trying to identify a parameter of some utility function (e.g., a constant relative risk aversion, or CRRA, utility function). This allows our results to be used to inspire theories that connect behaviors without committing to specific functional forms, which are almost surely misspecified and for which precise individual-level parameter estimation is difficult in a short survey.8

Two final design decisions—that our study is incentivized and representative—are easy to justify. Much of the literature we build on comes from laboratory studies in economics, which are almost always incentivized. While there are sometimes good reasons to use nonincentivized measures, these reasons are largely related to feasibility and credibility. These concerns were not substantial in our case, and we were motivated to move past them. In order to make our empirical basis representative of a broad range of people, rather than just specific subgroups, we used a representative sample. These two design decisions drove and constrained a number of implementation details discussed in the next section.

A challenge of using a representative sample is that some participants will be poorly educated (relative to convenience samples of college students). For generalizability, this sampling is a feature, not a bug; but it does mean that elicitations must be simple and designed to have good internal validity for all Americans. Consequently, many of our behavioral measures are based on the same elicitation technique: indifference elicited using a multiple price list (MPL) method. This was chosen because it allows for more efficient estimation of indifference points than asking individual binary choices—which, for an experiment of our scope, would be infeasible—and it is seen as easier for participants to understand than incentivized pricing tasks (Cason and Plott 2014). A training period, with examples and supervised trial sessions, preceded the actual survey.9 Other common issues that arise in trying to draw representative samples are addressed in section III.A. The techniques we used to (statistically) deal with measurement error, detailed in section III.B, further helped in recovering valid estimates from these low-education populations.

Appendix A gives implementation details omitted here. The specific question wordings, screenshots, and other details of experimental design can be found in our replication bundle (Chapman et al. 2022a).

A.  Social Preferences

There are many examples where people’s actions take into account the preferences and beliefs of others, even in nonstrategic settings. The motivating factors behind these acts are often given the broad term of social preferences.

Altruism is defined as giving to strangers while expecting nothing in return. In experimental economics, it is usually measured using the dictator game (Forsythe et al. 1994; Falk et al. 2013), in which one participant decides unilaterally how to split money between themselves and another person. Following this literature, we measure Altruism as the amount given to another person in the dictator game.

Trust and reciprocity are intertwined. To understand why, imagine that a stranger asks for money for a sure-thing investment. In order to provide money in such a venture, you must trust that he will give some of the proceeds back to you. The act of giving money back is reciprocation, which may depend on how much money you gave to the stranger. We measure these concepts through a standard trust game: one participant (the sender) decides how much of an endowment to send to a second (the receiver). This amount is doubled by the experimenter, and then the receiver decides how much to send back—which is also doubled (Berg, Dickhaut, and McCabe 1995). We measure Trust as the amount sent by the participant when they are in the role of the sender. The original sender will also take the role of receiver in a different interaction: Reciprocity: Low corresponds to the amount sent back when receiving the lowest amount, and Reciprocity: High corresponds to the amount sent back when receiving the maximum amount.10

People are willing to punish others for what they perceive as bad behavior—even when that behavior does not directly affect them—and punishment is costly. To measure this, we allow participants to observe a trust game in which the sender gives all the money they have and the receiver returns nothing. We give each participant a stock of points to punish the receiver—that is, to pay a cost to reduce the points the receiver gets: the amount used is Prosocial Punishment. Prior studies document that a significant minority of people—the percent possibly depending on culture—also punish the sender (Herrmann, Thöni, and Gächter 2008). Thus, we also give a separate stock of points that can be used to punish the person who sent all their money. The amount used to punish the sender is Antisocial Punishment.

Many people seem uncomfortable with having a different amount (greater or less) than others, a phenomenon known as inequality aversion (Fehr and Schmidt 1999; Charness and Rabin 2002; Kerschbamer 2015). Dislike Having Less is how much a person is willing to forgo in order to ensure that they will not have less than another person. Dislike Having More is how much a person is willing to forgo in order to ensure that they will not have more than another person.

B.  Measures of Risk Attitudes

To measure attitudes toward risk and uncertainty, we elicit the valuation of various prospects. All lotteries involve only two possible payoffs, and most assign 50% probability to each.

Following the standard approach, we identify the behavioral manifestation of risk aversion as valuing a lottery at less than its expected value. Extensive research shows that the patterns of valuation depend on whether a lottery contains positive payoffs, negative payoffs, or both a positive and a negative payoff.11 Thus, we include three measures of risk aversion: Risk Aversion: Gains elicits a participant’s certainty equivalent for a lottery containing nonnegative payoffs, Risk Aversion: Losses elicits a participant’s certainty equivalent for a lottery with nonpositive payoffs, and Risk Aversion: Gain/Loss elicits a participant’s certainty equivalent for a lottery with one positive and one negative payoff (Cohen, Jaffray, and Said 1987; Holt and Laury 2002). The difference between the expected value of the lottery and a participant’s value is used in the analysis, so larger numbers indicate more risk aversion.

The endowment effect is the phenomenon that, on average, people value a good more highly if they possess, or are endowed with, it. In our implementation, WTP is the amount a participant is willing to pay for a lottery ticket, and WTA is the amount the participant is willing to accept for the same ticket when she or he is endowed with it. The difference between WTA and WTP is the Endowment Effect (Kahneman, Knetsch, and Thaler 1990). We discuss this as a risk preference because the object being bought and sold is a lottery ticket, but more importantly, because of the patterns revealed in the analysis of section IV.C. However, as the pattern of correlations in our data suggests that WTA and WTP are also fundamental behaviors, we primarily examine these rather than the endowment effect.

Risk attitudes often change when one of the available options offers certainty, as demonstrated through the common-ratio effect (Allais 1953). Under expected utility, when the winning probabilities of two lotteries are scaled down by a common factor, a person’s ranking over those lotteries should not change. However, this is often not the case. To capture this effect, we ask the participant to make two choices, one that measures risk aversion with a certain alternative and another in which both options are risky. In Risk Aversion: CR Certain (where CR = Common Ratio), we elicit the amount b such that the participant is indifferent between a certain amount a and a lottery paying b with probability α (and zero otherwise): that is, a lottery equivalent of a sure amount. In Risk Aversion: CR Lottery, we elicit the amount c such that the participant is indifferent between a lottery paying a with probability 1/x (and zero otherwise) and c with probability α/x (and zero otherwise). Under expected utility, b=c; the Common Ratio measure is then bc (Dean and Ortoleva 2019). In keeping with our treatment of the endowment effect, we enter the constituent measures in most analyses.12

Ambiguity aversion is a preference (or beliefs that lead to a preference) for prospects with known probabilities over those with unknown probabilities. To measure it, we use an ambiguous urn filled with balls of two different colors: one color gives the participant a positive payoff, and the other gives them zero. Participants do not know the proportions of the different colors of balls in the urn but are allowed to choose which color gives a positive payoff. They are then asked for their certainty equivalent for a draw from this urn. If participants have a prior over the composition of the urn, they must believe that a draw from the urn has a winning probability of at least 50%, yet many participants prefer a 50/50 lottery with known odds. The difference between the certainty equivalents for draws from the risky urn—with a known composition of 50% of each color—and the ambiguous urn is Ambiguity Aversion.13 Similarly, a draw from a risky urn is usually more highly valued than one in which the number of balls is unknown but drawn from a uniform distribution—that is, a compound lottery. The difference between the certainty equivalents for draws from the risky urn and a compound urn is Compound-Lottery Aversion (Halevy 2007).

C.  Overconfidence

Overconfidence can be divided into three types. Overestimation refers to a person’s estimate of her performance on a task (vs. her actual performance). Overplacement refers to her perceived performance relative to other participants (vs. her real relative performance). In order to measure these phenomena, we ask participants to complete two tasks: on one, we ask them to estimate their performance, and on the other, we ask them to estimate their performance relative to others taking the survey. The difference between these subjective estimations and actual performance, in absolute or percentile terms, gives us Overestimation or Overplacement, respectively (Moore and Healy 2008).

Overprecision refers to a belief that one’s information is more precise than it actually is. We ask participants to estimate a number (such as the year the telephone was invented) and then tell us how close they think they were to the correct answer. To difference out Overprecision from justified precision, we regress how close the participant thought they were on a fourth-order polynomial of their accuracy (Ortoleva and Snowberg 2015a, 2015b).14

D.  Patience (Time Preferences)

A payoff sometime in the future is generally seen as less valuable than a payoff of the same size today. The value today of a fixed future payoff is Patience (Andersen et al. 2008).15

E.  Choice of Measures

The choice of measures to include was driven largely by a desire to focus on “the basics” and by time constraints inherent in the study design. This general logic still left room for many subjective choices. In this subsection, we try to lay out how this logic was applied and how subjective considerations ultimately led to the set of measures we focused on.

Most of our measures are simple deviations from the standard neoclassical model of a selfish expected-utility maximizer. Behavior under that model is relatively simple: choices involving risk and uncertainty are driven by the curvature of the utility function. Adding time adds a discount rate. Choices involving other individuals are no different than choices involving the market: all result in maximizing the welfare of the individual, with no regard for the welfare of others. Finally, complex choices are processed no differently than simple ones. In the domain of risk, deviations from this model allow for distortions in the way these choices are perceived, including reference points and violations of independence. In the domain of social choice, deviations allow for caring about the welfare of others. Deviations in the domain of cognitive limitations include biased cognitive processing, how those biases affect strategic interactions, and so on.

We focused mainly on preferences in the risk and social domains. These areas have generated enormous interest both within and beyond behavioral economics. Arguably, they constitute the areas where behavioral economic models have been most extensively applied in other fields (e.g., finance and development economics). Each area has generated multiple empirical regularities that are inconsistent with the standard model of economic decision-making, leading to model proliferation. As different modeling approaches suggest different empirical relations between these regularities, it is particularly useful to establish the correlational structure between these behaviors to guide future modeling work.

Within each domain, the measures we included were guided by a combination of theory and importance to the literature. In the case of risk, we first focused on the two classic violations of independence: common-ratio and ambiguity-aversion measures. Both have been central to generating huge literatures on non–expected utility theory. Moreover, both have been explained as manifestations of probability weighting (e.g., Quiggin 1982; Segal 1987), as has aversion to compound lotteries (Segal 1990). Probability weighting has also been advanced as a possible explanation for small-stakes risk aversion.

We then focus on the second major behavioral force in choice under risk: reference dependence. Since the seminal work of Kahneman and Tversky (1979), the ideas of reference-dependent risk attitudes and loss aversion have been central to understanding risky choice. The key behavioral patterns associated with reference dependence are differing risk attitudes in the gain-and-loss domain and an increase in risk aversion for lotteries that include both gains and losses, thus motivating our three risk-aversion measures. At the same time, the idea of loss aversion has been used to explain the endowment effect (see, e.g., Kahneman, Knetsch, and Thaler 1990; Kőszegi and Rabin 2007). This motivated us to also include a measure of the endowment effect in our study.

We would argue that our nine measures of risk attitude span the behaviors that have been most influential in guiding the development of models of risky choice and are the most obvious manifestations of the three key theoretical constructs in the literature—ambiguity attitudes, nonlinear probability weighting, and reference dependence.

In the case of social preferences, we selected a set of measures that both reflect standard experiments and connect to most widespread theories of social preferences in a parsimonious way. In virtually all models of social preferences, people account for the other’s well-being. This is most directly captured by our measures of altruism and trust, which are both associated with classical measures in their own right. Common approaches, most prominently that of Fehr and Schmidt (1999), add the role of inequality aversion, with a specific distinction between preferences over advantageous and disadvantageous inequality. Other theories point toward the importance of reciprocity and different types of punishment. Importantly, different theories make different predictions about the distribution and correlation of these behaviors.

We made less of an attempt to cover behaviors thought to be driven by biases or mistakes, largely because of time constraints on the survey. Moreover, many of these judgment biases are carefully considered in Stango and Zinman (forthcoming), which can be seen as a companion to our more intensive focus on preferences. However, we did have space for a couple of such behaviors and chose to focus on overconfidence (Moore and Healy 2008) and level-K thinking (Nagel 1995). Overconfidence was motivated by a longer-term interest in the relationship between economic and political behavior, and prior results suggest that overconfidence is an important link between the two (Ortoleva and Snowberg 2015a, 2015b). Level-K thinking has long been a subject of interest for one of our coauthors, although our elicitations did not seem to produce reliable results; see figure E.1.

Naturally, there are other areas arguably of equal interest to behavioral economists that we have ruled out on various grounds. While there are many interesting behavioral phenomena and traits in strategic settings—for example, quantal response and competitiveness—the types of questions needed to measure these behaviors comprehensively would be too time-consuming for our survey. For similar reasons, we were able to measure only time preferences using money, rather than, say, real-effort tasks. Given the well-known issues with this approach, we restricted ourselves to a single time-preference measure. In the interests of parsimony, we measured reference dependence within the context of risk preferences, rather than separately, and did not measure violations of the independence of irrelevant alternatives, such as the compromise or asymmetric dominance effect. Overall, time constraints in the survey forced us to make choices and to leave out many measures that would be interesting to explore in future work.

III.  Implementation and Analysis

This section describes our representative, incentivized survey and the statistical techniques used to eliminate the attenuating effects of measurement error.

A.  Survey Implementation

Administering an incentivized survey to a representative population presents challenges not normally dealt with in lab environments. To surmount these challenges, we partnered with YouGov, a worldwide leader in online surveys serving the public, businesses, and governments.16 Our study was given to a representative sample of 1,000 US adults between March 30 and April 14, 2016. We consulted extensively with YouGov on our study design to utilize its expertise in survey design and implementation.

Constructing a representative sample is difficult, given variation in response rates. In order to do so, most modern surveys weight on demographics. YouGov supplements this with its own panel of participants. It continually recruits new people, especially from difficult-to-reach and low-socioeconomic-status groups. To generate a representative sample, it randomly draws people from various Census Bureau products and matches them on observables to members of its panel. Oversampling and differential response rates lead to the over- and underrepresentation of certain populations. Thus, YouGov provides sample weights to recover estimates that would be obtained from a fully representative sample. According to Pew Research, YouGov’s sampling and weighting procedure yields more representative samples than traditional probability (i.e., random) sampling methods, including Pew’s own probability sample (Kennedy et al. 2016; YouGov is Sample I). We use these weights throughout the paper.17

Incentivized questions pose additional challenges: stakes and whether the experimenter is seen as credible in making future payments or running randomizations as specified. Two randomly selected questions were chosen for payment.18 To enhance the credibility of our study, we took advantage of YouGov’s relationship with its panel and restricted the sample to those who had already been paid (in cash or prizes) for their participation in surveys.

All outcomes to incentivized questions were expressed in points. This is an internal YouGov currency used to pay participants. It can be converted to US dollars or prizes, using the approximate rate of $0.001 per point.19 The average payment to participants was around 9,000 points (or $9). The survey took participants between 45 minutes and an hour. This compensation level is quite high for an internet survey and represents a rate of pay approximately three times the average for similar surveys.

B.  Measurement Error

Measurement error causes a downward bias in correlations. This would make it more difficult to pick out the clusters of interrelated measures. Moreover, this downward bias would also make it difficult to know which behaviors are actually not related. To circumvent this issue, we use the ORIV technique of Gillen, Snowberg, and Yariv (2019), which uses the instrumental variables approach to errors in variables to produce efficient, consistent estimates. At the heart of this method is duplicate observations of variables—multiple elicitations that are similar but not exactly the same—that are likely to have orthogonal measurement error. This technique takes duplicate measures and estimates a stacked regression in which each measure of one behavior is used as both an independent variable and an instrument. This is done twice, once for each measure of the other behavior, effectively averaging across all four specifications in the stacked regression model.20

Our setting requires us to deal with an additional issue: constructed variables will often have correlated measurement error due to the nature of their construction. For example, the Compound-Lottery Aversion and Ambiguity Aversion measures are both constructed by taking some behavior (the certainty equivalent for a compound or ambiguous urn) and subtracting off the same quantity measured with error (certainty equivalent of a risky urn). This leads to correlation in the measurement error of Ambiguity Aversion and Compound-Lottery Aversion. To avoid spurious correlations, we make use of the fact that we have two observations for each measure and modify the ORIV procedure so that the measurement error in the instrument is uncorrelated with the measurement error in the left-hand-side variable.21 We use this formulation for all sets of variables constructed from two elicitations: Compound-Lottery Aversion and Ambiguity Aversion, and Overplacement and Overprecision.

C.  Multiple-Hypothesis Testing

This paper displays a large number of correlations and standard errors. There are no theoretical predictions for most of the correlations we examine. Thus, we omit any description of statistical testing or significance from our tables. However, if one were interested in null-hypothesis statistical testing, the appropriate critical value for significance at the 5% level is between 1.96 for a single-hypothesis test and 3.82 using a Bonferroni correction (Dunn 1958, 1961) for all 378 correlations underlying sections IV or V. List, Shaikh, and Xu (2016) provide an alternative approach that is generally less restrictive than a Bonferroni correction but still provides valid inference.


We examine correlation matrices directly and use PCA to summarize them. As PCA is an analysis of a correlation matrix and our correlation matrices are corrected for measurement error, so too are the results of PCA. The aim of PCA is to extract the m components most useful for explaining n>m variables. Components are linear combinations of the variables. The first component is constructed to capture the highest possible fraction of variance in the data (subject to the constraint that the linear weights sum to 1), the second to capture the highest fraction of the remaining variance, conditional on being orthogonal to the first component, and so on.22

Once components are identified, the key question is, How many are necessary to provide a good description of the underlying data? Heuristically, we want to retain components only when the marginal explanatory power is high. In order to determine the number of components to retain, we use an approach that captures this intuition: parallel analysis. Parallel analysis creates many random data sets with the same numbers of observations and variables as the original data. The average eigenvalues of the resulting correlation matrices are then computed. Components are kept as long as their associated eigenvalues are greater in the actual data than the average in the randomly generated data.23

The retained components help in understanding the relationship between the original variables in the data set. The correlation between a component and a variable is called the variable’s loading on that component. Variables that load heavily on the same component are highly related. In order to facilitate interpretation, retained components can be rotated relative to the data. Following standard practice, we rotate the resulting components using the varimax rotation (Furr 2017, 92). This rotates the basis identified from the retained components to maximize the variance of the squared loadings, easing interpretation of the components. However, as we will see below, the patterns in the components largely line up with apparent patterns in the correlation matrices, making interpretation straightforward.

In section V, we analyze whether the components we identify correlate with cognitive and demographic measures. As we have two elicitations of most variables, to compute the components we multiply the weights from the PCA by the average of the two elicitations.

IV.  Relationships between Econographics

The next two sections attempt to explain, as succinctly as possible, the relationships between 21 econographics variables.24 We begin by examining each econographic separately. Next, we study the relationships between econographics through a visual inspection of correlation matrices, followed by PCAs to verify the observed patterns. This leads to our central finding that the 21 econographics are well summarized by six principal components. In the next section, we examine the relationship between these principal components and cognitive abilities and demographics. As noted in the introduction, there are many ways one might summarize these 378 relationships. Our approach is driven by the desire to create an empirical basis for an underlying structure of more comprehensive theories of behavioral decision-making and by the hope that this is a relatively straightforward way to do so.

We describe the relationships between econographics in three steps. We first examine social preferences, then examine risk preferences, and then combine social and risk preferences with overconfidence measures and patience. This is done for simplicity and because each of the first two types of preferences may be of independent interest. Moreover, as our results are largely driven by clusters of correlations, these clusters will not disappear when additional measures are added to the analysis (although they may be augmented).

A.  Summary Statistics

The summary statistics in Table 2 show that behavior in our data is consistent with standard findings in the laboratory. In addition to summary statistics, we also show the percentage of participants whose responses are in the “expected” direction.25 Surveying the information in this column, the majority of participants are risk averse or risk neutral over gains, risk loving over losses, and inequality averse; exhibit an endowment effect and the common-ratio effect; are ambiguity averse; have a negative reaction to compound lotteries; and are overconfident. (A majority of participants are not all of these things simultaneously.) Note that although patience in this table is represented as a discount rate, in the correlation analysis we code the variable as discussed in section II.D. Either coding gives the same (directional) interpretation to correlations. However, the coding in section II.D is linear in a participant’s answer, allowing for the measurement error correction discussed in section III.B.

Table 2. 

Summary Statistics of Econographics Measures

VariableDescription/UnitMean ValueStandard DeviationExpected Direction (%)Correlation between Duplicates
Reciprocity: LowPercent of possible points returned.42.23.80
Reciprocity: HighPercent of possible points returned.39.22.96
TrustPercent of possible points sent.45.26
AltruismPercent of possible points sent.41.27
Antisocial PunishmentPercent of possible points used.21.35
Prosocial PunishmentPercent of possible points used.50.42
Dislike Having MorePercent of income forgone for equal split.08.4578.74
Dislike Having LessPercent of income forgone for equal split.02.3758.69
Risk Aversion: Gains(EV − CE)/EV−.04.4950.64
Risk Aversion: Losses(EV − CE)/EV−.29.5273.69
Risk Aversion: Gain/Loss(EV − CE)/EV−.04.55.71
WTAPercent of EV.91.41.70
WTPPercent of EV.65.36.75
Endowment EffectPercent of EV.26.5774.75
Risk Aversion: CR Certain(1 − EV of LE) as percent of sure amount−.32.4471.76
Risk Aversion: CR Lottery(1 − EV of LE) as percent of EV of lottery−.35.4473.70
Common RatioRA: CR Certain − RA: CR Lottery−.03.5262.62
Ambiguity Aversion(Risky CE − Ambiguous CE)/EV.07.5571.64
Compound Aversion(Risky CE − Compound CE)/EV.03.5668.57
OverestimationPerceived − real no. correct (of 3).691.1887.37
OverplacementPerceived − real percentile6.43960.30
OverprecisionStandardized subjective precision−.021.00.42
PatienceMonthly discount rate.87.23.78

Note. “Expected Direction” refers to the percentage of the participants (weighted) who give an answer in the direction expected, given the current literature: risk averse or risk neutral for most risk questions, risk loving or risk neutral for risk aversion over losses, equality seeking for distributional preferences, endowment effect greater than 0, overweighting small probabilities for common ratio, and ambiguity/compound averse or neutral. When there are two measures of a quantity, those measures are normalized and stacked, so the sample statistics are drawn from 2,000 observations from 1,000 people. EV = expected value; CE = certainty equivalent; LE = lottery equivalent; RA = risk aversion.

View Table Image

Our data exhibit fairly standard levels of noise. As discussed in Gillen, Snowberg, and Yariv (2019) and Snowberg and Yariv (2021), the correlation of duplicate measures—subtracted from 1—gives the level of noise in a particular elicitation. Gillen, Snowberg, and Yariv (2019) report correlations of around 0.65 between duplicate measures, using data from Caltech undergraduates. In most cases, the correlations we observe—in the final column of Table 2—are somewhat higher. This implies that our data are less noisy than similar data obtained from Caltech undergraduates. The exceptions are the overconfidence measures, which are noisier than the rest.26 When there is no correlation listed, we have only one elicitation of that behavior.

B.  Links between Social Preferences

There is ample opportunity to create a more parsimonious representation of the social preferences we measure: altruism, trust, anti- and prosocial punishment, and distributional preferences. These measures fall into three clusters shown in correlation table 3: one formed by the two measures of reciprocity, and altruism and trust; a second formed by the willingness to punish pro- and antisocial behavior; and a third formed by our two measures of inequality aversion. These clusters are characterized by high within-cluster correlations and low correlations between measures in different clusters. To make these clusters visually apparent, we present the correlation matrix in the form of a “heat map,” where the shade of red indicates the magnitude of the correlation.27

Table 3. 
Table 3. 

ORIV Correlations of Social Measures.

Note.—Bootstrapped standard errors from 10,000 simulations are in parentheses. Colors in heat map change with each 0.05 of magnitude of correlation. Measures are as defined in Table 2.

The first correlation in the first cluster—0.86 between the two reciprocity measures—is a useful example for interpretation. There are distinctions between those who are more and less reciprocal when a partner is more or less generous, resulting in a less than perfect correlation. Yet the predominant behavioral distinction, reflected in the very high correlation, is how reciprocal someone is in all conditions. This overarching behavior is (empirically) related to both trust and altruism, although more closely to the former. We note that although some readers may have anticipated some of these correlations, the fact that the literature distinguishes between, say, different forms of reciprocity, indicates that these anticipations are not universally shared.

The second cluster contains the two punishment measures.28 Like other clusters, they are highly correlated with each other but with few other measures. However, these measures are characterized by both an extensive and an intensive margin. The extensive margin is whether or not someone punishes and the intensive margin is how much punishment a person metes out. This leads to obvious questions about how these two margins contribute to the overall relationship. The correlations on both margins are roughly equal. About two-thirds of participants engage in prosocial punishment, whereas only one-third engage in antisocial punishment. However, almost everyone who engages in antisocial punishment also engages in prosocial punishment. Of those who engage in both types of punishment, there is a correlation of ∼0.4 in the amount they choose to spend on punishment of both parties. Both the extensive and intensive margins are poorly related to other measures of social preferences.

The third cluster contains the distributional preferences measures. It features the weakest intracluster correlation. Moreover, both measures in this cluster—Dislike Having More and Dislike Having Less—are moderately correlated with other econographics. These moderate correlations extend to some measures in the risk domain, leading to this component combining with risk preferences in the analysis of all 21 econographics in section IV.D.

Before turning to the PCA, it is worth noting that there is very little in the literature that indicates which specific correlations we should, and should not, find. This could lead to a lot of plausible storytelling. For example, one might believe that the dislike of having more (one form of inequality aversion) entirely motivates altruism and that a dislike of having less (another form) motivates punishment. Yet these patterns are not present in our data. Alternatively, one might expect that the predominant feature of distributional preferences is the presence or absence of a preference for equality, which would generate the observed correlation between Dislike Having More and Dislike Having Less. Whatever one’s priors (or lack thereof), these results should be informative. We discuss what we believe are the biggest takeaways for theory in section VI.

The PCA of these correlations shows the same patterns: results appear in Table 4. Three components are suggested for inclusion under parallel analysis—see figure C.1 for the scree plot. Together, these three components explain 68% of the variation in the eight measures of social preferences we explore here.

Table 4. 

PCA of Social Preferences

Principal ComponentsUnexplained Variance
GenerosityPunishmentInequality Aversion
Reciprocity: Low.52.04−.04.27
Reciprocity: High.
Antisocial Punishment−.
Prosocial Punishment.07.64−.03.37
Dislike Having More.17−.24.65.28
Dislike Having Less−.
Share of variation (%)34191632

Note. First three principal components using the varimax rotation. Weights greater than or equal to 0.25 for those components are in boldface. Measures are as defined in Table 2.

View Table Image

These three components have fairly obvious interpretations. The first is generosity in behaviors that directly influence the well-being of another person. The second is a general affinity for punishment. The third and final component captures both types of inequality aversion. The components are thus named accordingly: Generosity, Punishment, and Inequality Aversion.

Overall, these results show that there is a clear structure to the connections between different social preferences. These measures tend to group in clusters with high within-cluster correlations and low across-cluster correlations, leading to the latent structure shown by the PCA.

C.  Links between Risk Attitudes

In this subsection, we show that models of decision-making under risk and uncertainty may be too parsimonious. As shown by the clear clusters in table 5, Ambiguity Aversion and Compound-Lottery Aversion group together, blurring the line between risk and uncertainty. Moreover, we find two clusters of risk attitudes: one related to WTA and the other to WTP.

Table 5. 
Table 5. 

ORIV Correlations of Risk Measures.

Note.—Bootstrapped standard errors from 10,000 simulations are in parentheses. Colors in heat map change with each 0.05 of magnitude of correlation. Measures are as defined in Table 2.

The easiest cluster to interpret contains Ambiguity Aversion and Compound-Lottery Aversion.29 This high correlation is consistent with extant empirical work (Halevy 2007; Dean and Ortoleva 2019; Gillen, Snowberg, and Yariv 2019) and theoretical observation (Segal 1990; Dean and Ortoleva 2019). Note that both of these measures are essentially unrelated to measures of risk attitudes toward ordinary lotteries, suggesting either a delineation between risk and uncertainty, with compound lotteries grouped with ambiguity aversion, or the lack of a clean line between risk and ambiguity.

The two remaining clusters both contain risk attitudes and suggest two separate aspects of risk preferences. The first contains Risk Aversion: Gains, Risk Aversion: Losses, Risk Aversion: Gain/Loss, and WTA, and the second includes Risk Aversion: CR Certainty, Risk Aversion: CR Lottery, and WTP. Crucially, there are low correlations between the measures in different clusters (with the exception of the relationship between Risk Aversion: Losses and WTP—a correlation of 0.3, meaning that risk aversion using WTP is negatively correlated with risk aversion over losses). This suggests that there are two separate aspects of risk aversion that are largely unrelated to each other or to preferences under uncertainty. Note that these two aspects are both within a fairly narrow domain: the valuation of risky lotteries. Thus, unlike psychology research, which has shown differences in risk preferences across outcome domains, these patterns suggest differences in preferences within the domain of money lotteries—the most studied domain in choice theory and experimental and behavioral economics.

These two separate aspects of risk preference align with WTA and WTP, respectively, giving this division substantive and theoretical importance. We find that certainty equivalents fall into one cluster while lottery equivalents fall into another. This is in line with prior research finding that different fixed elements in an MPL lead to different degrees of risk aversion. Here, as in the literature, this is consistent with the fixed element of the MPL acting as a reference point (Sprenger 2015). Our results add, first, that these groups are largely uncorrelated with each other and, second, that they naturally align with WTA and WTP. WTA—where the lottery is explicitly the reference point—naturally aligns with certainty equivalents: measures where the lottery is fixed. Similarly, WTP aligns with measures with a fixed monetary amount. At the same time, as we discuss in section VI and more in depth in Chapman et al. (2022b), our data are difficult to reconcile with current theoretical explanations. The most visible signature of this difficulty in table 5 is the lack of an obvious component of loss aversion positively related to WTA and negatively related to Risk Aversion: Gain/Loss.

Before proceeding, it is worth discussing the two measures in the correlation table that we do not include in the clusters or in the following PCA: the endowment effect and the common-ratio effect. Both measures are the difference of two other measures we already include: for example, the endowment effect is the difference between WTA and WTP. Thus, including all three measures in a PCA makes little sense. As described above, our own recent research compiles extensive evidence that WTA and WTP should be treated as separate behavioral measures (Chapman et al. 2022b). Thus, we include the component measures (WTA and WTP) rather than the endowment effect. We take the same approach with the common-ratio measures: entering the constituent measures into the correlation and PCA.30

Once again, the PCA in Table 6 confirms the visual patterns in the correlation table. The first three components explain 66% of the variation in the nine measures of risk preferences considered here. The first and the second clearly capture different aspects of risk attitudes. Following the discussion above, we use the names Risk Aversion: WTA and Risk Aversion: WTP to describe them. The third encompasses more complex lotteries, which may induce uncertainty: hence, we use the name Uncertainty.

Table 6. 

PCA of Risk Preferences

Principal ComponentsUnexplained Variance
Risk Aversion: WTARisk Aversion: WTPUncertainty
Risk Aversion: Gains.
Risk Aversion: Losses.39−.21.09.45
Risk Aversion: Gain/Loss.53−.09.02.22
Risk Aversion: CR Certain.08.62−.06.29
Risk Aversion: CR Lottery−.
Ambiguity Aversion−.01−.04.69.27
Compound-Lottery Aversion.
Share of variation (%)29211734

Note. First three principal components using the varimax rotation. Weights greater than or equal to 0.25 for those components are in boldface. Measures are as defined in Table 2.

View Table Image

D.  Putting It All Together

Analyzing all 21 econographics together leads to six components. The structure of components from the risk and social domains is preserved; however, one social and one risk component combine. The sixth component is comprised of the three overconfidence measures. Time preferences load on several components, most heavily on the Punishment component.

Analyzing risk and social preferences, together with overconfidence and time preferences, is straightforward because there are few important relationships between risk and social preferences. However, they are not completely unrelated. One of the social preference components (Inequality Aversion) combines with a risk-preference component (Risk Aversion: WTP). Otherwise, the structure in the previous two subsections is largely unaltered.

The first and second components in Table 7 are similar to the first components in the social and risk analyses, respectively. Thus, we retain the same names. Patience loads moderately on both components: more generous and less risk-averse people are more patient.

Table 7. 

PCA of All Measures

Principal ComponentsUnexplained Variance
GenerosityRisk Aversion: WTAInequality Aversion/WTPOverconfidenceImpulsivityUncertainty
Reciprocity: Low.−.01.29
Reciprocity: High.51.01−.02−.
Antisocial Punishment−.−.03.26
Prosocial Punishment.09−.05−.04−.
Dislike Having More.22.00.26−.18−.11.17.57
Dislike Having Less−.06−.10.40−.
Risk Aversion: Gains.−.05.06.23
Risk Aversion: Losses−.09.36−.
Risk Aversion: Gain/Loss−.01.50−.11−.04−.01.03.24
Risk Aversion: CR Certain−.−.02−.11.38
Risk Aversion: CR Lottery−.03−.03.42.11−.03.04.56
Ambiguity Aversion.04.00−.03−.
Compound-Lottery Aversion−.−.07.65.30
Share of variation (%)14131188740

Note. First six principal components using the varimax rotation. Weights greater than or equal to 0.25 for those components are in boldface. Measures are as defined in Table 2.

View Table Image

The third component is of particular interest. It combines the measures included in the second risk component, Risk Aversion: WTP, with those of the third social component, Inequality Aversion. This suggests that some aspects of social and risk preferences are distinct, while others are more related. Moreover, this relationship follows a clear pattern building upon the previously identified components: Inequality Aversion combines with Risk Aversion: WTP. We note that both of these components deal with aversion to spreads—in possible payoffs or distributional assignments. Thus, we conjecture that a similar form of caution may lead participants to both dislike entering conditions of risk and generically dislike unequal allocations. Note, however, that this pertains only to entering a situation of risk (WTP for a lottery), rather than leaving one (WTA). In line with its constituent parts, we call this component Inequality Aversion/WTP.

The fourth component loads heavily on all three overconfidence measures; naturally, we call it Overconfidence. Given existing work on the conceptual distinctions between different types of overconfidence, the fact that they would load on a single component may not have been ex ante obvious (Moore and Healy 2008).

The fifth is the social component Punishment, to which (a bit of) time preferences have been added. People who score highly on this component enjoy both pro- and antisocial punishment and are impatient. Thus, we dub this new component Impulsivity.

The sixth component is essentially the same as the third risk component. Thus, we continue to call this component Uncertainty.

There are a number of ways to examine the robustness of our results. We discuss two here—changing the number of components and using other methods to extract latent variables—and another—adding and removing variables—in section VIII.

The Inequality Aversion/WTA component is robust to adding or subtracting components. Reducing the number of components to five, as shown in table E.3, causes the Uncertainty component to combine with both the Overconfidence and Impulsivity components but otherwise leaves the structure of the components qualitatively unchanged. Perhaps more interesting is what happens when we add components. With either seven or eight components—the latter recommended by the Kaiser criterion—the Overconfidence component splits into two, as shown in tables E.4 and E.5. Both of these overconfidence components contain overprecision, but one contains overestimation, and the other, overplacement. Adding an eighth component further splits the Generosity component into two, with the reciprocity measures on one component and the altruism and trust measures on the other. The rest of the components are largely unchanged by the inclusion of additional components.

Using other ways of computing the correlation matrix or other latent dimension recovery techniques has very little effect on our results, as shown in tables E.6–E.9. In particular, our results are robust to using unweighted measures or the average of measures (rather than ORIV) to compute the correlation matrix. Moreover, one can use Spearman rank-order correlations, which are robust to nonlinear relationships and outliers, to compute the correlation matrix, and this also results in very little change to the structure shown in Table 7. Finally, one can use factor analysis (on the average of measures), rather than PCA. This, too, produces qualitatively similar results.

Overall, the underlying structure of our data can be summarized with four points. First, six interpretable and separate components explain 60% of the variance of our 21 variables. Second, the social and risk components do not change from the earlier analyses in tables 4 and 6, but two components—Risk Aversion: WTP and Inequality Aversion—collapse into one. Third, time preferences are captured by several of the existing components, possibly signaling that they are related to many elements, although none particularly strongly. Fourth, overconfidence measures are largely separate and are captured by a single component.

V.  Econographics and Other Measures

In this section, we examine the correlation between our econographic components, cognitive abilities, and demographics. This has two potential benefits. First, it may give us clues as to the psychological processes underlying these clusters. Second, it may also provide useful information about economic preferences. For example, understanding how the components relate to demographics might be useful in understanding how preferences might change as a population becomes older or better educated.

A.  Econographics and Cognitive Measures

There are several important relationships between our six econographic components and the two measures of cognition on our survey: a six-question IQ battery and the three-question cognitive-reflection test (CRT; Frederick 2005). These two measures have an ORIV correlation of >0.8.31

Our IQ measure consists of three questions from each of two classes of questions from the International Cognitive Ability Resource, a public domain intelligence measure (Condon and Revelle 2014). We chose three matrix-reasoning questions, similar to Raven’s Progressive Matrices. In these questions, participants determined which of a set of possibilities correctly completed a graphic pattern. We also constructed a second battery of three questions based on three-dimensional rotations: a drawing of a cube was shown, and participants had to identify which of a set of six other drawings of a cube were compatible. These questions were chosen to be of progressively greater difficulty in order to try to capture a variation in fluid intelligence in the general population.32

There are several important relationships between our six econographic components and IQ or the CRT, shown in Table 8. Note that, unlike results in prior sections that were primarily concerned with magnitudes, here positive and negative correlations have very different interpretations. Consistent with the high correlation between the IQ and CRT measures, the general pattern of correlations is largely the same: higher cognitive ability is positively correlated with Generosity and negatively correlated with Inequality Aversion/WTP, Overconfidence, and Impulsivity.33

Table 8. 

Correlations with Measures of Cognitive Ability

Risk Aversion: WTA.02.04

Note. Bootstrapped standard errors from 10,000 simulations are in parentheses. Components are as defined in Table 7.

View Table Image

These results show that, in general, the underlying behavioral components that we have identified are not simple proxies for intelligence. Moreover, there is a more nuanced relationship between our measures of intelligence and the extent to which a participant is “behavioral”—that is, behaves differently than a selfish, risk-neutral expected-utility maximizer. While it is the case that lower measured intelligence is related to higher Impulsivity and Inequality Aversion, it is also related to lower Generosity. Moreover, risk and uncertainty attitudes are largely unrelated to intelligence, with much of the positive relationship between Inequality Aversion/WTP, Overconfidence, and intelligence being driven by distributional preferences.34

B.  Econographics and Demographics

There are interesting relationships in our data between our components and standard economic variables—such as income and education—or demographics—such as age and gender. Table 9 shows these correlations in panel A and the same correlations, controlling for our six-question IQ battery, in panel B.35

Table 9. 

Correlations with Demographics

IncomeEducationAgeMaleAttend Church
A. Correlations
Risk Aversion: WTA.00.06.03−.03.02
B. Controlling for IQ
Risk Aversion: WTA−.00.06.04−.03.02

Note. Bootstrapped standard errors from 10,000 simulations are in parentheses. Components are as defined in Table 7.

View Table Image

Greater education and income are associated with higher Generosity and lower Impulsivity, as was the case with higher cognitive ability. This may be unsurprising, as education and income are often associated with higher cognitive ability. However, as panel B of Table 9 shows, the relationships with education and income remain even after IQ is controlled for. There is no association between income or education and Overconfidence or Inequality Aversion/WTP; however, once we control for cognitive abilities, we obtain that higher education is (slightly) correlated with higher Overconfidence. That is, even though more education increases participants’ own ability and knowledge, it seems to increase their perceptions of these skills even more. This suggests, if anything, that education is not a “cure” for nonstandard preferences.36

Older people are more generous, and men are more overconfident, in accordance with the literature.37 These results only become stronger once we control for cognitive abilities. Additionally, older people exhibit a higher Inequality Aversion/WTP score, while men have a slightly lower one. As with cognitive abilities, we find that neither demographic is related to the other risk component, Risk Aversion: WTA, adding nuance on the relation between these aspects and risk preferences. Finally, frequency of church attendance is positively associated with Generosity. It is also positively associated with Overconfidence, although not statistically significantly at conventional levels.

VI.  Relation to Theory

As suggested in prior sections, commonly used theories can account for some, but not all, of the patterns in our data, even within the social or risk domain. In this section, we spell out the correlations that would be predicted by different theories and the points of difference between theories and our empirical patterns.

A.  Social Preferences

Theories of social preferences can capture some of the moments in our data, but no single theory can explain all the patterns we observe. Here we consider four common theories of social preferences: simple models of altruistic preferences, inequality aversion (Fehr and Schmidt 1999), and reciprocity (Rabin 1993) and a model incorporating the latter two features plus social welfare (Charness and Rabin 2002).38

A simple model of altruistic preferences—in which utility is increasing in one’s own monetary payoff and that of others—predicts a positive correlation between our measures of Altruism, Trust, Reciprocity: High, and Reciprocity: Low, in line with the Generosity component (see tables 3, 4, and 7). This simple model also predicts that more altruistic people would have to be paid more to move to an uneven split in which the other person got less but paid less to move to a split in which the other person got more. That is, it also predicts that measures in the Generosity component should be positively correlated with Dislike Having More and negatively related to Dislike Having Less. There is some evidence for such a pattern in our data; however, it does not reflect the most parsimonious reading, as expressed by PCA. Moreover, a model of altruistic preferences would also predict a negative correlation between Dislike Having More and Dislike Having Less—the opposite of what we find in our data. The altruistic preferences model also predicts that people would accept a reduction in allocation to move away from an even split to one in which another person got more. In our data, the majority of non-indifferent participants require an increase in income (79% of non-indifferent participants in question 1 and 74% in question 2) to move away from the even split, the opposite of the theoretical prediction. Finally, a purely altruistic person would never engage in costly punishment.

Social welfare preferences modify the altruistic preferences model by assuming that the utility a decision-maker gets from the consumption of others grows faster when others are poorer than the decision-maker (Charness and Rabin 2002). Thus, altruistic behavior is governed by two parameters, one that dictates behavior when a person has more than others and one when they have less. Like the altruism model, this model also predicts a positive correlation between Altruism, Trust, Reciprocity: High, and Dislike Having More, as these are all situations in which the decision-maker will (choose to) have more than the other person. In contrast, Reciprocity: Low would be governed by a separate parameter, which would also determine Dislike Having Less and lead to a negative correlation between these two measures. This is not the primary pattern we see in the data. As with altruistic preferences, a person of this type would never engage in costly punishment.

Perhaps the most common characterization of social preferences is Fehr and Schmidt’s model of inequality aversion. This adds two parameters to the standard model: one codes how much the decision-maker dislikes having more than other people and the other how much they dislike having less. As with social welfare preferences, the first of these parameters governs behavior in Altruism, Trust, Reciprocity: High, and Dislike Having More, while Dislike Having Less is governed by the second parameter. The model of Bolton and Ockenfels (2000) would imply similar correlations. Again, this does not match the most parsimonious reading of our data. Moreover, a model of inequality aversion would predict that no money should be returned in the Reciprocity: Low measure, but that is not what happens empirically. Inequality-averse participants might engage in Prosocial Punishment (as this would reduce inequality) but not Antisocial Punishment (as this would increase inequality). This is inconsistent with the strong positive correlation we find between these two measures. The single punishment factor is also inconsistent with evidence from cross-cultural studies on public goods games with punishment, such as Gächter, Herrmann, and Thöni (2010), in which cultures tend to bifurcate between prosocial and antisocial punishment.

Other important theories of social preferences are not tested in our survey. Intention-based models, such as Rabin (1993), apply to only one of our choice environments—the trust game—that has a second player whose actions affect both players’ payoffs. Ideal tests of these models must collect actions, beliefs about others’ actions, and second-order beliefs (e.g., Charness and Dufwenberg 2006), which are hard to measure, especially given the constraints of our survey (Schotter and Trevino 2014). While we do not have ideal measures of intention, to test reciprocity, a simple comparison of our items can plausibly lead to a negative correlation between Reciprocity: High (in which participants have been treated fairly) and Reciprocity: Low (in which they have been treated unfairly). This is in contrast to what we find in our data.39

B.  Risk Preferences

Our results exhibit inconsistencies with standard models of preferences under risk and uncertainty. These models typically separate risk and uncertainty, with the distinction being whether probabilities of outcomes are known or unknown (Arrow 1951; Gilboa and Marinacci 2016). The curvature of the utility function, and possibly aspects of non–expected utility, should affect behavior in both domains, while beliefs and ambiguity aversion should affect only choice under uncertainty. In our data, the classic delineation between risk and uncertainty fails to hold: attitudes toward compound lotteries, where probabilities are known, are related to ambiguity aversion, where they are not. This empirical relationship has been shown before, with different possible explanations (Halevy 2007; Dean and Ortoleva 2019; Gillen, Snowberg, and Yariv 2019). Segal (1990) proposes that ambiguous prospects are treated like compound lotteries, in which case probability weighting could explain both phenomena. This theory would further imply a link between Ambiguity/Compound-Lottery Aversion and the Common Ratio effect. However, we do not observe this relationship in our data. An alternative is to argue that compound lotteries are complex objects that people perceive as ambiguous (Dean and Ortoleva 2019). In either case, these results point to a different structure, where compound lotteries are associated with prospects with unknown probabilities.

The split into two separate types of risk preferences for objective, noncompound risk is also difficult to reconcile with models that treat risk aversion as a unitary phenomenon—for example, driven by the curvature of a utility function. As discussed at length in Chapman et al. (2022b), the specific clusters we find can be interpreted using the framework of Sprenger (2015), in which question structure serves as an implicit frame and determines reference points. In particular, Sprenger shows that certainty and probability equivalents yield different levels of risk aversion. He explains this finding by appealing to the different reference points these frames induce. In line with this hypothesis, the two clusters in our data are related to the two explicit frames we administer in the WTA and WTP tasks: certainty equivalents are correlated with WTA, while probability equivalents are correlated with WTP.

A natural question is whether this pattern of correlations is in line with existing theories of reference dependence. Cumulative prospect theory and the Kőszegi-Rabin model predict specific relationships between WTA, WTP, and the Endowment Effect and the three measures Risk Aversion: Gains, Risk Aversion: Losses, and Risk Aversion: Gain/Loss. While deriving these specific predictions and testing them is beyond the scope of this paper, Chapman et al. (2022b) shows that the predicted relationships fail to hold and in some cases are in the opposite direction of predictions from reference-dependence models. Thus, while the reference point induced in these questions is of clear first-order importance, existing theories of reference dependence cannot account for the overall pattern of correlations observed in our data.

VII.  Literature

Before reviewing the few papers that, like ours, examine the correlations between large sets of behavioral regularities, we note that there is a small but significant literature focusing on the connections between only two or three of them. These papers primarily focus on the relationships between different risk preferences or between risk and time preferences; a couple also study the links between social preferences and risk. Broadly speaking, where these studies overlap with our work, they find similar relationships. We discuss this literature in greater detail in appendix D.

Another group of papers study the relationship between risk and/or time preferences and cognitive ability (Burks et al. 2009; Dohmen et al. 2010; Benjamin, Brown, and Shapiro 2013—see Dohmen et al. 2018 for a recent review). In aggregate, the literature suggests that higher cognitive ability is associated with less risk aversion. Our results suggest some nuance, as we find different relationships between cognitive ability and the two components of ordinary risk attitudes we identify.

Our work is differentiated from the studies described above by its large representative sample and measures of multiple behavioral regularities. This allows us to identify the complete correlational structure without worries about confounds from differing study designs or populations. However, two contemporaneous and complementary projects also have multiple behavioral measures and use representative populations. Stango and Zinman (forthcoming) measure a broad set of 17 behavioral factors in a representative sample of US adults to study, in part, the relationships between them. Falk et al. (2018) survey 80,000 adults across 80 countries to document patterns of six behaviors. Two additional studies use multiple elicitations in nonrepresentative populations.40 These studies differ from ours in both purpose and implementation: they study different measures, do not use incentives, and do not take the steps that we do to eliminate the attenuating effects of measurement error.

Contemporary with our work, Stango and Zinman (forthcoming) also study the correlation between behavioral economic phenomena, as well as that between those phenomena, demographics, and life outcomes. They measure behavioral biases in six classes. Two of these—risk and uncertainty biases and overconfidence—are closely related to measures in our analysis, while the other four classes are not.41 Stango and Zinman also measure what they call “behavioral inputs,” three of which have equivalents in our analysis: cognitive skills, risk aversion, and patience. Unlike our study, Stango and Zinman’s does not measure social preferences.

Despite various differences in approach, there is significant consistency between our results and those of Stango and Zinman concerning concepts modeled in both studies.42 Both find strong correlation between three types of overconfidence: overestimation, overprecision, and overplacement. In the risk domain, Stango and Zinman measure ambiguity aversion, loss aversion (similar to our mixed risk), and preference for certainty (similar to our common-ratio measure). As with our study, they find them to be weakly, and in some cases negatively, correlated. In both studies, patience shows no strong correlation with either risk-aversion measures or overconfidence. Both studies also find cognitive skills to be negatively related to overconfidence and patience and only weakly related to risk preferences.

On the basis of their factor analysis, Stango and Zinman propose a structure with four factors: two “processing” factors—related to biased beliefs and choice mistakes—and two “preference factors”—related to discounting and risk attitudes. They find that cognitive skills are strongly and negatively associated with the two processing biases, weakly associated with the discounting factor, and unrelated to the risk factor. Some similar patterns emerge in our study. Our overconfidence factor is a subset of the biased-belief factor of Stango and Zinman and is also negatively associated with cognitive skills. Our impulsivity factor is similar to their discounting factor, and we also find a negative relation between it and cognitive skills, albeit a stronger one than do Stango and Zinman. Unlike Stango and Zinman, we find two risk components.

Two other recent papers have some overlap with our work. Falk et al. (2018) collect a subset of the measures we include here: patience, risk, positive reciprocity, punishment, altruism, and trust. Each is measured through a combination of qualitative self-reports and hypothetical money questions. Three of the four measures of social preferences—“altruism,” “positive reciprocity,” and “trust”—are highly correlated, a result that is reproduced in our data.43 The purposes of that paper are simply different from ours. For example, they are interested in a huge range of 80 countries, differences between countries, and cross-country associated variables (such as language and religion). Their measures are also unincentivized.

An immediate predecessor of this paper is Dean and Ortoleva (2019), which studies relationships between many of the same behaviors in a sample of 180 Brown undergraduates. These students are likely to be less heterogeneous on some dimensions, such as cognitive ability. Dean and Ortoleva focus more on risk and time and less on social preferences. Broadly speaking, where these two studies use similar measures, they tend to find similar relationships. However, these correlations represent a limited subset of the 210 we examine here: 45 for Dean and Ortoleva and 15 for Falk et al. (2018). An exception to these similarities is that Dean and Ortoleva find a strong positive relationship between the endowment effect and loss aversion for risky choice, while for the closest measure of loss aversion (Risk Aversion: Gain/Loss) in our data, we find a strong negative relationship. This is discussed in great detail in Chapman et al. (2022b).

In recent years, others have collected information on basic economic preferences and personality and noncognitive skill variables (see, e.g., Becker et al. 2012; Jagelka 2020).44 These investigations follow an increase in the links between personality and economic outcomes (Almlund et al. 2011; Heckman, Jagelka, and Kautz 2019). There are still open questions in this evolving empirical literature about how to correct for measurement error and how strongly correlated measures of all these variables are across time (Morris et al. 2021 show low cross-time correlations for many noncognitive skills). Including personality and noncognitive skills is obviously important, and we are eager to see the accumulation of knowledge from other efforts, like those above, that include more and different variables.

As all of these studies are trying to understand a complicated question of first-order importance to behavioral economics—How are many different behaviors related?—it is natural that there would be some overlap. Because this question is both fundamental and complicated and involves a large number of possible measures, it is not likely that there will be a single, definitive, study soon. Instead, we believe that these many studies trace out different features of this complicated enterprise. The ideal cumulative scientific process, in our opinion, is what this area of research is currently working toward: creating a set of studies, all with distinctive strengths, that can be generally assessed as results are published.

VIII.  Conclusion

We elicit 21 econographics from a representative sample of 1,000 US adults in order to create an empirical basis for an underlying structure of more comprehensive theories of behavioral decision-making. We identify six interpretable econographic components that explain a large fraction of the variance in these 21 econographic variables: Generosity, Risk Aversion: WTA, Inequality Aversion/WTP, Overconfidence, Impulsivity, and Uncertainty. These components suggest that representations more parsimonious than current theories of social preferences are possible but that canonical theories of risk preferences are perhaps too parsimonious. Moreover, they suggest limited, and nuanced, connections between risk and social preferences. By studying the relationship between the components we identify and cognitive measures and demographics, we document several stylized facts that may be useful for theorizing.

A strength of our study is the number of behaviors included in our analysis. However, the behaviors we included were limited both by survey time and by the current literature. These, in turn, present limits for our analyses. A nuanced view of these limitations comes from thinking about what would happen if we had included more, or fewer, measures, which we do in the remainder of this paper. This exercise also speaks to the robustness of our results.

Including an elicitation extremely similar to “Risk Aversion: Gains” has little qualitative effect on our conclusions, as shown in table E.10. This extremely similar measure loads heavily on the second component in Table 7, and this component now becomes the first (in terms of percent of variation explained).45 Thus, the ranking of components may respond to the inclusion or exclusion of measures. Consequently, we have not attached meaning to the ordering in the text.

Removing Dislike Having More reduces the number of components in Table 4 to two but has little qualitative effect on the overall analysis of Table 7, as shown in table E.11.46 Dislike Having Less still combines with the variables in the Risk Aversion: WTP component to form Inequality Aversion/WTP. Thus, it appears that minor perturbations are not particularly consequential, although these exercises have little to say about larger changes.

The robustness here stems from the fact that most correlations between measures—displayed, for example, in tables 3 and 5—are either large or close to zero.47 There are a few middling correlations—such as those between Dislike Having More and other social preference measures. These are the likely sources of fragility in PCA. Adding a variable with a number of middling correlations may cause parallel analysis to suggest the inclusion of an additional component, and this inclusion may lead to extensive changes in existing components. Consequently, the components are useful for making sense of the large correlation matrices in our analysis. However, the correlation matrices themselves are the most robust—and consequential—part of the analysis, as these are the fundamental patterns upon which latent variable models, such as PCA, are built.

Overall, this discussion suggests that our main conclusion—that there is an underlying structure to our measures that is informative for theorizing—is robust. However, we note that our findings relate to fairly specific domains of economic behaviors: choices over money lotteries, time, the distribution of resources between two people, and beliefs about oneself and others. Adding more measures might create broader clusters that increase the average number of measures per component or lead to a more diffuse set of underlying dimensions. Moreover, examining the relationship between clusters and consumer behavior, as in Stango and Zinman (forthcoming), might, in the context of specific applications, increase the salience of particular relationships and/or the importance of some components. Whatever the outcome of such explorations, it is worth noting that a failure to reduce all economically relevant behaviors to just a handful of components is not a failure of behavioral economics. Chemistry has been incredibly successful with more than 100 elements. In econographics, we have barely started down that path. It should be our goal to accurately and adequately describe economic behavior with no more, and no fewer, components than necessary. This study takes a step in that direction.

Data Availability

Code and data for replicating the tables in this article can be found in Chapman et al. (2022a) in the Harvard Dataverse,


We thank Douglas Bernheim, Ben Enke, Benedetto De Martino, Stefano DellaVigna, Xavier Gabaix, Daniel Gottlieb, Eric Johnson, David Laibson, Graham Loomes, Ulrike Malmandier, Matthew Rabin, Jörg Spenkuch, Victor Stango, Dimitri Taubinsky, Peter Wakker, Michael Woodford, Jonathan Zinman, and the participants of seminars and conferences for their useful comments and suggestions. Daniel Chawla provided excellent research assistance. Camerer, Ortoleva, and Snowberg gratefully acknowledge the financial support of National Science Foundation grant SMA-1329195. This research was conducted under Caltech Institutional Review Board approval ES-408. This paper was edited by John List.

1 See, e.g., Fudenberg (2006), Levine (2012), Kőszegi (2014), Bernheim (2016), and Chang and Ghisellini (2018). There are many important defenses of the wide range of models, including that they may be the most accurate representation of behavior; see, e.g., Conlisk (1996), Kahneman (2003), DellaVigna (2009), Chetty (2015), and Thaler (2016).

2 For example, suppose that a theory predicts a relationship between behaviors A and B that have correlation 0.5. Another theory connects behaviors B and C; a different study finds a correlation of 0.5. Even if there is no theoretically predicted relationship between A and C, their correlation—which can be anywhere between 0 and 1—is important to understanding the true structure. If the correlation is 1, then A and C are redundant, so at most one of the two prior theories is correct. If the correlation is 0, then B=A+C, and the two theories should be complementary. If the correlation is 0.5, this suggests that an underlying latent factor determines all three behaviors, and the focus should be on developing a theory consistent with it.

3 Moreover, a correlation that might seem large in isolation (i.e., statistically significant), may be quite small in the context of other correlations.

4 We also used machine-learning techniques that are more sophisticated than PCA, such as clustering, but these did not produce additional insight. See app. B.

5 Note that this finding is distinct from the different “domains” of risk attitudes often discussed in psychology—see Weber and Johnson (2008) for a review and Barseghyan, Prince, and Teitelbaum (2011) and Einav et al. (2012) for applications in economics—as we examine only the domain of lotteries. Our findings are also distinct from economic studies that document poor correlations between different elicitations of risk attitudes—see Friedman et al. (2014) for a review—as we document a particular pattern of both high and low correlations between measures of risk attitudes.

6 While an extensive literature has studied different forms of overconfidence and their differences, to our knowledge the finding of a common component, in a large representative survey, is new evidence about this phenomenon (Moore and Dev 2020).

7 Outcome-based models are those in which utility is based only on the outcomes received by all players. This is in contrast to models where utility might be based on the set of actions that could have been taken by other players, such as the fairness model of Rabin (1993).

8 Additionally, recovering a parameter of a behavioral model through nonlinear transformations of quantities measured with error is problematic. For example, some participants state relatively high, or low, certainty equivalents for lotteries that result in huge (positive and negative) CRRA coefficients. These values dominate correlations, causing them to be more informative about measurement error than about behavior. Our data can be found in the replication bundle (Chapman et al. 2022a), and researchers interested in particular parametric formulations can use them to test those theories, if they so desire.

9 We also included dominated options at the endpoints of the MPL scales wherever possible. The undominated options included in these rows with the dominated options were preselected, following Andreoni and Sprenger (2012). The software also imposed a single crossing point.

10 These two types of reciprocity are sometimes referred to as positive and negative reciprocity. However, negative reciprocity is usually defined as the response to the lowest possible action. In this case, the lowest action is sending no money, to which the receiver cannot respond. Thus, to avoid confusion with the standard usage, we use “low” and “high” rather than “negative” and “positive.” Note also that each participant’s partner in these interactions differs between when they are the sender and when they are the receiver. Moreover, when the participant plays the role of receiver, we use the strategy method: i.e., we elicit the response for every possible amount sent. For more details of implementation, see app. A.

11 The most familiar theoretical model of these patterns of valuation is prospect theory (Kahneman and Tversky 1979).

12 We check the robustness of this analysis to adding the single Common Ratio measure in table E.2.

13 The risk measure here is Risk Aversion: Urn, which matches the description above. Empirically, this measure is highly correlated with Risk Aversion: Gains. Note that both Compound-Lottery Aversion and Ambiguity Aversion difference out the same quantity. If this quantity is measured with error, this can create a spurious correlation between Compound-Lottery Aversion and Ambiguity Aversion. This issue and our solution are discussed in sec. III.B.

14 Note that as both Overprecision and Overplacement refer to the same factual questions, this will create correlated measurement error between them. This can create spurious correlation between the two measures. This issue and our solution are discussed in sec. III.B.

15 In an earlier survey, we attempted to measure present bias but found little or no evidence of it, similar to some recent studies that attempt to elicit present bias using both financial payments and effort (see, e.g., Augenblick, Niederle, and Sprenger 2015; Imai, Rutter, and Camerer 2021). This may be due to the fact that points in our study were not (usually) instantly convertible into consumption. Thus, we did not attempt to measure it here. A general concern about experiments that attempt to measure time preferences is whether the participants trust the experimenter to follow through with payment (Andreoni and Sprenger 2012). Using a survey company with established relationships with its panelists seems to have largely mitigated this, as discount rates were quite low (correspondingly, the discount factorδ was quite high).

16 Companies that use in-person or phone surveys, such as Gallup, were unable to administer incentives.

17 As economists rarely run their own surveys in representative populations, it is worth explaining how the survey research literature uses the term “representative.” With few exceptions—censuses, samples in rural areas of developing countries based on a census—representative samples are representative on observables, not on unobservables. While random samples have the potential to be representative on both observables and unobservables, low response rates render these samples less representative on both observables and expressed preferences, as the Pew study documents. Commonly used representative surveys in economics, such as the Current Population Survey (CPS), use weighting to account for nonresponse. The CPS also uses imputation to adjust for item nonresponse, which is not present in our survey.

18 We chose to pay for two randomly selected questions to increase the stakes while making fewer participants upset about their payoffs. Paying for two questions instead of one may theoretically induce some wealth effects, but these are known to be negligible, especially in an experiment such as ours (Charness, Gneezy, and Halladay 2016). Paying for randomly selected questions is incentive compatible under expected utility but not necessarily under more general risk preferences, where it is known that no such mechanism may exist (Karni and Safra 1987; Azrieli, Chambers, and Healy 2018). A growing literature suggests that this theoretical concern may not be empirically important (Beattie and Loomes 1997; Cubitt, Starmer, and Sugden 1998; Hey and Lee 2005; Kurata, Izawa, and Okamura 2009), but there are some exceptions (Freeman, Halevy, and Kneeland 2019).

19 The conversion from points to awards can be done only at specific point values, which leads to a slightly convex payoff schedule. This is of little concern here, as these cash-out amounts are farther apart than the maximum payoff from this survey.

20 Our approach to the design of the multiple measures, as well as using bootstrapped standard errors in all specifications, follows directly from the recommendations in Gillen, Snowberg, and Yariv (2019). When we do not have an available duplicate for one of the measures, we use an approximation of ORIV detailed in n. 31 of that paper. Bootstrapped standard errors do not rely on assumptions about asymptotics, which may be particularly troublesome when dealing with correlations.

21 Formally, Xi and Yi, i{1,2}, are two elicitations of X* and Y* measured with error. For a given i, Xi and Yi are constructed by subtracting the same quantity (measured with error) from two different quantities. Thus, the measurement errors in Xi and Yi are correlated for a given i, but the measurement errors in Xi and Yj, ij, are not. The stacked regression in ORIV is then modified to become (YaYb)=(α1α2)+β(XaXb)+η, with instruments W=(Xb0N0NXa).

22 See Abdi and Williams (2010) for an introduction. We also tried more sophisticated machine-learning techniques, such as clustering, but these produced no additional insight. See app. B.

23 The value of additional components is graphically represented by a “scree” plot. This shows the eigenvalues of the correlation matrix of the underlying data and compares these with the average value of the eigenvalues produced by parallel analysis. The scree plots for our analysis can be found in app. C.

24 We have chosen to present our results first, then relate them to previous empirical results afterward. Given the number of results here, it is difficult to preview them all, relate them to the literature, and then go back and describe them all in detail.

25 One could instead code people as having “expected” or “unexpected” behavior on a dimension or define a point on each dimension that is “neutral” (such as risk neutral) and then measure “biases” in either direction from that point. These approaches are closer to that of Stango and Zinman (forthcoming). As our focus is the behavior itself, rather than how behavior compares to predictions of a model, we examine behaviors linearly in the scale they are measured.

26 Gillen, Snowberg, and Yariv (2019) do not report data from overconfidence measures.

27 The shading is driven by a concave function of magnitude so that there is more differentiation between magnitudes of 0 and 0.25 than there is between 0.25 and 1. We also show both the upper and lower parts of the symmetric correlation matrix.

28 As a reminder, both refer to costly punishments meted out to participants in a sender-receiver game where the sender sent his or her entire endowment and the receiver returned none. Prosocial punishment refers to the amount used to punish the receiver, and antisocial punishment refers to the amount used to punish the sender.

29 Note that these come from the certainty equivalent of a draw from an ambiguous (Ellsberg) or a compound urn minus the certainty equivalent from a 50/50 risky urn. In order to prevent spurious correlation, we adapt the ORIV procedure as described in sec. III.B.

30 Entering the common-ratio measure directly in the PCA produces results qualitatively similar to Table 6; see table E.2.

31 Note that our survey also contained two instantiations of the “beauty contest” measure of strategic sophistication (Nagel 1995). Unfortunately, as in Snowberg and Yariv (2021), we found that the distribution of responses is relatively uniform, with a spike in responses at 50 (out of 100), as shown in fig. E.1. It is difficult to discern from this pattern who is strategically sophisticated. Thus, we did not include these measures in our analysis.

32 Condon and Revelle (2014) contains the percentage of people who typically get each of their questions correct. We used this information to attempt to cut the population at the 20th, 40th, 60th, 80th, 90th, and 95th percentiles. We were largely successful, with 21% answering no questions correctly, 24% answering one correctly, 24% answering two correctly, 15% answering three correctly, 7% answering four correctly, 5% answering five correctly, and 4% answering all six correctly.

33 Note that perceptions of how well one did on the IQ tasks are used to calculate one of the three main components of the Overconfidence measure and also have some weight in the other components, which might result in some spurious correlations. This is unlikely to be a matter of great concern, as the pattern of correlations with the CRT is largely the same. Moreover, this is consistent with the so-called Kruger and Dunning (1999) effect. That is, poor performers also have poor metacognition and think that their performance is not as relatively bad as it is, leading to negative correlation between confidence and performance (see also Krueger and Mueller 2002).

34 Table E.1 shows the relationships between individual econographics and cognitive measures.

35 Table E.1 shows the relationships between individual econographics and demographics.

36 Further, we note that Stango and Zinman (forthcoming) find larger variation in the degree to which they classify people as “behavioral” within educational strata than across them.

37 Although not significant after a Bonferroni correction, this result is significant without one. As this fact has been previously shown, e.g., in Ortoleva and Snowberg (2015b), there is little risk of this being a spurious correlation.

38 Charness and Rabin (2002) take an approach similar to ours, as they parameterize different components of social preference to see which ones have the most predictive power. But they explicitly “have not tested for individual differences and correlation across games, and neither our analysis nor our model deals with heterogeneity of subject preferences” (849). See also Bruhin, Fehr, and Schunk (2019).

39 A related class of theories models social image concerns, in which behavior depends delicately on what people think others know about their behavior (e.g., Andreoni and Bernheim 2009; Lazear, Malmendier, and Weber 2012). Establishing common knowledge and credibility is crucial for the validity of such measures but very difficult to do in representative surveys.

40 Becker et al. (2012) measure risk aversion, time preferences, trust, and altruism, using incentivized questions in a representative sample. However, they do not report the correlations between these behaviors, as their goal is to study the links between these behaviors and personality. Burks et al. (2009) make use of a large-scale experiment carried out on a group of newly recruited truck drivers. The authors use parametric methods to measure risk aversion, short-term and long-term discounting (though a beta-delta model), and behavior in a sequential two-person prisoner’s dilemma (similar to our trust game). They find a statistically significant (though quantitatively small) relationship between risk attitudes, patience, and sender behavior in the trust game: more patient people tended to be less risk averse and send more in the trust game.

41 These four are present bias, inconsistent and dominated choices, math/statistical biases, and limited attention and memory.

42 Out of the 210 correlations we examine, about 20 are common between our study and Stango and Zinman’s. Major differences include that Stango and Zinman use exploratory factor analysis rather than PCA for dimension reduction; typically do not incentivize their measures; and use different, though sometimes related, measures.

43 Correlations between these preferences are generally lower than what we observe (all are below 0.33). This may be due to the attenuating effects of measurement error or to the fact that they examine a more diverse population (across 80 countries).

44 A typical list of such skill variables—also called soft skills or character skills—is behavioral problems, social skills, communication, self-esteem, persistence, locus of control, empathy, impulsivity, and personality (see Morris et al. 2021).

45 This measure, Risk Aversion: Urn is described in n. 13. Note that this measure also loads on the uncertainty component, although the correlations that lead to this loading are spurious, as Risk Aversion: Urn is used in constructing both of the measures that comprise the uncertainty component.

46 The number of components is determined, as above, with parallel analysis. In analyzing just the social preference measures, Dislike Having Less loads on the punishment component.

47 The preceding two paragraphs are not comprehensive tests of robustness. Rather, they illustrate patterns in the data to build intuition about how PCA responds to small changes in which econographics are included.