Preliminary draft 1/05/01
All rights reserved
Coherent and Incoherent Valuation:
A Problem With Contingent Valuation of Cultural Amenities
Cass R. Sunstein*
Consider the following contingent valuation studies:
(b) How much would you be willing to pay to ensure the continued operation of an art museum in your city?
(b) How much would you be willing to pay to ensure the continued operation of an art museum in your city?
I believe that for the art museum, the amount that will be elicited in study (1) will be higher than the amount elicited in study (2). I also believe that for the same museum, study (3) will elicit an amount equal to or less than that elicited in study (1).
In making these claims, I aim to describe a new problem with contingent valuation studies, and to suggest the nature and severity of the difficulties in surmounting it. The source of the problem, in brief, is category-bound thinking. When people explore particular problems in isolation, they normalize them by comparing them to a cognitively accessible comparison set, consisting of cases from the same basic category. When cases from other categories are introduced into the picture, people’s judgments can be greatly affected, because the process of normalization is disrupted. The upshot is that if cases are assessed in isolation, people will produce a pattern of outcomes that will be incoherent – that will make no sense by their own lights. It is for this reason that there is often a difference between people’s judgments of items when they evaluate them separately (SE, for separate evaluation) and their judgments of the same items when they evaluate them jointly (JE, for joint evaluation). In JE, there can even be judgment reversals as compared to what emerges in SE – judgment reversals in the sense that A can be valued more highly than B in SE, but B more highly than A in JE.
People also have an inchoate but generally unmistakable ranking of categories. In the criminal context, they tend to believe that murder is worse than rape, that rape is worse than assault, and that assault is worse than theft. In the context of contingent valuation, people tend to believe that health problems are more serious than ecological problems, and that risks to life are more serious than risks to cultural amenities. Problems arise – in the sense that people are not sure how to answer -- when people attempt to compare a low-ranking problem form a high-ranking category with a high-ranking problem from a low-ranking category. Here people do not easily make cross-category comparisons.
I will urge that these points present a formidable challenge for those who seek to use contingent valuation to come up with monetary amounts for cultural amenities. My simplest suggestion is that contingent valuation (CV) studies will produce less than helpful and possibly even worthless outcomes if they attempt to ask people about cultural amenities by posing single questions in the form of question (1) above. Indeed, any CV study that asks people about cultural amenities in particular, without introducing cases from other categories, will have serious problems. It is possible to go further. CV studies that ask people questions about cultural amenities are likely to produce significantly inflated outcomes, at least from the standpoint of conventional economic accounts of value. The reason is that cultural amenities tend to be part of relatively low-ranking categories, and hence people will naturally produce high monetary values simply because they will not be thinking of other categories that rank higher.
This does not prove that in the context of cultural amenities, CV studies are necessarily a waste of time. The challenge is to overcome the effects of category-bound thinking without dwarfing the constraints and capacities of experimenters and subjects. I do not know if this challenge can be surmounted. One difficulty is that any effort to introduce cases from other categories creates a risk of bias and manipulation: Which cases, and which categories, should subjects be asked to explore as well? If cases from unusually low-ranking categories are introduced, the amounts for cultural amenities might be artificially inflated; if cases from unusually high-ranking categories are introduced, the amounts might be artificially reduced. An effort to reduce cases from all categories might not be practicable, assuming we are able to know what such an effort would entail. In Part V, I will also raise some doubts about the theory of economic valuation, and hence willingness to pay, in the context of cultural policy, where a goal is to shape preferences, not merely to cater to them. But my basic submission is simple: It seems clear that valuation of cultural amenities are likely to be inflated, by the standard economic criteria, if the relevant contingent valuation studies do not ask people to consider problems from other categories, involving, for example, human health, safety, and the environment.
A. Judgment Reversals
Let me begin with some work from other domains. Suppose that a contingent valuation study attempts to obtain people’s willingness to pay to reduce health and ecological risks. In the study, a number of people are asked to say how much they would be willing to pay for research on bone marrow cancer among the elderly. Suppose that different people are asked to say how much they would be willing to pay to protect coral reefs by banning certain actions that are injurious to them. Suppose, finally, that a third group of people is asked to say how much they would be willing to pay for research on bone marrow cancer among the elderly, and also to say how much they would be willing to pay to protect coral reefs. What would you expect the third group to say? How would their answers diverge from those of the first and second groups, taken together?
Experimental data show the following. The bone marrow cancer problem, taken in isolation, produces relatively low willingness to pay. The coral reef problem, taken in isolation, produces relatively high willingness to pay. But when the two problems are taken together, there is a substantial shift. People want to pay more to prevent bone marrow cancer among the elderly than to protect coral reefs. More specifically, the median WTP for both problems is about $70 for both problems, taken in isolation. But when the two are taken together, the WTP is about $105 for bone marrow cancer research, and $60 for coral reefs. WTP for bone marrow research goes substantially up in joint evaluation, and WTP for coral reefs goes down. The basic result is duplicated in the context of several other problems involving human health problems and ecological problems.
B. Normalization
Now what explains these results? Why does separate evaluation produce different outcomes from joint evaluation? Here is a possible explanation. When people see the bone marrow cancer problem, they spontaneously normalize it, by comparing it to other human health problems, which are the cognitively accessible comparison set. A bone marrow cancer problem does not naturally make people think about ecological problem, or cultural amenities, or national defense. Taken within the class of human health problems, bone marrow cancer among the elderly seems a relatively trivial matter. In any case we are speaking of old people, who do not have many years left. The same process is at work when people see the coral reef problem in isolation. Here too they spontaneously normalize it, by comparing it to the comparison cases that it calls up, namely other ecological problems. Taken within the class of ecological problems, damage to coral reefs seems quite serious. And when given the coral reef case, most people do not naturally or spontaneously think about human health problems. Hence it is perfectly possible that large groups will produce a high WTP to protect coral reefs (a high ranking case within the class of ecological problems) and a low WTP to protect against bone marrow cancer among the elderly (a low ranking case within the class of human health problems).
Things shift when subjects see the two problems together. Here people think quite differently, simply because the comparison set has changed. People are not willing to pay more to protect coral reefs than to prevent cancer; indeed, they tend to think that this would be a foolish way to allocate resources. Joint evaluation has the effect of disrupting the process of normalization, by getting people to consider not only the category including the case at hand, but also other categories. Damage to coral reefs looks like a serious ecological problem. But it pales before human health problems. Judgments shift accordingly.
This point suggests what seems clear, that people have a kind of implicit ranking of categories, in accordance with which ecological problems rank lower than problems to human health. If this is so, we can imagine how to produce experiments that would generate SE-JE judgment reversals: Ask people to assess a low-priority problem from a high-ranking category and a high-priority item from a low-ranking category. In SE, the member of the low ranking category will receive much higher WTP than it will receive in JE.
C. Numbers, Cases, and Categories
It might be noticed that the questions, in the experiment just described, are quite vague. Numbers, for example, were not given. Is it even sensible to ask people how much they are willing to pay to prevent skin cancer among the elderly, without telling them exactly what the money would do? It might even seem surprising that people are willing to answer questions of this kind. But the findings described thus far are unlikely to be affected by more specific questions. Suppose that people are asked how much they would be willing to pay to contribute to a fund designed to prevent 50 skin cancer cases, and that they were asked how much they would be willing to pay to contribute to a fund designed to protect 500 miles of coral reef. Would the results be different? This is extremely doubtful.
In fact we know a great deal about how people think about the sorts of problems involved in CV studies. For present purposes, the following points are especially important. People tend to be able to agree on how to rank cases within a given category. People are able to rate problems along a numerical scale, and to come up with numbers on which there is considerable social agreement. But serious problems are created by the use of the unbounded scale of dollars. People do not have a clear sense of how to translate their moral and political judgments into the dollar scale. As a result of the translation problem, an effort to capture willingness to pay can produce variable and somewhat arbitrary results. This problem can be reduced if experimenters supply an "anchor," in the form of a numerical amount from which subjects start. The problem is that the anchor is likely to have a large influence on judgments, and is thus likely to manipulate judgments in one or another direction. As I have emphasized, people’s judgments about particular problems are usually made by consulting a narrow set of similar problems; people do not naturally or spontaneously ask about problems from other categories. This is a serious difficulty for contingent valuation of cultural amenities, as we shall now see.
III. Culture
It should be clear that SE-JE judgment reversals are exceedingly likely in the context of cultural amenities. The logic of the situation is the same as the logic of the cases just described.
Suppose, for example, that people are told that there is no art museum in a major city, and asked to state their WTP for it. A reasonable hunch is that if the problem is taken in isolation, people will be willing to pay a nontrivial amount. It is no light thing for a city to lack an art museum. Within the category of cultural amenities, an art museum seems to be a high priority. (The numbers would likely be all the higher if people were asked how much they would pay to ensure continued operation of an art museum; because of loss aversion, people greatly dislike losses of what they already have.) But suppose now that we ask people not only about an art museum, but also about a human health problem, such as skin cancer among the elderly. Once people are confronted with skin cancer problem, they are highly likely to be willing to pay somewhat less to support art museums, on the ground that health problems simply have a higher priority than does art. As before, the reason for the reversal is that when evaluating the case of the art museum, people do not naturally or spontaneously think about other categories of problems.
Of course it would be possible to create a different kind of reversal. Suppose, for example, that people are asked to say how much they would be willing to pay for an art museum and also for a problem that seems to deserve less attention, as, for example, in the case of subsidies to tobacco farmers. In a case of this sort, it is possible that JE will actually deflate people’s WTP.
The most general conclusion is that we have great reason to distrust the results of CV studies that ask people isolated questions about cultural amenities. The point is important, for numerous studies have proceeded in exactly this way. Willis, for example, attempted to monetize the value of the Durbin Cathedral, by asking a single question about WTP. Hansen measured the value of the Royal Theatre in Copenhagen by asking people their WTP, without introducing other cases, let alone cases from other categories. In an especially interesting study, Santagata and Signorello attempted to measure the value of a cultural network in Naples, consisting of churches, palaces, historical squares, and a museum. They asked people about their "yearly voluntary money contribution in order to preserve" the existing situation. These and similar studies have an extremely serious problem.
The more specific conclusion is that the results of those studies are likely to produce inflated numbers, at least if we stay within the economic paradigm. This conclusion follows from the fact that cultural amenities are, or are part of, a relatively low-ranking category. Recall that high-ranking members are low-ranking categories are likely to be inflated as a result of SE, whereas low-ranking members of high-ranking categories are likely to be deflated as a result of SE.
IV. What Can Be Done?
Can anything be done about the problem? If we are seeking to elicit people’s valuations, and to avoid the incoherence that emerges from separate evaluation, the most natural step would be to give people problems from all categories, or at least from a wide range of them. An assessment of the value of museums, or art exhibits, would be done in the context of an assessment of the value of many or most other things. But the problems with this solution should be obvious. Any such effort would be exceptionally demanding, and possibly extremely confusing to boot. Could people sensibly generate dollar figures for problems from every category under the sun? Is it really practicable for experimenters to ask people, at once, about cases from many areas of the state or local budget?
If it is not, it might seem tempting to try to produce more limited contingent valuation studies, avoiding the disadvantages of isolated questions but without attempting the impossible. Experimenters might offer cases from two or more categories, but without exhausting the universe of categories. But note that there is a serious problem here: If the set of categories is sharply limited, there is a significant risk of manipulation, whether intentional or not. If people are asked not only about cultural amenities but also about goods from a very low-ranking category, the cultural amenity in question will not decrease in value, and indeed will likely increase, in joint evaluation. If people are asked not only about cultural amenities but also about goods from a very high-ranking category, the good in question is likely to decrease in value.
It follows that efforts to overcome the problem of category-bound thinking will encounter two difficulties: cognitive overload on the one hand and manipulation (whether or not intentional) on the other. I am not sure if these difficulties can be surmounted. But I am certain that when experiments elicit WTP by asking people a single question about a cultural amenity, or a series of questions about cultural amenities, the answers will not be especially helpful.
V. Is Government An Aggregating Machine? The Limits of Markets
The analysis in this essay has been conducted in terms of the conventional economic paradigm. Taking the market as the appropriate model, CV studies assume that willingness to pay is the proper measure of value; they attempt to do whatever can be done to ensure accurate elicitation of that value. I have suggested that in the cultural context in particular, and in many others as well, CV studies cannot promote that goal without overcoming the problem of category-bound thinking. But I have serious doubts about the goal itself in the context of cultural amenities, and while I cannot defend those doubts here, I will venture a few words about them here. My principal concern is that those who engage in CV studies, here and elsewhere, seem insufficiently attuned to the normative problems.
Government should not be taken as a maximizing machine, with the goal of aggregating preferences in accordance with the market model. In some contexts, the point is easy to see. Many people would be willing to pay a great deal to discriminate on the basis of race or sex; but that point does not mean that they are authorized to discriminate, or that their willingness to pay should count in policy. Even the most committed contemporary utilitarians agree that malicious and sadistic preferences should not be counted. It might be tempting to respond that this kind of restriction is a paternalistic intervention, by certain people with certain values, into the neutral process of aggregation. But what is neutral about that process? How can the aggregation itself be defended? Any claim for aggregation must be urged in terms of values, and it is extremely difficult to come up with any plausible account that calls for aggregation of preferences as such. Society is not a person, and even people reflect on their preferences; they do not simply try to satisfy them. The problem is compounded by the fact that willingness to pay is a crude proxy for utility, especially in light of income inequality: Many people are willing to pay a great deal for things from which they would not get much utility, and many people are willing to pay little for things from which they would get a great deal of utility.
These are abstract points that do not say anything particular about the proper approach to cultural amenities. Of course free societies make a large place for market ordering, and hence for willingness to pay, because of the (imperfect, but pretty good) association between markets and social welfare. But art exhibits, science museums, and gardens should not be seen as ordinary commodities, subject to the forces of supply and demand. They have an educative function. More precisely, they have a preference-shaping function, helping to ensure certain sorts of values and tastes. They supplement the school system, and do so for adults; and this is widely understood. They do not simply cater to existing tastes. Now we might be concerned if undemocratic forces were engaged in preference-shaping. And of course preference-shaping can be tyrannical. But there is really no avoiding it. Even markets shape preferences, by inculcating certain attitudes and desires, and they have long been defended on just this ground.
So long as officials are subject to democratic control, there is no objection, in principle, to an effort to spend far more on culture than would emerge from aggregated WTP. Indeed, people make choices both as consumers and as citizens, and sometimes their choices as citizens diverge from their choices as consumers. One reason is that the role of citizen solves a collective action problem. Another reason is that as citizens, people often try to promote their aspirations, urging policies that diverge from their consumption decisions, of which they may not always approve. These are reasons that political choices sometimes diverge from aggregated consumption choices. The divergence should be seen, not as an abridgement of freedom, but as democracy in action. The market paradigm, suitable as it is for many contexts, is not suitable for judgments about cultural amenities.
VI. Conclusion
My major goal here has been to draw attention to a serious problem with contingent valuation: category-bound thinking. People do not spontaneously compare a problem to problems from other categories, but instead tend to normalize it by comparing it to a cognitively accessible comparison set, usually consisting of cases from the same category. For this reason, we should suspect that valuations given after seeing one problem, or even a set of problems from the same or similar categories, are different from what they would be if people saw cases from other categories.
These points suggest that for cultural goods, it is highly likely that people will produce, in separate evaluation, numbers that they will themselves consider inflated, if cultural goods are placed together with problems from other categories. The natural remedy – to provide cases from all or most categories – does not appear to be feasible. The practical question is whether it is possible to design an approach that overcomes category-bound thinking while also imposing realistic demands on those who administer and respond to contingent valuation surveys. I have also suggested that the market model does not work in the context of cultural amenities, where it is legitimate to attempt to shape preferences and values, not simply to cater to them. But even for those who reject this conclusion, the likelihood of category-bound thinking means that CV method is fraught with difficulties.