Paul Riedesel, Ph.D.
President, Action Marketing Research
Overview
"Discrete choice analysis" encompasses a variety of experimental design techniques, data collection procedures, and statistical procedures which can be used to predict the choices that consumers will make between alternatives. These techniques apply when consumers have the ability to choose between distinct ("discrete") courses of action. For example:
Not all consumer choices are clearly discrete (filling the gas tank versus some lesser amount). Not all consumer behavior results from conscious, deliberated choices (buying a gallon of the only brand of milk your store carries). Discrete choice analysis is not a magic tool for understanding all consumer behavior. However, it offers certain advantages which are unmatched by any other marketing research technique.
The most-common application in marketing research is to the problem of how consumers choose between competing products. Qualitative research, motivation research, positioning research, segmentation research etc. all can give partial answers to the question of "Why do consumers buy what they buy?" All are deficient, though, when it comes to practical predictions of what is likely to happen in the marketplace given a competitive situation.
A key advantage of discrete choice techniques is that they are based on the observation of consumer choices (real or simulated). In the end, what we do as consumers is make choices, and those choices are ultimately what matters to the marketer. All else is secondary. The closer any research technique comes to modeling and/or predicting choices, the more actionable and credible it will be to marketers.
Discrete choice research is notably useful for the study of price sensitivity. The principal example used in this paper involves a pricing decision.
Applications of Discrete Choice Analysis
Discrete choice analysis may be applied to a wide variety of issues. The following examples are anything but exhaustive:
Another variation of choice modeling uses characteristics of the respondent as predictors. Given his or her characteristics, what are the odds the respondent will choose alternative A, B.... With Y alternatives and N of these personal variables, there will be Y-1 sets of N parameters. This approach is called Multinomial Logit in some quarters as opposed to Discrete Choice Analysis, although the same basic statistical procedure applies. The marketing literature has been quiet about this second approach. It is perfectly legitimate, though again apparently rare, to create models which use both attributes of the alternatives and attributes of the respondents as predictors.
Designing Discrete Choice Research
The statistical procedures used in discrete choice analysis can be applied to many kinds of data, including purchase panel data. However, we will focus on controlled choice experiments in marketing research.
In most, though not necessarily all, situations the major choice alternatives available to the consumer can be specified in advance. These may relate to brand choices:
The choices could also be few in number though not branded. For instance:
Choices can be "nested." If you choose to take a taxi, there is no more choosing to be done. If you choose to rent a car, however, there are further choices of who to rent from and what size to rent.
When there are a large number of possible (though still discrete) choices, the design issues are more complicated, such as:
It is possible and often desirable to develop two choice models: one to determine whether an alternative passes a first screening to even be considered, and a second to determine which alternative within the consideration set is chosen. For instance, a consumer might rule out all vehicles costing less than $15,000 or more than $20,000. But within that consideration set, price could be of minor importance in the final decision.
For purposes of this paper, we will stay with the simple case of a few branded alternatives. The situation may or may not call for a residual "any other" or "none of the above" alternative.
Once the major choice alternatives (brands) are decided, the research team must decide what attributes are to be used to describe the alternatives, and what the appropriate levels of the attributes should be tested.
Alternatives--the brands or unbranded objects between which the respondent must make a choice
Attributes--the variables which describe and define the differences between the alternatives (in addition to brand, if any)
Levels--the values of the attributes
The next major task is to devise a set of "choice sets" (we use the term "Shopping Game"). As few as eight or nine and as many as several hundred choice sets might be needed for a given experiment. This will depend on the number of alternatives, attributes, and levels, plus other statistical considerations. In the simplest cases, a choice set shows the brand choices and the characteristics of each brand.
The construction of choice sets requires a good understanding of experimental design. Almost always, some kind of fractional factorial design is used. The number of possible combinations of attribute levels across several alternatives may number in the thousands. We therefore need to use a subset of all possible choice sets which allows us to measure the effects of the attributes within a reasonable budget.
An Illustration
Assume we are concerned with an oil-change franchise called Quickie Lube. They compete with another franchise called Fast Change as well as scores of independent service stations and garages. Quickie Lube charges a standard rate of $18.95 today. The Fast Change franchise charges $21.95 but offers a 30-minute guarantee (if the work is not done in 30 minutes, the customer pays half-price). The prevailing rate at service stations is also around $17.95.
Our management is considering several questions:
We need to design a test with consumers to estimate the market response to these different situations. In so doing, we need to define what we mean by alternatives, attributes, and levels.
In this example, the choice alternatives we would offer study participants are:
The attributes we will use in this hypothetical case are:
In setting the levels of the attributes to be tested, we are able to use a higher price range for the two "branded" franchises. We also choose to state a constant price for "any other". The 30-minute guarantee applies only to the first three choices. In our example:
An alternative either offers the service guarantee, or it does not; the residual "any other" choice may have a constant "no guarantee" level
In our oil change example, there are actually three three-level price factors (one for each of three "brands") and three two-level guarantee factors (one for each of three brands). This simple-looking design is thus a 3x3x3x2x2x2 design, requiring 216 different choice sets or "Shopping Games" for a full factorial. As is typical, though, we have (hypothetically) settled for a fractional design of 27 sets. All six main effects (i.e. three price and three guarantee variables) can be estimated independently, but higher-level interaction terms have been confound (aliased) with main effects.
Following are facsimiles of the kind of shopping game or choice set which could be employed in this study. Almost always, each respondent sees only a subset of the choice sets used in a study (themselves usually a subset of a full factorial). Note that each respondent completes multiple choice tasks. The assignment of choice sets to respondents may be part of the larger orthogonal design.

Analysis of Discrete Choice Data
Assume that an experiment such as the one on oil change services has been completed. If we had 300 respondents and each of them completed only nine shopping games with four alternatives each, we will have
300 x 9 x 4
or 10,800 "cases" or lines of data! In the simplest mode of analysis, each "case" would include only:
Hmmm. Looks like a regression problem. But it isn't. It is true that any standard regression program could digest these data and print out results. What regression cannot do is to take into account the competitive set. The effect of a price of $21.95 for "your usual service station" will be very different if all other alternatives are priced at $21.95 instead of at $24.95. At a more technical level, ordinary regression could predict the probability of an alternative being chosen as less than .00 or more than 1.00! The choice probabilities within what we know to be a choice set will probably not equal 1.00, as they logically should. And there are also complications of heteroscedasticity (not an STD but still a problem).
Instead, discrete choice data should be modeled through so-called Multinomial Logit (MNL) which uses the maximum likelihood estimation procedure. MNL estimation is available in some but not all popular statistical software packages. Increased availability does not mean it is any easier to understand and use properly.
MNL produces a set of parameters which look a lot like normal regression coefficients. However, their interpretation and use is quite different. The statistical goal, so to speak, is to find a set of parameters which will optimally reproduce the distribution of choices as observed in the data. The quantiphobic may skip the following section.
| Since this paper was first written, a whole new class of estimation techniques has become available. They use what are called Hierarchical Bayesian procedures to estimate utilities, not at the aggregate level but for each respondent. We will not attempt to explain these complex procedures here, but refer the reader to a paper by our friends at Sawtooth Software. The estimation procedure is different from that implied in our discussion of MNL (below), but the resulting utility coefficients are used in the same fashion. |
Multinomial Logit Estimation
Assume a simple example with three alternatives A, B, and C.
Each alternative is described by a vector X of N predictor variables which could include dummy variables representing brands, continuous or discrete price variables, or other nominal or continuous attributes.
Multinomial logit solves for a vector ß of N parameters of the same rank as X according to the method of maximum likelihood. For any alternative within any set:
N
V = ßiXi
i=1
V can be interpreted as the utility of the alternative, and is, as seen, the outcome of a linear equation. However, it has no particular meaning until subjected an to exponential transformation:
U = exp(V)
This is the natural antilog and always produces a positive number. As we have three alternatives,
UA = exp(VA)
UB = exp(VB)
UC = exp(VC)
The probability that alternative A will be chosen given its attributes and those of the other two alternatives is
pA| [A,B,C] = UA / (UA + UB + UC)
Likewise:
pB| [A,B,C] = UB / (UA + UB + UC)
pC| [A,B,C] = UC / (UA + UB + UC)
We hope it is obvious that:
pA + pB + pC =1.00
Program output typically consists of the coefficients and other estimation statistics. However, the vector of coefficients (ß) enables us to make the best possible predictions of choice probabilities according to these equations.
Referring again to our oil change study, the following table evaluates three (of scores) of possible scenarios. The predicted shares are based on a set of hypothetical coefficients from the MNL analysis.
Discussion
The first box represents the current situation. It suggests that Quickie Lube has a 20 percent share of preferences. With the other financial assumptions this further shows a monthly gross profit of $69,500.
The two following scenarios apply the model further. In the first scenario, Quickie Lube has added the 30-minute guarantee without an increase in price. The model says that its share could be expected to increase to 23.5 percent. Carrying through various arithmetic shows that even with the increased liability for those jobs in which the half-price guarantee was invoked, the company would come out ahead due to an overall increase in volume. The monthly gross profit goes up almost $10,000.
In the second scenario, Quickie Lube has taken a price cut of $1.00 (with no 30-minute guarantee). The model again predicts an increase in share to 22 percent. However, this is forecast to be a losing proposition. We would do more work, but by cutting our margin by $1.00 per job the net result is negative.
Many other scenarios could be evaluated.
An obvious managerial conclusion is that matching the service guarantee now offered by Fast Change would be a smarter move than trying to compete by cutting the price by a dollar.
This example is almost too simple as it does not take into account factors such as awareness and accessibility, but the application to real-life decisions should be clear.
Application of Discrete Choice Analysis to Oil Change Business
Assumptions
Current Situation
| Brand | Price | Guarantee | Predicted Share |
| Quickie Lube | $18.95 | None | 20.0% |
| Fast Change | $21.95 | Yes | 31.2% |
| Usual station | $17.95 | None | 43.9% |
| Any Other | $17.95 | None | 5.0% |
Monthly profit=($18.95-$12.00)*10,000 = $69,500
Scenario One
| Brand | Price | Guarantee | Predicted Share |
| Quickie Lube | $18.95 | Yes | 23.5% |
| Fast Change | $21.95 | Yes | 29.8% |
| Usual station | $17.95 | None | 41.8% |
| Any Other | $17.95 | None | 4.9% |
Incremental change in number of monthly jobs=(23.5/20.0)*10,000=1,750
Incremental revenue=1,750*$18.95=$33,162
Incremental cost=1,750*$12=($21,000)
Cost of fulfilling guarantee=.02*11,750 jobs*$9.48=($2,228)
Change in profit = ($33,162-$21,000-$2,228)=$9,934
Scenario Two
| Brand | Price | Guarantee | Predicted Share |
| Quickie Lube | $17.95 | No | 22.0% |
| Fast Change | $21.95 | Yes | 30.4% |
| Usual station | $17.95 | None | 42.7% |
| Any Other | $17.95 | None | 4.9% |
Incremental change in number of monthly jobs=(22.0/20.0)*10,000=1,000
Incremental revenue=1,000*$17.95=$17,950
Minus $10,000 (receiving $1.00 less for each base job)=$7,950
Incremental cost=1,000*$12=($12,000)
Change in profit=($4,050)
Shortcomings
Discrete choice research is not a cure-all.
Summary
This is only a brief outline of the advantages, disadvantages design, and analysis of discrete choice analysis in marketing research. Many interesting variations have not been mentioned at all. To our knowledge, textbooks have not begun to include any well-informed coverage. The only exception is Discrete Choice Analysis written by Professors Moshe Ben-Akiva and Steven Lerman. One must look to literature in fields as disparate as transportation research and economics to get the necessary background, and to working papers by practitioners and academics as presented at research conferences. By far the most important published work in marketing has been written by Professor Jordan Louviere and collaborators. He is now at the University of Sydney, Australia. His work has definitely expanded the frontiers of choice research.
Sawtooth Software, whose products we use and trust, has a growing body of working papers and research articles that explain these methods in some detail
##
© Paul Riedesel 1996, 2001