Tuesday, February 21, 2012

OR and Base Voters: Common Pitfalls

My adopted state of Michigan is currently afflicted with the Republican presidential primary. (Symptoms include repetitious attack ads on television, robocalls to one's house, and the general malaise associated with staring at any crop of candidates for political office.) Primaries tend to draw out "base" voters (those committed to one party or the other); we swing voters just stay at home, hiding under the covers until it is over.

Last night the local TV news included a sound bite from a generic Republican voter, an apparently intelligent and articulate woman (to the extent one can judge these attributes from a two sentence interview) who said she was still undecided because she wanted to vote for the "most conservative" candidate. The logic, or lack of logic, behind that statement caused me to take notice of the similarities between how some "base" voters think and common errors in operations research.

A single criterion is easy, but multiple criteria may be correct. There are quite a few pressing issues these days, ranging from foreign policy to budget deficits to global warming to unemployment to ... (I'll stop there; I'm starting to depress myself). Our base voter, henceforth Mme. X, has apparently condensed these criteria down to a single value, on a scale from hard core liberal (arbitrarily 0) to hard core conservative (arbitrarily 1). What is not apparent is how the multiple dimensions were collapsed to a single one. OR people know that multiple criterion optimization is hard, more from a conceptual standpoint than from a computational one. Using a single composite criterion (weighted sum of criteria, distance from a Utopia point in some arbitrary metric, ...) makes the computational part easier, but there are consequences (frequently hidden) to the choice of the single criterion. Goal programming has its own somewhat arbitrary choices (aspiration levels, priorities) which again can have surprising consequences. Picking the "most conservative" candidate simplifies the cognitive process but may lead to buyer's remorse. Similarly, arbitrarily collapsing multiple objectives into a single objective may simplify modeling, but may produce solutions that do not leave the client happy.

Averages can be deceptive. Point estimates also make modeling and decision making easier, but they can mask important things. (A colleague has a favorite, if politically incorrect, quotation: "Statistics are like bikinis. What they reveal is interesting, but what they conceal is critical.")

Suppose that Mme. X has narrowed her choices down to two candidates, and that they have both weighed in on five important issues (A through E). If candidate 1 is consistently to the right of candidate 2 on all issues, we have a dominated solution: Mme. X can eliminate candidate 2 and vote for candidate 1. On the other hand, consider the following scenario, where each candidate's position is rated on a scale from 0 (liberal) to 1 (conservative).
Candidate 1 is more conservative than candidate 2 in both mean (0.780 versus 0.756) and median (0.80 versus 0.75); yet candidate 2 is to the right of candidate 1 on two of five issues (A and B), and close to a wash on a third (C).  So if Mme. X truly wants a conservative candidate, it is not all that clear which she should prefer. Likewise, OR models that consider only point estimates without taking dispersion into account can result in solutions that should do well "on average" but sometimes do quite poorly.

A solution that goes unimplemented is not a solution. Missing in Mme. X's search for the most conservative candidate is the quality referred to by pundits as "electability". Neither major political party claims a majority of registered voters in the U.S., so to win a general election, a candidate must capture a significant number of moderates and independents. The most ideologically pure candidate (for either party) may not be able to do so. This is a bit of a paradox in recent elections, where candidates find that they must appeal to "base" voters at one end of the political spectrum to get the nomination, then appeal to voters in the middle of the spectrum to win the election. Ideological "base" voters may not grasp this particular reality; they expect the "correctness" of their candidate's views (which are also their views) to triumph. [This may be at least partly explained by the false consensus fallacy.]

OR modelers sometimes have a similar blind spot. We can pursue perfection at the expense of good answers. We can opt for the approach that uses the most sophisticated or "elegant" mathematics or the most high-powered solution technique available. We may try for more scope or more scale in a project than what we can accomplish in a reasonable time frame (or what users can realistically cope with, in terms of data requirements and solution complexity). Professional journals often encourage this trend by requiring "novel" solution methods in order to publish a paper. The end result can be a really impressive solution that sits on a shelf because the client is unwilling or unable to implement it, or because it is too complex for the client to understand and trust.

Garbage in, garbage out. OR models rely on data, as inputs to the decision process or to calibrate parameters of the model. Feed bad data to an otherwise correct model and no good will come of it.  I have seen estimates that as much as 60% of the time in an OR project can be spent cleaning the data.

Meanwhile, Mme. X has to rely on a variety of unreliable sources to gauge how conservative each candidate may be. Candidates famously say things they may not entirely believe, or express intentions they may not carry out, either in an overt effort to curry voters or because their views change between campaigning and governing. Historical data may be faked or misreported, and sometimes facts may not be what they seem. For instance, a generally pro-military candidate might vote against a military appropriation bill because there is a rider on it that would fund an inordinately wasteful project, or something unpalatable to the candidate and/or the candidate's constituents. Opponents will characterize this as an anti-military stance. Budget projections, and indeed any sort of projections, are subject to forecast errors, so a candidate's magical plan to fix deficits/unemployments/Mme. X's dripping kitchen faucet may turn out not to be so magical after all. Unfortunately for Mme. X, she probably has less ability to filter and correct bad data than an OR analyst typically does.

So, in conclusion, voters and OR analysts face similar challenges ... but OR analysts do not have to cope with a glut of robocalls.


  1. Very nice post. The only thing I don't get is why you are worried about the budget deficit ;)
    You live in the United States which is the sole issuer of US-Dollars, so it can technically never run out of money. (In addition it has no foreign debt and has a flexible exchange rate).
    So we should not worry about the budget, but what goverment ought to do or not.

    A good written introduction to Modern Monetary Theory is "7 deadly innocent frauds" from Warren Mosler.

  2. @Christian: Your comment about debt and exchange rate had me looking around to see if China had "repossessed" us. Although I suppose our exchange rate really is flexible -- what matters most is the dollar:yuan rate, which seems to be whatever the Chinese government thinks it should be.

    Thanks for the link.

  3. Hi Paul,

    I just mentioned the flexible exchange rate, because countries that issue their own currency, but peg it to another currency can run into problems, see Argentinia or Russia for that. The central bank must exchange their own currency into that of another country, if a fixed exchange rate is set. But as they do not issue the other currency, they have to get hold of that currency in financial markets. The case for the US is entirely different, as I stated above.

    So China does not "possess" the United States in any way. They can however buy US goods in exchange for their Dollars.

    The only problem is that your government and nearly everyone that is running for it thinks that the government is broke. In their own currency that they issue in the first place. It really is non-sensical.

  4. Glad to find your blog. Now that I know I'm not alone in continual frustration with the glib illogic of pundits and pols, my head is less likely to explode! People who don't understand statistics and political taxonomy should not be allowed to cover elections! (I thnik the Sports departments of major networks would do an excellent job of getting passionate about numbers and showing breaking live events with great style and gusto.) I'm so sick of undefined canards like "[the candidate's] base," "the [party's] left/right wings," "independent/undecided voters," "the [vaguely described ethnic] Bloc," and unsubstantiated "trends" and "narratives."

  5. On your last point, I find it interesting that the difference between "trend" and "regression to the mean" or "noise" is based on the non-mathematical property "does it bolster my argument?".

  6. Hi Paul,

    Nice article. Recently, I attended a talk on use of 'multi-objective' stochastic programming models for capacity planning, so probably this technique could ease out some of these (ill effects of point estimates and single objective).

  7. @bengu: Thanks. Multi-objective SP may work for some problems. There's a relatively new area, I think called "robust optimization", that may work in other cases, at least when a single criterion is appropriate (and may be less demanding in terms of data requirements). Explicitly recognizing multiple criteria and randomness makes the problems harder, so I expect we'll continue to see a lot of "simplified" models -- which I think is okay, as long as we recognize and accept the implicit compromises.


Due to intermittent spamming, comments are being moderated. If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on Operations Research Stack Exchange.