OR in an OB World: Formulating Optimization Models

Periodically, on OR Exchange and other forums, I encounter what are surely homework problems involving the construction of optimization models. "The Acme Anvil Corporation makes two types of anvils, blue ones and red ones. Blue anvil use 185 kg. of steel and have a gross revenue of $325 each; red anvils ..." Really? Does anyone seriously believe Acme Anvil uses metric units?? This is obviously a homework problem.

Not surprisingly, these sorts of questions tend not to get answered, or at least not fully. Many of the participants in the forums are academics (faculty or students) who are loathe to corrupt the homework process; most of the rest have real jobs, and better things to do than other people's homework. Still, there's something to be said for providing some nonspecific guidance on how a student might get started formulating such a model.

After bumping into one on a forum earlier today, it occurred to me to post here, somewhat paraphrased and highly condensed, my "introduction to modeling" lecture from the Dark Ages, when I taught modeling to business students. A caveat is in order: this is for beginners in the modeling process. Experienced modelers will go through these steps implicitly/subconsciously/automatically or perhaps just dive straight into a relevant model by association with previously encountered models.

In essence, what follows is a checklist for formulating an optimization model. For whatever it's worth, here it goes.

Variables: Identify your decisions.

An optimization model is typically a decision-making model. Rather than attempting to go from first reading of the problem to full-fledged algebraic formulation, start by identifying the key decisions you must make (e.g., how many red anvils to make in March at the Los Alamos plant). Each decision maps to a single variable in your model. You will not necessarily come up with all the variables at this step, but you should identify the important ones. It might be helpful to think of yourself as the supervisor of some diligent but not very bright (and thoroughly non-quantitative) employees. (Think Harvard MBAs here.) What orders would you need to give them to get things running properly? Those are at least some of your decisions/variables.

As you identify decisions, also select units for them. Are anvils produced individually or in pallets? Is paint ordered by the gallon, by the thousand gallons, by the jug ...? Also ask yourself whether the amounts are continuous or discrete. Time spent on something is pretty continuous. Number of anvils produced is rather discrete.

Since discrete variables generally make models harder to solve, you might ask whether it's reasonable to treat seemingly discrete amounts a continuous. One example I used for this is the manufacture of ball bearings. The little buggers are obviously discrete -- nobody uses 4.37 ball bearings in the manufacture of a device -- but you probably make the little guys by the rail carload. If you decide to make 2.134 carloads today, nobody's going to obsess about whether that translates to a partial ball bearing.

My classroom experience suggests that this step is critical. Almost every time I had a student foundering in an attempt to formulate a model, their difficulty traced back to incorrect identification of the decision variables.

Bounds: What are reasonable values for your decisions?

The question here is not what specific values do you want; that's the solution to the model. The question is, in general terms, how big or small can each decision quantity be? The answers give upper and lower bounds for your decision variables. Often the lower bound will be zero, signaling that (a) you are measuring things the way most people do (negative quantities are inherently unnatural for most people) and (b) you might possibly not want to do whatever this variable represents at all. Upper bounds do not need to be very tight, and definitely should not be too tight (too tight meaning it cuts off a valid decision that might prove to be optimal). How many red anvils might I choose to make in a day? Without getting mired in resource issues, simple logic might suggest that more than 1,000 is out of the question. So I can use 1,000 as the upper bound for that quantity.

Objective function: What is my criterion for selecting a program of action?

Suppose that some intern came to you with several choices for what to do (how many anvils of each type to make, how many to put into storage, which ones to drop off cliffs, ...). How would you decide which solution was best? The criterion you use will turn into your objective function. Common choices include maximizing profit or minimizing cost (or, if you're the government, maximizing cost), but it could be something non-monetary such as maximizing the number of at-risk subjects inoculated against some disease or maximizing the amount of food aid distributed to a stricken area.

Having determined your decision criterion, you need to express it as a function (preferably linear, but life isn't always cooperative) of your decision variables. In attempting to do so, you may identify quantities other than decision variables that you need to know (or that would be helpful) in order to express the objective function. Those quantities will turn into auxiliary variables. Mathematically, there's no difference between a "decision" variable and an "auxiliary" variable; the distinction is largely in whether it looks like something you would decide explicitly (previous section) or implicitly (this section).

A common example of an auxiliary variable for business students is inventory. How many blue anvils should be put into inventory in March to be shipped in April? You might not have thought of that as an explicit decision when identifying the original variables. When you try to express profit (your criterion) as a function, however, you realize that there are costs for carrying inventory, and you must account for them ... which requires knowing how much inventory you have. Thus are born the inventory variables.

There are methods for handling multiple competing criteria (I want to maximize distribution of food aid while minimizing the cost of acquiring and distributing it), but beginners are urged to start with a single criterion. If you must account for multiple criteria, you will need to do one or more of the following:

prioritize them;
set reasonable targets (aspiration levels) for them;
weight them (e.g., I'm indifferent between feeding one more person and saving $20 in cost).

Constraints: What restricts your decisions?

At this juncture, the ideal set of decisions is likely to be obvious. Make an infinite number of anvils and sell them all for a profit (at the risk of causing road runner extinction). Eat as much as you want of all the foods you like. Sadly, various factors will constrain your free will here. There's only so much iron available for anvils, your factories have limited capacity, and demand for anvils is not infinite. Your wallet will not support that all-steak diet, your wardrobe will not support the change in dimensions resulting from all the ice cream you can eat, and your doctor (or mother) (or both) absolutely insists that something green is mandatory (and mint ice cream does not qualify).

Your task now is to identify the aspects of the situation that will limit your decisions and express them as equations or inequalities involving your decision variables. Again, you may identify auxiliary variables during this process. If you do not have a specific accounting cost for holding inventory, you might not have identified it in the previous section. When it comes time to connect anvil production with anvil shipments, though, you may find it convenient to create inventory variables.

For strictly mathematical reasons, you will need to avoid strict inequalities. Where quantities are discrete, this is easy. "I need to make more than 50 blue anvils" (strict inequality, verboten) is equivalent to "I need to make at least 51 blue anvils" (weak inequality, perfectly legal). In other cases, it may not be so easy. "We need to make some blue paint" sounds a lot like "blue paint production $> 0$", but that's a strict inequality. Your choices are either "blue paint production $\ge 0$" (allowing the possibility that you produce no blue paint) or "blue paint production $\ge \epsilon$ where $\epsilon$ is a strictly positive parameter (a lower production limit that is the minimum amount you can live with).

I hope someone finds that helpful.