Sunday, July 31, 2011

The Curse of Basic Numeracy (or Why I Keep Gaining Weight)

Like all O.R. people (I hope!), and perhaps 2% of political officeholders in the U.S. (fewer if you restrict the count to Congress), I am functionally numerate. In particular, when ordering food at restaurants or high-calorie beverages (like the one I'm slurping as I type this) at coffee shops, I recognize when multiple sizes of the same order are priced so as to give the consumer an economy of scale.  For instance, the price of the frozen mocha in front of me comes from the following table:



Size
(oz.)

Price
(USD)
Marginal Cost
(USD/oz.)
Average Cost
(USD/oz.)
10 3.75 0.375 0.3750
16 4.35 0.100 0.2719
20 4.65 0.075 0.2325
24 5.05 0.100 0.2104

There is a modest diseconomy of scale in marginal cost going from the 20 oz. size to the 24 oz. size, the reason for which eludes me. That in turn bothers me at one level (I teach in a business school, so I feel that I should understand pricing models) and not at another (I occasionally fly on commercial airlines, whose ticket pricing algorithms appear to be the first "practical" application of chaos theory).

Obviously, being an O.R. person (and a business prof at that), I was compelled to order either the 20 oz. size (best marginal cost) or the 24 oz. size (best average cost). Thus do I, being perfectly rational, give the locker room scale one more reason to shudder when I walk in the door. I did order a reduced-calorie version, which is somewhat like playing stickball on a busy one-way street as opposed to a busy two-way street.

Update: I'm now sitting in another coffee shop, where I buy beans. They have a pretty sweet deal: buy a pound of beans (slightly more expensive bargain brands at the local supermarket, but not much more) and get a free drink, any type, any size.  I'm not sure exactly how many ounces in this drink, but the fact that it has its own lifeguard on duty is not a good sign. I truly do not want to think about how many calories it contains.

Tuesday, July 26, 2011

Princeton Consultants Is Hiring

This is a first for me.  I received an email this morning for someone at Princeton Consultants, asking if it would appropriate to list some job openings on the blog. Given the current unemployment rate in the U.S., who am I to say no -- especially since their name is "Princeton Consultants" and that they are headquartered in Princeton, NJ (although as far as I know are unaffiliated with Princeton University).

Quoting the message I got:
Our firm, Princeton Consultants, does IT and Management Consulting (optimization, systems integration, business process improvement, etc.) for a variety of Fortune 500 companies. We are currently looking for 10-15 new employees to start immediately as an Associate, Senior Associate or Analyst.
These are apparently all full-time positions. Interested readers can learn more about the company from their careers page, and can find a brief application form on their site. If you know any OR types in the job market, please pass the word to them.

Monday, July 25, 2011

USENET: A Remembrance

Although I've already submitted a post to the July INFORMS blog challenge ("O.R. and Social Networking"), I cannot let the topic go by without a (an?) homage to sci.op-research. It is probably premature (albeit barely) to write an epitaph for USENET: in fact, its traffic apparently continues to grow, and where once you needed access to a USENET server, now much of it is accessible via Google Groups.

(image source: Wikipedia)
I suspect, though, that the traffic growth is more a tribute to our insatiable appetite for naughty pictures, bootleg videos and software, and passionate debates about atonal musicians than a testimonial to continued use of USENET as a social network for professionals.

I'm sure a lot of younger people think "social networks" started with MySpace and Facebook. For those of use who predate the Internet (yes, kids, a few of us remain), social networks were once both virtual and analog (cf. "Old Boy Network"). After the introduction of the Internet, but before the advent of the World Wide Web (second note to the kiddies: yes, the two are distinct, and came into being several years apart), we began to enjoy a new way of connecting with people outside our sphere of physical contact: bulletin boards systems (BBSes). These tended to be text only and slightly cumbersome, but they were for many of us the first medium for broadcast communication, where your message is not directed at a specific individual, unlike letters, telephone calls etc. (For me they were the second such medium; I was an amateur radio operator briefly in my teens.)

Then came the mother of all BBSes, USENET.  I think it caught on first with system operators, programmers and other "computer geeks", then with a smattering of other technical types. (Bear in mind that, in the early days of the Internet, access was somewhat limited, largely to universities and the military here in the U.S.)  Then came the flood of, um, less technical stuff (alt.animals.otters? alt.amazon-women.admirers??).  Circa 1993, Mohan Sodhi came up with the idea for a USENET group for operations researchers, and sci.op-research was born.

For a while, at least, sci.op-research was a way for individuals in the O.R. community to ask and answer questions, and for the occasional non-O.R. person to get help.  By 2009, Mike Trick was ready to pronounce it, and the rest of USENET, irrelevant. Today, sadly, it has little non-spam activity. (Mike gets the self-fulfilling prophecy award for creating OR-Exchange, which likely is the nail in sci.op-research's coffin.) Someone with a quick O.R. question or thought today may think Twitter first; for a longer question, they likely will look to a web forum (such as OR-Exchange). Google+ may yet become another viable alternative for communications with and among O.R. people.

Still, in its heyday sci.op-research (and at least portions of USENET) served a useful purpose ... and not just to let O.R. professionals network with each other. One time I answered a question from a practitioner (MS in some engineering discipline, no formal O.R. training that I recall) that led to an exchange of emails as we pinned down the answer to his question.  A year later he contacted me again, offering co-authorship in a paper (which landed in a respectable journal). To this day I've never met my co-author. Neither the collaboration nor the paper would have happened without USENET.

Sunday, July 24, 2011

Social Networks at Work

The theme for the INFORMS Blog Challenge for July is "O.R. and Social Networking". I suspect that most posts will deal either with how O.R. tools can be applied to social networks (algorithms that recommend new "friends", managing network traffic, ...) or how social networks can benefit O.R. people. I hope to take a slightly different tack here. Operations research, management science and analytics propose to help people make better decisions and help systems operate more smoothly, and I think we have something to offer in better understanding and managing the role of social networks in the workplace.

Let me start by disclaiming any originality of the following ideas. Researchers in organizational behavior have already taken note of social networks in and between organizations. The central notion is that individual workers and groups of workers gain productivity by leveraging contacts in other units of the same organization or in other, separate organizations. Often those are direct contacts, but sometimes not. (To be concrete, if we picture a social network as a graph with actors or groups of actors as nodes and relationships as edges, direct contacts are nodes adjacent to a given node. Indirect contacts are nodes connected to a given node by a path of length greater than one.) There are various reasons why an organization might be concerned about social networks within it and between it and other organizations:
  • If workers become more productive by exploiting links, the organization might benefit from fostering the development of such links.
  • A social network within an organization may improve intra-organizational communication.
  • Conversely, a social network within an organization might contribute to the development of cliques and cause a rise in "political" behaviors.
  • Contacts with members of allied organizations might improve the relationship between the organizations (for instance, helping coordinate a supply chain).
  • Contacts with members of competing organizations might foster cooperation on some issues, but also might lead to leakage of proprietary information.
  • Individuals with many valuable connections (nodes with high degree, either weighted or unweighted) may be of particular value to the organization, warranting extra effort to retain and reward them.
  • Unmanaged development of social networks, both within the organization and leading to the outside, might result in the workforce being divided into "ins" (highly connected individuals) and "outs" (nodes with low degree). To the extent that the social network improves performance, the "outs" may be at a disadvantage in career development. This can undermine mentoring and retention efforts, and may be a particular concern for minorities and non-domestic workers.
To manage social networks, organizations need to be able to quantify them. This means:
  • defining what constitutes a network;
  • mapping the network;
  • finding a way to quantify things such as influence level and value to productivity (which may not be symmetric, meaning the network is directional);
  • identifying options for creating or manipulating networks (decision variables);
  • estimating the cost and potential impact of each option;
  • assessing risks of various options (including allowing networks to grow unmanaged); and
  • finding an optimal program of network construction/management.
(Being an optimization guy, I had to squeeze that last item in.) The measurement aspects, including mining existing data (emails, phone records, reports) to help identify existing networks, sound like analytics; the mapping, analysis of costs and impacts, and prescriptive recommendations sound like operations research. So I think we have something to contribute here.

And the great thing about being a blog author is that I'm not required to come up with any solutions (or even brilliant insights). :-)

Saturday, July 23, 2011

Farkas Certificates in CPLEX


This is a post about linear programs, specifically infeasible linear programs, and how to deal with them in CPLEX. In response to a question on a CPLEX forum, I played around a bit with Farkas certificates as implemented in CPLEX 12.2. I've never really worked with them, as I have a tendency to produce feasible models. The posted question, though, aroused my curiosity. It asked how to extend a Farkas certificate to a ray for the (unbounded) dual when the primal problem contains bounds on variables.

A Farkas certificate is a vector of constraint multipliers that proves a primal linear program to be infeasible. The name comes from Farkas's Lemma. Assume, without loss of generality, that we have a linear program of the form\[
\textrm{(P) }\begin{array}{lrcl}
\textrm{minimize} & 0\\
\textrm{subject to} & Ax & \ge & b\\
& x & \ge & l\\
& x & \le & u
\end{array}.
\]We are interested here only in the feasibility of (P), so we may as well assume the objective function (irrelevant to feasibility) is zero. Vectors $l$ and $u$ are lower and upper bounds for the variables $x$. To simplify the discussion, we will assume that $l$ and $u$ are finite (but not necessarily nonnegative). What will transpire extends easily to infinite bounds.

Assume that the lower and upper bounds are entered as bounds, rather than as constraints (so that the solver sees $Ax\ge b$ as the only set of constraints). A Farkas certificate for (P) will be a vector $y\ge0$ with the following property. Set $q'=y'A$ and define a vector $z$ of the same dimension as $x$ as follows:\[
z_{j}=\begin{cases}
l_{j} & \textrm{if }q_{j}<0\\
u_{j} & \textrm{if }q_{j}\ge0
\end{cases}.
\](Actually,the definition of $z_{j}$ is arbitrary and irrelevant if $q_{j}=0$; I chose $u_{j}$ just for concreteness.) The key property of the Farkas certificate is that it will satisfy\[
q'z=y'Az < y'b.
\]Now suppose that $x$ is any feasible solution, so that in particular $l\le x\le u$.  Then \[
q_{j}\ge0\implies z_{j}=u_{j}\ge x_{j}\implies q_{j}z_{j}\ge q_{j}x_{j}
\]and
\[
q_{j}<0\implies z_{j}=l_{j}\le x_{j}\implies q_{j}z_{j}\ge q_{j}x_{j}.
\]So $q'x\le q'z<y'b$. If $x$ is feasible in (P), though, we have $Ax\ge b$; given that $y\ge0$, it must follow that $q'x=y'Ax\ge y'b$, a contradiction. Thus the existence of a Farkas certificate for (P) tells us that there cannot be any feasible solutions $x$ to (P).

Now let's tie this to the dual (D) of (P), which is\[
\textrm{(D) }\begin{array}{lrcl}
\textrm{maximize} & y'b+v'l-w'u\\
\textrm{subject to} & y'A+v'-w' & = & 0\\
& y,v,w & \ge & 0
\end{array}
\]if we now include the bounds on $x$ as constraints (and treat $x$ as unrestricted in sign). The feasible region of (D) is a cone with vertex at the origin. Denoting by $h^{+}$ and $h^{-}$ the positive and negative parts respectively of a vector $h$ (i.e., $h_{j}^{+}=\max(h_{j,}0)$ and $h_{j}^{-}=-\min(h_{j},0)$), let $y$ be a Farkas certificate for (P), let $q'=y'A$ as before, and let $v=q^{-}$ and $w=q^{+}$. Then $(y',v',w')\ge0$ and $(y'A+v'-w')'=q+v-w=q+q^{-}-q^{+}=0$;
so $(y',v',w')$ is a feasible solution to (D). Moreover, based on our definition of $z$ above, $q'z=-(q^{-})'l+(q^{+})'u=-v'l+w'u$, and so $y'b+v'l-w'u=y'b-q'z>0$. By extending the Farkas certificate $y$ with $v=q^{-}$ and $w=q^{+}$, we obtain a feasible solution to the dual with a positive dual objective value. Since the dual is a cone, $(y',v',w')$ is a recession direction along which the dual objective is unbounded.

Let's go in the other direction. Suppose that (D) is unbounded (making (P) infeasible), and let $(y',v',w')$ be a feasible dual solution with objective value $y'b+v'l-w'u>0$. Let $q'=y'A=w'-v'$ and define $z$ as before. Then $l\le z\le u$, so\[
q'z=w'z-v'z\le w'u-v'l<y'b
\](the last step since the dual objective is positive at $(y',v',w')$); thus $y$ must be a Farkas certificate for (P).

To answer the original question (finally!), suppose we are solving an infeasible primal problem in CPLEX. CPLEX provides a mechanism (the IloCplex.dualFarkas method in the Java API) to retrieve a Farkas certificate $y$. To extend that to the corresponding dual multipliers $v$, $w$ for the lower and upper bounds respectively, compute $y'A$ and assign the positive parts to $w$ and the negative parts to $v$. It would be nice if there were a more direct method (one that would avoid the multiplication $y'A$), but I do not know of one.

One last note: the point of extending the Farkas certificate to a full dual solution is usually to obtain an extreme ray of the dual. As shown above, any solution to (D) with positive objective value provides a Farkas certificate, not just extreme rays. So we rely on the mechanism by which the solver generates the certificate to pick one corresponding to an extreme ray.

Monday, July 18, 2011

The Continuing Adventure of Mint 11

I should have left "well enough" alone.  Mint 11 Katya (AMD 64 bit version) was running well on my Acer Inspire home PC, other than one little annoyance: when I booted, before and after the grub menu, my monitor would nag about a suboptimal resolution (and tell me that 1440x900 at 60Hz was optimal).  So I decided to change the screen resolution used by the boot loader (Control > Startup-Manager).  Now I can't swear that was the cause (cf. "post hoc ergo propter hoc fallacy"), but ever since then normal boots freeze, sometimes after the battery state is reported as OK and sometimes before.  I'm still able to get Linux up and running, but by one of a couple of circuitous routes:
  • booting into recovery mode, starting in the failsafe X mode, then rebooting normally seems to work;
  • booting into recovery mode, then using the option to reboot with a file system check routinely works, even though the file system check appears to find no problems.
I also tried the dpkg repair option after booting into recovery mode.  It found no packages but did give me the option to upgrade the system core (which I did -- another slow process).  That proved to be no help either.

Channeling my inner Sherlock Holmes, I review the facts: a change to display parameters immediately preceded the problems; booting in failsafe X mode appears to help bypass the problem.  So perhaps something in the display subsystem  is the culprit (although that does not explain why a mandated disk check gets the machine to boot properly).

A search of the Internet shows that I'm not alone in experience the boot problem, although it is unclear that what's going splat on my system is the same as what's going splat on other systems.  The author of this post, for instance, appears to have encountered multiple maladies, sufficient that he abandoned the attempt to use Mint. Someone experiencing the same symptoms with an earlier release had some success hiding the xorg.conf file.  I tried moving it (and eventually deleted it), but to no avail. Another user found success by deleting and reinstalling the nVidia driver.  I tried that (strangely slow to uninstall, very slow to download and reinstall!), but no joy for me.  At least I think not.  My first reboot after reinstalling the nVidia driver failed to make it even as far as the battery check.  So I rebooted, again in normal mode, and the system started properly.  Previously, rebooting repeatedly in normal mode failed (every boot attempt got stuck in the same place).  Is this progress? Will the beast boot normally from now on?  Only time will tell.  Meanwhile, I have to go back to doing actual work.  If anything changes or I find new information, I'll update this post.

Update #1: The day after I reloaded the nVidia driver, the machine booted normally.  So I was cautiously optimistic that maybe the reload had fixed the problem, and the hang immediately after the reload (previous paragraph) was just a last gasp of the bug.  No such luck.  The first boot this morning hung somewhere around the line saying that flushing the log to disk had stopped. I did a plain reboot, and the second attempt made it to the battery status ok line.  I did a third plain boot, and lo and behold it booted properly.  So the bug is still there, but either unsuccessful boots make some sort of progress now (write something to disk?) or else it's just stochastic (race condition?).

Update #2: Thinking that perhaps I'd messed up something in grub, I reinstalled grub and reverted something (a config file? a script?) to the package maintainer's version when asked. The next couple of boots went fine, but the bug then reappeared. So no joy there. However, after a few days of win some/lose some boots, my AMD box has gone on a tear of 30 straight successful boots. I can't think of anything I did that could account for this, and none of the updates pushed down to the machine appear to be related to booting.  So maybe Loki just remembered he had business elsewhere. My laptop, however, continues to occasionally conk out during a boot, which is probably an entirely different bug.

Update #3: See this post and this later post for additional nVidia-related issues (and possible cures).

Saturday, July 16, 2011

Facts, Beliefs ... and Budgets

Melissa Moore (@mooremm), Executive Director of INFORMS, recently tweeted the following question: "What or tools would you use if you were asked to help solve the US Federal /#debt impass?"  My initial reaction (after verifying that a flammenwerfer is not considered an OR tool) was that OR and analytics would be of no use in the budget debate (debacle?).  OR and analytics rely on facts and logic; it is unclear that either side of the debate is interested in facts or willing to be constrained by logic.

The question did set me to thinking about the difference between facts and beliefs.  I have a hard time sorting out when demagogues, whether politicians or media bloviators, are espousing positions they actually believe and when they are simply pandering for ratings/votes.  (My cynicism is hard won: I grew up in New York, went to school in New Jersey, and cast my first vote to reelect Richard M. Nixon.  It's been downhill from there.)  For the sake of argument, let's stipulate that both sides are acting on beliefs they truly hold.  When I was younger it seemed to me that, however venal either side's motives might be, both the left and the right were capable of negotiating based on some common understanding of governance and the political, social and economic realities of the country they governed.  It's hard to trade horses, though, when one side can't tell a horse from a zebra and the other can't tell a horse from a camel. Today, one party thinks that the answer to any question that does not contain the phrase "gay marriage" is "cut taxes".  The other side thinks that the answer to any question that does not contain the phrase "gay marriage" is "tax the rich".  That the proposed solution might not work is simply inconceivable (as is the possibility that the other side's solution might work).

The somewhat unnerving truth, however, is that everything we think we know as a fact (raw data aside) is ultimately a belief.  My training is in mathematics. Casual users of mathematics, and even forgetful mathematicians, tend to think that what has been "proved" (i.e., a theorem) is definitively true. In reality, theorems are merely statements that must follow logically from a set of axioms (beliefs). The system of logic we accept is itself a matter of belief, but in the interest of avoiding a painful flashback to an undergraduate formal logic course I'll drop that line of thought right now. As in mathematics, so too in the physical sciences: theory arises from a mix of assumptions and empirical evidence; when new evidences clashes with the theory, modifications are made; and when the modifications become untenable, some assumption is altered or deleted and the theory is rebuilt.  (Remember when the speed of light was a constant?)

So if mathematics and physical sciences are built on leaps of faith, we really cannot fault elected representatives (and economists) from doing the same.  What we perhaps can demand, though, is that these beliefs at least be acknowledged as beliefs (not "proven facts"), and that decision makers attempt to examine the likely impact of any of those beliefs turning out false. As a parallel (pun deliberate), consider Euclid's Elements, written ca. 300BC, in which Euclid developed many theorems of what we now refer to as "Euclidean" geometry based on five postulates.  The postulates appear self-evident, and mathematicians over the centuries tried unsuccessfully to derive one from the others (turning the derived one into a theorem). In the 19th century, Nikolai Lobachevsky famously replaced Euclid's fifth postulate with a negation of it, perhaps hoping to prove the fifth postulate from the others by contradiction. Rather than finding a contradiction, he invented hyperbolic geometry, which is not only consistent as a mathematical system but has actually found use (those bleeping physicists again).

So, back to the original question: can OR bring any useful tools to bear on the budget debate? With enough time and effort, and exploiting the systems perspective that underlies OR, perhaps we could diagram out the interplay of all the assumptions being made (consciously or unconsciously) by each side; and perhaps, using simulation models based on those assumptions and calibrated to historical data, we could explore the consequences of each side's preferred solution (or, for that matter, any compromise solution) should any specific assumption not hold up. It would be a massive undertaking, and I am not confident it would be productive in the end. Zealously held beliefs will not yield easily to "what if" analyses.

Monday, July 11, 2011

Perils of "Big M"

A somewhat recent Twitter exchange with Bo Jensen, which I believe was triggered by a post on an optimization forum (details of which are fading rapidly from my memory), motivated this entry. It deals with optimization models (specifically linear or integer linear ones) that are constructed using what is sometimes referred to as the "big M" method.

Terminology

First, I need to clarify terminology.  Any reference to "big M" means the incorporation into a model of coefficients (usually represented by $M$) that are in a sense surrogates for $\infty$. The phrase "big M method" is frequently used to cover two somewhat different modeling techniques.
  • One eliminates the first phase (finding a feasible solution) of the two-phase simplex method by merging the phase I and phase II objective functions into a single function, penalizing the artificial variables with very large coefficients ($M$). It's an example of using a penalty function, and requires a sufficiently large value of $M$ to ensure that violating a constraint is never optimal.
  • The other provides a way for binary variables to turn constraints on or off. Consider, for example, a lockbox problem in which the number $x_{ij}$ of checks from customer $i$ assigned to lockbox location $j$ cannot exceed some capacity limit $M$ if the location is used (binary variable $y_j=1$), but should be zero if the location is not used ($y_j=0$). This is formulated as $$x_{ij}-My_j\le 0.$$

Rounding error

My first course in graduate school was in numerical analysis, and it came as a bit of a shock to my system. We covered rounding, truncation and representation errors rather extensively -- very disconcerting stuff to a pure mathematician. One would like to think that computers do arithmetic accurately. One would like to believe that politicians are honest and competent. One is destined to be disappointed both times.

I won't go into a lengthy introduction to the aforementioned types of error (which I'll somewhat cavalierly lump together under the name "rounding error"). Let me instead give a small example. The code is in Python (which represents all floating point numbers in double precision), but it could be any language.
x = 1.0e10
y = 1.0e-9
z = x - y
w = z - x
print x, y, z, w
print (z == x)
print (w == -y), (w == 0)
The output of the first print line is: 
1.00000000000000e10 1.00000000000000e-9 1.00000000000000e10
 0.000000000000000
 Note that z looks suspiciously like x (it should be strictly smaller), and w (which should be -y) is zero. That could just be limited precision in the output. The next three prints, however, tell the tale. The first (z==x) and third (w==0) should be false, while the second (w==-y) should be true. What we actually get:
True
False True
Whoops!

This leads me to a few truisms from that long-ago numerical analysis course:
  • Addition or subtraction of numbers introduce rounding errors, and addition or subtraction of numbers with substantially different magnitudes introduce rounding headaches. (Multiplication and division can also introduce small errors, but addition and subtraction tend to be the killers. The biggest problem with multiplication or division is probably overflow or underflow.)
  • Comparisons are closet subtractions. It's generally unsafe to test $x=y$ when $x$ and $y$ are floating point numbers. Instead, test $|x-y|\lt \epsilon$ for a suitably small (but not too small) $\epsilon \gt 0$.
  • In particular, things that should be zero often come out not quite zero (commonly referred to as "decimal dust"). More than once I've seen someone ask why some solver gave an optimal solution to his problem that had a binary variable equal to 1.0e-9 (throw in a few more junk digits if you like). It should be either 0 or 1!  Well, yes; but if the solver held out for exactly 0 or exactly 1, you'd get a lot of wrong answers due to decimal dust. So the solver accepts that "darned near 0" is 0 and "darned near 1" is 1. 

    $M$ in the objective or RHS

    Having $M$ in the objective function (as in the first definition of "big M" above, eliminating phase I of the simplex method) or having it as the right hand side of a constraint introduces rounding error (per my first bullet above). Specifically, $M$ in the objective can cause rounding errors in reduced costs, which may confuse the solver about which columns are candidates for entry to the basis. $M$ in the right hand side can introduce rounding errors in the right hand sides of subsequent tableaus, which may cause problems selecting the variable to leave the basis. As far as I know, those issues are generally survivable, and I've not seen any crash landings caused by it. Perhaps someone who has will comment with an example of a bad outcome from $M$ on the perimeter of the model.

    $M$ in the constraints

    The lockbox formulation illustrates the situation where $M$ shows up as a coefficient in the constraint matrix. This is where serious misadventures can occur. If $M$ creeps into the basis matrix $\textbf{B}$ (or a column that is a candidate for entry into the basis matrix), the ensuing rounding errors can cause $\textbf{B}$ to look singular (or can cause a column that should be ineligible for entry to the basis, because it would make $\textbf{B}$ singular, look eligible). The rounding errors can also cause severe loss of precision in the computation of $\textbf{B}^{-1}$. In technical terms, $M$ (or $1/M$) appearing in $\textbf{B}$ can make $\textbf{B}$ ill-conditioned.

    Suppose that, in a formulation with $M$ in constraints, we bump into the following candidate for a basis matrix$$\textbf{B}=\left[\begin{array}{ccc} 1 & 0 & \frac{1}{M} \\ M & 0 & 1 \\ 1 & M & 0\end{array}\right].$$Since $$\textbf{B}\ \left[\begin{array}{c}1 \\ \frac{-1}{M} \\ -M\end{array}\right]=\left[\begin{array}{c}0\\0\\0\end{array}\right]$$it is obvious that $\textbf{B}$ is singular. Right? Well, let's check the determinant of $\textbf{B}$. I did the calculations using Sage 4.7, but I have no reason to think other packages would do any better (although pathologies might well vary).

    Here is the Sage code I used to compute the determinants:
    B = matrix(3,3,[1.0,0.0,1.0/M,M,0.0,1.0,1.0,M,0.0]);
    [det(B.substitute(M=10.^k)) for k in (28, 29, 30, 31)];
    
    Here is a table of the results:$$\begin{array}{rccc}M & 10^{28} & 10^{29} \\ \det(\textbf{B}) & 0.000000000000000 & 0.000000000000000 \\  \\
    M & 10^{30} & 10^{31} \\ \det(\textbf{B}) & -1.40737488355328e14 & 0.000000000000000\end{array}$$
    So something inexplicable (but bad) happened when $M=10^{30}$, presumably due to loss of precision in some key computation.

    You won't see exactly this glitch in  a typical "big M" model, but if you play with enough "big M" formulations, sooner or later (probably sooner) you will trip over numerical instability attributable to mixing disgustingly large coefficients with considerably smaller ones.

    $M$ in MIP models

    As I mentioned above, one common use of "big M" formulations (illustrated by the lockbox example) is to allow a binary variable to turn a constraint on or off. Even if ill-conditioned basis matrices do not occur, large values of $M$ can cause branch-and-bound solvers to make slow progress solving the mixed-integer programming model (MIP). Consider our simple lockbox constraint: $x_{ij}-My_j\le 0$. There will be an associated penalty cost (say, $c_j y_j$) in the objective function for using location $j$. Now suppose that, at some point, the solver is considering a node where $x_{1j}=2$, $x_{2j}=5$ and $x_{ij}=0$ for other values of $i$. Logically, we know this requires $y_j=1$ and incurs cost $c_j$; but in the LP-relaxation of the node problem, $y_j=5/M$ is sufficient, incurring a cost of just $5c_j/M$. For large values of $M$, this substantially underestimates the true cost, leading to loose node bounds.  Loose bounds at nodes in turn make it hard for the solver to prune nodes based on objective value, and so more nodes need to be examined, slowing the solution process.

    $M$ and (dubious) pedagogy

    I can't write a post this long without at least one rant. Let me stipulate up front that neither text book authors nor instructors can be expected to cover every exigency. That said, many instructors and authors introduce students to "big M" formulations in the abstract, because they are mathematically neat and easy to explain, and then move on to the next topic without regard to the grubby implementation details. I confess to having done this myself when I was young and foolish.

    At least one thing, though, is simply notational laziness. That $M$ in my lockbox example? It should be $M_j$. There is no reason why a single value should be used for all instances of $M$ in a model, and very good reasons why the values should differ. I don't recall seeing $M$ subscripted very often, if ever. (Are text book authors charged for each subscript they use?) In certain examples, including lockbox problems, there is a natural and obvious value for $M$ (typically a capacity limit), and in those cases you see individualized values being used. More often, though, there is just this undifferentiated symbol $M$ floating throughout the model. I think that influences users to actually use a single, very large value everywhere. (I recently saw a model on a CPLEX support forum that was clearly a "big M" model with a single large value of $M$ in about half the constraints.)

    Concluding thoughts...

    ... and if you're still with me, you're very ready for this post to conclude.
    • If you are going to use a "big M" formulation, look for the smallest values of $M$ that work in the context of the model. (I once wrote a paper in which I analytically derived a sufficiently large but not horribly inflated value for $M$. There are tricks for doing so.)
    • $M$ does not need to be one size fits all.
    • Particularly if $M$ will show up as the coefficient of a variable in a constraint, be prepared for numerical instability. Learn how to detect numerical problems and what levers your solver has for coping with instability.
    • Consider alternatives to a "big M" approach (two phase simplex, Benders decomposition, ...) if you have stability or accuracy problems you cannot resolve.