OR in an OB World: August 2012

Tuesday, August 28, 2012

A MythTV Mythtery

... with apologies to the late Robert Asprin for stealing one of his puns.

My Mythbuntu installation abruptly decided, starting around August 24, to malfunction. The system is set up to boot straight into MythTV, meaning that the back end server starts automatically at boot, Mythwelcome starts automatically, and Mythwelcome in turn starts the front end automatically. The back end and front end are on the same box, with IPv4 address 127.0.0.1 and IPv6 address [::1]. The box is connected to my home wireless network (DHCP address). The PC also has an Ethernet card that is not connected to anything (and has not been since before I installed Mythbuntu).

Things worked fine for several months, but starting four days ago I observed the following symptoms after a boot:

The front end generated an error message: "Could not connect to the master backend server. Is it running? Is the IP address set for it in mythtv-setup correct?"
A Python script (sendProfile.py) would attempt to send a bug report and would itself crash with "Failed to connect to backend at 127.0.0.1:6543" (which should be the correct address).

The script crash message would only occur once, but the first message repeated every two seconds or one keystroke until I managed to fight my way out of the front end. Once I got into Mythwelcome, the same error would pop up with the same dogged determination until I exited Mythwelcome. If you've ever fought with a Microsoft system message that just won't let go, it's deja vu all over again.

I checked the back end configuration, and it was correct (127.0.0.1, [::1]). Interestingly, running through the configuration program would temporarily solve the problem (until the next boot) or break things totally. Success apparently occurred because the configuration program forces you to shut down the back end (which it restarts when you exit configuration). Failure occurred occasionally (but not always) because the initial shut down apparently did not succeed, and after I exited the configuration program I wound up with two separate mythbackend processes.

Anyway, for what it's worth, I discovered that if I exited the front end and killed the back end with sudo kill -9 <process id>, the back end would automatically restart, the front end would be able to connect to it, and all would be fine ... until the next boot.

Looking in the back end log file, I found that prior to the 24th there would be messages saying the backend was listening on TCP [::1]:6544 or TCP 127.0.0.1:6543, which was as it should be. From the 24th onward, the corresponding messages would mention TCP [fe80::218:37ff:f32b:100d%wlan0]:6544 (or 6543), with a failure message if it was after a reboot (success if it was after my kill -9 ministrations), or TCP [fe80::20d:feff:fe7e:455b%eth1]:6544 (or 6543) with apparent success, not that it did me any good. My best guess is that the change was prompted by my installing a system update, since I certainly did not touch any configuration files, or anything else, directly.

Solution? NO!

Various web searches turned up nothing useful, until I stumbled across a post from someone encountering a similar but different problem running MythTV on Windows. The cure for him was to disable his Ethernet adapter. So I changed the settings for the eth0 and eth1 interfaces on my system so that (a) they don't automatically try to connect and (b) users do not automatically have access. After a reboot, everything appeared to work. After a second reboot (I have a suspicious nature), it still worked. I'm tentatively declaring victory, but I hope someone fixes this (without breaking something else) in a future update.

Update: Boy, was that ever a mistake! Everything looked fine, until I recently discovered that nothing was recording. The PC woke on time, Mythwelcome showed the correct information for upcoming recordings, there were no error messages (until I looked in the logs) ... but when the time to record a show came, the backend experienced an error and the show went unrecorded. Midway through an afternoon of log parsing and repair attempts, I discovered that MythTV could no longer see my external tuner card (a Hauppauge WinTV-DCR that MythTV thinks is an HDHomeRun Prime). The tuner communicates via a nonrouteable local IP address, and even though it plugs into a USB port, it apparently counts as one of the Ethernet ports (probably eth1, but I'm not certain about that). When I set both interfaces to turn on manually and restricted them to root access, I guess I effectively disconnected the tuner card. Oops!

Update 2: I may have fixed it. Details are posted here.

Monday, August 27, 2012

Step Functions and Mixed Functions

Someone asked me (elsewhere) about modeling, in a mathematical program, a piece-wise linear function that was a mixture of steps and linear segments. Since it contains step functions, it is neither continuous nor convex, but nonetheless it can be modeled within a mixed-integer linear program (MILP). Here is a sketch of a sample function.

step function with diagonal linear segment on the right end

I drew this with one diagonal linear segment, at the right end, but what follows will work for any mixture of steps and linear segments. If you are wondering why the heights are listed at $y_2,y_4,y_6$, it will become clear in a minute. The labeling is a bit fuzzy (sorry about that). The annotation of the diagonal segment is $y=mx+b$.

The key to modeling this is recognizing that every point on the graph is a linear combination of two adjacent endpoints. (The endpoints are the ones represented by dots on the graph.) There are nine endpoints on the graph, which we will designate as $(x_1,y_1),\dots,(x_9,y_9)$, as shown in the second sketch. For the modeling trick to work, it is necessary that the domain of the function be bounded (in this case $0\le x\le x_9$).

step function with linear segment and endpoints labeled

Hopefully the reason for the subscripts of the ordinates of the horizontal segments is now apparent

Let $y$ be the value of the function. We introduce continuous variables $\mu_1,\dots,\mu_9\in [0,1]$, which will be the weights of a convex combination of the graph points. Both $x$ and $y$ are expressed using these weights:\begin{gather*}x=\mu_1 x_1 + \dots + \mu_9 x_9\\y=\mu_1 y_1 + \dots + \mu_9 y_9\\\mu_1+\dots+\mu_9=1.\end{gather*}
What makes this work is that we also make the weights $\{\mu_1,\dots,\mu_9\}$ a type 2 special ordered set (SOS2), specified in index order. This tells the browser that at most two of the weights can be nonzero and, if there are two nonzero weights, they must be consecutive in the stated order (e.g., $\mu_3=0.4,\mu_4=0.6$ is fine but $\mu_2=0.4,\mu_7=0.6$ is invalid). The SOS2 restriction essentially forces the solver to select one segment of the graph, whether horizontal, diagonal or even vertical; the weights then select one point on the segment. If $y$ is part of the objective, and the chosen segment happens to be vertical, it's a safe bet that one of the weights will be 1 and the rest 0 (picking whichever endpoint of the segment better suits the objective direction).

Although the weight variables are continuous, introducing the SOS2 constraint will effectively turn the problem into a MILP even if it otherwise would have been a linear program (LP).

There are other ways to model this type of function. In particular, anything done with SOS1 or SOS2 constraints can be done with binary variables and linear constraints involving those binaries.

Related posts:

Thursday, August 23, 2012

Minor Device Mysteries

I have a couple of mildly annoying, though thankfully not crippling, issues with computing devices. Honestly, the fact that I cannot figure them out (the mental itch I cannot scratch) devils me more than the issues themselves. Interestingly, they may both turn out to be sleep disorders.

MythTV

I have a PC running Mythbuntu and dedicated to recording and playing back TV shows. I've written about it before (click "MythTV" in the label cloud, bottom right, if you're curious), including setting it up to wake from a G3 (unpowered) state to record shows or capture program database information. As I mentioned in an earlier post, neither of these works correctly. I find that I have to reload the database manually, and I occasionally find the PC waking for no apparent reason. The latter always seems to happen at 8:00 p.m. local time, which (I think not coincidentally) is 0000 GMT. I'm not positive, but I suspect it only happens when the MythTV front end is showing no scheduled recordings. Initially, I thought this was when MythTV was going to grab program data, but when the PC starts at 8:00 pm (and no recording is scheduled at that time), I check the MythTV front end, and it says that MythTV is idle (which the more detailed information provided by the back end confirms).

Instructions for setting the BIOS alarm clock (see the ACPI Wakeup document for MythTV) specify that the alarm should first be cleared, as the following line from the setwakeup.sh script illustrates:

echo 0 > /sys/class/rtc/rtc0/wakealarm      #this clears your alarm.

I'm left wondering if some script is writing a zero to wakealarm after the last scheduled recording takes place, and if this somehow translates into a (persistent) wakeup call for midnight UTC.

Update: I've fixed this; details are here.

Android

I have a Toshiba Thrive tablet, which by and large has served me well. For months after Android 4 ("Ice Cream Sandwich", also known as "ICS") came out, the Thrive stubbornly remained on Android 3 ("Honeycomb"). A week or so ago, Toshiba finally rolled out the update to ICS.

I'm not sure what all the hoopla about ICS was, but I suspect the improvements most important to me (if any) are under the hood. The user interface got new icons (woo-hoo) and a feature to unlock the device using facial recognition (except this did not make it into the Toshiba version). There's now rather usable handwriting recognition built in, but only on the Google search page in the Google browser (another woo-hoo). You can now close apps by swiping them out of the list of running apps, which is a nice feature.

I use an iGo Stowaway portable Bluetooth keyboard (no longer listed on iGo's web site, so I suspect it's been discontinued). It worked fine before the upgrade, but after I installed ICS I started running into problems where the Thrive would not recognize the fact that I was typing on the keyboard (but did tell me, in the Bluetooth settings, that it was seeing the Stowaway and treating it as an input device). Eventually I deleted and paired the Stowaway, and that seems to have reduced the frequency with which it is ignored ... but not eliminated the problem completely.

The more annoying issue, though, is battery life. Google "android ics battery life" and you quickly discover that a variety of people, on a variety of devices, are complaining that their battery life has decreased greatly since installing ICS. In my case, I'd guess that it is roughly halved, although that's a very subjective estimate. (Yes, I believe in the importance of analytics. No, I haven't bothered to track this, at least not yet.) Some people have issues with their screens not turning off, but that is not the problem on the Thrive. The power indicator should blink when the Thrive is in sleep mode, and with Gingerbread and then Honeycomb it did so consistently. With ICS, sometimes it does, sometimes it does not (meaning the power light stays on, even though the screen does go dark). Either way, a surprising amount of the charge drains when the device is sleeping (about 25% over ten hours). So I have to be diligent to recharge early and often.

I just checked the battery settings. (This is another plus for ICS: it give more detailed information about the battery. Now if I could only find out how to interpret it ...) The Thrive had been on battery for 16:44 (all times here are hours:minutes -- ICS gives seconds as well, but that's rounding error to me). It was down to 72% of full charge, so 28% bled off while it was sleeping. According to the battery info, Android OS accounted for 75% of battery use. Drill down, and then turns into 1:26 for "CPU total" and 5:44 for "Keep awake".

Now I have the device in airplane mode when it sleeps, so the WiFi and Bluetooth should both be off. Why the "keep awake"? Is this because I have antivirus (Avast, which I happily endorse) installed? Is it because I did not shut down all possible apps before sleeping it? Do Google Reader or the Google Play Store try to keep the device awake when there is no Internet connection and it's supposed to be sleeping?

I guess I'll have to try to remember to clean out all running apps before I put it to sleep tonight, and see if that affects battery life.

Update: Apparently running apps are not the culprits. I shut down everything I could last night (antivirus, for one, being excluded) and let the Thrive sleep. After 15.5 hours on battery, 26% of the charge drained. Battery use by Android once again dwarfed use by anything else, and the ratio of keep-awake time to CPU cycles was 4.5:1.

Thursday, August 9, 2012

User Cuts versus Lazy Constraints

After an e-mail exchange with a contact at IBM, I recently gained a bit of clarity (I hope) about the distinction between user cuts and lazy constraints in CPLEX, which I will share here. The terminology and some of the details in this post are specific to CPLEX, but I suspect that a number of other solvers will have something analogous to user cuts and lazy constraints, albeit perhaps under different names. If the terminology is familiar to you, you can skip to the end of the post and find the one potentially useful nugget of information it contains.

Let me start with an illustration of the feasible region of a hypothetical two variable pure integer linear program.

integer program feasible region, showing LP and IP hulls

The polygon bounded in black (including portions of the two axes) is the feasible region of the linear program relaxation (also known as the linear hull or LP hull). This is what you get if you relax the integrality restrictions while enforcing all functional constraints and variable bounds. The blue dots are the integer lattice points inside the polygon, i.e., the feasible solutions that meet the integrality restrictions. The blue polygon, the integer hull or IP hull, is the smallest convex set containing all integer-feasible solutions. The IP hull is always a subset of the LP hull. Assuming a linear objective function, one of the vertices of the IP hull (all of which are lattice points) will be the optimal solution to the problem.

User Cuts

In CPLEX parlance, a cut is a constraint that is not part of the original model and does not eliminate any feasible integer solutions. Cuts are typically added for one of two reasons: to separate a node in the search tree into two nodes representing smaller subproblems; or to strengthen bounds by paring away some of the fat in the LP hull (the "fat" being the set difference between the LP hull and the IP hull). CPLEX automatically adds a variety of cuts, of which the most famous is arguably the Gomory cut. A user cut is simply a cut added by the user, rather than by CPLEX, frequently but not necessarily inside a callback. The next figure illustrates a user cut (red dashed line).

The shaded triangle shows the portion of the original LP hull that is removed by the user cut. Note that no integer solutions were harmed in the construction of this diagram.

Lazy Constraints

In contrast to user cuts, lazy constraints are allowed to pare away integer-feasible solutions. The next figure shows an example of a lazy constraint (green line).

LP and IP hulls with a lazy constraint added

The shaded polygon is again the portion of the LP (and, in part, IP) hull removed by the constraint. Note that, this time, three integer points are being rendered infeasible. The implication of using lazy constraints is that the problem originally specified (first figure) is actually a relaxation of the true problem; adding the lazy constraints to that initial problem gives you the problem you really want to solve.

Why use lazy constraints? Three reasons come to mind. First, a full specification of the model may involve an extremely large number of constraints, many of which you anticipate will be redundant, or at least not binding near the optimal solution. If you knew in advance which of those constraints were redundant, you could simply omit them, but frequently you cannot identify redundant constraints at the outset. Many modern solvers allow you to put a portion of the constraints (what we are calling the "lazy" ones) in a pool, where they are initially not a part of the active model. As solutions are generated, the solver checks to see if any lazy constraints are violated and, if so, adds them to the active set. Lazy constraints that were previously added but have not been binding for a while may be returned to the pool. This is an attempt to reduce the computation time per iteration, and possibly the memory footprint of the problem.

A second and closely related reason to use lazy constraints is that there may in fact be so many constraints that you cannot afford the time or memory to generate them all at the outset. One approach to formulating the Traveling Salesman Problem as an integer program requires the use of subtour elimination constraints. The number of such constraints grows exponentially with the number of nodes in the network. While they are easy to specify in a computer program, for decent size networks you simply cannot afford to list them all. An alternative is to treat them as lazy constraints and generate them on the fly, through a callback.

The third reason to use lazy constraints, specifically in a callback, is that you may not be able to identify them at the time the model is specified. The feasibility and optimality cuts generated during Benders decomposition fall into this category. You discover them by solving one or more subproblems at certain points in the search for the solution to the master problem.

Why Two Categories?

We've observed that the distinction between user cuts and lazy constraints is fundamentally whether or not they can cut off otherwise feasible solutions. The reason we care about that is somewhat arcane. CPLEX and other solvers do a variety of problem reductions, both at the outset (presolving) and on the fly during the search process, to tighten the feasible regions of node problems and expedite the solution process. Some of those reductions involve solutions to dual problems. User cuts do not affect the reductions, but if you hide some actual constraints in a pool of lazy constraints (or generate them after the reductions have been done), the dual solutions used in the reductions are incomplete (lacking dual variables corresponding to the constraints that are not yet included). This can lead to incorrect results.

If CPLEX sees the user adding lazy constraints (or, by virtue of the presence of a lazy constraint callback, threatening to add lazy constraints), it automatically turns of reductions that might be incorrect. I recently ran a program that uses CPLEX (12.4) with a lazy constraint callback. The output contained the following message:

Lazy constraint(s) or lazy constraint callback is present.
Disabling dual reductions (CPX_PARAM_REDUCE) in presolve.
Disabling non-linear reductions (CPX_PARAM_PRELINEAR) in presolve.

As user, you can turn off these reductions manually, but including a lazy constraint callback (or adding constraints to the lazy constraint pool during model construction) will induce CPLEX to do it for you.

So What Got Clarified?

A few versions ago, IBM refactored their implementation of cut callbacks. Of particular interest is the conditions under which cut callbacks are called.

User cuts may be checked and applied at any time during the optimization process. This includes the cut generation phase at each node. They are not guaranteed to be checked at the time a candidate solution (potential new incumbent) is found, however. Calling a constraint a user cut tells CPLEX that it cannot cut off a feasible solution, so there is no need to test it each time a feasible solution is found.
Lazy constraints, on the other hand, are guaranteed to be checked each time CPLEX finds a candidate solution, regardless of how the candidate is found (node LP solution, rounding, various heuristics, inferred from the position of the planets ...). This is important, because a lazy constraint might make the candidate solution infeasible in the full problem. (There is one small exception, at least as of version 12.4 and earlier: a candidate solution injected by the user through a heuristic callback is not checked against the lazy constraints. The user is somewhat charitably assumed to know what she or he is doing, so a user-injected solution is assumed to satisfy all lazy constraints.) What changed in the refactoring of the cut callback interface is that lazy constraints are checked only when a candidate solution is found. They are not routinely checked at every node.

I mainly use lazy constraints in Benders decomposition, and I generate Benders cuts only when a new incumbent is proposed, so this is fine with me. Discussions on various CPLEX and OR forums tells me that other people doing Benders sometimes want to check for possible Benders cuts at every node. There may be other scenarios unknown to me where users desire to check lazy constraints at each node. Before the refactoring, this was possible, but the new rule that lazy constraint callbacks are called only when potential incumbents are identified seems to throw up an obstacle. This is compounded by the fact that the current CPLEX documentation threatens the user with exile to the Phantom Zone should she or he dare lie and call a lazy constraint a user cut.

So here, at last, is the clarification. It turns out that it is legal to add a lazy constraint in a user cut callback (gasp!) as long as the user includes a lazy constraint callback or otherwise ensures that the necessary reductions are turned off. Since user cut callbacks are not called when candidate incumbents are found, at least for Benders decomposition this would suggest adding the same cut generation logic in both a lazy constraint callback (mandatory) and a user cut callback (optional).

I'm told that the next (?) release of official documentation for CPLEX will include this clarification. To close, I'll note that if you intend to sneak lazy constraints into a user cut callback, the safest way to do this may be to include a lazy constraint callback, even if it does nothing (just returns with no action). That way, CPLEX will automatically turn off everything that needs to be turned off. As I said, you can use parameter settings to turn off reductions manually, but which reductions will need to be turned off could potentially change in a new version. I, for one, am too lazy (pun deliberate) to keep up with that stuff.

Wednesday, August 1, 2012

The Olympic Fencing Controversy

I wasn't going to write about this, but Laura McLay nagged me into it. I fenced at the collegiate level 40 years ago (if you can call sitting on Princeton's bench waiting for garbage time "fencing"). I have no experience at the national or international level, let alone at the Olympic level, and the rules have changed drastically in the past 40 years. So take the following with a substantial grain of salt.

Before launching into the controversy, I'll attempt a quick primer, sticking to the bare minimum. (Trust me, there's lots more detail I could drag in.) Fencing (more precisely, Olympic fencing, as opposed to the medieval style you might see at Renaissance reenactments or Asian styles such as kendo or kumdo) involves a choice of three weapons.

The épée, based on dueling swords, is a straight weapon with a point but no edge. If you see two fencers in a movie hacking away at each other with épées, they're wasting their energy (probably because the stunt director thought it would make the action look more exciting). With the épée, it's stab or go home. Any part of your opponent's anatomy is a valid target for the épée. A duel is no time to get finicky. (In real life you try not to hit your own anatomy ... which actually applies to all three weapons ... but strangely enough in Olympic-style épée fencing, deliberately hitting your own foot sometimes gets you an undeserved touch.)
The foil is based on a practice weapon used in earlier times before graduating to the real thing (the épée). The foil is also a point-only weapon, but with a lighter, more flexible blade and a smaller bell (hand guard). Valid target area for the foil is the torso (including the groin area in the front but cut off at the waist in the back). Historically, the target area limitation was probably done in an attempt to encourage students to aim before they poked.
The sabre is based on a cavalry sword, and scores with edges (the full bottom edge and I think about a third of the top edge) and the point. With the sabre, you can both stab and cut. Valid target area is everything above the waist except the hands and the back of the head. The sabre was often used against mounted opponents, and it was considered rude to wound the opponent's horse, hence the emphasis on striking above the waist.

Of course, in sporting competitions the weapons do not literally have points (they have buttons at their tips) nor sharpened edges. Scoring is done electronically, with the weapons electrified and with foil and sabre fencers wearing lamé jackets with wires woven into them. In my day, each fencer trailed a rather bulky cable attached to a reel; today fencers connect to the scoring apparatus wirelessly.

In the Olympics, individual bouts are staged in three three-minute rounds with one minute rest between rounds. A bout ends immediately when one fencer reaches 15 touches (points). If time runs out before either fencer reaches 15, the fencer ahead at the end of the bout wins. Here comes a rule I dislike (and a factor in the controversy): if time runs out with the fencers tied, a coin toss gives one fencer "priority". The fencers then fence for one additional minute, or until the tie is broken, whichever occurs first. If still tied after one minute, the winner of the coin toss wins the bout. (And you thought penalty kicks in soccer were bad!) When I fenced in college, if a bout was tied at the end of regulation time (which admittedly was much shorter then), it continued until the tie was broken or the universe suffered heat death, whichever came first.

Bouts are officiated by a referee (or, in my day, "director"), with assistance from a scorekeeper and a timekeeper, and possibly by side judges (whose main job, in épée, is to tell whether a fencer accidentally or deliberately stabbed the floor). In foil and sabre, simultaneous touches cannot both score; if the apparatus signals hits by both fencers that are simultaneous (to within a specified tolerance), the referee determines who had priority or "right of way" and awards the touch to that fencer. To make a long and complicated explanation short and sloppy, the fencer who initiates the attack has priority until either he/she misses or the other fencer successfully deflects the attach and counterattacks. With épée, however, there is no concept of right of way; if both fencers strike simultaneously, both score a touch.

At the 2012 London Olympics, a semifinal bout in women's épée turned into a circus. South Korean Shin A-lam and German Britta Heidemann were tied 5-5 at the end of regulation. Shin won the coin toss and was given priority, meaning that Heidemann needed to break the tie in her favor within one minute of fencing time to advance. With one second left in overtime, the score was still tied.

At this point, news reports vary widely (and some if not all are pretty clearly inaccurate on at least a few details). There may have been a "clock malfunction" that gave Heidemann an extra chance to score. Some reports claimed the clock was "stuck". Since I'm pretty sure they use electronic timers, a more likely explanation is that the timekeeper failed to start the clock promptly. Some reports said that the clock ran down to zero and was reset to one second. One report thought that was because the clock was running when the action was stopped (after not running when the fencers were actually fencing). Another thought this might have been due to Shin receiving a yellow card (warning) for something. Various minor infractions, including stepping off the side of the strip, can result in a warning. The reports do seem to agree that, in that final "one second", Shin and Heidemann managed to score two simultaneous touches (which did not count, since they did not break the tie) before Heidemann scored the ostensible winner.

The Korean delegation protested, and things went from bad to worse. Partly due to a lack of clarity on the part of the officials about what to do, partly due to the fact that Olympic protests require a cash appeal fee to paid on the spot, it took at least a half hour to sort out the appeal. This leads to another rule issue and some more bad reporting (or at least a bad choice of a headline): "Fencer Shin A-Lam stages dramatic sit-down protest after losing controversial semi-final" (the Telegraph). Shin did not sit down on the piste (strip) as part of a "sit-down protest"; she sat there because a rule requires fencers to remain on the strip while a protest is being adjudicated. (The Telegraph acknowledged this in the body of the article.) By rule, leaving the strip would signal acceptance of the decision. Shin could have remained standing, but one can hardly fault her for sitting, particularly as she was busy crying at the time.

The rule is designed for relatively minor protests, where a referee may go to video replay for a quick look before deciding if, say, a touch was on the opponents foot (valid) or on the floor (not valid). Given how long it took to round up all the clowns and stuff them back in the clown car, the rule should have been waived here. That, of course, assumes that (a) the rule can be waived and (b) someone could have determined that it could be waived without another hour of consultations. I really don't think the stay-on-the-strip rule was intended for appeals this lengthy. Apparently officials did persuade Shin to leave the strip after half an hour; it is unclear from the reporting whether the appeal had been denied by then or whether the stay-put rule was being waived.

Adding injury to injury, Shin was forced to fence the bronze medal bout (which had been delayed by the protest) within something like fifteen minutes of when she left the strip. That's not much time to regain your composure and get loose again. She lost the second bout and came away with no medal. (FIE, the governing body, apparently tried to give her a made-up consolation medal, which she turned down.)

Without being an eyewitness, and particularly given the variance of the reporting, it is hard to tell exactly what happened, what went wrong, and what level of injustice was done here. One thing is clear, though. In the FIE's competition rules (English language version, PDF), rule t.32.3 seems unambiguous:

Should there be a failure of the clock or an error by the time-keeper, the Referee must himself estimate how much fencing time is left.

Since three touches in a one second time period would seem to require shifting the bout to the event horizon of a black hole, I'm pretty sure the referee could, and should, have called the bout done before Heidemann's winning touch. That would still have not been satisfactory to me -- Shin would have won by virtue of that stupid coin toss, rather than by out-fencing Heidemann in overtime -- but it would at least have been consistent with the rules (and possibly less dramatic).

This isn't the first Olympic clock controversy (see, for instance USA-USSR basketball, 1972 ... grump, grump, grump) and likely won't be the last.

OR in an OB World