Saturday, October 9, 2021

Worker Ordering

 A question ("Ordering of resources for maximizing success") on Mathematics Stack Exchange asks how to model the problem of sequencing workers under some unusual assumptions. You have 32 workers and 16 tasks. Each worker $i$ has a probability $w_i$ for success on any task. The probabilities vary by worker but not by task. Successes and failures are also assumed to be independent, meaning that if worker $i$ succeeds on one task, their probability of succeeding on the next task is still $w_i$, and if they fail on a task, the probability worker $i+1$ succeeds is still $w_{i+1}$. Once a worker fails, that worker is no longer available for any subsequent tasks. The objective is to order the workers so as to maximize the probability that all tasks are completed.

Answers posted on Math Stack Exchange asserted that sorting the workers into either ascending or descending probability order would be best. The question actually asked about how to model the decision problem, and I could not think of a good way to turn it into a tractable optimization model. So I decided to run a few experiments (in R) computing various permutations, with the thought that I would eventually land on a heuristic approach.

To my surprise, in every experiment I ran all permutations resulted in the same success probability. In other words, the answer to "what is the best order?" is "any order you like". I wrote out the explicit formula for success with three workers and two tasks, thinking I might see why the order did not matter, but derived no insights from it. So I was pretty much flummoxed.

Eventually, a retroactively obvious answer dawned on me. Suppose we put the workers in an arbitrary order, pad the list of tasks with dummy tasks (so there are essentially infinitely many tasks, the first 16 of which are real), and turn the workers loose sequentially until the last worker fails. Let $n_i$ be the number of successes for worker $i$. (Remember, $i$ stops after their first failure.) Since each attempt is an independent observation of a Bernoulli distribution with parameter $w_i$, $n_i$ has a geometric distribution with parameter $1-w_i$. As it turns out, the specific distribution does not matter to us. What matters is that, given our infinite supply of dummy tasks, the variables $n_1, \dots, n_{32}$ are independent of each other. Since the real tasks are at the front of our task queue, the probability of overall success is just $$\mathrm{Pr}\left(\sum_{i=1}^{32}n_{i}\ge16\right),$$which depends only on the sum, not the order of summation. So it becomes obvious that any ordering of the workers (and thus of the $n_i$ variables) yields the same likelihood of success.


If you want to see my R computations, they are in a notebook you can download.