Saturday, August 12, 2023

The "Set" Card Game

Someone asked a question (actually a pair of questions) about the "Set" card game on OR Stack Exchange. This was the first I had heard of the game. For purposes of this post, the game play is irrelevant. The key is that it uses a deck of 81 cards, each of which has four "features" with three possible values for each feature. Every combination of feature values appears on exactly one card. Three cards form a "set" if, for each of the four features, either all three cards have the same value or all three cards have different values. As noted in the Wikipedia page, there are 1,080 possible "sets". (I'm putting "set" in quotes to avoid confusion with, you know, sets.) The OR SE question asked if an integer program (IP) could be used to generate all 1,080 sets.

The answer is yes, with the caveat that while you can do it, an IP model is not the most efficient way to do it. I won't repeat my IP model here, since (a) it's a bit of a space hog and (b) you can read it in my answer on ORSE. The model is pretty small (97 binary variables) and finds one "set" quickly. There are two ways to get it to find all possible "sets". Both involve adding a "no good" constraint for each solution found, i.e., a constraint that eliminates that solution but no others. The more tedious approach solves the progressively growing model 1,081 times (the last time producing a declaration of infeasibility, meaning all "sets" have been found). The less tedious approach requires a solver that supports a callback function letting you reject candidate solutions by adding a constraint cutting them off (the same "no good" constraint used in the first approach).

Since there is no objective and the constraints are mostly logical in nature, constraint programming (CP) strikes me as a more intuitive approach than IP. The specifics of a CP model will depend on what types of constraints your CP solver supports, but I think the model I tried is likely to work with a lot of solvers. Since I coded the models in Java, I'll use 0-based indexing here. My variables (all integer-valued) are: $x_s\in \lbrace 0,\dots, 80\rbrace,$ the index of the card in slot $s\in \lbrace 0,1,2\rbrace$ of the "set"; and $y_{f,s}\in \lbrace 0,1,2\rbrace,$ the value of feature $f$ in the card in slot $s.$ The constraints are as follows:

  • allDifferent($x$) (no two slots can contain the same card);
  • $x_0 \lt x_1 \lt x_2$ (to avoid counting the same "set" more than once, we require that the cards be listed in ascending index order);
  • $y_{f,s} = a_{x_s,f},$ where $a_{c,f}$  is the value of feature $f$ in card $c$ (the value of a feature in a slot comes from the card in the slot); and
  • allDifferent$(\lbrace y_{f,0}, y_{f,1}, y_{f,2}\rbrace) \vee (y_{f,0}=y_{f,1} \wedge y_{f,1}=y_{f,2})\,\forall f$ (for every feature, either all three slots have different values or all three slots have the same value).

This exploits two constraints that CP solvers typically have but IP solvers do not. One is the "all different" constraint, which forces a set of variables to take values that do not repeat. The other is the ability to use a decision variable as an index to either a constant vector or another variable vector. That comes into play in the third constraint, where the variable $x_s$ (the index of the card in slot $s$) is used as a subscript of $a$ to get the feature value of that card.

Even CP is probably overkill for this problem, though. We can compute the 1,080 possible solutions by brute force. I used a recursive function for this.

Here is how each method stacks up in terms of computation time. As I said, I coded it in Java, using CPLEX 22.1.1 for the IP models and CP Optimizer 22.1.1 for the CP model.

  • brute force took between 0.1 and 0.2 seconds;
  • the CP model also took between 0.1 and 0.55 seconds;
  • solving the IP model once using a callback took between 1.4 and 3 seconds; and
  • solving the IP model 1,081 times (without a callback) took around 56 seconds, give or take.

Solution times for all methods are a bit random (they change when I rerun the program), so treat them as approximate. In the interest of full disclosure, I throttled CPLEX to a single thread for the callback approach. Using multiple threads would have required both synchronizing the code (a pain) and also checking for repeated solutions (since hypothetically two parallel threads could independently locate the same "set"). The moral, to the extent there is one, is that sometimes brute force beats more sophisticated approaches.

You can find my Java code here if interested.