To start, we can make a few simplifying assumptions.

- The height of a box is not given, so we'll assume it is zero, and only consider the top and bottom surfaces of the box. For a box (really, envelope) with side $s$, that makes the total area $2s^2$. If the boxes have uniform height $h$, the area changes to $2s^2 + 4hs$, but the model and algorithm I'll pose are unaffected.
- We'll charitably assume that a document with side $s$ fits in a box with side $s$. In practice, of course, you'd like the box to be at least slightly bigger, so that the document goes in and out with reasonable effort. Again, I'll let the user tweak the size formula while asserting that the model and algorithm work well regardless.

- Only document sizes need be considered as box sizes, i.e. for every selected size at least one document should fit "snugly".
- The number of boxes you need at each selected size equals the number of documents too large to fit in a box of the next smaller selected size but capable of fitting in a box of this size.
- You have to select the largest possible box size (since that is required to store the largest of the documents).

The first thought that crossed my mind was the possibility of solving the problem by brute force. The author of the original question supplied a data file with document dimensions. There are 1166 documents, with 384 distinct sizes. So the brute force approach would be to look at all $\binom{383}{2} = 73,153$ or $\binom{383}{3} = 9,290,431$ combinations of box sizes (in addition to the largest size), calculate the number of boxes of each size and their combined areas, and then choose the combination with the lowest total. On a decent PC, I'm pretty sure cranking through even 9 million plus combinations will only need a tolerable amount of time.

A slightly more sophisticated approach is to view the problem through the lens of a layered network. There are either three or four layers, representing progressively larger selected box sizes, plus a "layer 0" containing a start node. In the three or four layers other than "layer 0", you put one node for each possible box size, with the following restrictions:

- the last layer contains only a single node, representing the largest possible box, since you know you are going to have to choose that size;
- the smallest node in each layer is omitted from the following layer (since layers go in increasing size order); and
- the largest node in each layer is omitted from the preceding layer (for the same reason).

I wanted to confirm that the problem is solvable without special purpose software, so I coded it in Java 8. Although there are plenty of high quality open-source graph packages for Java, I wrote my own node, arc and network classes and my own coding of Dijkstra's shortest path algorithm just to prove a point about not needing external software. You are welcome to grab the source code (including the file of document sizes) from my Git repository if you like.

I ran both the three and four size cases and confirmed that my solutions had the same total surface areas that Erwin got, other than a factor of two (I count both top and bottom; he apparently counts just one of them). How long does it take to solve the problem using Dijkstra's algorithm? Including the time reading the data, the four box version takes about half a second on my decent but not workstation caliber PC. The three box version takes about 0.3 seconds, but of course gives a worse solution (since it is more tightly constrained). This is single-threaded, by the way. Both problem set up and Dijkstra's method are amenable to parallel threading, but that would be overkill given the run times.

So is it wrong to take a fancier modeling approach, along the lines of what Erwin did? Not at all. There are just trade-offs. The modeling approach produces more maintainable code (in the form of mathematical models, using some modeling language like GAMS or AMPL) that are also more easily modified if the use case changes. The brute force and basic network approaches I tried requires no extra software (so no need to pay for it, no need to learn it, ...) and works pretty well for a "one-off" situation where maintainability is not critical.

Mainly, though, I just wanted to make a point that we should not overlook simple (or even brute force) solutions to problems when the problem dimensions are small enough to make them practical ... especially with computers getting more and more powerful each year.