Pipes, Clojure and Choco (part 4: Benchmarking)
This is the fourth in a series of posts that explore constraint programming. I described a “direct search” in part 2, while part 3 extended that to included more “smarts”. But - as Scott Hunter here at ISTI pointed out yesterday - it has been difficult to draw accurate conclusions from the limited number of examples in the original problem.
(By the way, I just fixed a bug in the “smart” code, so if you read that article earlier you might want to check the updated version.)
So, provoked by Scott’s comments (which were by email - so I’ve now added commenting directly to the pages), I have had a look at automating the generation of test data.
My first idea was to generate puzzles like mazes. But puzzles don’t really look like mazes - they have more loops (mazes tends to be tree-like).
So then I tried deleting pipes from a “fully connected” puzzle (or “grid”). This worked much better: it’s easy to select edges at random and it’s easy to check whether deleting the pipe will leave a gap (by “pipe” here I mean a single connection between two cells; a central cell can have up to four pipes - one to each neigbour - a cell in a valid puzzle will always have at least two).
And it turns out (after playing around) that puzzles with many deleted pipes tend to be less ambiguous. Which I suspect is better - the original problem was probably intended for puzzles with unique solutions.
So my final approach is:
-
start with a full “grid” of pipes;
-
choose a pipe at random, and delete it if it doesn’t leave a pipe-less cell;
-
repeat until you’re not finding any more pipes to remove;
-
remove tiles at random from the puzzle until you are left with a suitable number of “clues”.
Here are some examples (the number of solutions is median/maximum, taken from a sample of 9):
╔╗╔╗ ╔▒▒▒ ╔══╗ ╔▒▒╗ ╔══╗ ▒▒═╗ ╔══╗ ▒▒▒▒
║║║║ ▒▒▒▒ ║╔╗║ ▒▒╗▒ ║╔╗║ ║▒▒▒ ╠═╗║ ╠▒╗▒
╠╝╚╣ ▒╝▒▒ ║╚╝║ ▒╚▒▒ ╠╝╚╣ ▒▒▒╣ ║╔╣║ ║▒╣║
╚══╝ ▒═▒▒ ╚══╝ ▒▒▒▒ ╚══╝ ▒══▒ ╚╝╚╝ ╚╝▒╝
4x4, 20% clues 4x4, 30% clues 4x4, 40% clues 4x4, 50% clues
1/2 solutions 1 solution 1 solution 1 solution
╔══╦╗ ▒▒▒▒╗ ╔╗╔═╗ ▒▒▒═▒ ╔═══╗ ╔══▒▒ ╔╗╔═╗ ╔╗▒═╗
╠══╝║ ▒▒▒▒▒ ║║╚═╣ ▒║▒▒▒ ║╔══╝ ▒▒▒═▒ ╚╝║╔╣ ╚▒▒▒╣
╚╗╔╗║ ╚╗╔▒▒ ╚╩═╗║ ▒▒▒▒║ ║╚═╦╗ ║▒═▒▒ ╔╦╝║║ ╔╦╝║▒
╔╣╠╝║ ▒▒▒▒▒ ╔══╝║ ╔═▒╝▒ ╠═╗║║ ▒═▒▒▒ ║╚═╝║ ║▒▒▒▒
╚╝╚═╝ ▒╝▒▒▒ ╚═══╝ ▒▒▒▒╝ ╚═╝╚╝ ▒═╝╚▒ ╚═══╝ ▒▒▒═▒
5x5, 20% clues 5x5, 30% clues 5x5, 40% clues 5x5, 50% clues
1/10 solutions 1/4 solutions 1/2 solutions 1 solution
╔╗╔══╗ ▒▒▒▒▒▒ ╔╗╔══╗ ▒▒╔▒═▒ ╔╗╔═╦╗ ╔▒▒▒▒▒ ╔══╗╔╗ ╔══╗╔╗
║║╚╗╔╝ ▒▒╚▒▒▒ ╚╝║╔═╝ ▒╝▒▒▒▒ ║╚╩╗╚╝ ║╚╩▒▒▒ ║╔═╝╚╝ ▒▒▒▒▒╝
║╠╗║╠╗ ▒╠╗║▒▒ ╔╗╠╝╔╗ ▒▒▒▒▒▒ ╚╗╔╝╔╗ ╚╗▒╝╔▒ ║╠╗╔╦╗ ▒╠▒╔╦╗
╚╝║║╚╝ ▒▒▒▒▒▒ ╚╩╝╔╣║ ╚▒▒╔▒▒ ╔╝╚═╣║ ▒▒╚═▒▒ ║║║║╚╝ ▒║▒▒╚▒
╔╗║║╔╗ ▒▒▒▒▒▒ ╔═╗╠╝║ ▒▒╗▒▒║ ╠═╦╗║║ ╠▒▒▒▒▒ ║╚╩╝╔╗ ▒▒▒╝╔╗
╚╝╚╝╚╝ ▒╝▒▒╚╝ ╚═╝╚═╝ ╚═▒▒═▒ ╚═╝╚╩╝ ▒═▒╚╩▒ ╚═══╩╝ ╚▒▒═▒▒
6x6, 20% clues 6x6, 30% clues 6x6, 40% clues 6x6, 50% clues
4/9 solutions 2/4 solutions 1/4 solutions 1/2 solutions
╔═════╗ ▒▒═▒▒▒▒ ╔═╗╔══╗ ╔═▒╔▒═▒ ╔═══╗╔╗ ╔▒▒▒╗╔
║╔═╗╔═╣ ▒╔▒▒▒▒▒ ╠═╝╚╗╔╝ ▒═▒▒╗▒▒ ║╔╗╔╝║║ ║▒▒╔╝║
║╚═╝╚╗║ ▒╚▒╝▒╗▒ ║╔══╣╠╗ ║╔▒═╣▒╗ ╚╝║║╔╣║ ▒▒▒▒▒╣
╚╗╔╗╔╩╝ ▒╗╔▒▒╩▒ ╚╝╔═╝╚╝ ╚▒▒═▒▒▒ ╔╗╚╝╚╝║ ▒╗╚╝╚▒
╔╣╚╝╚═╗ ╔▒▒▒▒▒▒ ╔╗║╔══╗ ▒▒▒╔▒═╗ ╚╩══╗╔╝ ╚╩▒═▒▒
╠╝╔╦╗╔╣ ▒▒▒╦╗╔▒ ║╚╝║╔═╣ ║▒▒▒╔═▒ ╔═══╝╚╗ ▒═══╝▒
╚═╩╝╚╩╝ ▒▒▒╝╚▒▒ ╚══╝╚═╝ ▒▒▒▒▒▒▒ ╚═════╝ ▒══▒═▒
7x7, 30% clues 7x7, 40% clues 7x7, 50% clues
6/104 solutions 1/15 solutions 1/2 solutions
╔═══╗╔═╗ ▒═▒▒╗▒═▒ ╔═╗╔═╗╔╗ ▒▒▒╔▒▒╔▒
╚╗╔╗║║╔╝ ╚╗▒▒║▒▒╝ ╠═╩╝╔╣╠╝ ╠═▒╝▒╣▒▒
╔╝╚╣║║╚╗ ▒╝▒▒▒▒▒╗ ╚═╦═╝║╚╗ ▒═╦▒╝▒╚▒
╚══╝╚╩═╝ ▒▒▒▒▒╩▒▒ ╔═╝╔╗╚═╣ ╔═▒▒╗╚═▒
╔══════╗ ▒══▒═▒═▒ ╠╗╔╝╚══╝ ▒▒▒╝▒▒═▒
║╔══╦══╝ ▒▒▒▒╦══╝ ╚╣║╔╦══╗ ╚▒║▒▒═▒╗
║║╔╗║╔╦╗ ║║╔▒▒▒▒╗ ╔╝╚╝║╔╗║ ╔╝╚╝║╔╗▒
╚╝╚╝╚╝╚╝ ▒▒╚▒╚╝▒▒ ╚═══╩╝╚╝ ╚▒══▒╝▒▒
8x8, 40% clues 8x8, 50% clues
4/70 solutions 1/3 solutions
╔╗╔═══╗╔╗ ▒╗▒═▒▒╗╔╗
║╚╩═╗╔╣║║ ▒▒╩═╗▒╣▒▒
╚══╦╝║║║║ ╚▒═▒▒║▒▒▒
╔═╗╚╗╚╣║║ ╔═▒╚╗╚╣║▒
╠╗╚═╣╔╝╠╝ ▒╗▒▒▒▒╝╠▒
╚╝╔═╝╚╗╚╗ ╚╝╔▒▒╚▒▒╗
╔═╝╔╦═╩═╝ ▒▒▒╔╦═▒▒╝
║╔╦╝╚═══╗ ║▒╦▒╚═══╗
╚╝╚═════╝ ╚▒▒▒═▒▒▒▒
9x9, 50% clues
2/11 solutions
And here are the average times for the two algorithms so far (direct time in ms, smart time in ms, the ratio smart/direct):
| Clues | 4x4 | 5x5 | 6x6 | 7x7 | 8x8 | 9x9 |
|---|---|---|---|---|---|---|
| 20% | 0.1, 0.5, 4x | 0.7, 1, 2x | 4, 4, 1x | - | - | - |
| 30% | 0.1, 0.5, 4x | 0.3, 1, 3x | 2, 2, 1x | 10, 6, 0.5x | - | - |
| 40% | 0.1, 0.4, 5x | 0.2, 0.4, 3x | 0.6, 2, 3x | 2, 3, 2x | 7, 9, 1x | - |
| 50% | 0.1, 0.3, 5x | 0.1, 0.6, 4x | 0.3, 1, 4x | 0.7, 2, 3x | 2, 4, 2x | 17, 15, 1x |
Which, in the end, doesn’t change our conclusion: smart hardly ever out-performs the direct search.
As ever, latest code is in the repo.
Andrew
Related Posts
- Pipes, Clojure and Choco (part 5: Choco)
- Pipes, Clojure and Choco (part 3: Smart Search)
- Pipes, Clojure and Choco - Optimisation
- Pipes, Clojure and Choco (part 2: Direct Search)
- An ORM for C?
- How to Average Complex Responses
- Pipes, Clojure and Choco (part 1: Limits of Laziness)
- Jekyll on Bitbucket
- CMake with Check Unit Tests
- Javascript Graphics w/ Raphael