Pipes, Clojure and Choco (part 4: Benchmarking)

This is the fourth in a series of posts that explore constraint programming. I described a “direct search” in part 2, while part 3 extended that to included more “smarts”. But - as Scott Hunter here at ISTI pointed out yesterday - it has been difficult to draw accurate conclusions from the limited number of examples in the original problem.

(By the way, I just fixed a bug in the “smart” code, so if you read that article earlier you might want to check the updated version.)

So, provoked by Scott’s comments (which were by email - so I’ve now added commenting directly to the pages), I have had a look at automating the generation of test data.

My first idea was to generate puzzles like mazes. But puzzles don’t really look like mazes - they have more loops (mazes tends to be tree-like).

So then I tried deleting pipes from a “fully connected” puzzle (or “grid”). This worked much better: it’s easy to select edges at random and it’s easy to check whether deleting the pipe will leave a gap (by “pipe” here I mean a single connection between two cells; a central cell can have up to four pipes - one to each neigbour - a cell in a valid puzzle will always have at least two).

And it turns out (after playing around) that puzzles with many deleted pipes tend to be less ambiguous. Which I suspect is better - the original problem was probably intended for puzzles with unique solutions.

So my final approach is:

Here are some examples (the number of solutions is median/maximum, taken from a sample of 9):

╔╗╔╗  ╔▒▒▒         ╔══╗  ╔▒▒╗         ╔══╗  ▒▒═╗         ╔══╗  ▒▒▒▒    
║║║║  ▒▒▒▒         ║╔╗║  ▒▒╗▒         ║╔╗║  ║▒▒▒         ╠═╗║  ╠▒╗▒    
╠╝╚╣  ▒╝▒▒         ║╚╝║  ▒╚▒▒         ╠╝╚╣  ▒▒▒╣         ║╔╣║  ║▒╣║    
╚══╝  ▒═▒▒         ╚══╝  ▒▒▒▒         ╚══╝  ▒══▒         ╚╝╚╝  ╚╝▒╝    
                                                                  
4x4, 20% clues     4x4, 30% clues     4x4, 40% clues     4x4, 50% clues
1/2 solutions      1 solution         1 solution         1 solution     

╔══╦╗  ▒▒▒▒╗       ╔╗╔═╗  ▒▒▒═▒       ╔═══╗  ╔══▒▒       ╔╗╔═╗  ╔╗▒═╗  
╠══╝║  ▒▒▒▒▒       ║║╚═╣  ▒║▒▒▒       ║╔══╝  ▒▒▒═▒       ╚╝║╔╣  ╚▒▒▒╣  
╚╗╔╗║  ╚╗╔▒▒       ╚╩═╗║  ▒▒▒▒║       ║╚═╦╗  ║▒═▒▒       ╔╦╝║║  ╔╦╝║▒  
╔╣╠╝║  ▒▒▒▒▒       ╔══╝║  ╔═▒╝▒       ╠═╗║║  ▒═▒▒▒       ║╚═╝║  ║▒▒▒▒  
╚╝╚═╝  ▒╝▒▒▒       ╚═══╝  ▒▒▒▒╝       ╚═╝╚╝  ▒═╝╚▒       ╚═══╝  ▒▒▒═▒  
                                                                 
5x5, 20% clues     5x5, 30% clues     5x5, 40% clues     5x5, 50% clues
1/10 solutions     1/4 solutions      1/2 solutions      1 solution    
 
╔╗╔══╗  ▒▒▒▒▒▒     ╔╗╔══╗  ▒▒╔▒═▒     ╔╗╔═╦╗  ╔▒▒▒▒▒     ╔══╗╔╗  ╔══╗╔╗ 
║║╚╗╔╝  ▒▒╚▒▒▒     ╚╝║╔═╝  ▒╝▒▒▒▒     ║╚╩╗╚╝  ║╚╩▒▒▒     ║╔═╝╚╝  ▒▒▒▒▒╝ 
║╠╗║╠╗  ▒╠╗║▒▒     ╔╗╠╝╔╗  ▒▒▒▒▒▒     ╚╗╔╝╔╗  ╚╗▒╝╔▒     ║╠╗╔╦╗  ▒╠▒╔╦╗ 
╚╝║║╚╝  ▒▒▒▒▒▒     ╚╩╝╔╣║  ╚▒▒╔▒▒     ╔╝╚═╣║  ▒▒╚═▒▒     ║║║║╚╝  ▒║▒▒╚▒ 
╔╗║║╔╗  ▒▒▒▒▒▒     ╔═╗╠╝║  ▒▒╗▒▒║     ╠═╦╗║║  ╠▒▒▒▒▒     ║╚╩╝╔╗  ▒▒▒╝╔╗ 
╚╝╚╝╚╝  ▒╝▒▒╚╝     ╚═╝╚═╝  ╚═▒▒═▒     ╚═╝╚╩╝  ▒═▒╚╩▒     ╚═══╩╝  ╚▒▒═▒▒ 
               
6x6, 20% clues     6x6, 30% clues     6x6, 40% clues     6x6, 50% clues 
4/9 solutions      2/4 solutions      1/4 solutions      1/2 solutions  

		   ╔═════╗  ▒▒═▒▒▒▒   ╔═╗╔══╗  ╔═▒╔▒═▒   ╔═══╗╔╗  ╔▒▒▒╗╔
		   ║╔═╗╔═╣  ▒╔▒▒▒▒▒   ╠═╝╚╗╔╝  ▒═▒▒╗▒▒   ║╔╗╔╝║║  ║▒▒╔╝║
		   ║╚═╝╚╗║  ▒╚▒╝▒╗▒   ║╔══╣╠╗  ║╔▒═╣▒╗   ╚╝║║╔╣║  ▒▒▒▒▒╣
		   ╚╗╔╗╔╩╝  ▒╗╔▒▒╩▒   ╚╝╔═╝╚╝  ╚▒▒═▒▒▒   ╔╗╚╝╚╝║  ▒╗╚╝╚▒
		   ╔╣╚╝╚═╗  ╔▒▒▒▒▒▒   ╔╗║╔══╗  ▒▒▒╔▒═╗   ╚╩══╗╔╝  ╚╩▒═▒▒
		   ╠╝╔╦╗╔╣  ▒▒▒╦╗╔▒   ║╚╝║╔═╣  ║▒▒▒╔═▒   ╔═══╝╚╗  ▒═══╝▒
		   ╚═╩╝╚╩╝  ▒▒▒╝╚▒▒   ╚══╝╚═╝  ▒▒▒▒▒▒▒   ╚═════╝  ▒══▒═▒

		   7x7, 30% clues     7x7, 40% clues     7x7, 50% clues 
		   6/104 solutions    1/15 solutions     1/2 solutions  

				      ╔═══╗╔═╗  ▒═▒▒╗▒═▒ ╔═╗╔═╗╔╗  ▒▒▒╔▒▒╔▒ 
				      ╚╗╔╗║║╔╝  ╚╗▒▒║▒▒╝ ╠═╩╝╔╣╠╝  ╠═▒╝▒╣▒▒ 
				      ╔╝╚╣║║╚╗  ▒╝▒▒▒▒▒╗ ╚═╦═╝║╚╗  ▒═╦▒╝▒╚▒ 
				      ╚══╝╚╩═╝  ▒▒▒▒▒╩▒▒ ╔═╝╔╗╚═╣  ╔═▒▒╗╚═▒ 
				      ╔══════╗  ▒══▒═▒═▒ ╠╗╔╝╚══╝  ▒▒▒╝▒▒═▒ 
				      ║╔══╦══╝  ▒▒▒▒╦══╝ ╚╣║╔╦══╗  ╚▒║▒▒═▒╗ 
				      ║║╔╗║╔╦╗  ║║╔▒▒▒▒╗ ╔╝╚╝║╔╗║  ╔╝╚╝║╔╗▒ 
				      ╚╝╚╝╚╝╚╝  ▒▒╚▒╚╝▒▒ ╚═══╩╝╚╝  ╚▒══▒╝▒▒ 

				      8x8, 40% clues     8x8, 50% clues     
				      4/70 solutions     1/3 solutions      

							 ╔╗╔═══╗╔╗  ▒╗▒═▒▒╗╔╗
							 ║╚╩═╗╔╣║║  ▒▒╩═╗▒╣▒▒
							 ╚══╦╝║║║║  ╚▒═▒▒║▒▒▒
							 ╔═╗╚╗╚╣║║  ╔═▒╚╗╚╣║▒
							 ╠╗╚═╣╔╝╠╝  ▒╗▒▒▒▒╝╠▒
							 ╚╝╔═╝╚╗╚╗  ╚╝╔▒▒╚▒▒╗
							 ╔═╝╔╦═╩═╝  ▒▒▒╔╦═▒▒╝
							 ║╔╦╝╚═══╗  ║▒╦▒╚═══╗
							 ╚╝╚═════╝  ╚▒▒▒═▒▒▒▒

							 9x9, 50% clues
							 2/11 solutions

And here are the average times for the two algorithms so far (direct time in ms, smart time in ms, the ratio smart/direct):

Clues4x45x56x67x78x89x9
20%0.1, 0.5, 4x0.7, 1, 2x4, 4, 1x---
30%0.1, 0.5, 4x0.3, 1, 3x2, 2, 1x10, 6, 0.5x--
40%0.1, 0.4, 5x0.2, 0.4, 3x0.6, 2, 3x2, 3, 2x7, 9, 1x-
50%0.1, 0.3, 5x0.1, 0.6, 4x0.3, 1, 4x0.7, 2, 3x2, 4, 2x17, 15, 1x

Which, in the end, doesn’t change our conclusion: smart hardly ever out-performs the direct search.

As ever, latest code is in the repo.

Andrew


Related Posts

blog comments powered by