SuDoKu Solver Explanation

SuDoKu background

If you are new to SuDoKu here are a couple of websites that will tell you just about everything you could want to know. The links should open in a new window.
www.sudoku.com
www.websudoku.com
The second of these websites has an effectively limitless supply of puzzles to solve. At the current level of development this solver can manage all easy, moderate, hard and most of their evil puzzles. I intend to improve on that (when I can find the time!). If you find a puzzle this program can't solve then please, please send it to me so that I can learn how to improve.

Definitions / Glossary

Just the one: Region - one of the nine sets of 3 by 3 cells that makes up the puzzle.

Overview

There is a two stage process to the solving of every cell.

Stage 1
This can consist of one step (in the case of the simpler algorithms) and multiple steps (in the case of the advanced algorithms). After each step the results of the algorithm used to make progress are displayed. This stage ends when a value is solved for a cell.

The following highlights are used:-
Green - A possible value now identified as the solution.
Yellow - For anything used as part of the solution.
Red - A possible value that has been eliminated.
Pink - A possible value that has been eliminated in an earlier step.
Pink & Yellow - A possible value that was eliminated in a previous step, quite possibly while solving a cell some time back, and that now contributes to the solution of the cell. This only occurs for the "Only possibility (E)" solve.

Stage 2
Another click and the solved value is entered into the grid and the solved cell highlighted in light blue. All other highlighting is removed. At the same time the solved value is removed as a possible solution from all other unsolved cells in the row, column and region that the solved cell appears in.

The simpler algorithms

I will refer to the algorithms by the message displayed by the solver. As you would expect, the simpler ones are applied first. If no solution is found for a particular algorithm we move to the next one until either a solution is found or we run out of algorithms to apply. They are listed to the order the solver applies them.

1a. Last in row

If, for a particular unsolved cell, all the remaining eight cells in that row have been solved, then the cell must take the ninth value.

1b. Last in column

As above but for a column.

1c. Last in region

As above but for a region.

2a. Only possibility

Here we look at each cell that has not yet got a value assigned to it. We look in the row, column and region for the cell and determine how many of the values 1 to 9 have been used. If eight different values have been filled in then the last value is clearly the remaining unused value. The above three algorithms are actually just special cases of this.

2b. Only possibility (E)

This is very similar to the above in that we have determined that there is just one value left that can go into a particular cell. The difference is that not all of the other eight values are in solved cells (in the same row, column or region), one, or more, will have been eliminated by the use of an advanced algorithm. The eliminated possibility(ies) will be highlighted in pink & yellow.

3a. Third row solve

Here we examine the rows three at a time. E.g. take the top three rows. For each possible value, 1 to 9, look for the situation where exactly two of the rows contain that number. We can now infer that that number must go in one of three cells that are defined by the row and region that don't have that number in. If two of those cells are already filled in then we're home and dry. If not, we examine the columns for each empty cell. If the number appears elsewhere in the column it may be eliminated. If at the end of this exercise there is just one possibility left then we've solved another cell.

3b. Third column solve

Same principle as above but taking the columns three at a time.

4a. By elimination (row)

The principle here is to examine all the unsolved cells in each row. For each cell we determine what the possible values are. We do this by eliminating all the values that have already been used in the row, column and region in which the cell occurs. If there is a value that only appears once (in all the possibilities in all the unsolved cells in the row we are examining) then we have a solution. Or, to put it another way, we are looking for an unused value that can only go in one of the unsolved cells for the row we're looking at.

4b. By elimination (column)

Same principle as above but by column.

4c. By elimination (region)

Same principle as above but by region.

The advanced algorithms

These algorithms don't immediately solve a cell. They will, however, gradually whittle down the possible solutions for unsolved cells until either only one value is left (see "Only possibility (E)") or until a "By elimination" algorithm produces a result.

5. Exclusive pair

Look at all the unsolved cells in a row (or column or region). Look for two of these cells that have exactly the same two possible solutions. If such a pair is found, then all occurrences of these possible values can be eliminated from the other unsolved cells in the same row / column / region. More detailed explanation.

6a. Region/Row elimination

With this algorithm we are looking at the intersection of rows and regions. If, within that intersection, there are two or more unsolved cells and within those unsolved cells there is a possible value that doesn't appear in any other unsolved cell in the region as a possible value, then, that value may not appear anywhere else in the row (outside of its intersection with the region). More detailed explanation.

6b. Row/Region elimination

This is simply the converse of the above. As before, we are looking at the intersection of rows and regions. If, within that intersection, there are two or more unsolved cells and within those unsolved cells there is a possible value that doesn't appear in any other unsolved cell in the row as a possible value, then, that value may not appear anywhere else in the region (outside of its intersection with the row). More detailed explanation.

6c. Region/Column elimination

Same principle as 6a but by column.

6d. Column/Region elimination

Same principle as 6b but by column.

7. Exclusive triple

Look at all the unsolved cells in a row (or column or region). For every combination of three of the unsolved values for the row (or column or region) look for cells that have just two or three possible values and where all of that cell's possible values occur within the combination's three values. If there are three such cells then we have an exclusive triple and all occurrences of this combination's values can be eliminated from the other unsolved cells in the same row / column / region. More detailed explanation.