In part 1, we used a missile design problem to understand the basic elements of an optimization: having a goal, having a way of achieving trial solutions, and having a way of measuring the quality of our solutions against the goal. The problem was easy to solve because we could follow a continuous path from a poor starting point to an optimum design using a missile flight simulator. For this stop on our journey we look at a problem, simple to describe but with new difficulties. We will meet the barrier of local optima, and our path to success will be discontinuous. The new problem is to find a magic square that is greater than 3 by 3. As a reminder, a magic square is a square array of numbers, in sequence, arranged so that the rows, columns, and diagonals add to the same total. Last month, we showed a magic square the was three by three. Here, for the fun of it is a computer generated square that is 4 x 4:
This is not the only solution to the problem; others exist that differ in ways that are not simple rotations or reflections of this one. If there were space, we could print several magic squares with larger dimensions: 6 x 6, 12 x 12, 20 x 20, or whatever size you might wish to choose. A computer can make them up quickly. Our question: how do we program a computer to do it, and what do we learn in the process? We will follow the same process used with the missile optimization problem. The goal is clearly defined the rows, columns, and diagonals must add up to 34. We start by creating an initial trial solution to the problem by randomly arranging the numbers from 1 to 16 in the array. Next, we add up the totals for the rows, columns and diagonals. Of course, they do not each add up to 34. Some totals are higher and others are lower. We next add up these errors as if they were all positive numbers; an error of +3 for one row and -5 for another adds up to a total of 8. We sum the error values for all rows, columns, and diagonals, and this sum is our total error value. Our process will seek to reduce this error value to zero. When we start the process, however, the total of all these errors is a large number. The next step is to create a new square, similar the previous one, but with a small random change. We create the new square by selecting a pair of numbers at random, and swapping them. We will calculate the total error again, and note whether the new total is more or less than what we had before. If it is more, we return the swapped pair to the original positions. If less, we keep the swap. We repeat the process of swapping pairs and evaluating the errors. Initially, swapping pairs will often result in a better solution. As time goes on, and we get closer to a final answer, it becomes more and more difficult to find a pair to swap that will reduce the error. When we get to a total error value of 4, that is two rows or columns that add to 35, and another couple that add to 33, we can keep swapping forever, and never get the error to zero. If this happens, and it happens quite often, we have found a local optimum. That is to say, no matter how long we let the process run, we will never achieve an error value of zero, because our swapping technique is only able to look in the immediate neighborhood of the local the optimum. What is the next step? What is your guess? It is clear that we need a way to get off of this local optimum and start again somewhere else. We are simply on the wrong mountain, and we have to drop ourselves somewhere on the landscape where we may be on a bigger one. As in life, it is sometimes necessary to take one step backward in order to make two steps forward. And that is the case here. We could just start over, and completely randomize the entire square. On the other hand, we really have made quite a bit of progress, and the higher mountain may be not that far away. Instead of starting over, we randomly and arbitrarily select one pair of values and switch them. This will increase the error, but not so much as starting over. We can then restart swapping with the goal of reducing error. After a good number of trials with no solution, we will do an error-increasing random swap again. The computer may have to increase error several times. However, at some point, we find on a path that leads to an error of zero, which is a solution to the problem. In computer programming terms, we have an inner loop and an outer loop. The inner loop seeks to find the next better solution while the outer loop stops everything and starts over in a new location. It works. Here is another, different from the first:
What has this
|
Richard Ten Dyke, a member of the Danbury Area Computer Society, has previously contributed to this newsletter on the topic of Digital Photography. He is retired from IBM and can be reached at tendyke@bedfordny.com. All opinions are his own, and he welcomes comments. |
|