- crosses a decade boundary (e.g., 7, 8, 9, 10, 11, 12)
- includes a 9, or numbers that have 9 as a digit
- but the last / highest number should not include 9
{ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}
A numerical sort will give that. But an ASCII or alphabetical sort will produce
{ 1, 10, 11, 12, 2, 3, 4, 5, 6, 7, 8, 9}
If your dataset was 1 through 9, or 4203 through 4207, you would not notice a problem. In my case the data was sufficient to trigger the problem, which was especially apparent because a developer had put a sanity check in the code.
if ( current_version > highest_version )
print ("ERROR: current_version > highest_version")
We had to track down what created the condition, which turned out to be the bad sort. But if not for the error message, who knows what odd behavior this erroneous data would have caused - and whether I would have caught it. Everybody makes mistakes; good developers work to catch them.
There is a lot that could be said about datasets for testing ... but there are a couple of interesting points on sorting, test, and performance issues that I would like to cover next time.
No comments:
Post a Comment