Monday, 7 June 2010

Finding the Missing Number in a Sequence

This is one of those algorithm problems that come up in programmer interview tests from time to time. Suppose you receive a sequence of numbers from 1 to 100 in random order but one is missing.  How do you find the missing number?

The laborious way would be to sort the numbers in ascending order and then loop until the difference in two successive numbers is 2. But this would be very slow if we had to sort 10 million numbers.

A slick solution I encountered was to make use of the formula for the sum of an arithmetic progression which, for a sequence of n consecutive numbers starting from 1, is n(n+1)/2. The full formula is n(2a + (n- 1)d)/2 where a is the first term and d is the common difference. If we substitute a = 1 and d = 1 then it simplifies to the first formula.

To find the missing number we simply add up the numbers from the random sequence and then subtract that sum from the sum given by the arithmetic progression formula. No sorting involved.

I thought I’d try this out using my unique random number generator class. This class takes any range of numbers and randomly shuffles them. So we can take a range of consecutive numbers from 1 to n and generate a random set. Then we can remove one at random and find it using the foregoing technique.

// Generate ordered sequence of numbers
int start = 1;
int count = 10;
var orderedNumbers =
Enumerable.Range(start, count);

// Print ordered numbers
Console.WriteLine("Ordered Numbers");
orderedNumbers
.ToList()
.ForEach(x => Console.Write(x + " "));
Console.WriteLine();

// Randomly shuffle them
var randomNumbers = new HashSet<int>();
var g = new UniqueRandomNumberGenerator(orderedNumbers);

// Print random numbers
Console.WriteLine("Random Numbers");
while (g.RemainingNumbersCount > 0)
{
int number = g.NewRandomNumber();
randomNumbers.Add(number);
Console.Write(number + " ");
}
Console.WriteLine();

// Verify we have same numbers after random shuffle
Debug.Assert(randomNumbers.SetEquals(orderedNumbers));

// Randomly generate a number to remove
var rand = new Random();
int removedNumber =
rand.Next(1, orderedNumbers.Count() + 1);
Console.WriteLine(
"Number to remove is " + removedNumber
);

// Remove it from the random numbers
randomNumbers.Remove(removedNumber);

// Print random numbers with number removed
Console.WriteLine(
"Random Numbers with {0} missing", removedNumber
);
randomNumbers
.ToList()
.ForEach(x => Console.Write(x + " "));
Console.WriteLine();

// Sum random numbers
int randomNumbersSum = 0;
var enumerator =
randomNumbers.GetEnumerator();
while (enumerator.MoveNext())
{
int number = enumerator.Current;
randomNumbersSum += number;
}

// Sum ordered numbers using arithmetic progression formula
int n = orderedNumbers.Count();
int orderedNumbersSum = n * (n + 1) / 2;

// The missing number is the difference between
// the sums of the ordered and random numbers
Console.WriteLine("Searching for missing number...");
int missingNumber = orderedNumbersSum - randomNumbersSum;
Console.WriteLine("Missing number is " + missingNumber);



I could have computed the sums using IEnumerable.Sum() on the two sets of numbers but I wanted to illustrate the algorithms.



Typical output for 10 numbers is:



Ordered Numbers

1 2 3 4 5 6 7 8 9 10


Random Numbers


10 3 5 9 1 7 4 2 8 6


Number to remove is 3


Random Numbers with 3 missing


10 5 9 1 7 4 2 8 6


Searching for missing number...


Missing number is 3

1 comment:

  1. (What if 2 numbers are missing instead of only one?)

    What would be the best way to find two missing numbers in a random sequence of 1 to 100 random numbers?

    ReplyDelete