Find duplicate element in array in time O(n)
Solution 1:
This can be done in O(n)
time and O(1)
space.
(The algorithm only works because the numbers are consecutive integers in a known range):
In a single pass through the vector, compute the sum of all the numbers, and the sum of the squares of all the numbers.
Subtract the sum of all the numbers from N(N-1)/2
. Call this A
.
Subtract the sum of the squares from N(N-1)(2N-1)/6
. Divide this by A
. Call the result B
.
The number which was removed is (B + A)/2
and the number it was replaced with is (B - A)/2
.
Example:
The vector is [0, 1, 1, 2, 3, 5]
:
-
N = 6
-
Sum of the vector is 0 + 1 + 1 + 2 + 3 + 5 = 12. N(N-1)/2 is 15. A = 3.
-
Sum of the squares is 0 + 1 + 1 + 4 + 9 + 25 = 40. N(N-1)(2N-1)/6 is 55. B = (55 - 40)/A = 5.
-
The number which was removed is (5 + 3) / 2 = 4.
-
The number it was replaced by is (5 - 3) / 2 = 1.
Why it works:
-
The sum of the original vector
[0, ..., N-1]
isN(N-1)/2
. Suppose the valuea
was removed and replaced byb
. Now the sum of the modified vector will beN(N-1)/2 + b - a
. If we subtract the sum of the modified vector fromN(N-1)/2
we geta - b
. SoA = a - b
. -
Similarly, the sum of the squares of the original vector is
N(N-1)(2N-1)/6
. The sum of the squares of the modified vector isN(N-1)(2N-1)/6 + b2 - a2
. Subtracting the sum of the squares of the modified vector from the original sum givesa2 - b2
, which is the same as(a+b)(a-b)
. So if we divide it bya - b
(i.e.,A
), we getB = a + b
. -
Now
B + A = a + b + a - b = 2a
andB - A = a + b - (a - b) = 2b
.
Solution 2:
We have the original array int A[N];
Create a second array bool B[N]
too, of type bool=false
. Iterate the first array and set B[A[i]]=true
if was false, else bing!