Given two arrays a and b .Find all pairs of elements (a1,b1) such that a1 belongs to Array A and b1 belongs to Array B whose sum a1+b1 = k

I am looking for the solution of following algorithm with minimal time and space complexity.

Given two arrays a and b, find all pairs of elements (a1,b1) such that a1 belongs to Array A and b1 belongs to Array B whose sum a1+b1 = k (any integer).

I was able to come up with O(n log n) approach where we will sort one of the array say A and for each of the element b in array B, do binary search on sorted array A for value (K-b) .

Can we improve it any further?


If the arrays are sorted you can do it in linear time and constant storage.

  • Start with two pointers, one pointing at the smallest element of A, the other pointing to the largest element of B.
  • Calculate the sum of the pointed to elements.
  • If it is smaller than k increment the pointer into A so that it points to the next largest element.
  • If it is larger than k decrement the pointer into B so that it points to the next smallest element.
  • If it is exactly k you've found a pair. Move one of the pointers and keep going to find the next pair.

If the arrays are initially unsorted then you can first sort them then use the above algorithm. There a few different approaches for sorting them that you could use, depending on the type of data you expect:

  • A comparison sort, e.g. Mergesort or Timsort.
  • Counting sort.
  • Radix sort.

A comparison sort will require O(n log n) time on average. The last two are faster than O(n log(n)) but can be impractical if the range of possible values in the input arrays is very large.


If you have a limit on the maximum possible number (let's name it M) then you can have a solution in O(M+n).

Boolean array of false and mark as true all value for element of A. Then for each element b of B check if the element number K-b is marked as true.

You can improve it if you are using an hash-map instead of a big array. But I would not consider that in this kind of questions hash-map is kind of cheating.

Anyway it would give you O(n) for insertion and then O(n) for query, O(n) in total.

EDIT :

One case where this might be useful.

  • You have un-sorted vectors of size 10^6, so sorting them and doing the match is in O(n log n) with n = 10^6.
  • You need to do this operation 10^6 times (different vectors), complexity of O(n*n*log n).
  • Maximum value is 10^9.

Using my idea not with boolean but integer (incremented at each run) gives you a complexity of :

  • "O(10^9)" to create the array (also same complexity of space)
  • O(n) at each run, so O(n*n) for the total.

You are using more space but you've increased speed by a factor log(n) ~=20 in this case !


I would create a hash table containing the elements of one array, then iterate the other array looking up k - a(n), generating an output element if the lookup succeeded. This will use O(n) space and time.

In C#, it might look like this:

var bSet = new HashSet(B);
var results = from a in A
              let b = k - a
              where bSet.Contains(b)
              select new { a, b };

template< typename T >
std::vector< std::pair< T, T > >
find_pairs( 
    std::vector< T > const & a, std::vector< T > const & b, T const & k  ) {

    std::vector< std::pair< T, T > > matches;

    std::sort( a.begin(), a.end() );  // O( A * lg A )
    std::sort( b.begin(), b.end() );  // O( B * lg B )

    typename std::vector< T >::const_iterator acit = a.begin();
    typename std::vector< T >::const_reverse_iterator bcit = b.rbegin();

    for( ; acit != a.end(); /* inside */ ) {
        for( ; bcit != b.rend(); /* inside */ ) {

            const T sum = *acit + *bcit;

            if( sum == k ) {
                matches.push_back( std::pair< T, T >( *acit, *bcit ) );
                ++acit;
                ++bcit;
            }
            else if( sum < k ) {
                ++acit;
            }
            else {  // sum > k
                ++bcit;
            }
        }
    }  // O( A + B )
    return matches;
}