Data structure: insert, remove, contains, get random element, all at O(1)

I was given this problem in an interview. How would you have answered?

Design a data structure that offers the following operations in O(1) time:

  • insert
  • remove
  • contains
  • get random element

Solution 1:

Consider a data structure composed of a hashtable H and an array A. The hashtable keys are the elements in the data structure, and the values are their positions in the array.

  1. insert(value): append the value to array and let i be its index in A. Set H[value]=i.
  2. remove(value): We are going to replace the cell that contains value in A with the last element in A. let d be the last element in the array A at index m. let i be H[value], the index in the array of the value to be removed. Set A[i]=d, H[d]=i, decrease the size of the array by one, and remove value from H.
  3. contains(value): return H.contains(value)
  4. getRandomElement(): let r=random(current size of A). return A[r].

since the array needs to auto-increase in size, it's going to be amortize O(1) to add an element, but I guess that's OK.

Solution 2:

O(1) lookup implies a hashed data structure.

By comparison:

  • O(1) insert/delete with O(N) lookup implies a linked list.
  • O(1) insert, O(N) delete, and O(N) lookup implies an array-backed list
  • O(logN) insert/delete/lookup implies a tree or heap.

Solution 3:

You might not like this, because they're probably looking for a clever solution, but sometimes it pays to stick to your guns... A hash table already satisfies the requirements - probably better overall than anything else will (albeit obviously in amortised constant time, and with different compromises to other solutions).

The requirement that's tricky is the "random element" selection: in a hash table, you would need to scan or probe for such an element.

For closed hashing / open addressing, the chance of any given bucket being occupied is size() / capacity(), but crucially this is typically kept in a constant multiplicative range by a hash-table implementation (e.g. the table may be kept larger than its current contents by say 1.2x to ~10x depending on performance/memory tuning). This means on average we can expect to search 1.2 to 10 buckets - totally independent of the total size of the container; amortised O(1).

I can imagine two simple approaches (and a great many more fiddly ones):

  • search linearly from a random bucket

    • consider empty/value-holding buckets ala "--AC-----B--D": you can say that the first "random" selection is fair even though it favours B, because B had no more probability of being favoured than the other elements, but if you're doing repeated "random" selections using the same values then clearly having B repeatedly favoured may be undesirable (nothing in the question demands even probabilities though)
  • try random buckets repeatedly until you find a populated one

    • "only" capacity() / size() average buckets visited (as above) - but in practical terms more expensive because random number generation is relatively expensive, and infinitely bad if infinitely improbable worst-case behaviour...
      • a faster compromise would be to use a list of pre-generated random offsets from the initial randomly selected bucket, %-ing them into the bucket count

Not a great solution, but may still be a better overall compromise than the memory and performance overheads of maintaining a second index array at all times.

Solution 4:

For this Question i will use two Data Structure

  • HashMap
  • ArrayList / Array / Double LinkedList.

Steps :-

  1. Insertion :- Check if X is already present in HashMap --Time complexity O(1) . if not Present Then Add in end of ArrayList -- Time complexity O(1). add it in HashMap also x as key and last Index as a value -- Time complexity O(1).
  2. Remove :- Check if X is present in HashMap --Time complexity O(1). If present then find the its index and remove it from HashMap --Time complexity O(1). swap this element with last element in ArrayList and remove the last element --Time complexity O(1). Update the index of last Element in HashMap --Time complexity O(1).
  3. GetRandom :- Generate Random number from 0 to last index of ArrayList . return the ArrayList element at random index generated --Time complexity O(1).
  4. Search :- See in HashMap for x as a key. --Time complexity O(1).

Code :-

import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.HashSet;
import java.util.LinkedList;
import java.util.List;
import java.util.Random;
import java.util.Scanner;


public class JavaApplication1 {

    public static void main(String args[]){
       Scanner sc = new Scanner(System.in);
        ArrayList<Integer> al =new ArrayList<Integer>();
        HashMap<Integer,Integer> mp = new HashMap<Integer,Integer>();  
        while(true){
            System.out.println("**menu**");
            System.out.println("1.insert");
            System.out.println("2.remove");
            System.out.println("3.search");
            System.out.println("4.rendom");
            int ch = sc.nextInt();
            switch(ch){
                case 1 : System.out.println("Enter the Element ");
                        int a = sc.nextInt();
                        if(mp.containsKey(a)){
                            System.out.println("Element is already present ");
                        }
                        else{
                            al.add(a);
                            mp.put(a, al.size()-1);

                        }
                        break;
                case 2 : System.out.println("Enter the Element Which u want to remove");
                        a = sc.nextInt();
                        if(mp.containsKey(a)){

                            int size = al.size();
                            int index = mp.get(a);

                            int last = al.get(size-1);
                            Collections.swap(al, index,  size-1);

                            al.remove(size-1);
                            mp.put(last, index);

                            System.out.println("Data Deleted");

                        }
                        else{
                            System.out.println("Data Not found");
                        }
                        break;
                case 3 : System.out.println("Enter the Element to Search");
                        a = sc.nextInt();
                        if(mp.containsKey(a)){
                            System.out.println(mp.get(a));
                        }
                        else{
                            System.out.println("Data Not Found");
                        }
                        break;
                case 4 : Random rm = new Random();
                        int index = rm.nextInt(al.size());
                        System.out.println(al.get(index));
                        break;

            }
        }
    }

}

-- Time complexity O(1). -- Space complexity O(N).