How would you code an efficient Circular Buffer in Java or C#?

I would use an array of T, a head and tail pointer, and add and get methods.

Like: (Bug hunting is left to the user)

// Hijack these for simplicity
import java.nio.BufferOverflowException;
import java.nio.BufferUnderflowException;

public class CircularBuffer<T> {

  private T[] buffer;

  private int tail;

  private int head;

  @SuppressWarnings("unchecked")
  public CircularBuffer(int n) {
    buffer = (T[]) new Object[n];
    tail = 0;
    head = 0;
  }

  public void add(T toAdd) {
    if (head != (tail - 1)) {
        buffer[head++] = toAdd;
    } else {
        throw new BufferOverflowException();
    }
    head = head % buffer.length;
  }

  public T get() {
    T t = null;
    int adjTail = tail > head ? tail - buffer.length : tail;
    if (adjTail < head) {
        t = (T) buffer[tail++];
        tail = tail % buffer.length;
    } else {
        throw new BufferUnderflowException();
    }
    return t;
  }

  public String toString() {
    return "CircularBuffer(size=" + buffer.length + ", head=" + head + ", tail=" + tail + ")";
  }

  public static void main(String[] args) {
    CircularBuffer<String> b = new CircularBuffer<String>(3);
    for (int i = 0; i < 10; i++) {
        System.out.println("Start: " + b);
        b.add("One");
        System.out.println("One: " + b);
        b.add("Two");
        System.out.println("Two: " + b);
        System.out.println("Got '" + b.get() + "', now " + b);

        b.add("Three");
        System.out.println("Three: " + b);
        // Test Overflow
        // b.add("Four");
        // System.out.println("Four: " + b);

        System.out.println("Got '" + b.get() + "', now " + b);
        System.out.println("Got '" + b.get() + "', now " + b);
        // Test Underflow
        // System.out.println("Got '" + b.get() + "', now " + b);

        // Back to start, let's shift on one
        b.add("Foo");
        b.get();
    }
  }
}

This is how I would (or did) write an efficient circular buffer in Java. It's backed by a simple array. For my particular use case, I needed high concurrent throughput, so I used CAS for allocation of the index. I then created mechanisms for reliable copies including a CAS copy of the entire buffer. I used this in a case which is outlined in greater detail in short article.

import java.util.concurrent.atomic.AtomicLong;
import java.lang.reflect.Array;

/**
 * A circular array buffer with a copy-and-swap cursor.
 *
 * <p>This class provides an list of T objects who's size is <em>unstable</em>.
 * It's intended for capturing data where the frequency of sampling greatly
 * outweighs the frequency of inspection (for instance, monitoring).</p>
 *
 * <p>This object keeps in memory a fixed size buffer which is used for
 * capturing objects.  It copies the objects to a snapshot array which may be
 * worked with.  The size of the snapshot array will vary based on the
 * stability of the array during the copy operation.</p>
 *
 * <p>Adding buffer to the buffer is <em>O(1)</em>, and lockless.  Taking a
 * stable copy of the sample is <em>O(n)</em>.</p>
 */
public class ConcurrentCircularBuffer <T> {
    private final AtomicLong cursor = new AtomicLong();
    private final T[]      buffer;
    private final Class<T> type;

    /**
     * Create a new concurrent circular buffer.
     *
     * @param type The type of the array.  This is captured for the same reason
     * it's required by {@link java.util.List.toArray()}.
     *
     * @param bufferSize The size of the buffer.
     *
     * @throws IllegalArgumentException if the bufferSize is a non-positive
     * value.
     */
    public ConcurrentCircularBuffer (final Class <T> type, 
                                     final int bufferSize) 
    {
        if (bufferSize < 1) {
            throw new IllegalArgumentException(
                "Buffer size must be a positive value"
                );
        }

        this.type    = type;
        this.buffer = (T[]) new Object [ bufferSize ];
    }

    /**
     * Add a new object to this buffer.
     *
     * <p>Add a new object to the cursor-point of the buffer.</p>
     *
     * @param sample The object to add.
     */
    public void add (T sample) {
        buffer[(int) (cursor.getAndIncrement() % buffer.length)] = sample;
    }

    /**
     * Return a stable snapshot of the buffer.
     *
     * <p>Capture a stable snapshot of the buffer as an array.  The snapshot
     * may not be the same length as the buffer, any objects which were
     * unstable during the copy will be factored out.</p>
     * 
     * @return An array snapshot of the buffer.
     */
    public T[] snapshot () {
        T[] snapshots = (T[]) new Object [ buffer.length ];

        /* Determine the size of the snapshot by the number of affected
         * records.  Trim the size of the snapshot by the number of records
         * which are considered to be unstable during the copy (the amount the
         * cursor may have moved while the copy took place).
         *
         * If the cursor eliminated the sample (if the sample size is so small
         * compared to the rate of mutation that it did a full-wrap during the
         * copy) then just treat the buffer as though the cursor is
         * buffer.length - 1 and it was not changed during copy (this is
         * unlikley, but it should typically provide fairly stable results).
         */
        long before = cursor.get();

        /* If the cursor hasn't yet moved, skip the copying and simply return a
         * zero-length array.
         */
        if (before == 0) {
            return (T[]) Array.newInstance(type, 0);
        }

        System.arraycopy(buffer, 0, snapshots, 0, buffer.length);

        long after          = cursor.get();
        int  size           = buffer.length - (int) (after - before);
        long snapshotCursor = before - 1;

        /* Highly unlikely, but the entire buffer was replaced while we
         * waited...so just return a zero length array, since we can't get a
         * stable snapshot...
         */
        if (size <= 0) {
            return (T[]) Array.newInstance(type, 0);
        }

        long start = snapshotCursor - (size - 1);
        long end   = snapshotCursor;

        if (snapshotCursor < snapshots.length) {
            size   = (int) snapshotCursor + 1;
            start  = 0;
        }

        /* Copy the sample snapshot to a new array the size of our stable
         * snapshot area.
         */
        T[] result = (T[]) Array.newInstance(type, size);

        int startOfCopy = (int) (start % snapshots.length);
        int endOfCopy   = (int) (end   % snapshots.length);

        /* If the buffer space wraps the physical end of the array, use two
         * copies to construct the new array.
         */
        if (startOfCopy > endOfCopy) {
            System.arraycopy(snapshots, startOfCopy,
                             result, 0, 
                             snapshots.length - startOfCopy);
            System.arraycopy(snapshots, 0,
                             result, (snapshots.length - startOfCopy),
                             endOfCopy + 1);
        }
        else {
            /* Otherwise it's a single continuous segment, copy the whole thing
             * into the result.
             */
            System.arraycopy(snapshots, startOfCopy,
                             result, 0, endOfCopy - startOfCopy + 1);
        }

        return (T[]) result;
    }

    /**
     * Get a stable snapshot of the complete buffer.
     *
     * <p>This operation fetches a snapshot of the buffer using the algorithm
     * defined in {@link snapshot()}.  If there was concurrent modification of
     * the buffer during the copy, however, it will retry until a full stable
     * snapshot of the buffer was acquired.</p>
     *
     * <p><em>Note, for very busy buffers on large symmetric multiprocessing
     * machines and supercomputers running data processing intensive
     * applications, this operation has the potential of being fairly
     * expensive.  In practice on commodity hardware, dualcore processors and
     * non-processing intensive systems (such as web services) it very rarely
     * retries.</em></p>
     *
     * @return A full copy of the internal buffer.
     */
    public T[] completeSnapshot () {
        T[] snapshot = snapshot();

        /* Try again until we get a snapshot that's the same size as the
         * buffer...  This is very often a single iteration, but it depends on
         * how busy the system is.
         */
        while (snapshot.length != buffer.length) {
            snapshot = snapshot();
        }

        return snapshot;
    }

    /**
     * The size of this buffer.
     */
    public int size () {
        return buffer.length;
    }
}

Here is a ready-to-use CircularArrayList implementation for Java which I use in production code. By overriding AbstractList in the Java-recommended way, it supports all functionality you would expect from a standard List implementation in the Java Collections Framework (generic element type, subList, iteration etc.).

The following calls complete in O(1):

  • add(item) - adds at end of list
  • remove(0) - removes from beginning of list
  • get(i) - retrieves random element in list

I would use ArrayBlockingQueue or one of the other prebuilt Queue implementations, depending on what the needs are. Very rarely there is need to implement such a data structure yourself (unless it's a school assignment).

EDIT: Now that you have added the requirement "to retrieve any element in the buffer by index", I suppose that you need to implement your own class (unless google-collections or some other library provides one). A circular buffer is quite easy to implement, as JeeBee's example shows. You may also look at ArrayBlockingQueue's source code - its code is quite clean, just remove the locking and unneeded methods, and add methods for accessing it by index.