Maximum number of characters using keystrokes A, Ctrl+A, Ctrl+C and Ctrl+V

This is an interview question from google. I am not able to solve it by myself. Can somebody shed some light?

Write a program to print the sequence of keystrokes such that it generates the maximum number of character 'A's. You are allowed to use only 4 keys: A, Ctrl+A, Ctrl+C and Ctrl+V. Only N keystrokes are allowed. All Ctrl+ characters are considered as one keystroke, so Ctrl+A is one keystroke.

For example, the sequence A, Ctrl+A, Ctrl+C, Ctrl+V generates two A's in 4 keystrokes.

  • Ctrl+A is Select All
  • Ctrl+C is Copy
  • Ctrl+V is Paste

I did some mathematics. For any N, using x numbers of A's , one Ctrl+A, one Ctrl+C and y Ctrl+V, we can generate max ((N-1)/2)2 number of A's. For some N > M, it is better to use as many Ctrl+A's, Ctrl+C and Ctrl+V sequences as it doubles the number of A's.

The sequence Ctrl+A, Ctrl+V, Ctrl+C will not overwrite the existing selection. It will append the copied selection to selected one.


There's a dynamic programming solution. We start off knowing 0 keys can make us 0 A's. Then we iterate through for i up to n, doing two things: pressing A once and pressing select all + copy followed by paste j times (actually j-i-1 below; note the trick here: the contents are still in the clipboard, so we can paste it multiple times without copying each time). We only have to consider up to 4 consecutive pastes, since select, copy, paste x 5 is equivalent to select, copy, paste, select, copy, paste and the latter is better since it leaves us with more in the clipboard. Once we've reached n, we have the desired result.

The complexity might appear to be O(N), but since the numbers grow at an exponential rate it is actually O(N2) due to the complexity of multiplying the large numbers. Below is a Python implementation. It takes about 0.5 seconds to calculate for N=50,000.

def max_chars(n):
  dp = [0] * (n+1)
  for i in xrange(n):
    dp[i+1] = max(dp[i+1], dp[i]+1) # press a
    for j in xrange(i+3, min(i+7, n+1)):
      dp[j] = max(dp[j], dp[i]*(j-i-1)) # press select all, copy, paste x (j-i-1)
  return dp[n]

In the code, j represents the total number of keys pressed after our new sequence of keypresses. We already have i keypresses at this stage, and 2 new keypresses go to select-all and copy. Therefore we're hitting paste j-i-2 times. Since pasting adds to the existing sequence of dp[i] A's, we need to add 1 making it j-i-1. This explains the j-i-1 in the 2nd-last line.

Here are some results (n => number of A's):

  • 7 => 9
  • 8 => 12
  • 9 => 16
  • 10 => 20
  • 100 => 1391569403904
  • 1,000 => 3268160001953743683783272702066311903448533894049486008426303248121757146615064636953144900245 174442911064952028008546304
  • 50,000 => a very large number!

I agree with @SB that you should always state your assumptions: Mine is that you don't need to paste twice to double the number of characters. This gets the answer for 7, so unless my solution is wrong the assumption must be right.

In case someone wonders why I'm not checking sequences of the form Ctrl+A, Ctrl+C, A, Ctrl+V: The end result will always be the same as A, Ctrl+A, Ctrl+C, Ctrl+V which I do consider.


By using marcog's solution I found a pattern that starts at n=16. To illustrate this here are the keystrokes for n=24 up to n=29, I replaced ^A with S (select), ^C with C (copy), and ^V with P (paste) for readability:

24: A,A,A,A,S,C,P,P,P,S,C,P,P,P,S,C,P,P,P,S,C,P,P,P
       4   *    4    *    4    *    4    *    4     = 1024
25: A,A,A,A,S,C,P,P,P,S,C,P,P,S,C,P,P,S,C,P,P,S,C,P,P
       4   *    4    *   3   *   3   *   3   *   3    = 1296
26: A,A,A,A,S,C,P,P,P,S,C,P,P,P,S,C,P,P,S,C,P,P,S,C,P,P
       4   *    4    *    4    *   3   *   3   *   3    = 1728
27: A,A,A,A,S,C,P,P,P,S,C,P,P,P,S,C,P,P,P,S,C,P,P,S,C,P,P
       4   *    4    *    4    *    4    *   3   *   3    = 2304
28: A,A,A,A,S,C,P,P,P,S,C,P,P,P,S,C,P,P,P,S,C,P,P,P,S,C,P,P
       4   *    4    *    4    *    4    *    4    *   3    = 3072
29: A,A,A,A,S,C,P,P,P,S,C,P,P,P,S,C,P,P,P,S,C,P,P,P,S,C,P,P,P
       4   *    4    *    4    *    4    *    4    *    4     = 4096

After an initial 4 As, the ideal pattern is to select, copy, paste, paste, paste and repeat. This will multiply the number of As by 4 every 5 keystrokes. If this 5 keystroke pattern cannot consume the remaining keystrokes on its own some number of 4 keystroke patterns (SCPP) consume the final keystrokes, replacing SCPPP (or removing one of the pastes) as necessary. The 4 keystroke patterns multiply the total by 3 every 4 keystrokes.

Using this pattern here is some Python code that gets the same results as marcog's solution, but is O(1) edit: This is actually O(log n) due to exponentiation, thanks to IVlad for pointing that out.

def max_chars(n):
  if n <= 15:
    return (0, 1, 2, 3, 4, 5, 6, 9, 12, 16, 20, 27, 36, 48, 64, 81)[n]
  e3 = (4 - n) % 5
  e4 = n // 5 - e3
  return 4 * (4 ** e4) * (3 ** e3)

Calculating e3: There are always between 0 and 4 SCPP patterns at the end of the keystroke list, for n % 5 == 4 there are 4, n % 5 == 1 there are 3, n % 5 == 2 there are 2, n % 5 == 3 there are 1, and n % 5 == 4 there are 0. This can be simplified to (4 - n) % 5.

Calculating e4: The total number of patterns increases by 1 whenever n % 5 == 0, as it turns out this number increases to exactly n / 5. Using floor division we can get the total number of patterns, the total number for e4 is the total number of patterns minus e3. For those unfamiliar with Python, // is the future-proof notation for floor division.


Here's how I would approach it:

  • assume CtrlA = select all
  • assume CtrlC = copy selection
  • assume CtrlV = paste copied selection

given some text, it takes 4 keystrokes to duplicate it:

  • CtrlA to select it all
  • CtrlC to copy it
  • CtrlV to paste (this will paste over the selection - STATE YOUR ASSUMPTIONS)
  • CtrlV to paste again which doubles it.

From there, you can consider doing 4 or 5 A's, then looping through the above. Note that doing ctrl + a, c, v, v will grow your text exponentially as you loop through. If remaining strokes < 4, just keep doing a CtrlV

The key to interviews @ places like Google is to state your assumptions, and communicate your thinking. they want to know how you solve problems.


It's solveable in O(1): Like with the Fibonacci numbers, there is a formula to calculate the number of printed As (and the sequence of keystrokes):


1) We can simplify the problem description:

  • Having only [A],[C-a]+[C-c],[C-v] and an empty copy-paste-buffer

equals

  • having only [C-a]+[C-c],[C-v] and "A" in the copy-paste-buffer.

2) We can describe the sequence of keystrokes as a string of N chars out of {'*','V','v'}, where 'v' means [C-v] and '*' means [C-a] and 'V' means [C-c]. Example: "vvvv*Vvvvv*Vvvv"

The length of that string still equals N.

The product of the lengths of the Vv-words in that string equals the number of produced As.


3) Given a fixed length N for that string and a fixed number K of words, the outcome will be maximal iff all words have nearly equal lengths. Their pair-wise difference is not more than ±1.

Now, what is the optimal number K, if N is given?


4) Suppose, we want to increase the number of words by appending one single word of length L, then we have to reduce L+1 times any previous word by one 'v'. Example: "…*Vvvv*Vvvv*Vvvv*Vvvv" -> "…*Vvv*Vvv*Vvv*Vvv*Vvv"

Now, what is the optimal word length L?

(5*5*5*5*5) < (4*4*4*4*4)*4 , (4*4*4*4) > (3*3*3*3)*3

=> Optimal is L=4.


5) Suppose, we have a sufficient large N to generate a string with many words of length 4, but a few keystrokes are left; how should we use them?

  • If there are 5 or more left: Append another word with length 4.

  • If there are 0 left: Done.

  • If there are 4 left: We could either

    a) append one word with length 3: 4*4*4*4*3=768.

    b) or increase 4 words to lenght 5: 5*5*5*5=625. => Appending one word is better.

  • If there are 3 left: We could either

    a) or append one word with length 3 by adjusting the previus word from length 4 to 3: 4*4*4*2=128 < 4*4*3*3=144.

    b) increase 3 words to lenght 5: 5*5*5=125. => Appending one word is better.

  • If there are 2 left: We could either

    a) or append one word with length 3 by adjusting the previus two words from length 4 to 3: 4*4*1=16 < 3*3*3=27.

    b) increase 2 words to lenght 5: 5*5=25. => Appending one word is better.

  • If there is 1 left: We could either

    a) or append one word with length 3 by adjusting the previus three words from length 4 to 3: 4*4*4*0=0 < 3*3*3*3=81.

    b) increase one word to lenght 5: 4*4*5=80. => Appending one word is better.


6) Now, what if we don't have a "sufficient large N" to use the rules in 5)? We have to stick with plan b), if possible! The strings for small N are:

1:"v", 2:"vv", 3:"vvv", 4:"vvvv"

5:"vvvvv" → 5 (plan b)

6:"vvvvvv" → 6 (plan b)

7:"vvv*Vvv" → 9 (plan a)

8:"vvvv*Vvv" → 12 (plan a)

9:"vvvv*Vvvv" → 16

10:"vvvv*Vvvvv" → 20 (plan b)

11:"vvv*Vvv*Vvv" → 29 (plan a)

12:"vvvv*Vvv*Vvv" → 36 (plan a)

13:"vvvv*Vvvv*Vvv" → 48 (plan a)

14:"vvvv*Vvvv*Vvvv" → 64

15:"vvv*Vvv*Vvv*Vvv" → 81 (plan a)


7) Now, what is the optimal number K of words in a string of length N?

If N < 7 then K=1 else if 6 < N < 11 then K=2 ; otherwise: K=ceil((N+1)/5)

Written in C/C++/Java: int K = (N<7)?(1) : (N<11)?(2) : ((N+5)/5);

And if N > 10, then the number of words with length 3 will be: K*5-1-N. With this, we can calculate the number of printed As:

If N > 10, the number of As will be: 4^{N+1-4K}·3^{5K-N-1}