Is this the fastest way to group in Pandas?

The following code works well. Just checking: am I using and timing Pandas correctly and is there any faster way? Thanks.

$ python3
Python 3.4.0 (default, Apr 11 2014, 13:05:11) 
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> import timeit
>>> pd.__version__
'0.14.1'

def randChar(f, numGrp, N) :
   things = [f%x for x in range(numGrp)]
   return [things[x] for x in np.random.choice(numGrp, N)]

def randFloat(numGrp, N) :
   things = [round(100*np.random.random(),4) for x in range(numGrp)]
   return [things[x] for x in np.random.choice(numGrp, N)]

N=int(1e8)
K=100
DF = pd.DataFrame({
  'id1' : randChar("id%03d", K, N),       # large groups (char)
  'id2' : randChar("id%03d", K, N),       # large groups (char)
  'id3' : randChar("id%010d", N//K, N),   # small groups (char)
  'id4' : np.random.choice(K, N),         # large groups (int)
  'id5' : np.random.choice(K, N),         # large groups (int)
  'id6' : np.random.choice(N//K, N),      # small groups (int)            
  'v1' :  np.random.choice(5, N),         # int in range [1,5]
  'v2' :  np.random.choice(5, N),         # int in range [1,5]
  'v3' :  randFloat(100,N)                # numeric e.g. 23.5749
})

Now time 5 different groupings, repeating each one twice to confirm the timing. [I realise timeit(2) runs it twice, but then it reports the total. I'm interested in the time of the first and second run separately.] Python uses about 10G of RAM according to htop during these tests.

>>> timeit.Timer("DF.groupby(['id1']).agg({'v1':'sum'})"                            ,"from __main__ import DF").timeit(1)
5.604133386000285
>>> timeit.Timer("DF.groupby(['id1']).agg({'v1':'sum'})"                            ,"from __main__ import DF").timeit(1)
5.505057081000359

>>> timeit.Timer("DF.groupby(['id1','id2']).agg({'v1':'sum'})"                      ,"from __main__ import DF").timeit(1)
14.232032927000091
>>> timeit.Timer("DF.groupby(['id1','id2']).agg({'v1':'sum'})"                      ,"from __main__ import DF").timeit(1)
14.242601240999647

>>> timeit.Timer("DF.groupby(['id3']).agg({'v1':'sum', 'v3':'mean'})"               ,"from __main__ import DF").timeit(1)
22.87025260900009
>>> timeit.Timer("DF.groupby(['id3']).agg({'v1':'sum', 'v3':'mean'})"               ,"from __main__ import DF").timeit(1)
22.393589012999655

>>> timeit.Timer("DF.groupby(['id4']).agg({'v1':'mean', 'v2':'mean', 'v3':'mean'})" ,"from __main__ import DF").timeit(1)
2.9725865330001398
>>> timeit.Timer("DF.groupby(['id4']).agg({'v1':'mean', 'v2':'mean', 'v3':'mean'})" ,"from __main__ import DF").timeit(1)
2.9683854739996605

>>> timeit.Timer("DF.groupby(['id6']).agg({'v1':'sum', 'v2':'sum', 'v3':'sum'})"    ,"from __main__ import DF").timeit(1)
12.776488024999708
>>> timeit.Timer("DF.groupby(['id6']).agg({'v1':'sum', 'v2':'sum', 'v3':'sum'})"    ,"from __main__ import DF").timeit(1)
13.558292575999076

Here is system info :

$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                32
On-line CPU(s) list:   0-31
Thread(s) per core:    2
Core(s) per socket:    8
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 62
Stepping:              4
CPU MHz:               2500.048
BogoMIPS:              5066.38
Hypervisor vendor:     Xen
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              25600K
NUMA node0 CPU(s):     0-7,16-23
NUMA node1 CPU(s):     8-15,24-31

$ free -h
             total       used       free     shared    buffers     cached
Mem:          240G        74G       166G       372K        33M       550M
-/+ buffers/cache:        73G       166G
Swap:           0B         0B         0B

I don't believe it's relevant but just in case, the randChar function above is a workaround for a memory error in mtrand.RandomState.choice :

How to solve memory error in mtrand.RandomState.choice?

If you'd like to install the iPython shell, you can easily time your code using %timeit. After installing it, instead of typing python to launch the python interpreter, you would type ipython.

Then you can type your code exactly as you would type it in the normal interpreter (as you did above).

Then you can type, for example:

%timeit DF.groupby(['id1']).agg({'v1':'sum'})

This will accomplish exactly the same thing as what you've done, but if you're using python a lot I find that this will save you significant typing time :).

Ipython has a lot of other nice features (like %paste, which I used to paste in your code and test this, or %run to run a script you've saved in a file), tab completion, etc. http://ipython.org/

How do I upgrade to jlink (JDK 9+) from Java Web Start (JDK 8) for an auto-updating application?

Code formatter like nb_black for google colab

Improving Postgres psycopg2 query performance for Python to the same level of Java's JDBC driver

Long connections with Node.js, how to reduce memory usage and prevent memory leak? Also related with V8 and webkit-devtools

UITextView renders custom font incorrectly in iOS 7

Galaxy S5 Lollipop - not all breakpoints stop execution under Android Studio debugger

Node.js, Cygwin and Socket.io walk into a bar... Node.js throws ENOBUFS and everyone dies

What additional rotation is required for deletion from a Top-Down 2-3-4 Left-leaning Red Black tree?

How can I find out what's causing differences in generated Sandcastle docs?

Enumerating monitors on a computer

Why do Clang and VS2013 accept moving brace-initialized default arguments, but not GCC 4.8 or 4.9?

How does storage access change on Android 6?