How to use numpy.void type

According to the numpy documentation: http://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html, numpy.void types are defined as flexible data types. Basically, these are data types where there is no pre-defined type associated to the variable(s) you're looking at. If you look at numpy, you have data types such as float, uint8, bool, string, etc.

void is to accommodate for more generic and flexible types and are for those data types that don't necessary fall into any one of these pre-defined data types. This situation is mostly encountered when you're loading in a struct where each element has multiple data types associated with multiple fields. Each structure element could have a combination of different data types, and the amalgamation of all of these data types to represent an instance of this structure element thus leads us to numpy.void.

With the documentation, you can certainly do the same operations like you would with any other data type. Take a look at the generic data type methods here: http://docs.scipy.org/doc/numpy/reference/generated/numpy.generic.html#numpy.generic . In fact, all numpy data types are derived from this generic class, including numpy.void.

In the first link I provided at the beginning of this post, it shows a good example of how to create a custom record type, where a record is a combination of a tuple of numbers and a string. When creating a list of these records, each type in the list is of type numpy.void and it demonstrates that a record is of this data type. However, bear in mind that this record list has a data type that is of this record, but each element of this list will be of type numpy.void.


However, as a matter of self-containment, let's re-create the example here: Let's create a custom record type where it has two fields associated for each variable you create:

  • A 16-bit string with a field named name
  • A 2-element tuple of floating point numbers that are 64-bits each, with a field named grades

As such, you'd do something like:

import numpy as np
dt = np.dtype([('name', np.str_, 16), ('grades', np.float64, (2,))])

As such, let's create an example list of two elements and instantiate their fields:

x = np.array([('Sarah', (8.0, 7.0)), ('John', (6.0, 7.0))], dtype=dt)

Because we made this list into a numpy.array, we expect its data type to be so:

type(x)

We get:

<type 'numpy.ndarray'>

Remember, the list itself is a numpy.array, but not the individual elements.


To access the second element of this list, which is the second record, we do:

x[1]

We get:

('John', [6.0, 7.0])

To check the type of the second record, we do:

type(x[1])

We get:

<type 'numpy.void'> # As expected

Some additional bonuses for you

To access the name of the second record, we do:

x[1]['name']

We get:

'John'

To access the grades of the second record, we do:

x[1]['grades']

We get:

array([ 6.,  7.])

To check the type of the name inside the second record, we do:

type(x[1]['name'])

We get:

<type 'numpy.string_'>

To check the type of the grades inside the second record, we do:

type(x[1]['grades'])

We get:

<type 'numpy.ndarray'>

Take note that each element in this list is of type numpy.void. However, the individual fields for each element in our list is either a tuple of numbers, or a string. The collection of these elements together is of type numpy.void.