memory issues when transforming np.array using to_categorical
I have a numpy array like this:
[[0. 1. 1. ... 0. 0. 1.]
[0. 0. 0. ... 0. 0. 1.]
[0. 0. 1. ... 0. 0. 0.]
...
[0. 0. 0. ... 0. 0. 1.]
[0. 0. 0. ... 0. 0. 1.]
[0. 0. 0. ... 1. 0. 1.]]
I transform it like this to reduce the memory demand:
x_val = x_val.astype(np.int)
resulting in this:
[[0 1 1 ... 0 0 1]
[0 0 0 ... 0 0 1]
[0 0 1 ... 0 0 0]
...
[0 0 0 ... 0 0 1]
[0 0 0 ... 0 0 1]
[0 0 0 ... 1 0 1]]
However, when I do this:
x_val = to_categorical(x_val)
I get:
in to_categorical
categorical = np.zeros((n, num_classes), dtype=np.float32)
MemoryError
Any ideas why? Ultimately, the numpy array contains the labels for a binary classification problem. So far, I have used it as float32
as is in a Keras ANN and it worked fine and I achieved pretty good performance. So is it actually necessary to run to_categorical
?
You don't need to use to_categorical
since I guess you are doing multi-label classification. To avoid any confusion once and for all(!), let me explain this.
If you are doing binary classification, meaning each sample may belong to only one of two classes e.g. cat vs dog or happy vs sad or positive review vs negative review, then:
- The labels should be like
[0 1 0 0 1 ... 0]
with shape of(n_samples,)
i.e. each sample has a one (e.g. cat) or zero (e.g. dog) label. - The activation function used for the last layer is usually
sigmoid
(or any other function that outputs a value in range [0,1]). - The loss function usually used is
binary_crossentropy
.
If you are doing multi-class classification, meaning each sample may belong to only one of many classes e.g. cat vs dog vs lion or happy vs neutral vs sad or positive review vs neutral review vs negative review, then:
- The labels should be either one-hot encoded, i.e.
[1, 0, 0]
corresponds to cat,[0, 1, 0]
corresponds to dog and[0, 0, 1]
corresponds to lion, which in this case the labels have a shape of(n_samples, n_classes)
; Or they can be integers (i.e. sparse labels), i.e.1
for cat,2
for dog and3
for lion, which in this case the labels have a shape of(n_samples,)
. Theto_categorical
function is used to convert sparse labels to one-hot encoded labels, of course if you wish to do so. - The activation function used is usually
softmax
. - The loss function used depends on the format of labels: if they are one-hot encoded,
categorical_crossentropy
is used and if they are sparse thensparse_categorical_crossentropy
is used.
If you are doing multi-label classification, meaning each sample may belong to zero, one or more than one classes e.g. an image may contain both cat and dog, then:
- The labels should be like
[[1 0 0 1 ... 0], ..., [0 0 1 0 ... 1]]
with shape of(n_samples, n_classes)
. For example, a label[1 1]
means that the corresponding sample belong to both classes (e.g. cat and dog). - The activation function used is
sigmoid
since presumably each class is independent of another class. - The loss function used is
binary_crossentropy
.