How to training/testing my own dataset in caffe?
I started with Caffe and the mnist example ran well.
I have the train and label data as data.mat
. (I have 300 training data with 30 features and labels are (-1, +1)
that have saved in data.mat
).
However, I don't quite understand how I can use caffe to implement my own dataset?
Is there a step by step tutorial can teach me?
Many thanks!!!! Any advice would be appreciated!
Solution 1:
I think the most straight forward way to transfer data from Matlab to caffe is via HDF5 file.
First, save your data in Matlab in an HDF5 file using hdf5write
. I assume your training data is stored in a variable name X
of size 300-by-30 and the labels are stored in y
a 300-by-1 vector:
hdf5write('my_data.h5', '/X',
single( permute(reshape(X,[300, 30, 1, 1]),[4:-1:1]) ) );
hdf5write('my_data.h5', '/label',
single( permute(reshape(y,[300, 1, 1, 1]),[4:-1:1]) ),
'WriteMode', 'append' );
Note that the data is saved as a 4D array: the first dimension is the number of features, second one is the feature's dimension and the last two are 1 (representing no spatial dimensions). Also note that the names given to the data in the HDF5 are "X"
and "label"
- these names should be used as the "top"
blobs of the input data layer.
Why permute
? please see this answer for an explanation.
You also need to prepare a text file listing the names of all hdf5 files you are using (in your case, only my_data.h5
). File /path/to/list/file.txt
should have a single line
/path/to/my_data.h5
Now you can add an input data layer to your train_val.prototxt
layer {
type: "HDF5Data"
name: "data"
top: "X" # note: same name as in HDF5
top: "label" #
hdf5_data_param {
source: "/path/to/list/file.txt"
batch_size: 20
}
include { phase: TRAIN }
}
For more information regarding hdf5 input layer, you can see in this answer.