2013年12月26日 星期四

EMNIST handwritten dataset on libtorch c++

Refer to:
https://jovian.ai/goyalbhavya529/emnist-project
https://github.com/austin-hill/EMNIST-CNN
https://discuss.pytorch.org/t/libtorch-how-to-use-torch-datasets-for-custom-dataset/34221
https://github.com/pytorch/examples/blob/main/cpp/custom-dataset/custom-dataset.cpp

1. download emnist label and image
https://www.kaggle.com/datasets/crawford/emnist

2. explain emnist data
###emnist-balanced-train-labels-idx1-ubyte###
[offset]    [type]      [value]                        [description]
0000        32 bits     0x00000801(2049)     magic number     - big endian
0004        32 bits     0x0001b8a0(112800) number of items  - big endian
0008        unsigned    byte   ??                   label
0009        unsigned    byte   ??                   label
....
xxxx        unsigned    byte   ??                   label

labels   = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabdefghnqrt"
classes = 47

###emnist-balanced-train-images-idx3-ubyte###
[offset] [type]              [value]                        [description]
0000     32 bit integer  0x00000803(2051)     magic number          - big endian
0004     32 bit integer  0x0001b8a0(112800) number of images    - big endian
0008     32 bit integer  0x0000001c(28)         number of rows        - big endian
0012     32 bit integer  0x0000001c(28)         number of columns  - big endian
0016     unsigned byte   ??                              pixel
0017     unsigned byte   ??                              pixel
....
xxxx     unsigned byte   ??                              pixel

Pixels are arranged in row-wise 28 x 28 pixels
Pixel value between 0 to 255.
0     = background white.
255 = foreground black.
The original EMNIST images provided are inverted horizontally and rotated 90 anti-clockwise

3.
class EminstDataset : public torch::data::Dataset<EminstDataset>
{
public:
     EminstDataset();
     ~EminstDataset();

     torch::data::Example<> get(size_t index) override;
     optional<size_t> size() const override;
     torch::Tensor Transform(cv::Mat cv_image);

private:
     std::vector<char> m_class_labels;
     std::vector<cv::Mat> m_class_images;
};

we must implement 2 functions
torch::data::Example<> get
optional<size_t> size()

4. declare dataset & dataloader
auto train_dataset = EminstDataset().map(torch::data::transforms::Stack<>());
auto train_dataloader = torch::data::make_data_loader<torch::data::samplers::SequentialSampler>(std::move(train_dataset), batch_size);

5. get samples
for (auto& batch : *train_dataloader)
this line will call torch::data::Example<> get

6. when get image and label we can use simple CNN to train
conv2d + relu + max_pool2d + linear and loss function is cross_entropy

download:
https://github.com/fatalfeel/emnist_dataset_torch_cpp
https://www.mediafire.com/file/vrasmlqam70tals/torch_resnet_cpp.tar.gz

ps: when meet warning message
[W NNPACK.cpp:80] Could not initialize NNPACK! Reason: Unsupported hardware.
it's means cpu does not have avx2 instructions.

沒有留言:

張貼留言