Visual Servoing Platform version 3.7.0
Loading...
Searching...
No Matches
Tutorial: Read / Save arrays of data from / to NPZ file format

Introduction

Note
Please refer to the Python tutorial for a short overview of the NPZ format from a Python point of view.

The NPY / NPZ ("a zip file containing multiple NPY files") file format is a "standard binary file format in NumPy", appropriate for binary serialization of large chunks of data. A description of the NPY format is available here.

The C++ implementation of this binary format relies on the rogersce/cnpy library, available under the MIT license. Additional example code can be found directly from the rogersce/cnpy repository.

Comparison with some other file formats

The NPZ binary format is intended to provide a quick and efficient mean to read/save large arrays of data, mostly for debugging purpose. While the first and direct option for saving data would be to use file text, the choice of the NPZ format presents the following advantages:

  • it is a binary format, that is the resulting file size will be smaller compared to a plain text file (especially with floating-point numbers),
  • it provides exact floating-point representation, that is there is no need to bother with floating-point precision (see for instance the setprecision or std::hexfloat functions),
  • it provides some basic compatibility with the NumPy NPZ format (numpy.load and numpy.savez),
  • large arrays of data can be easily appended, with support for multi-dimensional arrays.

On the other hand, the main disadvantages are:

  • it is a non-human readable format, suitable for saving large arrays of data, but not for easy debugging or hierarchical information,
  • compatibility is limited to the Python/NumPy world contrary to other formats such as XML, JSON, etc,
  • file compression (numpy.savez_compressed) is not supported.
Note
The current implementation works also on big-endian platforms, but without much guarantee (little-endian is the primary endianness architecture nowadays).

You can refer to this Wikipedia page for an exhaustive comparison of data-serialization formats.

Hands-on

How to save/read string data

Saving C++ std::string data can be achieved the following way:

  • create a string object and convert it to a vector<char> object:
    const std::string save_string = "Open Source Visual Servoing Platform";
    std::vector<char> vec_save_string(save_string.begin(), save_string.end());
  • add and save the data to the .npz file, the identifier is the variable name and the "w" means write ("a" means append to the archive):
    const std::string npz_filename = "tutorial_npz_read_write.npz";
    const std::string identifier = "My string data";
    visp::cnpy::npz_save(npz_filename, identifier, &vec_save_string[0], { vec_save_string.size() }, "w");

Reading back the data can be done easily:

  • load the data:
    const std::string npz_filename = "tutorial_npz_read_write.npz";
    visp::cnpy::npz_t npz_data = visp::cnpy::npz_load(npz_filename);
  • the identifier is then needed,
  • a conversion from vector<char> to std::string object is required:
    const std::string identifier = "My string data";
    if (npz_data.find(identifier) != npz_data.end()) {
    visp::cnpy::NpyArray arr_string_data = npz_data[identifier];
    std::vector<char> vec_arr_string_data = arr_string_data.as_vec<char>();
    const std::string read_string(vec_arr_string_data.begin(), vec_arr_string_data.end());
    std::cout << "Read string: " << read_string << std::endl;
    }
Note
In the previous example, there is no need to save a "null-terminated" character since it is handled at reading using a specific constructor which uses iterators to the begenning and ending of the string data. Additional information can be found here. The other approach would consist to
  • append the null character "\0" to the vector: "vec_save_string.push_back(`\0`);"
  • and uses the constructor that accepts a pointer of data: "std::string read_string(arr_string_data.data<char>());"
Warning
The previous approach is efficient in terms of memory storage but is not compatible with NumPy. We describe another approach if you want a compatibilty with NumPy np.load function.

The NumPy library saves string data as UTF-32 format where each character is stored using 4 bytes (instead of 1 with the previous approach):

The following code shows how to save std::string object with compatibility with NumPy:

const std::string identifier2 = "My string data 2";
visp::cnpy::npz_save(npz_filename, identifier2, save_string, "a");

Reading back the value with:

const std::string identifier2 = "My string data 2";
if (npz_data.find(identifier2) != npz_data.end()) {
visp::cnpy::NpyArray arr_string_data = npz_data[identifier2];
const std::string read_string2 = arr_string_data.as_utf8_string_vec()[0];
std::cout << "Read string 2: " << read_string2 << std::endl;
}
Note
For simplicity, ViSP saves std::string with each character taking 4 bytes to be compatible with NumPy. When reading back the data, visp::cnpy::NpyArray::as_utf8_string_vec() will only retrieve 1 byte and ignore the rest. Dealing with UTF-8, UTF-16, UTF-32 and extended std::string such as std::wstring is much more complex in C++.
Array of string data can also be saved using visp::cnpy::npz_save() function which takes in parameter:
  • a vector of string objects: std::vector<std::string>
  • the shape of the array: std::vector<size_t>

How to save basic data types

Saving C++ basic data type such as int32_t, float or even std::complex<double> is straightforward:

const std::string npz_filename = "tutorial_npz_read_write.npz";
const std::string int_identifier = "My int data";
int int_data = 99;
visp::cnpy::npz_save(npz_filename, int_identifier, &int_data, { 1 }, "w");
const std::string double_identifier = "My double data";
double double_data = 3.14;
visp::cnpy::npz_save(npz_filename, double_identifier, &double_data, { 1 }, "a");
const std::string complex_identifier = "My complex data";
std::complex<double> complex_data(int_data, double_data);
visp::cnpy::npz_save(npz_filename, complex_identifier, &complex_data, { 1 }, "a");

Reading back the data can be done easily:

const std::string npz_filename = "tutorial_npz_read_write.npz";
visp::cnpy::npz_t npz_data = visp::cnpy::npz_load(npz_filename);
const std::string int_identifier = "My int data";
const std::string double_identifier = "My double data";
const std::string complex_identifier = "My complex data";
visp::cnpy::npz_t::iterator it_int = npz_data.find(int_identifier);
visp::cnpy::npz_t::iterator it_double = npz_data.find(double_identifier);
visp::cnpy::npz_t::iterator it_complex = npz_data.find(complex_identifier);
if (it_int != npz_data.end() && it_double != npz_data.end() && it_complex != npz_data.end()) {
visp::cnpy::NpyArray arr_data_int = it_int->second;
visp::cnpy::NpyArray arr_data_double = it_double->second;
visp::cnpy::NpyArray arr_data_complex = it_complex->second;
int int_data = *arr_data_int.data<int>();
double double_data = *arr_data_double.data<double>();
std::complex<double> complex_data = *arr_data_complex.data<std::complex<double>>();
std::cout << "Read int data: " << int_data << std::endl;
std::cout << "Read double data: " << double_data << std::endl;
std::cout << "Read complex data, real: " << complex_data.real() << " ; imag: " << complex_data.imag() << std::endl;
}

How to save a vpImage

Finally, one of the advantages of the NPZ is the possibility to save multi-dimensional arrays easily. As an example, we will save first a vpImage<vpRGBa>.

Following code shows how to read an image:

const std::string img_filename = "ballons.jpg";
vpImageIo::read(img, img_filename);

Then, saving a color image can be achieved as easily as:

if (img.getSize() != 0) {
const std::string npz_filename = "tutorial_npz_read_write.npz";
const std::string img_identifier = "My color image";
visp::cnpy::npz_save(npz_filename, img_identifier, &img.bitmap[0], { img.getRows(), img.getCols() }, "w");
}

We have passed the address to the bitmap array, that is a vector of vpRGBa. The shape of the array is thus "height x width" since all basic elements of the bitmap are already of vpRGBa type (4 unsigned char elements).

Reading back the image is done with:

const std::string npz_filename = "tutorial_npz_read_write.npz";
visp::cnpy::npz_t npz_data = visp::cnpy::npz_load(npz_filename);
const std::string img_identifier = "My color image";
visp::cnpy::npz_t::iterator it_img = npz_data.find(img_identifier);
if (it_img != npz_data.end()) {
visp::cnpy::NpyArray arr_data_img = it_img->second;
const bool copy_data = false;
vpImage<vpRGBa> img(arr_data_img.data<vpRGBa>(), static_cast<unsigned int>(arr_data_img.shape[0]), static_cast<unsigned int>(arr_data_img.shape[1]), copy_data);
std::cout << "Img: " << img.getWidth() << "x" << img.getHeight() << std::endl;
std::shared_ptr<vpDisplay> ptr_display = vpDisplayFactory::createDisplay(img);
vpDisplay::displayText(img, 20, 20, "vpImage<vpRGBa>", vpColor::red);
}

The vpImage constructor accepting a vpRGBa pointer is used, with the appropriate image height and width values.

Finally, the image is displayed.

How to save a multi-dimensional array

Similarly, the following code shows how to save a multi-dimensional array with a shape corresponding to {H x W x 3}:

const std::string img_filename = "ballons.jpg";
vpImageIo::read(img, img_filename);
if (img.getSize() != 0) {
std::vector<unsigned char> vec_data_img;
vec_data_img.resize(3*img.getSize());
vpImageConvert::RGBaToRGB(reinterpret_cast<unsigned char *>(img.bitmap), vec_data_img.data(),
img.getSize());
const std::string npz_filename = "tutorial_npz_read_write.npz";
const std::string img_identifier = "My RGB image";
visp::cnpy::npz_save(npz_filename, img_identifier, &vec_data_img[0], { img.getRows(), img.getCols(), 3 }, "w");
}

Finally, the image can be read back and displayed with:

const std::string npz_filename = "tutorial_npz_read_write.npz";
visp::cnpy::npz_t npz_data = visp::cnpy::npz_load(npz_filename);
const std::string img_identifier = "My RGB image";
visp::cnpy::npz_t::iterator it_img = npz_data.find(img_identifier);
if (it_img != npz_data.end()) {
visp::cnpy::NpyArray arr_data_img = it_img->second;
vpImage<vpRGBa> img(static_cast<unsigned int>(arr_data_img.shape[0]), static_cast<unsigned int>(arr_data_img.shape[1]));
vpImageConvert::RGBToRGBa(arr_data_img.data<unsigned char>(), reinterpret_cast<unsigned char *>(img.bitmap),
img.getSize());
std::shared_ptr<vpDisplay> ptr_display = vpDisplayFactory::createDisplay(img);
vpDisplay::displayText(img, 20, 20, "RGBToRGBa", vpColor::red);
}

A specific conversion from RGB to RGBa must be done for compatibility with the ViSP vpRGBa format.