< Back to blog

How to make a proper reference for a model

Hi there! This article is intended for IT-professionals working in the sphere of machine learning. Imagine that you trained a model, got good accuracy, and now you want to run it in a C++ application. There is a lot of information on the network about deploying a model, but your model may not work correctly. Why is this happening? Let’s figure it out.

One small detail. We work with Computer Vision (CV). The research takes place in Python, and the model is deployed in C++. If you work with CV, you would probably encounter similar problems, because in CV a large number of transform algorithms are used while working with the model.

Typically, to work with models, you need to implement a codebase that repeats the data processing pipeline both before and after the prediction. But sometimes the results can differ. This is due to the fact that the explorer’s script in python and the application are different.

For example, differences could be due to the difference in the operation of standard functions in languages, an error in writing a code in C++, or an incorrect description of the pipeline for working with a model on the part of the researcher. In order to solve this problem, we created a special regulation.

According to internal rules, a model can be transferred to a C++ developer only if there is documentation, which we usually call a reference. These documentations allow us to minimize the majority of errors during deployment, and also limit unnecessary interactions between developers and researchers. This allows one to be less distracted by clarifying the features while working with the model.

The reference includes several entities:

0_ original _ image

1_resize_image

2_convert_resized_image_to_float32

4_normalize_image

5_image_swap_axes

6_image_unsqueeze

7_predict

8_mask_squeeze

9_mask_threshold

10_mask_denormalize

11_resize_mask

In addition to standard transformation of files (resize, normalization, cast to float), it was necessary to transpose (the fifth file displays it) and mask_squeese (the eighth file ). In this project, we had to describe each step in order to find errors that were made during the deployment of the model. It was then that it was decided to create such sets of files for each model.

Despite availability of detailed documentation, errors occur that cannot be avoided by the rules described above. Let’s discuss them based on two most interesting examples.

The first error is a rounding error. In our situation, researchers rounded floats, just like C++ developers. But the standard float in Python is 64 bit, and in C++ it is 32 bit. Because of this, the references did not match, and all tests failed. This was until we found the problem. It turned out that these rounding work differently.

A rounding code in Python:

>>> int (255 * float (0.9999999999999999)) 254

A rounding code in C ++:

cout << int (255 * float (0.9999999999999999)); 255

It may seem that such differences are insignificant, but in a large dataset, the total error accumulates and becomes significant.

The second example is the differences in transformations (such as padding or data normalization). We scaled the image in C++ using swscale from ffmpeg using bicubic interpolation, and the researchers used scaling options from various frameworks, including the use of bicubic interpolation.

Scaling with PIL in Python:

>>> import numpy >>> from PIL import Image >>> >>> data = numpy.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> data = data / 255 >>> img = Image.fromarray(data) >>> scaled_img = img.resize((2, 2)) >>> scaled_data = numpy.array(scaled_img) >>> scaled_data array([[2.2391655, 3.619583 ], [6.380418 , 7.7608347]], dtype=float32) >>> scaled_data.astype(int) array([[2, 3], [6, 7]])

Scaling with cv2 in Python:

>>> import numpy >>> import cv2 >>> >>> data = numpy.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> data = data / 255 >>> scaled_data = cv2.resize(data, (2,2), cv2.INTER_CUBIC) >>> scaled_data = scaled_data * 255 >>> scaled_data array([[2., 3.5], [6.5, 8.]]) >>> scaled_data.astype(int) array([[2, 3], [6, 8]])

Scaling with swscale in C++:

const int size = 3; std::vector<uint8_t> img = {1, 2, 3, 4, 5, 6, 7, 8, 9}; uint8_t *src[1] = { img.data() }; int srcStride[1] = { size }; const int scaled_size = 2; std::vector<uint8_t> scaled_img(scaled_size * scaled_size); uint8_t *dst[1] = { scaled_img.data() }; int dstStride[1] = { scaled_size }; SwsContext *swsCtx = sws_getContext(size, size, AV_PIX_FMT_GRAY8, scaled_size, scaled_size, AV_PIX_FMT_GRAY8, SWS_BICUBIC, NULL, NULL, NULL); sws_scale(swsCtx, src, srcStride, 0, size, dst, dstStride); for(size_t i = 0; i < scaled_size; ++i) { for(size_t j = 0; j < scaled_size; ++j) { std::cout << (int)scaled_img[i * scaled_size + j] << ' '; } std::cout << std::endl; }

Cpp output:

2 3

7 9

One solution is to enable researchers to choose the tool that suits them. This, in turn, will require implementation in C++. In the long term, this will lead to the build-up of a large amount of code with the same functionality. Therefore, we went the other way.

We have introduced a number of guidelines for the selection of tools for various operations. Consequently, for scaling in C++, a function from OpenCV was chosen; for scaling in python a function from cv2.

Similar transformation tools for a large number of cases are stored in our special library. Sometimes we add new implementations to it, if necessary. The set of tools in this library is unified, which allows it to be used as a pipeline constructor. If you want to try our tools, here is the link.

In conclusion, I would like to highlight a few useful tips.

  1. Do not use transformations with the same purpose from different libraries because their implementations may differ.

  2. Use transformations with the same purpose from the same library because the use of multiple libraries will increase the codebase significantly.

  3. Use documentation when working with models. It reduces development time by reducing the number of errors and making them easier to detect.