Image Preprocessing

Image preprocessing steps to prepare data for models.

Preprocessing

Preprocessing should be applied to your training and test set to assure learning and inference occurs on the same image properties. For example, if your model learns on 500x500 images, it should do inference on images of the same size.

Auto-Orient

Auto-orient strips your images of their EXIF data so that you see images displayed the same way they are stored on disk.

EXIF data determines the orientation of a given image. Applications (like Preview on Mac) use this data to display an image in a specific orientation, even if the orientation of how it is stored on disk differs. See this front page Hacker News discussion on how this may silently ruin your object detection models.

Roboflow recommends defaulting to leaving this on and checking how your images in inference are being fed to your model.

Resize

Resize changes your images size and, optionally, scale to a desired set of dimensions. Annotations are adjusted proportionally (except in the case of “fill” below).

Currently, only support downsizing.

Stretch to: Stretch your images to a preferred pixel-by-pixel dimension. Annotations are scaled proportionally. Images are square, distorted, but no source image data is lost.

Fill (with center crop) in: The generated image is a centered crop of your desired output dimensions. For example, if the source image is 2600x2080 and the resize option is set to 416x416, the outputted resize is the central 416x416 of the source image. The aspect ratio is maintained, but source image data is lost.

Fit within: The dimensions of the source dimension are scaled to be the dimensions of the output image while maintaining the source image aspect ratio. For example, if a source image is 2600x2080 and the resize option is set to 416x416, the longer dimensions (2600) is scaled to 416 and the secondary dimension (2080) is scaled to ~335.48 pixels. Image aspect ratios and original data are maintained, but they are not square.

Fit (reflect edges) in: The dimensions of the source dimension are scaled to be the dimensions of the output image while maintaining the source image aspect ratio, and any newly created padding is a reflection of the source image. For example, if a source image is 2600x2080 and the resize option is set to 416x416, the longer dimensions (2600) is scaled to 416 and the secondary dimension (2080) is scaled to ~335.48 pixels. The remaining pixel area (416-335.48, or 80.52 pixels) are reflected pixels of the source image. Notably, Roboflow also reflects annotations by default. Images are square, padded, and aspect ratios plus original data are maintained.

Fit (black edges) in: The dimensions of the source dimension are scaled to be the dimensions of the output image while maintaining the source image aspect ratio, and any newly created padding is black area. For example, if a source image is 2600x2080 and the resize option is set to 416x416, the longer dimensions (2600) is scaled to 416 and the secondary dimension (2080) is scaled to ~335.48 pixels. The remaining pixel area (416-335.48, or 80.52 pixels) are black pixels. Images are square, black padded, and aspect ratios plus original data are maintained.

Fit (white edges) in: The dimensions of the source dimension are scaled to be the dimensions of the output image while maintaining the source image aspect ratio, and any newly created padding is white area. For example, if a source image is 2600x2080 and the resize option is set to 416x416, the longer dimensions (2600) is scaled to 416 and the secondary dimension (2080) is scaled to ~335.48 pixels. The remaining pixel area (416-335.48, or 80.52 pixels) are white pixels. Images are square, white padded, and aspect ratios plus original data are maintained.

Grayscale

[Converts an image with RGB channels into an image with a single grayscale channel. The value of each grayscale pixel is calculated as the weighted sum of the corresponding red, green and blue pixels as: Y = 0.2125 R + 0.7154 G + 0.0721 B

These weights are used by CRT phosphors as they better represent human perception of red, green and blue than equal weights. (VIA SCIKIT-IMAGE)

Converting to a single channel saves you memory.

Auto-Adjust Contrast

Enhances an image with low contrast.

Contrast Stretching: the image is rescaled to include all intensities that fall within the 2nd and 98th percentiles. See more.

Histogram Equalization: “spreads out the most frequent intensity values” in an image 1. The equalized image has a roughly linear cumulative distribution function. See more.

Adaptive Equalization: Contrast Limited Adaptive Histogram Equalization (CLAHE). An algorithm for local contrast enhancement, that uses histograms computed over different tile regions of the image. Local details can therefore be enhanced even in regions that are darker or lighter than most of the image. (VIA SCITKIT-IMAGE)