What are Fractionally Strided Convolutions?
Fractionally strided convolutions, sometimes referred to as deconvolutions, transpose images, typically from a minimized format to a larger one. Imagine an image that has been reduced to a 2x2 pixel format. To transpose the image up to a larger format, a fractionally strided convolution reconstructs the image's spatial resolution, then performs the convolution.
How do Fractionally Strided Convolutions work?
An image that passed through a convolutional layer may be transformed to a minimized version of itself according to defined parameters. The defined parameters typically include assigned values for stride, kernel size, and padding. The kernel size is the viewing window for the convolution, often set to 3, resulting in a 3x3 image. Stride is the step size of the kernel as it processes the image. While the stride is often set to 1, for image downsampling cases it can be set to 2. For example, if a 5x5 pixel image is processed by setting the stride to 2, and the kernel to 3x3, the resulting image is 2x2 in resolution. The inverse of this process, a fractionally strided convolution, begins by determining the spatial resolution and then performs the convolution. While it is not a mathematical inverse, the process is still useful in certain encoding mechanisms.