Convolutional layer: basic component of CNNs | by Egor Howell | January 2024

What are convolutional layers and how do they enable deep learning for computer vision?

“https://www.flaticon.com/free-icons/neural-network” title=”neural network icons”>Neural network icons created by julioso_fish — Flaticon..

Unlike you and me, computers only work with binary numbers. Therefore, they cannot see or understand an image. However, we can represent images using pixels. For a grayscale image, the smaller the pixel, the darker it is. A pixel takes values between 0 (black) and 255 (white), the numbers in between are a spectrum of grays. This range of numbers is equal to byte In binary, which is ²⁸, this is the smallest unit of work for most computers.

Below is an example image I created in Python and its corresponding pixel values:

Example of a flower-like image divided into pixels. Plot generated by the author in LaTeX.

Using this concept, we can develop algorithms that can see patterns in these pixels to classify images. This is exactly what a Convolutional Neural Network (CNN) does.

Most images are not grayscale and have some color. They are usually represented using RGB where we have three channels which are red, green and blue. Each color is a two-dimensional grid of pixels, which are then stacked on top of each other. So the input image is three-dimensional.

The code used to generate the graph is available on my GitHub:

General description

The key part of CNNs is the convolution operation. I have an entire article detailing how convolution works, but I'll do a quick summary here for completeness. If you want a deep understanding, I recommend you check out the previous post:

Convolutional layer: basic component of CNNs | by Egor Howell | January 2024

Technical Terrence Team

This is how you would invest £20,000 in a stocks and shares ISA to earn passive income for life

Leave a Reply Cancel reply

Recommended.

Some Of The Ways To Compare Two Pandas DataFrames

Sky Mavis teases new features in 'Axie Infinity: Homeland'

HuggingFace introduces TextEnvironments: an orchestrator between a machine learning model and a set of tools (Python functions) that the model can call to solve specific tasks

This FTSE 100 company made four times as much money as Tesla last year. Its shares are trading at a third of the price

SenseTime introduced SenseNova 5.5: setting a new benchmark to compete with GPT-4o in 5 of the 8 key metrics

Categories

Important Links

Convolutional layer: basic component of CNNs | by Egor Howell | January 2024

What are convolutional layers and how do they enable deep learning for computer vision?

General description

Related

Technical Terrence Team

This is how you would invest £20,000 in a stocks and shares ISA to earn passive income for life

Leave a Reply Cancel reply

Recommended.

Some Of The Ways To Compare Two Pandas DataFrames

Sky Mavis teases new features in 'Axie Infinity: Homeland'

HuggingFace introduces TextEnvironments: an orchestrator between a machine learning model and a set of tools (Python functions) that the model can call to solve specific tasks

This FTSE 100 company made four times as much money as Tesla last year. Its shares are trading at a third of the price

SenseTime introduced SenseNova 5.5: setting a new benchmark to compete with GPT-4o in 5 of the 8 key metrics

Categories

Important Links

Get daily news updates to your inbox!