How I dockerized Apache Flink, Kafka and PostgreSQL for real-time data streaming | by Augustus of Nevrezé | June 2024

I embarked on a mission to integrate Apache Flink with Kafka and PostgreSQL using Docker. What makes this effort particularly exciting is the use of pyFlink, the Python version of Flink, which is powerful and relatively rare. This setup aims to efficiently handle real-time data processing and storage. In the following sections, I will demonstrate how I achieved this, discussing the challenges encountered and how I overcame them. I'll conclude with a step-by-step guide so you can create and experiment with this streaming channel yourself.

The infrastructure we will build is illustrated below. Externally, there is a publishing module that simulates messages from IoT sensors, similar to what was discussed in a Previous post. Inside the Docker container, we will create two Kafka topics. The first topic, sensors, will store incoming messages from IoT devices in real time. A Flink application will then consume the messages from this topic, filter out those with temperatures above 30°C, and publish them to a second topic. alerts. Additionally, the Flink application will insert the consumed messages into a PostgreSQL table created specifically for this purpose. This setup allows us to retain sensor data in a structured tabular format, providing opportunities for further transformation and analysis. Visualization tools like Tableau or Power BI can be connected to this data to make real-time charts and dashboards.

Additionally, other clients can consume the alert topic to initiate actions based on the messages it contains, such as activating air conditioning systems or activating fire safety protocols.

Services included in the Docker container – author's image

To follow the tutorial, you can clone the following repository. A docker-compose.yml is placed at the root of the project so that it can initialize the multi-container application. Additionally, you can find detailed instructions in the README file.

Issues with Kafka ports in docker-compose.yml

Initially, I ran into issues with Kafka port configuration when using the confluent Kafka Docker image, a popular choice for these types of setups. This issue became evident through the logs, emphasizing the importance of not running docker-compose up in standalone (-d) mode during the initial configuration and troubleshooting phases.

The reason for the failure was that the internal and external hosts were using the same port, which caused connectivity issues. I fixed this by changing the internal port to 19092. I found this Pretty insightful blog post.

KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:19092,PLAINTEXT_HOST://localhost:9092

Configure Flink in session mode

To run Flink on session mode (allowing multiple jobs on a single cluster), I am using the following directives in docker-compose.yml.

Run Flink applications locally with other services running inside the container – image by author

How I dockerized Apache Flink, Kafka and PostgreSQL for real-time data streaming | by Augustus of Nevrezé | June 2024

Local or containerized configuration

Technical Terrence Team

Royal Caribbean explains its recent change to its loyalty program

Leave a Reply Cancel reply

Recommended.

Alex The Doge pre-sale attracts Twitch gamers, while Hedera features attract Ethereum leaders

Save big on the PS5, headsets, mice, board games and more

Level adds Wi-Fi bridge for its smart locks, but doesn’t support Matter-over-Thread yet

Saudi Arabia stock markets close higher; Tadawul All Shares Up 0.41% By Investing.com

Iconic Celeste Score Receives Acoustic Tribute Album

Categories

Important Links

How I dockerized Apache Flink, Kafka and PostgreSQL for real-time data streaming | by Augustus of Nevrezé | June 2024

Issues with Kafka ports in docker-compose.yml

Configure Flink in session mode

Custom Docker image for PyFlink

Integrating PostgreSQL

Sink data to Kafka

Local or containerized configuration

Related

Technical Terrence Team

Royal Caribbean explains its recent change to its loyalty program

Leave a Reply Cancel reply

Recommended.

Alex The Doge pre-sale attracts Twitch gamers, while Hedera features attract Ethereum leaders

Save big on the PS5, headsets, mice, board games and more

Level adds Wi-Fi bridge for its smart locks, but doesn’t support Matter-over-Thread yet

Saudi Arabia stock markets close higher; Tadawul All Shares Up 0.41% By Investing.com

Iconic Celeste Score Receives Acoustic Tribute Album

Categories

Important Links

Get daily news updates to your inbox!