Blog Archives

01: 15+ Apache Kafka must-know basics interview Q&As – Part 1

Apache Kafka is used in Micro Services Architecture (i.e. MSA) to Big Data & Low Latency application architectures.

Q1. What is Apache Kafka?
A1. Apache Kafka is a distributed messaging broker. The purpose of the Kafka project is to provide a unified, high-throughput, and low latency platform for real-time data processing. Kafka delivers the following three key functions:

1) Kafka publish-subscribe paradigm: Kafka supports publish & subscribe model similar to other traditional messaging systems like Active MQ, Rabbit MQ, Websphere MQ, etc. The publish-subscribe paradigm is facilitated by a number of brokers in a cluster, and each broker will have a number of topics to which producers can publish & consumers can subscribe. Each topic is further partitioned to provide parallelism & fault tolerance whereby messages can be published & consumed in parallel. The partitions are replicated (e.g 3 times) across the brokers to give fault tolerance. One partition will act as a leader & the remaining partitions will be followers. Messages can be published to & subscribed from a leader partition. If a broker with a leader partition goes down, one of the follower partitions will be elected as a leader partition from one of the other active brokers.

2) Storage: as Kafka securely stores streaming data in a distributed and fault-tolerant cluster. The data retention period can be configured. The default retention period is 168 hours, which is 1 week.… Read more ...



01: 15+ Apache Kafka must-know basics interview Q&As – Part 2

This extends 8 Apache Kafka must-know basics interview Q&As – Part 1. Q4. What do you understand by the term “data is presented to Kafka as stream”? A4. This means either the Data is acquired from source systems in real time or as a scheduled extract process, the data is…

Read more ...


01: 15+ Apache Kafka must-know basics interview Q&As – Part 3

This extends Apache Kafka must-know basics interview Q&As – Part 2. Q10. What do you understand by the terms Kafka Consumer Groups & group.id? A10. Consumers read from any single partition, allowing you to scale throughput of message consumption as depicted below. Consumers can also be organised into consumer groups…

Read more ...


01: Apache Kafka example with Java – getting started tutorial

Apache Kafka with Java getting started tutorial demonstrates how quickly you can get started with Kafka using Docker.

Step 1: Make sure Docker engine is installed on your computer. For example on a Mac OS $ brew cask install docker or on Windows.

Step 2: Start the Docker engine on your operating system.

Kafka services on Docker

Step 3: Create the below docker-compose.yml file to run your Kafka, zookeeper & Apache Kafka Cluster Visualization (AKHQ) services. The images for these services are sourced from Docker hub.

Read more ...


02: Apache Kafka example with Java Producer & Consumer Tutorial

Apache Kafka with Java getting started tutorial demonstrates how quickly you can get started with Kafka using Docker. This extends Apache Kafka example with Java – getting started tutorial – Part 1.

Step 1: As discussed in part-1, stand-up the Kafka, Zookeeper & Apache Kafka Cluster Visualization (AKHQ) on a Docker container.

Java Producer class ProducerApp.java

Step 2: Create the Java class to produce & send a message to the Kafka broker topic “my-first-topic“.

Read more ...


03: Apache Kafka example with toxiproxy & Producer timeout handling

This extends Apache Kafka example with Java Producer & Consumer Tutorial – Part 2. This covers Apache Kafka example with toxiproxy & producer timeout. The producer code needs to be robust to handle any network issues like timeouts & latency issues. In this tutorial a tool named toxiproxy is used to simulate network issues.

Kafka server on Docker

Step 1: As discussed in part-1, stand-up the Kafka, Zookeeper & Apache Kafka Cluster Visualization (AKHQ) on a Docker container.

Step 2: Kafka is running on localhost:9093

Outputs:

Install toxiproxy

Step 3: Install toxiproxy as described.… Read more ...



05: Apache Kafka Streaming with JSON & Java Tutorial

This extends Apache Kafka JSON example with Java Producer & Consumer Tutorial. In this tutorial, let’s look at KafkaStreams, which enables you to consume from Kafka topics, analyse, transform or aggregate data, and potentially, send it to another Kafka topic.

This kafka tutorial is stateless, which means there is no interaction between individual messages. In the ongoing tutorials we will look at stateful processing like count, aggregate, windowing, etc.

The pom.xml with kafka-streams

Add “kafka-streams” to Maven pom.xml file to worth Kafka streaming. In the Kafka consumer tutorials we had kafka client, Spring Kafka and Jackson libraries.

Here is the complete pom.xml.

Read more ...


06: Apache Kafka Streaming with JSON & Java Tutorial – stateful operations groupByKey, windowing & aggregate

This extends Apache Kafka Streaming with JSON & Java Tutorial and you can either use Kafka on Docker as per Apache Kafka example with Java – getting started tutorial or install Kafka on Mac (i.e Getting started with Apache Kafka on Mac Tutorial) or Windows. This kafka tutorial is stateful…

Read more ...


07: Apache Kafka Streaming with stateless operations

This extends 06: Apache Kafka Streaming with JSON & Java Tutorial – stateful operations groupByKey, windowing & aggregate and Getting started with Apache Kafka on Mac Tutorial.

Kafka Streams DSL (i.e. Domain Specific Language) is built on top of the Streams Processor API. It is recommended to use DSL as most data processing operations can be expressed in just a few lines of DSL code. Kafka streaming can be stateful or stateless. Stateful operations are dependent on previous events of the stream whereas stateless operations are not.

Let’s look at a non-trivial stateless streaming example.

#1. Maven pom.xml

Read more ...


500+ Enterprise & Core Java programmer & architect Q&As

Java & Big Data Tutorials

Top