Best practices

News stories and content move from source to on-line news sites in real-time. Apache Kafka's durable, and fault-tolerant publish-subscribe messaging system is tailor-made to handle the demands of the news industry.

Streaming news articles are medium-sized, roughly around 500KB. The news article messages include text, metadata, and small references to multimedia elements. Message rate is variable, largely dependent on the size of the news articles. Peak activity typically occurs in the morning when new articles are published, with activity trailing off in the evening.

If you’re developing an online news or content streaming system, the inputs below are good default inputs to use for the partition calculator.

Kafka UI for your team

Results

number of producers
3
number of consumer
3
Expected lag
0
number of partitions
3

Business size preset

The size of the company in terms of article production (not number of employees, readership numbers, market capitalization, etc…)
Medium

Number of brokers

The number of brokers in the cluster where the topic will be created.
1 140

Producer processing time

The average time it takes to produce a message in ms.
ms
0.1 ms1,000 ms

Consumer processing time

The average amount of time it takes to consume a message in ms.
ms
0.1 ms1,000 ms

Throughput

The amount of messages that the system should process per second.
msg/s
1 msg/s10,000 msg/s

How to increase partitions

Learn how to to increase your topic partitions and what effects this will have on your cluster.

Read full article

Recommended configuration for a medium business size

  • Desired throughput (T) is quite low, but is an appropriate amount of articles of a newspaper company of this size.
  • Time to produce a single message (P) is 50 ms, given the medium-sized nature of the articles, on average.
  • Time to consume a single message (C) is 1 ms, due to the small effort needed to process the messages.
  • The number of brokers (B) is also very low and balances the need for redundancy and resource management within the given constraints.
  • The expected lag (L) is 0, which is ideal. In the realm of news, consumers demand up-to-the-minute updates, making it critical to keep latency as low as possible.
  • The recommended number of producers (RP) and consumers (RC) is 2, matching the number of brokers. This configuration maintains an optimal balance between the message production rate and consumption rate.
  • The number of partitions for topics (NP) should also be 2, fitting within the broker and cluster partition limits.

Each element of this configuration ensures that drivers and passengers experience minimal latency in location updates, contributing to smooth and responsive services.