News stories and content move from source to on-line news sites in real-time. Apache Kafka's durable, and fault-tolerant publish-subscribe messaging system is tailor-made to handle the demands of the news industry.
Streaming news articles are medium-sized, roughly around 500KB. The news article messages include text, metadata, and small references to multimedia elements. Message rate is variable, largely dependent on the size of the news articles. Peak activity typically occurs in the morning when new articles are published, with activity trailing off in the evening.
If you’re developing an online news or content streaming system, the inputs below are good default inputs to use for the partition calculator.
- The brokers are of similar capability.
- The load on the brokers’ machines is similar.
- The messages don't diverge too much in size.
- The messages are evenly distributed across all partitions.
- The number of brokers makes sense in this context.
- Brokers have similar latencies between producers and consumers.
- The throughput per producer is less than 10MB/s.
- Individual brokers have less than 40k partitions.
- The cluster has less than 200k partitions in total.