Getting Started
Documentation

Apache Samza 0.14 [Docs]

We are very excited to announce the release of Apache Samza 0.14.0. It is a major release with highly anticipated features viz Samza SQL, Azure EventHubs support and AWS Kinesis consumer.

####Enhancements and Bug Fixes Overall, 65 JIRAs were resolved in this release. Here are few highlights

  • SAMZA-1510 Introduce SQL semantics to Samza
  • SAMZA-1438 Implement Producer and consumer for Azure EventHubs
  • SAMZA-1515 Implement Kinesis consumer
  • SAMZA-1486 Checkpoint provider for Azure tables
  • SAMZA-1421 Support for durable state in high-level API
  • SAMZA-1392 Fix performance and correctness issues with concurrent sends and flushes in kafka system producer
  • SAMZA-1406 Enhancements to the Zookeeper based deployment model
  • SAMZA-1321 Support for multi-stage batch processing

####Upgrade Notes

  • Introduced a new mandatory configuration - job.coordination.utils.factory. It impacts applications using non-YARN deployment models. Read more about it here.
  • The following APIs in SystemAdmin have been deprecated in the previous versions and hence, replaced with newer APIs. If you have a custom System implementation, then you have to update to the newer APIs.
    • void createChangelogStream(String streamName, int numOfPartitions); -> boolean createStream(StreamSpec streamSpec);
    • void createCoordinatorStream(String streamName); -> boolean createStream(StreamSpec streamSpec);
    • void validateChangelogStream(String streamName, int numOfPartitions); -> void validateStream(StreamSpec streamSpec) throws StreamValidationException;
  • New API has been added to SystemAdmin that clear a stream.
    • boolean clearStream(StreamSpec streamSpec); Read more about it in the API docs.

####Sources and Artifacts Samza-sources-0.14.tgz

For more details about this release, please check out the release blog post.