The Orchestrator: State Transitions and Kafka Routing
In the previous post, I explained why I chose the Saga Pattern over distributed transactions. Now let's look at the central piece: the orchestrator. The orchestrator is the brain of the system. It receives events from all services and decides what happens next. It doesn't hold any business logic. It doesn't talk to databases. It just routes messages based on a state transition table. The entire saga flow is defined in a single static array. Each row maps a (source, status) pair to the next Kafka topic: public final class SagaHandler { public static final Object[][] SAGA_HANDLER = { { ORCHESTRATOR, SUCCESS, PRODUCT_VALIDATION_SUCCESS }, { ORCHESTRATOR, FAIL, FINISH_FAIL }, { PRODUCT_VALIDATION_SERVICE, ROLLBACK, PRODUCT_VALIDATION_FAIL }, { PRODUCT_VALIDATION_SERVICE, FAIL, FINISH_FAIL }, { PRODUCT_VALIDATION_SERVICE, SUCCESS, PAYMENT_SUCCESS }, { PAYMENT_SERVICE, ROLLBACK, PAYMENT_FAIL }, { PAYMENT_SERVICE, FAIL, PRODUCT_VALIDATION_FAIL }, { PAYMENT_SERVICE, SUCCESS, INVENTORY_SUCCESS }, { INVENTORY_SERVICE, ROLLBACK, INVENTORY_FAIL }, { INVENTORY_SERVICE, FAIL, PAYMENT_FAIL }, { INVENTORY_SERVICE, SUCCESS, FINISH_SUCCESS } }; public static final int EVENT_SOURCE_INDEX = 0; public static final int SAGA_STATUS_INDEX = 1; public static final int TOPIC_INDEX = 2; } Read it like this: when PAYMENT_SERVICE sends SUCCESS, publish to inventory-success. When INVENTORY_SERVICE sends FAIL, publish to payment-fail (rollback the payment). When INVENTORY_SERVICE sends SUCCESS, publish to finish-success (saga complete). This table is the entire orchestration logic. Adding a new step means adding rows. Changing the order means reordering rows. No if/else chains. No complex routing code. The SagaExecutionController looks up the table on every event: @Component public class SagaExecutionController { public TopicsEnum getNextTopic(Event event) { if (isEmpty(event.getSource()) || isEmpty(event.getStatus())) { throw new ValidationException("Source and status must be informed."); } return findTopicBySourceAndStatus(event); } private TopicsEnum findTopicBySourceAndStatus(Event event) { return (TopicsEnum) Arrays.stream(SAGA_HANDLER) .filter(row -> isEventSourceAndStatusValid(event, row)) .map(i -> i[TOPIC_INDEX]) .findFirst() .orElseThrow(() -> new ValidationException("Topic not found!")); } private boolean isEventSourceAndStatusValid(Event event, Object[] row) { var source = row[EVENT_SOURCE_INDEX]; var status = row[SAGA_STATUS_INDEX]; return source.toString().equals(event.getSource()) && status.equals(event.getStatus()); } } It streams through the table, finds the matching row, and returns the topic. No switch statements. No service-specific logic. Just a lookup. The orchestrator listens on multiple topics. Each one triggers a different action: @KafkaListener( groupId = "${spring.kafka.consumer.group-id}", topics = "${spring.kafka.topic.start-saga}") public void consumeStartSagaEvent(String payload) { var event = jsonUtil.toEvent(payload).orElseThrow(); orchestrationService.startSaga(event); } @KafkaListener( groupId = "${spring.kafka.consumer.group-id}", topics = "${spring.kafka.topic.orchestrator}") public void consumeOrchestratorEvent(String payload) { var event = jsonUtil.toEvent(payload).orElseThrow(); switch (event.getStatus()) { case SUCCESS -> orchestrationService.continueSaga(event); case ROLLBACK -> orchestrationService.rollbackSaga(event); case FAIL -> orchestrationService.handleFail(event); } } Every service publishes back to the orchestrator topic. The orchestrator reads the status and decides the next action. SUCCESS means continue to the next step. ROLLBACK means the current service failed and needs its own compensation first. FAIL means the service already rolled back and now the previous service needs to compensate. The OrchestrationService ties it all together: public void startSaga(Event event) { event.setSource(ORCHESTRATOR.toString()); event.setStatus(SUCCESS); var topic = getTopic(event); addHistory(event, "Saga started!"); sendToProducerWithTopic(event, topic); } public void continueSaga(Event event) { var topic = getTopic(event); sendToProducerWithTopic(event, topic); } public void finishSagaSuccess(Event event) { event.setSource(ORCHESTRATOR.toString()); event.setStatus(SUCCESS); addHistory(event, "Saga finished successfully!"); notifyFinishedSaga(event); } public void finishSagaFail(Event event) { event.setSource(ORCHESTRATOR.toString()); event.setStatus(FAIL); addHistory(event, "Saga finished with errors!"); notifyFinishedSaga(event); } startSaga looks up the first topic in the table (product-validation-success) and publishes. continueSaga does the same lookup based on whoever just completed their step. finishSagaSuccess and finishSagaFail publish to the notify-ending topic so the order-service can update the final status. Every service has clear boundaries. Each one consumes from its own success/fail topics and produces back to the orchestrator topic: Service Consumes Produces order-service notify-ending start-saga orchestrator start-saga, orchestrator, finish-success, finish-fail All service topics + notify-ending product-validation product-validation-success, product-validation-fail orchestrator payment-service payment-success, payment-fail orchestrator inventory-service inventory-success, inventory-fail orchestrator The orchestrator is the only service that publishes to multiple topics. Every other service publishes to exactly one: orchestrator. Every time the orchestrator processes an event, it appends a History entry: private void addHistory(Event event, String message) { var history = History.builder() .source(event.getSource()) .status(event.getStatus().toString()) .message(message) .createdAt(LocalDateTime.now()) .build(); event.addToHistory(history); } By the time a saga ends, the event carries a complete timeline. Every service, every status change, every message. This is what makes debugging possible. You don't need to search through logs. The event itself tells you exactly what happened. Notice that the orchestrator has no database. It doesn't persist saga state between messages. The entire state travels inside the Event object through Kafka. This is a deliberate choice. The orchestrator can restart at any time without losing state. Kafka retains the messages. The event carries all context. If the orchestrator crashes mid-saga, the unprocessed message is still on the topic and gets picked up after restart. The downside: you can't query "what sagas are currently in progress" from the orchestrator. That's the order-service's job (it stores all events in MongoDB and the notify-ending consumer updates the final status). The state machine handles the happy path and knows which topics to publish on failure. But what actually happens inside each service when a rollback is triggered? In the next post, I'll walk through the compensation logic: how payment-service refunds a charge and how inventory-service restores stock. The repo: github.com/pedrop3/saga-orchestration
