Skip to main content

Observability

What you will learn

  • How to use the health endpoint for liveness and readiness probes
  • How to scrape Prometheus metrics from a running worker
  • How to emit CloudEvents during workflow execution
  • How to control log verbosity

Health checks

Zigflow exposes an HTTP health endpoint while the worker is running:

GET http://localhost:3000/health

Returns 200 OK when the worker is healthy. Use this for liveness and readiness probes in any orchestration system.

Change the address with --health-listen-address (default 0.0.0.0:3000).

Kubernetes

The Helm chart configures liveness and readiness probes automatically. No additional configuration is needed for standard deployments.

Docker Compose

healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:3000/health || exit 1"]
interval: 10s
timeout: 5s
retries: 3

Prometheus metrics

Zigflow exposes Prometheus metrics at:

http://localhost:9090/metrics

Change the address with --metrics-listen-address (default 0.0.0.0:9090).

Use --metrics-prefix to add a prefix to all metric names.

CloudEvents metrics

MetricLabelsDescription
zigflow_events_emitted_totalclient, typeTotal events emitted per client and event type
zigflow_events_undelivered_totalclient, typeEvents that failed to deliver
zigflow_events_emit_duration_secondsclient, typeEmit duration histogram

These metrics are only populated when a CloudEvents configuration is active.

Scraping in Kubernetes

When using the Helm chart, add a Prometheus scrape annotation to the pod:

values.yaml
podAnnotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
prometheus.io/path: "/metrics"

CloudEvents

tip

For full configuration options, event structure, file output examples and debugging guidance, see Debugging workflows.

Zigflow can emit CloudEvents v1.0 at key points during workflow execution. This is the primary mechanism for real-time observability into running workflows.

Enable it by providing a configuration file:

zigflow run -f workflow.yaml --cloudevents-config ./cloudevents.yaml

Configuration file

cloudevents.yaml
clients:
- name: file-logger
protocol: file
target: ./tmp/events

- name: http-sink
protocol: http
target: "{{ .env.HTTP_EVENTS_URL }}"
options:
timeout: 5s
method: POST

The target field supports Go template syntax with access to environment variables via .env.

Supported protocols

ProtocolTarget formatNotes
fileDirectory pathEvents written as YAML, one file per workflow execution
httpHTTP URLEvents sent as POST requests

Event types

Event typeEmitted when
dev.zigflow.workflow.startedA workflow execution begins
dev.zigflow.workflow.completedA workflow execution completes successfully
dev.zigflow.task.startedA task begins
dev.zigflow.task.retriedA task is retried after failure
dev.zigflow.task.cancelledA task is cancelled
dev.zigflow.task.faultedA task fails
dev.zigflow.task.completedA task completes successfully
dev.zigflow.iteration.completedA task iteration completes

Important notes

  • CloudEvents are emitted on a best-effort basis.
  • A failed event delivery does not fail the workflow.
  • Event emission must remain within Temporal's determinism constraints.
  • For high-throughput workflows, set appropriate HTTP timeouts to avoid workflow delays.

Logging

Control log verbosity with --log-level:

zigflow run -f workflow.yaml --log-level debug

Valid values: trace, debug, info, warn, error. The code default is info. If the LOG_LEVEL environment variable is set, that value takes precedence.

Logs are structured JSON, written to stderr.