telega/telemetry
Telemetry events for production observability.
Telega emits telemetry events at every
key point of the update lifecycle, so standard BEAM exporters (PromEx,
opentelemetry_telemetry, custom handlers) work out of the box.
Emitting an event with no attached handlers is nearly free (a single ETS
lookup), so instrumentation costs nothing until you attach a handler.
Event reference
| Event | Measurements | Metadata |
|---|---|---|
telega.update.start | system_time | update_type, chat_id, from_id |
telega.update.stop | duration | update_type, chat_id, from_id |
telega.update.exception | duration | error, update_type, chat_id, from_id |
telega.api_call.start | system_time | method |
telega.api_call.stop | duration | method, status |
telega.api_call.exception | duration | method, error |
telega.api_call.retry | retry_after (ms) | method, attempt |
telega.request_queue.depth | depth | rule_id, priority |
telega.rate_limit.hit | count | update_type, chat_id, from_id |
telega.chat_instance.spawn | count | chat_id, from_id |
telega.chat_instance.terminate | count | key, reason |
telega.flow.step | duration | flow_name, step |
telega.flow.timeout | count | flow_name, step |
telega.flow.cancel | count | flow_name, step |
telega.shutdown.start | system_time | — |
telega.shutdown.stop | duration, drained | timed_out |
- Update and API call events follow the span convention
(
start/stop/exceptionwith a monotonicduration), the same pattern used by Phoenix and Ecto. durationandsystem_timeare in native time units — convert withnative_to_millisecond.telega.update.exceptionfires when a handler returnsError(...), before yourcatch_handlerruns.telega.chat_instance.terminatefires only for abnormal stops;reasontells you which.telega.rate_limit.hitfires for every update rejected byrouter.with_rate_limit.telega.shutdown.start/stopwraptelega.shutdown’s graceful drain;drainedis the number of in-flight updates waited for andtimed_outisTrueif the drain timeout elapsed first.
Attaching a handler
attach_many subscribes one handler (identified by a unique id) to a
list of events. The handler receives the event name, measurements, and
metadata as lists of pairs:
import gleam/int
import gleam/io
import gleam/list
import telega/telemetry
pub fn attach_slow_update_logger() {
telemetry.attach_many(
id: "my-bot-slow-updates",
events: [["telega", "update", "stop"]],
handler: fn(_event, measurements, metadata) {
let assert Ok(duration) = list.key_find(measurements, "duration")
let ms = telemetry.native_to_millisecond(duration)
case ms > 1000 {
True -> {
let update_type = case list.key_find(metadata, "update_type") {
Ok(telemetry.StringValue(t)) -> t
_ -> "unknown"
}
io.println(
"slow update: " <> update_type <> " took " <> int.to_string(ms) <> "ms",
)
}
False -> Nil
}
},
)
}
Call it once at startup, before telega.init_for_polling() /
telega.init(). Detach with telemetry.detach("my-bot-slow-updates").
Handlers run synchronously in the process that emitted the event. Keep them fast, never call the Telegram API from a handler, and offload anything heavy to another process (see below). A handler that crashes is automatically detached by telemetry.
Forwarding events to a process
To get events out of the hot path, forward them to a subject and consume them from your own process (an actor, a metrics aggregator, a test assertion):
import gleam/erlang/process
import telega/telemetry
pub type Event {
Event(
name: List(String),
measurements: List(#(String, Int)),
metadata: List(#(String, telemetry.Value)),
)
}
pub fn attach_forwarder(
id id: String,
events events: List(List(String)),
) -> process.Subject(Event) {
let subject = process.new_subject()
telemetry.attach_many(id:, events:, handler: fn(name, measurements, metadata) {
process.send(subject, Event(name:, measurements:, metadata:))
})
subject
}
This is also the easiest way to assert on telemetry in tests:
pub fn api_call_emits_stop_test() {
let subject =
attach_forwarder(id: "test-api-call", events: [
["telega", "api_call", "stop"],
])
// ... call the bot / client ...
let assert Ok(Event(name:, ..)) = process.receive(subject, 100)
assert name == ["telega", "api_call", "stop"]
telemetry.detach("test-api-call")
}
Exporters
Because events follow the standard telemetry span convention, any BEAM exporter can consume them — attach it to the event names from the table above:
- Prometheus: the
prometheusErlang package — register counters/histograms at startup and update them from anattach_manyhandler (convert durations withnative_to_millisecond). - Elixir releases:
telemetry_metrics+telemetry_metrics_prometheus, or a custom PromEx plugin — declare metrics likecounter("telega.update.stop.duration"). - OpenTelemetry:
opentelemetry_telemetrybridges thestart/stop/exceptionspans into traces.
Types
Handler invoked for each event: event name, measurements, metadata.
pub type EventHandler =
fn(List(String), List(#(String, Int)), List(#(String, Value))) -> Nil
Values
pub fn attach_many(
id id: String,
events events: List(List(String)),
handler handler: fn(
List(String),
List(#(String, Int)),
List(#(String, Value)),
) -> Nil,
) -> Nil
Attach a handler to several events. The id must be unique.
The handler runs synchronously in the process that emitted the event — keep it fast and never call the Telegram API from it. A handler that crashes is detached by telemetry.
pub fn execute(
event event: List(String),
measurements measurements: List(#(String, Int)),
metadata metadata: List(#(String, Value)),
) -> Nil
Emit a telemetry event.
telemetry.execute(["telega", "update", "start"], [#("system_time", now)], [
#("update_type", telemetry.StringValue("text")),
])
pub fn monotonic_time() -> Int
Current monotonic time in native units. Use for measuring durations.
pub fn native_to_millisecond(time time: Int) -> Int
Convert a native time unit value (e.g. a duration measurement) to milliseconds.
pub fn span(
event event: List(String),
metadata metadata: List(#(String, Value)),
run run: fn() -> Result(a, e),
) -> Result(a, e)
Wrap a Result-returning function in a start/stop/exception span,
following the Phoenix/Ecto span convention:
event + [start]withsystem_timebefore the function runsevent + [stop]with monotonicdurationonOkevent + [exception]withdurationand inspectederrormetadata onError