Skip to content

Changes

Started by upstream project gerrit-osmo-bts #1261
Started 9 hr 5 min ago
Queued 5.8 sec
Took 4 min 33 sec on build5-deb12build-ansible
osmo-bts-trx: fix spurious shutdown on first CLCK.ind from osmo-trx

osmo-trx starts its frame counter from a random value rather than 0.
When the first CLCK.ind arrives, last_fn_timer and last_clk_ind are
still zero-initialised (set by trx_sched_clock_started()), so:

* compute_elapsed_fn(0, fn) wraps to a large negative for any fn
  greater than hyperframe/2 (1357824), satisfying elapsed_fn < 0;
* compute_elapsed_us({0,0}, &tv_now) returns the full CLOCK_MONOTONIC
  uptime (potentially days), satisfying the error_us threshold.

Together these trip the stale-clock shutdown introduced in the previous
commit (0199c108), even though the transceiver is perfectly healthy:

DL1C NOTICE scheduler_trx.c:490 GSM clock started, waiting for clock indications
DL1C FATAL scheduler_trx.c:589 Stale CLCK.ind: fn=1456348 is 250957770198 us behind
DOML NOTICE bts_shutdown_fsm.c:268 BTS_SHUTDOWN(bts0){NONE}: Shutting down BTS, exit 1, reason: TRX clock skew too high

Fix by adding clk_ind_received to osmo_trx_clock_state.  On the first
CLCK.ind after a (re)start, skip all elapsed-time checks and directly
bootstrap the scheduler from the reported FN.  The stale-clock
detection remains fully active for every subsequent indication,
where last_clk_ind holds a real baseline.

Change-Id: I25e76e02d29fd8f88130d15d0adfe8d90a017924
Fixes: 0199c108 ("osmo-bts-trx: shut down on stale clock indication from transceiver")
Related: OS#7021
Vadim Yanitskiy at