Changes
#11587 (Jun 30, 2026, 9:58:07 AM)
osmo-bts-trx: fix spurious shutdown on first CLCK.ind from osmo-trx
osmo-trx starts its frame counter from a random value rather than 0.
When the first CLCK.ind arrives, last_fn_timer and last_clk_ind are
still zero-initialised (set by trx_sched_clock_started()), so:
* compute_elapsed_fn(0, fn) wraps to a large negative for any fn
greater than hyperframe/2 (1357824), satisfying elapsed_fn < 0;
* compute_elapsed_us({0,0}, &tv_now) returns the full CLOCK_MONOTONIC
uptime (potentially days), satisfying the error_us threshold.
Together these trip the stale-clock shutdown introduced in the previous
commit (0199c108), even though the transceiver is perfectly healthy:
DL1C NOTICE scheduler_trx.c:490 GSM clock started, waiting for clock indications
DL1C FATAL scheduler_trx.c:589 Stale CLCK.ind: fn=1456348 is 250957770198 us behind
DOML NOTICE bts_shutdown_fsm.c:268 BTS_SHUTDOWN(bts0){NONE}: Shutting down BTS, exit 1, reason: TRX clock skew too high
Fix by adding clk_ind_received to osmo_trx_clock_state. On the first
CLCK.ind after a (re)start, skip all elapsed-time checks and directly
bootstrap the scheduler from the reported FN. The stale-clock
detection remains fully active for every subsequent indication,
where last_clk_ind holds a real baseline.
Change-Id: I25e76e02d29fd8f88130d15d0adfe8d90a017924
Fixes: 0199c108 ("osmo-bts-trx: shut down on stale clock indication from transceiver")
Related: OS#7021
osmo-trx starts its frame counter from a random value rather than 0.
When the first CLCK.ind arrives, last_fn_timer and last_clk_ind are
still zero-initialised (set by trx_sched_clock_started()), so:
* compute_elapsed_fn(0, fn) wraps to a large negative for any fn
greater than hyperframe/2 (1357824), satisfying elapsed_fn < 0;
* compute_elapsed_us({0,0}, &tv_now) returns the full CLOCK_MONOTONIC
uptime (potentially days), satisfying the error_us threshold.
Together these trip the stale-clock shutdown introduced in the previous
commit (0199c108), even though the transceiver is perfectly healthy:
DL1C NOTICE scheduler_trx.c:490 GSM clock started, waiting for clock indications
DL1C FATAL scheduler_trx.c:589 Stale CLCK.ind: fn=1456348 is 250957770198 us behind
DOML NOTICE bts_shutdown_fsm.c:268 BTS_SHUTDOWN(bts0){NONE}: Shutting down BTS, exit 1, reason: TRX clock skew too high
Fix by adding clk_ind_received to osmo_trx_clock_state. On the first
CLCK.ind after a (re)start, skip all elapsed-time checks and directly
bootstrap the scheduler from the reported FN. The stale-clock
detection remains fully active for every subsequent indication,
where last_clk_ind holds a real baseline.
Change-Id: I25e76e02d29fd8f88130d15d0adfe8d90a017924
Fixes: 0199c108 ("osmo-bts-trx: shut down on stale clock indication from transceiver")
Related: OS#7021
osmo-bts-trx: fix spurious clock skew shutdown after self-compensation
When the BTS runs ahead of the transceiver (elapsed_fn < 0),
trx_sched_clock() reschedules the timerfd to deliberately delay the
next FN. osmo_timerfd_schedule() resets the timerfd and discards any
accumulated expirations, but last_fn_timer.tv was left pointing at
the previous callback. The next trx_fn_timer_cb() then measures
elapsed_us all the way back to that previous callback - spanning the
deliberate delay (or any OS stall that preceded us) - and falsely
trips the "PC clock skew too high" check, shutting the BTS down
for no good reason.
Advance last_fn_timer.tv to the projected firing time of the
rescheduled timer so that the next callback measures roughly
one FN interval, as expected.
Change-Id: Icdb7db8abe70258ae008d9514b6608bd74bb2881
AI-Assisted: yes (Claude)
Related: OS#6794
When the BTS runs ahead of the transceiver (elapsed_fn < 0),
trx_sched_clock() reschedules the timerfd to deliberately delay the
next FN. osmo_timerfd_schedule() resets the timerfd and discards any
accumulated expirations, but last_fn_timer.tv was left pointing at
the previous callback. The next trx_fn_timer_cb() then measures
elapsed_us all the way back to that previous callback - spanning the
deliberate delay (or any OS stall that preceded us) - and falsely
trips the "PC clock skew too high" check, shutting the BTS down
for no good reason.
Advance last_fn_timer.tv to the projected firing time of the
rescheduled timer so that the next callback measures roughly
one FN interval, as expected.
Change-Id: Icdb7db8abe70258ae008d9514b6608bd74bb2881
AI-Assisted: yes (Claude)
Related: OS#6794
l1sap: fix duplicate RF RESOURCE INDICATION on clock bootstrap
The TTCN-3 test suite (ttcn3-bts-test) expects to receive exactly one
RF RESOURCE INDICATION message from each TRX during the bootstrap stage,
while waiting for all TRX to come up and be configured by the BSC.
l1sap_interf_meas_report() fires whenever bts->gsm_time.fn % period is
0, where period = intave * 104 (typically 624 frames). Since CLCK.ind
with FN=0 satisfies this condition, a report is sent at the very
beginning of each clock epoch.
This was not a problem before commit fcfc4e83, because the first
CLCK.ind from the transciever was effectively a no-op: with
last_fn_timer.fn zero-initialised, the first indication at FN=0 yielded
elapsed_fn=0 (not > MAX_FN_SKEW), and the catch-up loop (while fn !=
last_fn_timer.fn) would not execute either. Downlink scheduling only
started on the second CLCK.ind (at FN=102, which is > MAX_FN_SKEW),
and 102 % 624 != 0, so no RF RESOURCE INDICATION was triggered.
fcfc4e83 changed the logic so that Downlink scheduling now begins
immediately on the first CLCK.ind, via an unconditional call to
trx_setup_clock() -> bts_sched_fn(fn). When fake_trx starts its frame
counter from FN=0, this immediately triggers l1sap_interf_meas_report()
because 0 % 624 == 0. A second report follows ~2.88s later when the
periodic timer reaches FN=624, making the bootstrap logic
in ttcn3-bts-test unhappy.
Fix by shifting the trigger to (fn + 1) % period == 0, i.e. the report
fires at the last frame of each period rather than the first. FN=0 now
yields (0+1) % 624 = 1 != 0, suppressing the spurious bootstrap report.
The periodic behaviour and report cadence are otherwise unchanged.
Change-Id: I6550178427b08e67c9763f0f37efff5b88960b1f
Related: fcfc4e83 ("osmo-bts-trx: fix spurious shutdown on first CLCK.ind from osmo-trx")
AI-Assisted: yes (Claude)
The TTCN-3 test suite (ttcn3-bts-test) expects to receive exactly one
RF RESOURCE INDICATION message from each TRX during the bootstrap stage,
while waiting for all TRX to come up and be configured by the BSC.
l1sap_interf_meas_report() fires whenever bts->gsm_time.fn % period is
0, where period = intave * 104 (typically 624 frames). Since CLCK.ind
with FN=0 satisfies this condition, a report is sent at the very
beginning of each clock epoch.
This was not a problem before commit fcfc4e83, because the first
CLCK.ind from the transciever was effectively a no-op: with
last_fn_timer.fn zero-initialised, the first indication at FN=0 yielded
elapsed_fn=0 (not > MAX_FN_SKEW), and the catch-up loop (while fn !=
last_fn_timer.fn) would not execute either. Downlink scheduling only
started on the second CLCK.ind (at FN=102, which is > MAX_FN_SKEW),
and 102 % 624 != 0, so no RF RESOURCE INDICATION was triggered.
fcfc4e83 changed the logic so that Downlink scheduling now begins
immediately on the first CLCK.ind, via an unconditional call to
trx_setup_clock() -> bts_sched_fn(fn). When fake_trx starts its frame
counter from FN=0, this immediately triggers l1sap_interf_meas_report()
because 0 % 624 == 0. A second report follows ~2.88s later when the
periodic timer reaches FN=624, making the bootstrap logic
in ttcn3-bts-test unhappy.
Fix by shifting the trigger to (fn + 1) % period == 0, i.e. the report
fires at the last frame of each period rather than the first. FN=0 now
yields (0+1) % 624 = 1 != 0, suppressing the spurious bootstrap report.
The periodic behaviour and report cadence are otherwise unchanged.
Change-Id: I6550178427b08e67c9763f0f37efff5b88960b1f
Related: fcfc4e83 ("osmo-bts-trx: fix spurious shutdown on first CLCK.ind from osmo-trx")
AI-Assisted: yes (Claude)
oml: validate Intave Parameter range in SET BTS ATTR
3GPP TS 52.021 §9.4.24 defines valid range for the Intave Parameter
as 1..31, matching the fixed size of the per-lchan interference sample
buffer (interf_meas_dbm[31] in lchan.h). Previously any uint8_t value
was accepted without validation, meaning a buggy BSC could send
intave=0 (silently disabling interference reporting) or intave>31
(causing a buffer overflow in gsm_lchan_interf_meas_push()).
Let's guard against that by NACKing the SET BTS ATTR message with
cause=NM_NACK_PARAM_RANGE if the value is outside the valid range.
Change-Id: Id4d3353d4397aaa2517091b020d38ee15e084e2c
AI-Assisted: yes (Claude)
3GPP TS 52.021 §9.4.24 defines valid range for the Intave Parameter
as 1..31, matching the fixed size of the per-lchan interference sample
buffer (interf_meas_dbm[31] in lchan.h). Previously any uint8_t value
was accepted without validation, meaning a buggy BSC could send
intave=0 (silently disabling interference reporting) or intave>31
(causing a buffer overflow in gsm_lchan_interf_meas_push()).
Let's guard against that by NACKing the SET BTS ATTR message with
cause=NM_NACK_PARAM_RANGE if the value is outside the valid range.
Change-Id: Id4d3353d4397aaa2517091b020d38ee15e084e2c
AI-Assisted: yes (Claude)
#11570 (Jun 22, 2026, 7:58:09 AM)
struct gsm_bts: drop unused ms_max_power
Change-Id: I0b02015db8b8e670eaff40c578f0474d9be9bb45
Change-Id: I0b02015db8b8e670eaff40c578f0474d9be9bb45
tests/meas: remove unused 'delta'
gcc 16.1.1 emits a -Wunused-but-set-variable warning.
Change-Id: I2540d701743caefb4bf54bb5b4ebe683d3257071
gcc 16.1.1 emits a -Wunused-but-set-variable warning.
Change-Id: I2540d701743caefb4bf54bb5b4ebe683d3257071
common: track whether gsm_time has been initialized
l1sap_info_time_ind() used 'bts->gsm_time.fn != 0' as a proxy for
"we have a previous frame number to diff against". This is unreliable:
Fn=0 is a _valid_ frame number, recurring on every hyperframe wrap.
If gsm_time.fn happened to be 0 and the next time indication jumped
forward by more than one frame, the real gap was silently swallowed.
It also gave no clean way to suppress the bogus "Invalid condition
detected: Frame difference is ..." message that appears when the PHY
(re)starts its TDMA frame number (e.g. from 0) on bring-up.
Introduce an explicit 'bts->gsm_time_valid' flag instead:
* l1sap_info_time_ind() treats the first indication of an epoch as
having no gap (frames_expired = 0): no warning, no RACH-slot
accounting;
* the flag is cleared in st_op_disabled_notinstalled_on_enter(), so
each BTS bring-up starts a fresh clock epoch regardless of which
FN the PHY reports first.
Change-Id: I7022b0ad084a0c224f7e8c04aca0648915b1a1c6
AI-Assisted: yes (Claude)
Related: OS#7020
l1sap_info_time_ind() used 'bts->gsm_time.fn != 0' as a proxy for
"we have a previous frame number to diff against". This is unreliable:
Fn=0 is a _valid_ frame number, recurring on every hyperframe wrap.
If gsm_time.fn happened to be 0 and the next time indication jumped
forward by more than one frame, the real gap was silently swallowed.
It also gave no clean way to suppress the bogus "Invalid condition
detected: Frame difference is ..." message that appears when the PHY
(re)starts its TDMA frame number (e.g. from 0) on bring-up.
Introduce an explicit 'bts->gsm_time_valid' flag instead:
* l1sap_info_time_ind() treats the first indication of an epoch as
having no gap (frames_expired = 0): no warning, no RACH-slot
accounting;
* the flag is cleared in st_op_disabled_notinstalled_on_enter(), so
each BTS bring-up starts a fresh clock epoch regardless of which
FN the PHY reports first.
Change-Id: I7022b0ad084a0c224f7e8c04aca0648915b1a1c6
AI-Assisted: yes (Claude)
Related: OS#7020
osmo-bts-trx: shut down on stale clock indication from transceiver
We expect the transceiver to be a reliable, monotonic clock source.
If it reports an FN far behind our local timer (elapsed_fn < 0) while
far more wall-clock time elapsed than its FN advance accounts for,
its clock has likely stalled and the indication carries a stale frame
number. Acting on it drags the scheduler backwards and re-transmits
already-sent TDMA frames, corrupting lchan-internal state(s).
Detect this and shut down the process, same rationale as the existing
"PC clock skew too high" check in trx_fn_timer_cb().
Change-Id: If787ab7ed70aa2dcb0389ceb58620c2302c3431a
AI-Assisted: yes (Claude)
Related: OS#7020, OS#6794
We expect the transceiver to be a reliable, monotonic clock source.
If it reports an FN far behind our local timer (elapsed_fn < 0) while
far more wall-clock time elapsed than its FN advance accounts for,
its clock has likely stalled and the indication carries a stale frame
number. Acting on it drags the scheduler backwards and re-transmits
already-sent TDMA frames, corrupting lchan-internal state(s).
Detect this and shut down the process, same rationale as the existing
"PC clock skew too high" check in trx_fn_timer_cb().
Change-Id: If787ab7ed70aa2dcb0389ceb58620c2302c3431a
AI-Assisted: yes (Claude)
Related: OS#7020, OS#6794