Skip to content
Success

Changes

Summary

  1. Revert "Revert "trx_toolkit/transceiver.py: implement the transmit burst (details)
  2. trxcon: Advance Uplink TDMA Fn by default again (details)
  3. trx_toolkit/transceiver: Do not forward nor log from under tx_queue_lock (details)
  4. trx_toolkit/transceiver: Do not scan tx_queue twice on tx path (details)
  5. trx_toolkit/transceiver: Use with tx_queue_lock instead of manual (details)
  6. trx_toolkit/*: Represent bursts as arrays instead of lists (details)
  7. trx_toolkit/*: Try to avoid copying burst data where possible (details)
Commit 0f4714776a9c9b64c4a7268eb8a346f304835565 by Kirill Smelkov
Revert "Revert "trx_toolkit/transceiver.py: implement the transmit burst queue""

This reverts commit d4ed09df57b3461470af501e9687ddd80eb78838,
reinstating tx queue into fake_trx.

It is ok to do so because, as explained in abc63d8d (trx_toolkit/clck_gen.py:
Fix clock generator not to accumulate timing error), the
reason for GSM clock jitter problem was timing error accumulation in
CLCKgen, not problems with py threading.Event.

Note: this restores original tx queue implementation basically as-is
with only resolve minor conflicts during the revert. The original tx
queue implementation wastes CPU cycles though because it linearly scans
the whole tx queue at every TDMA frame. If that CPU usage becomes a real
problem it should be straightforward to fix by reworking tx queue to use
priority queue instead of unordered array via heapq module from standard
library. See https://docs.python.org/3/library/heapq.html for details.

The follow-up patches will make necessarry adjastments for tx-queue to
function properly.

Related: OS#4658, OS#6672
Change-Id: I41291708effdd2c767be680fff22ffbd9a56815e
The file was modifiedsrc/target/trx_toolkit/transceiver.py
The file was modifiedsrc/target/trx_toolkit/fake_trx.py
Commit c80e193f6d95367e764684a6021ede981f44ebbd by Kirill Smelkov
trxcon: Advance Uplink TDMA Fn by default again

This essentially reverts 923e9b0b (trxcon: do not advance Uplink TDMA Fn
by default; I838b1ebc54e4c5d116f8af2155d97215a6133ba4) for the following
reason:

In trxcon TRX clock is unused, because the signal from BTS is used as
the master clock source instead (see 45c821ae/Ic8a5b6277c6b16392026e0557376257d71c9d230
"trxcon: get rid of the timer driven clock module" for details".

Before restoration of tx-queue in fake_trx this was working ok even with
fn-advance=0 on Ms side, but after I41291708effdd2c767be680fff22ffbd9a56815e
(Revert "Revert "trx_toolkit/transceiver.py: implement the transmit
burst queue"") fake_trx is sending frames having Fn when exactly same Fn
happens corresponding on fake_trx clock. This results in BTS frames
(that are sent with fn-advance=2 by default (see
I7da3d0948f38e12342fb714b29f8edc5e9d0933d in osmo-bts.git and OS#4487)
to be queued, waited to be sent, and then actually sent to Ms on
fn=msg.fn . And then even if Ms replies immediately with that same fn,
that message will be dropped by fake_trx as stalled, because fake_trx
thinks that the message is too late since that fn already happened
according to fake_trx clock.

Here is a trace of how that looks like with 1 BTS and 1 MS(*):

      7.106.927 CLOCK   fn=80 # fake_trx running
      7.111.592 CLOCK   fn=81
      7.116.289 CLOCK   fn=82
      7.120.949 CLOCK   fn=83
      7.125.523 CLOCK   fn=84
      7.130.000 CLOCK   fn=85
      7.134.575 CLOCK   fn=86

      ...

      7.209.222 CLOCK   fn=102
      7.209.897 BTS -> fn=104   tn=0 # BTS starts to emit RF
      7.210.221 BTS -> fn=104   tn=1
      7.210.556 BTS -> fn=104   tn=2
      7.210.796 BTS -> fn=104   tn=3
      7.211.019 BTS -> fn=104   tn=4
      7.211.234 BTS -> fn=104   tn=5
      7.211.479 BTS -> fn=104   tn=6
      7.211.768 BTS -> fn=104   tn=7
      7.213.086 CLOCK   fn=103
      7.214.354 BTS -> fn=105   tn=0
      7.214.566 BTS -> fn=105   tn=1
      7.214.685 BTS -> fn=105   tn=2
      7.214.792 BTS -> fn=105   tn=3
      7.214.890 BTS -> fn=105   tn=4
      7.214.985 BTS -> fn=105   tn=5
      7.215.083 BTS -> fn=105   tn=6
      7.215.184 BTS -> fn=105   tn=7
      7.217.823 CLOCK   fn=104
      7.218.869 BTS -> fn=106   tn=0
      7.219.092 BTS -> fn=106   tn=1
      7.219.224 BTS -> fn=106   tn=2
      7.219.330 BTS -> fn=106   tn=3
      7.219.431 BTS -> fn=106   tn=4
      7.219.527 BTS -> fn=106   tn=5
      7.219.621 BTS -> fn=106   tn=6
      7.219.718 BTS -> fn=106   tn=7
      7.222.535 CLOCK   fn=105

      ...

      9.995.869 CLOCK   fn=706 # MS will soon connect.
      9.997.138 BTS -> fn=709   tn=0 # Note: BTS is sending fn=709 before CLOCK fn=707
      9.997.338 BTS -> fn=709   tn=1 #       so this messages become queued before CLOCK fn=709 happens
      9.997.444 BTS -> fn=709   tn=2
      9.997.535 BTS -> fn=709   tn=3
      9.997.620 BTS -> fn=709   tn=4
      9.997.708 BTS -> fn=709   tn=5
      9.997.789 BTS -> fn=709   tn=6
      9.997.874 BTS -> fn=709   tn=7
     10.000.583 CLOCK   fn=707
     10.001.735 BTS -> fn=710   tn=0
     10.001.932 BTS -> fn=710   tn=1
     10.002.041 BTS -> fn=710   tn=2
     10.002.134 BTS -> fn=710   tn=3
     10.002.220 BTS -> fn=710   tn=4
     10.002.373 BTS -> fn=710   tn=5
     10.002.459 BTS -> fn=710   tn=6
     10.002.718 BTS -> fn=710   tn=7

    [DEBUG] ctrl_if_trx.py:115 (MS) Recv POWEROFF cmd # MS starts to connect
    [INFO] ctrl_if_trx.py:117 (MS) Stopping transceiver...
    [DEBUG] ctrl_if_trx.py:229 (MS) Ignore CMD ECHO
     10.005.203 CLOCK   fn=708
    [DEBUG] ctrl_if_trx.py:229 (MS) Ignore CMD SETSLOT
     10.006.406 BTS -> fn=711   tn=0
    [DEBUG] ctrl_if_trx.py:124 (MS) Recv RXTUNE cmd
     10.006.999 BTS -> fn=711   tn=1
     10.007.153 BTS -> fn=711   tn=2
    [DEBUG] ctrl_if_trx.py:131 (MS) Recv TXTUNE cmd
     10.007.590 BTS -> fn=711   tn=3
     10.007.728 BTS -> fn=711   tn=4
    [DEBUG] ctrl_if_trx.py:97 (MS) Recv POWERON CMD # MS connected and activated RF
    [INFO] ctrl_if_trx.py:109 (MS) Starting transceiver...
     10.008.344 BTS -> fn=711   tn=5
     10.008.471 BTS -> fn=711   tn=6
     10.008.563 BTS -> fn=711   tn=7

     10.009.868 CLOCK   fn=709 # CLOCK fn=709 happens

     10.009.987 MS  <- fn=709   tn=0 # messages of BTS queued previously with that fn=709 are forwarded to Ms
     10.010.696 MS  <- fn=709   tn=1
     10.010.904 MS  -> fn=709   tn=0 # <-- MS sends UL message with that same fn=709 _before_ CLOCK fn=710
     10.011.397 BTS -> fn=712   tn=0
     10.011.507 MS  <- fn=709   tn=2
     10.011.770 MS  <- fn=709   tn=3
     10.011.968 MS  <- fn=709   tn=4
     10.012.156 MS  <- fn=709   tn=5
     10.012.342 MS  <- fn=709   tn=6
     10.012.527 MS  <- fn=709   tn=7
     10.012.914 BTS <- fn=709   tn=0
     10.013.166 BTS -> fn=712   tn=1
     10.013.524 MS  -> fn=709   tn=1 # <-- MS sends UL message with that same fn=709 _before_ CLOCK fn=710
     10.013.832 BTS -> fn=712   tn=2
     10.013.949 MS  -> fn=709   tn=2 # <-- MS sends UL message with that same fn=709 _before_ CLOCK fn=710
     10.014.081 BTS -> fn=712   tn=3
     10.014.177 MS  -> fn=709   tn=3 # <-- MS sends UL message with that same fn=709 _before_ CLOCK fn=710
     10.014.361 BTS -> fn=712   tn=4

     10.014.475 CLOCK   fn=710 # but most of those messages of MS with fn=709 are not picked up
     10.014.713 MS  -> fn=709   tn=4 # instantly and so become dropped as stale on CLOCK fn=710
     10.014.815 MS  <- fn=710   tn=0
     10.015.032 BTS -> fn=712   tn=5
     10.015.687 MS  <- fn=710   tn=1
     10.016.189 MS  <- fn=710   tn=2
     10.016.464 MS  <- fn=710   tn=3
     10.016.648 MS  <- fn=710   tn=4
     10.016.882 MS  <- fn=710   tn=5
     10.017.110 MS  <- fn=710   tn=6
     10.017.336 MS  <- fn=710   tn=7
    [WARNING] transceiver.py:321 (MS) Stale TRXD message (fn=710): fn=709 tn=1 pwr=0
    [WARNING] transceiver.py:321 (MS) Stale TRXD message (fn=710): fn=709 tn=2 pwr=0
    [WARNING] transceiver.py:321 (MS) Stale TRXD message (fn=710): fn=709 tn=3 pwr=0
    [WARNING] transceiver.py:321 (MS) Stale TRXD message (fn=710): fn=709 tn=4 pwr=0

So without adding some fn-advance it is practically not possible for Ms
to be on time with tx-queueing on TRX even if Ms sends its uplink frames
right immediately after receiving downlink ones.

This way Ms fn-advance has to be 1 at the minimum, so that immediate UL
replies can in principle arrive before fn+1 happens on fake_trx side,
even for tn=7. And it is also better to increase fn-advance once more by
another +1, to compensate for possible jitter due to OS scheduling
latencies and similar things. This way default fn-advance=2 on Ms side
becomes symmetric to default fn-advance on BTS side and Ms<->BTS
exchange starts to work ok even with tx-queueing activated on fake_trx.

In theory it should be possible to reduce those fn-advances to 1 on both
sides, but that will likely require to switch clock granularity from Fn
to Tn increasing precision by an order of magnitude, which will likely
also result in the need to make architectural change of moving trx to
work inside BTS and MS instead of being separate service processes.
That's a big task and I'm not delving into that here.

Note: Uplink Fn advance > 0 is needed for Ms when working with regular
TRX'es as well. The reason is exactly the same as explained above. In
923e9b0b the reason for setting fn-advance=0 by default was that trxcon
is usually being used with fake_trx, and that with fake_trx it is not
needed. But after reenabling tx-queueing we have to revisit even
fake_trx case again.

(*) the trace was captured with the help of the following debugging patch:

    --- b/src/target/trx_toolkit/burst_fwd.py
    +++ a/src/target/trx_toolkit/burst_fwd.py
    @@ -22,6 +22,18 @@

     from trx_list import TRXList

    +import sys, time
    +
    +# trace logs msg to stderr with also marking it with high-resolution timestamp.
    +t0 = time.time()
    +def trace(msg):
    +    t = time.time() - t0
    +    t_ms = int(t * 1e3) / 1e3
    +    us = int((t - t_ms) * 1e6)
    +    print("%7.3f.%03d %s" % (t_ms, us, msg), file=sys.stderr)
    +
     class BurstForwarder(TRXList):
     """ Performs burst forwarding between transceivers.

    @@ -63,6 +75,7 @@ def forward_msg(self, src_trx, rx_msg):
     if trx.get_rx_freq(rx_msg.fn) != tx_freq:
     continue

    + trace("%s\t<- fn=%d\ttn=%d" % (trx, rx_msg.fn, rx_msg.tn))
     # Transform from TxMsg to RxMsg and forward
     tx_msg = rx_msg.trans(ver = trx.data_if._hdr_ver)
     trx.handle_data_msg(src_trx, rx_msg, tx_msg)

    --- b/src/target/trx_toolkit/fake_trx.py
    +++ a/src/target/trx_toolkit/fake_trx.py
    @@ -29,7 +29,7 @@
     import re

     from app_common import ApplicationBase
    -from burst_fwd import BurstForwarder
    +from burst_fwd import BurstForwarder, trace
     from transceiver import Transceiver
     from data_msg import Modulation
     from clck_gen import CLCKGen
    @@ -473,6 +473,7 @@ def run(self):

     # This method will be called by the clock thread
     def clck_handler(self, fn):
    + trace("CLOCK\tfn=%d" % fn)
     # We assume that this list is immutable at run-time
     for trx in self.trx_list.trx_list:
     trx.clck_tick(self.burst_fwd, fn)

    --- b/src/target/trx_toolkit/transceiver.py
    +++ a/src/target/trx_toolkit/transceiver.py
    @@ -25,6 +25,7 @@
     from data_if import DATAInterface
     from udp_link import UDPLink
     from trx_list import TRXList
    +from burst_fwd import trace

     from gsm_shared import HoppingParams

    @@ -198,6 +199,7 @@ def __init__(self, bind_addr, remote_addr, base_port, **kwargs):
     self._tx_queue = []

     def __str__(self):
    + return self.name
     desc = "%s:%d" % (self.remote_addr, self.base_port)
     if self.child_idx > 0:
     desc += "/%d" % self.child_idx
    @@ -289,6 +291,7 @@ def recv_data_msg(self):
     return None

     # Enque the message, it will be sent later
    + trace("%s\t-> fn=%d\ttn=%d" % (self, msg.fn, msg.tn))
     self.tx_queue_append(msg)
     return msg

Change-Id: Icf0b4568b44eb75ee0733391d94b0af86f27ee2e
The file was modifiedsrc/host/trxcon/src/trxcon_main.c
Commit fc9044895d23393f0fb81843012b83221e6183b7 by Kirill Smelkov
trx_toolkit/transceiver: Do not forward nor log from under tx_queue_lock

Even though for 1 BTS + 1 MS fake_trx works ok with tx-queuing, when I
try to run two ccch_scan's with 1 BTS fake_trx starts occupy ~ 100% of
CPU and emits lots of "Stale ..." messages:

[WARNING] transceiver.py:317 (M2@127.0.0.1:7700) Stale TRXD message (fn=2793): fn=2791 tn=7 pwr=0
[WARNING] transceiver.py:317 (M2@127.0.0.1:7700) Stale TRXD message (fn=2793): fn=2792 tn=0 pwr=0
[WARNING] transceiver.py:317 (M2@127.0.0.1:7700) Stale TRXD message (fn=2793): fn=2792 tn=1 pwr=0
[WARNING] transceiver.py:317 (M2@127.0.0.1:7700) Stale TRXD message (fn=2793): fn=2792 tn=2 pwr=0
[WARNING] transceiver.py:317 (M2@127.0.0.1:7700) Stale TRXD message (fn=2793): fn=2792 tn=3 pwr=0
...

Inspecting a bit with a profiler showed that fake_trx simply cannot keep
up with the load.

Let's try to fix this with optimizing things a bit where it is easy to
notice and easy to pick up low-hanging fruits.

This is the first patch in that optimization series. It moves blocking
calls from out of under tx_queue_lock on transmit path. The reason for
this move is not to block receive path while the transmit path is busy
more than necessary. I originally noticed tx_queue_lock.acquire being
visible in profile of the rx thread which indicates that tx/rx
contention on this lock can really happen if we do non-negligible tasks
from under this lock. Here, in particular, it was forward_msg that was
preparing and actually sending RxMsg to destination. tx_queue_lock is
needed only to protect tx_queue itself and synchronize rx and tx threads
access to it. Once necessary items are appended or popped, we can do
everything else out of this lock.

-> Move everything on the tx codepath, not actually needing access to
tx_queue out of this lock:

- only collect messages to be sent under the lock; actually forward them
  after releasing the log;
- same for logging.

Change-Id: I7d10c972c45b2b5765e7c3a28f8646508b3c8a82
The file was modifiedsrc/target/trx_toolkit/transceiver.py
Commit c186b58998dc2340a5f32a90dd100a9cd2e50e47 by Kirill Smelkov
trx_toolkit/transceiver: Do not scan tx_queue twice on tx path

Noticed while moving forwarding out of tx_queue_lock in the previous patch.

Change-Id: I225c44c4cc327b6786efce96d1278c6ec68fbc25
The file was modifiedsrc/target/trx_toolkit/transceiver.py
Commit 25b61af78e7149bb79837e9d19c5ee2808ba00a1 by Kirill Smelkov
trx_toolkit/transceiver: Use with tx_queue_lock instead of manual acquire/release

- it is a bit faster
- it is a bit more robust as the lock becomes released in case
  some exception is raised before reaching release

Noticed while moving forwarding out of tx_queue_lock in
I7d10c972c45b2b5765e7c3a28f8646508b3c8a82.

Change-Id: I74b194120bcc518d44796b57e36368bdc8de4aab
The file was modifiedsrc/target/trx_toolkit/transceiver.py
Commit abfd60b3ee7b6763661f59fce76c1e45fb9c0012 by Kirill Smelkov
trx_toolkit/*: Represent bursts as arrays instead of lists

Continuing fake_trx profiling story I noticed that on rx path a
noticeable time is spent in converting from ubits to sbits via list
comprehensions. By changing burst representation from py list, which
stores each item as full python object, to an array, which stores each
item as just byte, and by leveraging bytearray.translate, we can speed
up that conversion by ~ 10x:

before:

    In [1]: from data_msg import Msg

    In [2]: burst = [0, 1] * (142//2)

    In [3]: burst
    Out[3]:
    [0,
     1,
     0,
     1,
     0,
     ...
     0,
     1]

    In [4]: Msg.ubit2sbit(burst)
    Out[4]:
    [127,
     -127,
     127,
     -127,
     ...
     127,
     -127]

    In [5]: %timeit Msg.ubit2sbit(burst)
    3.01 µs ± 43.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

after:

    In [2]: burst = bytearray([0, 1] * (142//2))

    In [3]: burst
    Out[3]: bytearray(b'\x00\x01\x00\x01...\x00\x01')

    In [4]: Msg.ubit2sbit(burst)
    Out[4]: array('b', [127, -127, 127, -127, ... 127, -127])

    In [5]: %timeit Msg.ubit2sbit(burst)
    325 ns ± 12.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Change-Id: I7314e9e79752e06fa86b9e346a9beacc5e59579e
The file was modifiedsrc/target/trx_toolkit/gsm_shared.py
The file was modifiedsrc/target/trx_toolkit/data_msg.py
The file was modifiedsrc/target/trx_toolkit/rand_burst_gen.py
The file was modifiedsrc/target/trx_toolkit/test_data_msg.py
Commit 06456f118d6fcd6d60a9e50df1d8f07b5fde2c8b by Kirill Smelkov
trx_toolkit/*: Try to avoid copying burst data where possible

Conveying burst data is the primary flow in data place of what fake_trx
does, so the less copies we do, the less we make CPU loaded.

After this change I can finally run 1 BTS + 2 Mobile + 1 ccch_scan
without hitting "Stale message ..." on fake_trx side. However fake_trx
cpu load is close to 100% and there are internal clock overruns often:

    [WARNING] clck_gen.py:97 CLCKGen: time overrun by -1385us; resetting the clock
    [WARNING] clck_gen.py:97 CLCKGen: time overrun by -2657us; resetting the clock
    [WARNING] clck_gen.py:97 CLCKGen: time overrun by -1264us; resetting the clock
    [WARNING] clck_gen.py:97 CLCKGen: time overrun by -2913us; resetting the clock
    [WARNING] clck_gen.py:97 CLCKGen: time overrun by -1836us; resetting the clock
    ...

This suggests that even though fake_trx.py + tx-queue started to work
somehow, the rewrite of fake_trx in C, as explained in OS#6672, is still
better to do.

Change-Id: I147da2f110dedc863361059c931f609c28a69e9c
The file was modifiedsrc/target/trx_toolkit/data_msg.py
The file was modifiedsrc/target/trx_toolkit/data_if.py