Changes

Summary

Revert "Revert "trx_toolkit/transceiver.py: implement the transmit burst (details)
trxcon: Advance Uplink TDMA Fn by default again (details)
trx_toolkit/transceiver: Do not forward nor log from under tx_queue_lock (details)
trx_toolkit/transceiver: Do not scan tx_queue twice on tx path (details)
trx_toolkit/transceiver: Use with tx_queue_lock instead of manual (details)
trx_toolkit/*: Represent bursts as arrays instead of lists (details)
trx_toolkit/*: Try to avoid copying burst data where possible (details)

Commit 0f4714776a9c9b64c4a7268eb8a346f304835565 by Kirill Smelkov Revert "Revert "trx_toolkit/transceiver.py: implement the transmit burst queue"" This reverts commit d4ed09df57b3461470af501e9687ddd80eb78838, reinstating tx queue into fake_trx. It is ok to do so because, as explained in abc63d8d (trx_toolkit/clck_gen.py: Fix clock generator not to accumulate timing error), the reason for GSM clock jitter problem was timing error accumulation in CLCKgen, not problems with py threading.Event. Note: this restores original tx queue implementation basically as-is with only resolve minor conflicts during the revert. The original tx queue implementation wastes CPU cycles though because it linearly scans the whole tx queue at every TDMA frame. If that CPU usage becomes a real problem it should be straightforward to fix by reworking tx queue to use priority queue instead of unordered array via heapq module from standard library. See https://docs.python.org/3/library/heapq.html for details. The follow-up patches will make necessarry adjastments for tx-queue to function properly. Related: OS#4658, OS#6672 Change-Id: I41291708effdd2c767be680fff22ffbd9a56815e
The file was modified	src/target/trx_toolkit/transceiver.py
The file was modified	src/target/trx_toolkit/fake_trx.py
Commit c80e193f6d95367e764684a6021ede981f44ebbd by Kirill Smelkov trxcon: Advance Uplink TDMA Fn by default again This essentially reverts 923e9b0b (trxcon: do not advance Uplink TDMA Fn by default; I838b1ebc54e4c5d116f8af2155d97215a6133ba4) for the following reason: In trxcon TRX clock is unused, because the signal from BTS is used as the master clock source instead (see 45c821ae/Ic8a5b6277c6b16392026e0557376257d71c9d230 "trxcon: get rid of the timer driven clock module" for details". Before restoration of tx-queue in fake_trx this was working ok even with fn-advance=0 on Ms side, but after I41291708effdd2c767be680fff22ffbd9a56815e (Revert "Revert "trx_toolkit/transceiver.py: implement the transmit burst queue"") fake_trx is sending frames having Fn when exactly same Fn happens corresponding on fake_trx clock. This results in BTS frames (that are sent with fn-advance=2 by default (see I7da3d0948f38e12342fb714b29f8edc5e9d0933d in osmo-bts.git and OS#4487) to be queued, waited to be sent, and then actually sent to Ms on fn=msg.fn . And then even if Ms replies immediately with that same fn, that message will be dropped by fake_trx as stalled, because fake_trx thinks that the message is too late since that fn already happened according to fake_trx clock. Here is a trace of how that looks like with 1 BTS and 1 MS(): 7.106.927 CLOCK fn=80 # fake_trx running 7.111.592 CLOCK fn=81 7.116.289 CLOCK fn=82 7.120.949 CLOCK fn=83 7.125.523 CLOCK fn=84 7.130.000 CLOCK fn=85 7.134.575 CLOCK fn=86 ... 7.209.222 CLOCK fn=102 7.209.897 BTS -> fn=104 tn=0 # BTS starts to emit RF 7.210.221 BTS -> fn=104 tn=1 7.210.556 BTS -> fn=104 tn=2 7.210.796 BTS -> fn=104 tn=3 7.211.019 BTS -> fn=104 tn=4 7.211.234 BTS -> fn=104 tn=5 7.211.479 BTS -> fn=104 tn=6 7.211.768 BTS -> fn=104 tn=7 7.213.086 CLOCK fn=103 7.214.354 BTS -> fn=105 tn=0 7.214.566 BTS -> fn=105 tn=1 7.214.685 BTS -> fn=105 tn=2 7.214.792 BTS -> fn=105 tn=3 7.214.890 BTS -> fn=105 tn=4 7.214.985 BTS -> fn=105 tn=5 7.215.083 BTS -> fn=105 tn=6 7.215.184 BTS -> fn=105 tn=7 7.217.823 CLOCK fn=104 7.218.869 BTS -> fn=106 tn=0 7.219.092 BTS -> fn=106 tn=1 7.219.224 BTS -> fn=106 tn=2 7.219.330 BTS -> fn=106 tn=3 7.219.431 BTS -> fn=106 tn=4 7.219.527 BTS -> fn=106 tn=5 7.219.621 BTS -> fn=106 tn=6 7.219.718 BTS -> fn=106 tn=7 7.222.535 CLOCK fn=105 ... 9.995.869 CLOCK fn=706 # MS will soon connect. 9.997.138 BTS -> fn=709 tn=0 # Note: BTS is sending fn=709 before CLOCK fn=707 9.997.338 BTS -> fn=709 tn=1 # so this messages become queued before CLOCK fn=709 happens 9.997.444 BTS -> fn=709 tn=2 9.997.535 BTS -> fn=709 tn=3 9.997.620 BTS -> fn=709 tn=4 9.997.708 BTS -> fn=709 tn=5 9.997.789 BTS -> fn=709 tn=6 9.997.874 BTS -> fn=709 tn=7 10.000.583 CLOCK fn=707 10.001.735 BTS -> fn=710 tn=0 10.001.932 BTS -> fn=710 tn=1 10.002.041 BTS -> fn=710 tn=2 10.002.134 BTS -> fn=710 tn=3 10.002.220 BTS -> fn=710 tn=4 10.002.373 BTS -> fn=710 tn=5 10.002.459 BTS -> fn=710 tn=6 10.002.718 BTS -> fn=710 tn=7 [DEBUG] ctrl_if_trx.py:115 (MS) Recv POWEROFF cmd # MS starts to connect [INFO] ctrl_if_trx.py:117 (MS) Stopping transceiver... [DEBUG] ctrl_if_trx.py:229 (MS) Ignore CMD ECHO 10.005.203 CLOCK fn=708 [DEBUG] ctrl_if_trx.py:229 (MS) Ignore CMD SETSLOT 10.006.406 BTS -> fn=711 tn=0 [DEBUG] ctrl_if_trx.py:124 (MS) Recv RXTUNE cmd 10.006.999 BTS -> fn=711 tn=1 10.007.153 BTS -> fn=711 tn=2 [DEBUG] ctrl_if_trx.py:131 (MS) Recv TXTUNE cmd 10.007.590 BTS -> fn=711 tn=3 10.007.728 BTS -> fn=711 tn=4 [DEBUG] ctrl_if_trx.py:97 (MS) Recv POWERON CMD # MS connected and activated RF [INFO] ctrl_if_trx.py:109 (MS) Starting transceiver... 10.008.344 BTS -> fn=711 tn=5 10.008.471 BTS -> fn=711 tn=6 10.008.563 BTS -> fn=711 tn=7 10.009.868 CLOCK fn=709 # CLOCK fn=709 happens 10.009.987 MS <- fn=709 tn=0 # messages of BTS queued previously with that fn=709 are forwarded to Ms 10.010.696 MS <- fn=709 tn=1 10.010.904 MS -> fn=709 tn=0 # <-- MS sends UL message with that same fn=709 _before_ CLOCK fn=710 10.011.397 BTS -> fn=712 tn=0 10.011.507 MS <- fn=709 tn=2 10.011.770 MS <- fn=709 tn=3 10.011.968 MS <- fn=709 tn=4 10.012.156 MS <- fn=709 tn=5 10.012.342 MS <- fn=709 tn=6 10.012.527 MS <- fn=709 tn=7 10.012.914 BTS <- fn=709 tn=0 10.013.166 BTS -> fn=712 tn=1 10.013.524 MS -> fn=709 tn=1 # <-- MS sends UL message with that same fn=709 _before_ CLOCK fn=710 10.013.832 BTS -> fn=712 tn=2 10.013.949 MS -> fn=709 tn=2 # <-- MS sends UL message with that same fn=709 _before_ CLOCK fn=710 10.014.081 BTS -> fn=712 tn=3 10.014.177 MS -> fn=709 tn=3 # <-- MS sends UL message with that same fn=709 _before_ CLOCK fn=710 10.014.361 BTS -> fn=712 tn=4 10.014.475 CLOCK fn=710 # but most of those messages of MS with fn=709 are not picked up 10.014.713 MS -> fn=709 tn=4 # instantly and so become dropped as stale on CLOCK fn=710 10.014.815 MS <- fn=710 tn=0 10.015.032 BTS -> fn=712 tn=5 10.015.687 MS <- fn=710 tn=1 10.016.189 MS <- fn=710 tn=2 10.016.464 MS <- fn=710 tn=3 10.016.648 MS <- fn=710 tn=4 10.016.882 MS <- fn=710 tn=5 10.017.110 MS <- fn=710 tn=6 10.017.336 MS <- fn=710 tn=7 [WARNING] transceiver.py:321 (MS) Stale TRXD message (fn=710): fn=709 tn=1 pwr=0 [WARNING] transceiver.py:321 (MS) Stale TRXD message (fn=710): fn=709 tn=2 pwr=0 [WARNING] transceiver.py:321 (MS) Stale TRXD message (fn=710): fn=709 tn=3 pwr=0 [WARNING] transceiver.py:321 (MS) Stale TRXD message (fn=710): fn=709 tn=4 pwr=0 So without adding some fn-advance it is practically not possible for Ms to be on time with tx-queueing on TRX even if Ms sends its uplink frames right immediately after receiving downlink ones. This way Ms fn-advance has to be 1 at the minimum, so that immediate UL replies can in principle arrive before fn+1 happens on fake_trx side, even for tn=7. And it is also better to increase fn-advance once more by another +1, to compensate for possible jitter due to OS scheduling latencies and similar things. This way default fn-advance=2 on Ms side becomes symmetric to default fn-advance on BTS side and Ms<->BTS exchange starts to work ok even with tx-queueing activated on fake_trx. In theory it should be possible to reduce those fn-advances to 1 on both sides, but that will likely require to switch clock granularity from Fn to Tn increasing precision by an order of magnitude, which will likely also result in the need to make architectural change of moving trx to work inside BTS and MS instead of being separate service processes. That's a big task and I'm not delving into that here. Note: Uplink Fn advance > 0 is needed for Ms when working with regular TRX'es as well. The reason is exactly the same as explained above. In 923e9b0b the reason for setting fn-advance=0 by default was that trxcon is usually being used with fake_trx, and that with fake_trx it is not needed. But after reenabling tx-queueing we have to revisit even fake_trx case again. () the trace was captured with the help of the following debugging patch: --- b/src/target/trx_toolkit/burst_fwd.py +++ a/src/target/trx_toolkit/burst_fwd.py @@ -22,6 +22,18 @@ from trx_list import TRXList +import sys, time + +# trace logs msg to stderr with also marking it with high-resolution timestamp. +t0 = time.time() +def trace(msg): + t = time.time() - t0 + t_ms = int(t * 1e3) / 1e3 + us = int((t - t_ms) * 1e6) + print("%7.3f.%03d %s" % (t_ms, us, msg), file=sys.stderr) + class BurstForwarder(TRXList): """ Performs burst forwarding between transceivers. @@ -63,6 +75,7 @@ def forward_msg(self, src_trx, rx_msg): if trx.get_rx_freq(rx_msg.fn) != tx_freq: continue + trace("%s\t<- fn=%d\ttn=%d" % (trx, rx_msg.fn, rx_msg.tn)) # Transform from TxMsg to RxMsg and forward tx_msg = rx_msg.trans(ver = trx.data_if._hdr_ver) trx.handle_data_msg(src_trx, rx_msg, tx_msg) --- b/src/target/trx_toolkit/fake_trx.py +++ a/src/target/trx_toolkit/fake_trx.py @@ -29,7 +29,7 @@ import re from app_common import ApplicationBase -from burst_fwd import BurstForwarder +from burst_fwd import BurstForwarder, trace from transceiver import Transceiver from data_msg import Modulation from clck_gen import CLCKGen @@ -473,6 +473,7 @@ def run(self): # This method will be called by the clock thread def clck_handler(self, fn): + trace("CLOCK\tfn=%d" % fn) # We assume that this list is immutable at run-time for trx in self.trx_list.trx_list: trx.clck_tick(self.burst_fwd, fn) --- b/src/target/trx_toolkit/transceiver.py +++ a/src/target/trx_toolkit/transceiver.py @@ -25,6 +25,7 @@ from data_if import DATAInterface from udp_link import UDPLink from trx_list import TRXList +from burst_fwd import trace from gsm_shared import HoppingParams @@ -198,6 +199,7 @@ def __init__(self, bind_addr, remote_addr, base_port, **kwargs): self._tx_queue = [] def __str__(self): + return self.name desc = "%s:%d" % (self.remote_addr, self.base_port) if self.child_idx > 0: desc += "/%d" % self.child_idx @@ -289,6 +291,7 @@ def recv_data_msg(self): return None # Enque the message, it will be sent later + trace("%s\t-> fn=%d\ttn=%d" % (self, msg.fn, msg.tn)) self.tx_queue_append(msg) return msg Change-Id: Icf0b4568b44eb75ee0733391d94b0af86f27ee2e
The file was modified	src/host/trxcon/src/trxcon_main.c
Commit fc9044895d23393f0fb81843012b83221e6183b7 by Kirill Smelkov trx_toolkit/transceiver: Do not forward nor log from under tx_queue_lock Even though for 1 BTS + 1 MS fake_trx works ok with tx-queuing, when I try to run two ccch_scan's with 1 BTS fake_trx starts occupy ~ 100% of CPU and emits lots of "Stale ..." messages: [WARNING] transceiver.py:317 (M2@127.0.0.1:7700) Stale TRXD message (fn=2793): fn=2791 tn=7 pwr=0 [WARNING] transceiver.py:317 (M2@127.0.0.1:7700) Stale TRXD message (fn=2793): fn=2792 tn=0 pwr=0 [WARNING] transceiver.py:317 (M2@127.0.0.1:7700) Stale TRXD message (fn=2793): fn=2792 tn=1 pwr=0 [WARNING] transceiver.py:317 (M2@127.0.0.1:7700) Stale TRXD message (fn=2793): fn=2792 tn=2 pwr=0 [WARNING] transceiver.py:317 (M2@127.0.0.1:7700) Stale TRXD message (fn=2793): fn=2792 tn=3 pwr=0 ... Inspecting a bit with a profiler showed that fake_trx simply cannot keep up with the load. Let's try to fix this with optimizing things a bit where it is easy to notice and easy to pick up low-hanging fruits. This is the first patch in that optimization series. It moves blocking calls from out of under tx_queue_lock on transmit path. The reason for this move is not to block receive path while the transmit path is busy more than necessary. I originally noticed tx_queue_lock.acquire being visible in profile of the rx thread which indicates that tx/rx contention on this lock can really happen if we do non-negligible tasks from under this lock. Here, in particular, it was forward_msg that was preparing and actually sending RxMsg to destination. tx_queue_lock is needed only to protect tx_queue itself and synchronize rx and tx threads access to it. Once necessary items are appended or popped, we can do everything else out of this lock. -> Move everything on the tx codepath, not actually needing access to tx_queue out of this lock: - only collect messages to be sent under the lock; actually forward them after releasing the log; - same for logging. Change-Id: I7d10c972c45b2b5765e7c3a28f8646508b3c8a82
The file was modified	src/target/trx_toolkit/transceiver.py
Commit c186b58998dc2340a5f32a90dd100a9cd2e50e47 by Kirill Smelkov trx_toolkit/transceiver: Do not scan tx_queue twice on tx path Noticed while moving forwarding out of tx_queue_lock in the previous patch. Change-Id: I225c44c4cc327b6786efce96d1278c6ec68fbc25
The file was modified	src/target/trx_toolkit/transceiver.py
Commit 25b61af78e7149bb79837e9d19c5ee2808ba00a1 by Kirill Smelkov trx_toolkit/transceiver: Use with tx_queue_lock instead of manual acquire/release - it is a bit faster - it is a bit more robust as the lock becomes released in case some exception is raised before reaching release Noticed while moving forwarding out of tx_queue_lock in I7d10c972c45b2b5765e7c3a28f8646508b3c8a82. Change-Id: I74b194120bcc518d44796b57e36368bdc8de4aab
The file was modified	src/target/trx_toolkit/transceiver.py
Commit abfd60b3ee7b6763661f59fce76c1e45fb9c0012 by Kirill Smelkov trx_toolkit/: Represent bursts as arrays instead of lists Continuing fake_trx profiling story I noticed that on rx path a noticeable time is spent in converting from ubits to sbits via list comprehensions. By changing burst representation from py list, which stores each item as full python object, to an array, which stores each item as just byte, and by leveraging bytearray.translate, we can speed up that conversion by ~ 10x: before: In [1]: from data_msg import Msg In [2]: burst = [0, 1] (142//2) In [3]: burst Out[3]: [0, 1, 0, 1, 0, ... 0, 1] In [4]: Msg.ubit2sbit(burst) Out[4]: [127, -127, 127, -127, ... 127, -127] In [5]: %timeit Msg.ubit2sbit(burst) 3.01 µs ± 43.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) after: In [2]: burst = bytearray([0, 1] * (142//2)) In [3]: burst Out[3]: bytearray(b'\x00\x01\x00\x01...\x00\x01') In [4]: Msg.ubit2sbit(burst) Out[4]: array('b', [127, -127, 127, -127, ... 127, -127]) In [5]: %timeit Msg.ubit2sbit(burst) 325 ns ± 12.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) Change-Id: I7314e9e79752e06fa86b9e346a9beacc5e59579e
The file was modified	src/target/trx_toolkit/gsm_shared.py
The file was modified	src/target/trx_toolkit/data_msg.py
The file was modified	src/target/trx_toolkit/rand_burst_gen.py
The file was modified	src/target/trx_toolkit/test_data_msg.py
Commit 06456f118d6fcd6d60a9e50df1d8f07b5fde2c8b by Kirill Smelkov trx_toolkit/*: Try to avoid copying burst data where possible Conveying burst data is the primary flow in data place of what fake_trx does, so the less copies we do, the less we make CPU loaded. After this change I can finally run 1 BTS + 2 Mobile + 1 ccch_scan without hitting "Stale message ..." on fake_trx side. However fake_trx cpu load is close to 100% and there are internal clock overruns often: [WARNING] clck_gen.py:97 CLCKGen: time overrun by -1385us; resetting the clock [WARNING] clck_gen.py:97 CLCKGen: time overrun by -2657us; resetting the clock [WARNING] clck_gen.py:97 CLCKGen: time overrun by -1264us; resetting the clock [WARNING] clck_gen.py:97 CLCKGen: time overrun by -2913us; resetting the clock [WARNING] clck_gen.py:97 CLCKGen: time overrun by -1836us; resetting the clock ... This suggests that even though fake_trx.py + tx-queue started to work somehow, the rewrite of fake_trx in C, as explained in OS#6672, is still better to do. Change-Id: I147da2f110dedc863361059c931f609c28a69e9c
The file was modified	src/target/trx_toolkit/data_msg.py
The file was modified	src/target/trx_toolkit/data_if.py