In this case phybi->burst_len would be 444 + 2, while size of the burst buffer in struct l1sched_burst_ind is limited to the length of an 8-PSK modulated burst (444).
Commit
9feb5057da41611033a5881409c4fef2628d98a9
by Vadim Yanitskiy
layer23: refactor the application API concept
With this set of changes we have a cleaner l23 app architecture:
* struct vty_app_info: all l23 applications must define this struct; * struct vty_app_info: *cfg_supported() becomes a mask of L23_OPT_*; * struct vty_app_info: explicitly set L23_OPT_* in all l23 apps; * drop l23_app_info(), there can be only one vty_app_info per an app;
It's no more needed to obtain the vty_app_info by calling a function and checking the returned value against NULL everywhere. This kind of information is rather static (not dynamically composed) and needs not to be encapsulated into functions.
Commit
67943df4b7e59ef4982b586271566099c2f14d4a
by Vadim Yanitskiy
layer23: fix parsing of command line options
After the recent refactoring, parsing of the command line options is broken for some arguments. Specifically, the value of '-a'/'--arfcn' is ignored and hard-coded ARFCN=871 is used instead.
The problem is that l23_app_init(), which allocates an MS state and sets the initial ARFCN, is called *before* handle_options(). So the cfg_test_arfcn is used before it gets overwritten from the argv[].
The usual approach in osmo-* apps is to parse the command line arguments first, and only then execute code which depends on configurable parameters. Let's follow this approach too.
Commit
0b51656c06c58f53d7b27ad02de2831858cf31a7
by Vadim Yanitskiy
virt_phy: rearrange and clean up header files
* Build up the usual include directory hierarchy. * Move l1ctl_proto.h to 'include/osmocom/bb/'. * System headers first, then libosmo*, the local ones.
Commit
b2dfbb88591395135a494c048cac5c93865720de
by Vadim Yanitskiy
layer23/{mobile,modem}: fix segfault on VTY connection
It was a mistake to call vty_init(), passing it a pointer to the vty_app_info structure allocated on the stack, because it gets overwritten when the calling function _vty_init() returns.
Change-Id: I75843a964254243c70bedcf8ff97d854107ee21a Fixes: 9feb5057 "layer23: refactor the application API concept"
Commit
e8a3ad221648623dcb6f394cee407bc7cd2a3980
by Pau Espin Pedrol
layer23: modem: Test GMM layer through VTY
Recent work on libosmo-gprs-gmm already allows triggering GPRS Attach procedure. Let's add some code to use it so we can already test the entire stack GMM->LLC->RLCMAC (SM layer still missing).
Commit
2a4fb973415568b703cdda18bec073a5b8727594
by Vadim Yanitskiy
trxcon: add GSMTAP logging target if '-g' is given
Unlike the other more mature Osmocom projects, trxcon does not have its own VTY interface and thus does not support the config file parsing, so currently it's impossible to configure additional logging targets.
There is a command line option '-g', which enables GSMTAP Um logging. Let's also add a GSMTAP logging target if it's given. This is a quick hack, but good enough for occasional debugging.
Commit
a80957b617eca785e7ab3791952e9e6a6cbb1716
by Vadim Yanitskiy
trxcon: l1sched_prim_dequeue(): check TDMA Fn in PDCH prims
We shall never be transmitting Uplink PDCH blocks if the current TDMA Fn does not match the requested TDMA Fn, because Tx timing is critical for PDCH timeslots. Drop and log an error message.
Commit
ff9db9def78d9c2439c8ff3196746bf6df987886
by Vadim Yanitskiy
trxcon/l1sched: rework the primitive API
The goal is to simplify primitive management, and allow passing data between different components without having to re-allocate memory and copy it over several times. This patch has been tested by running ttcn3-bts-test, no regressions observed.
* Use msgb and prim API from libosmocore, * Move l1sched_prim definitions to its own header file, * Move Tx queue from per-timeslot to per-lchan state, * Route prims via l1sched_prim_{to,from}_user() functions, * Remove GSMTAP stuff from sched_lchan_desc[].
Commit
651426fee4e847d0d22d5b4845c3ead139a0b574
by Pau Espin Pedrol
layer23: Decouple SIM events from MMR events
let the specific app handle the events generated from the subscriber/SIM. All the MMR specific code can for now stay in mobile/ while SIM support can be in common/ without violating layers (common/ calling functions in mobile/).
Commit
0857d47885c7b396fda2672809116cd0c90179cc
by Vadim Yanitskiy
virt_phy: fix bogous TDMA Fn check in l1ctl_rx_gprs_ul_block_req()
sched_fn_ul() does not support RSL_CHAN_OSMO_PDCH, so it would always return the current time, which in most cases is not the correct time for scheduling a block. Actually, we don't really need this function because the Tx Fn is provided to us by the upper layers - just use it.
Commit
e8fc1e922859cd300257914604c8659d8e8ae648
by Pau Espin Pedrol
layer23: modem: Unregister registered callbacks upon app exit
It's just a good practice to delete all resources allocated during startup. The main aim here is to keep resemblance to what the mobile app is doing, so that they can slowly be merged and some functionalities from the mobile app can be added to the modem app, like shutting down the MS without killing the process eventually.
Commit
7d45f4d4eea6f73e92ce8e484ca884df943d5ed5
by Pau Espin Pedrol
layer23: Use OSMO_IMSI_BUF_SIZE from libosmocore
Note: GSM_IMSI_LENGTH was 16 octets, and OSMO_IMSI_BUF_SIZE is 17 octets. Probably a bug in old osmocom-bb code since that code predates the one in libosmocore.
Commit
5dccc1fbd870dbf259efacb746d6a901e439be9d
by Vadim Yanitskiy
trxcon: use non-blocking stderr logging by default
The logging in trxcon is initialized by calling osmo_init_logging2(), which creates an stderr target in *blocking* mode. Blocking write()s may cause random burst scheduling delays (due to the whole process being stuck). This is not desired and becomes even more critical when operating in PS doman, which imposes strict timing requirements.
trxcon does not have its own VTY interface yet, so there's currently no easy way to switch to non-blocking mode like in other osmo-apps. Let's enable it by default in trxcon_logging_init().
Commit
c16126317d4ec6c7d499f247d8ea836b902cedfb
by Pau Espin Pedrol
layer23: modem: grr: Log ignored CCCH ImmAss
There seems to be some bug when using virtphy where sometimes the received T2 and/or T3 in the ImmASs is not matching what we sent. This helps in showing the problem and not failing silently.
Commit
7ce8cdd32543312f496824bf82ebbaf565a01b6f
by Vadim Yanitskiy
trxcon/l1sched: allocate primitives of fixed size (64 + 64)
When running trxcon with GSMTAP Um logging enabled (-g cmd line arg), in handle_prim_rach_cnf() we msgb_put() one or two bytes to the given msgb. This causes a segfault, because the L1SCHED_PRIM_T_RACH prims have 0 tailroom bytes available.
While we could allocate L1SCHED_PRIM_T_RACH with a few extra bytes, a more fundamental approach is to allocate all l1sched primitives with a fixed tailroom.
Commit
1ad195e28f46c0d132406973f123b9b4e9271062
by Pau Espin Pedrol
layer23: rework store & pass of test_sim param to gsm_subscr_testcard() API
This way the gsm_subscr_testcard() API looks similar to that of other backends (sim, sap). Furthermore, the callers of the API don't need to pass tons of params. This is important since in the future there will be more params (eg. gprs related ones), so it makes no sense to keep increasing the param list in there.
Commit
7b53ad536c6c4fd8cbea4ae0f6f1a5716b72108b
by Pau Espin Pedrol
layer23: Generalize subscriber SIM insert API
With this patch, during VTY config the SIM type is selected, and the app calls a generic gsm_subscriber_insert() API which will take of internally initializing and starting whatever specific-backend setup is needed.
Commit
2ee1e23d937eeba86d8288797ebe97570e32669c
by Pau Espin Pedrol
layer23: subscriber: Move generic APIs to the top section
This way we end up with the generic section on top, followed by each backend section clearly delimited. As a result, it is now much clearer the separation between the generic code and each backend specific implementation.
Commit
3348f491792788974c6bb3ee75f3a4f1d159aef9
by Pau Espin Pedrol
Migrate network identifier fields to modern osmocom structures
This allows using well tested standarized API to print, compare, etc. usual identifiers like PLMN, LAI, etc. It also simplifies code by avoiding passing lots of parameters and making it easier to identify which fields go packed together. This is specially important since in the future more of those identifiers will be added for GPRS.
Commit
ad8f7794c9b7c5c03f34e1d6a273e8b5f7c9da30
by Vadim Yanitskiy
trxcon/l1sched: remove redundant TCH/[FH] prim length checks
Both gsm0503_tch_[fh]r_encode() do check the given payload length in order to determine the payload and/or codec type. The same applies to gsm0503_tch_a[fh]s_encode(). There is no real need to implement additional length checks on top of that - drop them.
Commit
d400126d0fe60783c10d99c96fcf42ddf3a8ee5f
by Pau Espin Pedrol
layer23: modem: Forward Paging Request Type 1/2 to rlcmac layer
The RLCMAC layer in libosmo-gprs-rlcmac will decode the messages and if matching the MS, forward it to GMM, who will see if it requires initiating a packet access procedure.
Commit
0ee32177a28e7191bb1ddba9c4115352b1d366d7
by Vadim Yanitskiy
trxcon/l1sched: rework burst buffer shifting for TCH/[FH]
This is how the buffer shifting is implemented in osmo-bts-trx. Keep trxcon's l1sched implementation as close to osmo-bts-trx as possible in order to simplify the integration of CSD support.
Commit
a49696bc981ffe5acda576f6a076572255c38b54
by Vadim Yanitskiy
trxcon/l1sched: do not check TDMA Fn of PTCCH/U prims
The PTCCH/U primitives are basically Access Bursts. The TDMA Fn in such primitives is always 0, because there's currently no way to indicate TDMA Fn in L1CTL_RACH_REQ (only the offset).
Commit
f5959f78cd19d20c2fd13607a19b1b1d6b085835
by laforge
fake_trx.py: remove SETSLOT based burst filtering
For the sake of simplicity and due to some performance limitations, fake_trx.py does not generate TRXD NOPE indications for osmo-bts-trx on its own. It's actually trxcon sending NOPE.req (empty Tx PDUs) when it has nothing to send, and fake_trx.py simply converting them.
In a follow-up change [1] we remove trxcon's internal clock module, making the Uplink burst scheduling being driven by Downlink bursts with the respective TDMA Fn/Tn values. Given that fake_trx.py is currently dropping bursts received for inactive timeslots, we would get NOPE.req only for a single timeslot, the one being currently active. This would break several testcases in ttcn3-bts-test.
Remove SETSLOT based burst filtering, so that trxcon would still be able to generate NOPE.req for all, active and inactive timeslots. Downlink bursts for inactive timeslots are discarded anyway.
Commit
45c821aee08e5f91273b0e203a1a04cff60114c8
by laforge
trxcon: get rid of the timer driven clock module
trxcon was heavily inspired by osmo-bts-trx, and among with many other scheduling related parts also inherited the timer driven clock module.
This clock module is driving the Uplink burst scheduling, just like it does drive the Downlink burst scheduling in osmo-bts-trx. Just like in osmo-bts-trx, the clock module relies on periodic CLCK indications from the PHY, which are needed to compensate for the clock drifting.
The key difference is that trxcon is using Downlink bursts as the CLCK indications, see 'bi.fn % 51' in trx_data_rx_cb(). This is possible because the MS is a clock slave of the BTS: the MS PHY needs to sync its freq. and clock first, and only after that it can Rx and Tx.
So far we've had no problems with the clock module in trxcon until we started adding GPRS support and integrated the l1gprs. While the CS domain is quite flexible in terms of timings and delays, the PS domain is a lot more sensetive to the timing issues.
Sometimes it happens that the trxcon's clock module is ticking quicker than it should, resulting in Uplink PDCH blocks being scheduled earlier than the respective Downlink PDCH blocks are received:
20230502021957724 l1sched_pull_burst(): PDTCH/U Tx time (fn=56103) 20230502021957744 (PDCH-7) Rx DL BLOCK.ind (fn=56103, len=23): ... 20230502021957747 l1sched_pull_burst(): PDTCH/U Tx time (fn=56108) 20230502021957765 l1sched_pull_burst(): PDTCH/U Tx time (fn=56112) 20230502021957767 (PDCH-7) Rx DL BLOCK.ind (fn=56108, len=23): ... 20230502021957768 (PDCH-7) Rx UL BLOCK.req (fn=56112, len=54): ... 20230502021957784 l1sched_pull_burst(): PDTCH/U Tx time (fn=56116) 20230502021957784 TS7-PDTCH dropping Tx primitive (current Fn=56116, prim Fn=56112)
This is impossible in reality, because Uplink is intentionally lagging behind Downlink by 3 TDMA timeslot periods. In a virtual setup this causes sporadic dropping of Uplink PDCH blocks, as can be seen from the logging snippet above, and significantly degrades the RLC/MAC performance for GPRS.
Let's remove the internal clock module and trigger the Uplink burst transmission each time we receive a Downlink burst. This helps to overcome the GPRS scheduling issues and replicates the approach of osmo-trx-ms more closely.
Commit
923e9b0b90622a7977c73ddd264d7cc48439098f
by laforge
trxcon: do not advance Uplink TDMA Fn by default
The idea behind advancing Uplink TDMA Fn is to give the transceiver, which is usually a separate process, some additional time to receive and prepare Uplink bursts for transmission. This comes at a price of having an additional delay between Uplink and Downlink.
Given that trxcon, as a standalone application, is primarily used in conjunction with fake_trx.py for running ttcn3-bts-test against osmo-bts-trx, there is no reason to advance the Uplink TDMA Fn.
Commit
96fec1646d714ab1880bf69106c0c556fd849ba3
by Vadim Yanitskiy
mobile: fix -Wlogical-not-parentheses in gsm48_cc_init()
Found by clang:
gsm48_cc.c:54:6: warning: logical not is only applied to the left hand side of this comparison [-Wlogical-not-parentheses] if (!cc->mncc_upqueue.next == 0) ^ ~~
Commit
df900478de4f3931539c9f2b0387f9800a785f9f
by laforge
layer23: Update to libosmocore osmo_auth_gen_vec2
libosmogsm has recently deprecated the use of osmo_auth_gen_vec and the osmo_sub_auth_data structure in favor of newer versions of this API. Let's migrate to it
Commit
749f0a461cc78b0fbd17f3eeee36dace4ba9c8be
by Vadim Yanitskiy
layer23: fix handling of logging category mask (-d option)
In change 67943df4 I broke handling of the logging category mask in the mobile app. Adding this option results in a segfault:
ERROR: osmo_log_info == NULL! You must call log_init() before using logging in log_parse_category_mask()! Assert failed osmo_log_info src/libosmocore/src/core/logging.c:329
As can be seen, the problem is that we are calling log_parse_category_mask() before initializing the logging.
As possible solution, I could rearrange the code to parse command line options after calling osmo_init_logging2(). This would fix the segfault, but would not fully solve the problem.
If we call log_parse_category_mask() before parsing the config file, then logging configuration in the config file overwrites the logging configuration specified via the command line. But we want the opposite: the command line setting should overwrite the config file parameters. This is handy because there is no need to edit the config file if you quickly need to test something.
So let's call log_parse_category_mask() after parsing the config file.
Change-Id: I1b2b7804bf99b71f96e9197f7824cfd20431e8a1 Fixes: 67943df4 "layer23: fix parsing of command line options"
Commit
60215bc051c96d87c552ece42d594be7f7388c4f
by Vadim Yanitskiy
modem: properly handle Dedicated mode or TBF IE
We need to distinguish between Uplink and Downlink TBF assignment in grr_rx_imm_ass(), because matching the Request Reference IE makes sense only for the Uplink TBF assignment.
Uplink TBFs are requested by the UEs by sending RACH, while Downlink TBFs are assigned by the network itself. The Request Reference IE is only valid for Uplink assignments and shall be ignored in messages assigning Downlink TBFs.
Now if the channel mode is GSM48_CMODE_SPEECH_AMR, UL FACCH/[FH] frames will be fed to osmo_amr_rtp_dec(), which is definitely wrong. Fix this by doing all AMR specific checks in a separate function, which is called only for speech frames.
Commit
a22acea3a9db44d891080caaed8619114386afff
by Vadim Yanitskiy
trxcon/l1sched: rework dequeueing of Tx prims
Centralized dequeueing of Tx prims in l1sched_pull_burst() is a working approach, but doing this in each logical channel handler individually is a lot more flexible. This is how it's done in osmo-bts-trx, and this allows implementing FACCH support for CSD channels.
Commit
21aacfe7096ef27fe40c22d20a86d2303f9cba91
by Vadim Yanitskiy
trxcon/l1sched: peoperly prioritize FACCH/H over TCH
Unlike FACCH/F, which steals one TCH frame, FACCH/H steals two TCH frames. This is what prim_dequeue_tchh() aims to implement, but the current implementation is not 100% correct.
The problem is that we're attempting to dequeue and drop two TCH frames in one go, whenever we get a FACCH/H frame. Most likely, there will be no 2nd TCH frame in the Tx queue at that time, so it will never be dropped and will clog the queue.
Let's replicate what osmo-bts-trx does:
* dequeue and drop the 1st TCH frame when sending 1st/6 burst of FACCH, * dequeue and drop the 2nd TCH frame when sending 3rd/6 burst of FACCH.
Commit
fd8962e89144fb0af4b99199589d9f9768804640
by Vadim Yanitskiy
trxcon/l1sched: do not craft artificial BFI frames on TCH
Whenever decoding fails or a FACCH setaling happens, simply send an empty DATA.ind to the upper layers. On the Uplink path, use a dummy LAPDm func=UI frame (with random padding) whenever possible.
Crafting TCH frames with zeroes is not really needed and moreover makes it hard to distinguish between valid speech frames and BFIs. This also used to be the case for osmo-bts-trx, but not anymore (see the related patch).
Commit
0cfd0bbe801d86e1a3cbc08b228f9ad2521ebd08
by Vadim Yanitskiy
trxcon/l1sched: transmit dummy speech blocks with inverted CRC3
In case when an Uplink TCH/[FH]S frame needs to be transmitted, but there is no frame available in the Tx queue, transmit an intentionally invalid block with inverted CRC3. This will induce a BFI condition in the BTS side receiver. See also the related osmo-bts-trx patch.
This works-around a race condition happening when the upper layers are sending L1CTL RESET.req immediately followed by L1CTL FBSB.req. The problem is that the TRXC logic is considering the transceiver powered on until a response to CMD POWEROFF is received.
Commit
89ef574fe2257f65ae4140b71620589e8c0726e9
by Pau Espin Pedrol
layer23: modem: Avoid direct transition ST_PACKET_TRANSFER->ST_PACKET_IDLE
Right now the existing code is switching to state IDLE and hence running grr_st_packet_idle_onenter() which attempts stuff like starting an attach. This is all done while the L1CTL RESET + FBSB is still in progress. We should instead wait to receive confirmation from those. As an easy implementation for now, simply switch to the GRR_ST_PACKET_NOT_READY state, which will move to GRR_ST_PACKET_IDLE once it starts receiving CCCH blocks (aka it will already have gone through L1CTL RESET + FBSB completely).
Commit
3f409eb94eac9ffa67a7528f29f58275f0b836b8
by Vadim Yanitskiy
trxcon/l1sched: emit DATA.cnf early (on bid=0)
trxcon's scheduler is currently emitting DATA.cnf whenever the last burst of a DATA.req has been transmitted. This sounds logical, but makes the implementation quite complex. It's even harder to implement sending of DATA.cnf properly for CSD specific channel modes, which are to be implemented in a follow-up patch.
The DATA.cnf prims trigger sending of L1CTL DATA.cnf/TRAFFIC.cnf, which are interpreted as Ready-to-Send by the upper layers (layer23). Additionally DATA.cnf prims trigger sending of GSMTAP PDUs containing the respective Uplink frames.
This patch changes the l1sched logic, so that a DATA.cnf primitive is emitted whenever the respective DATA.req is dequeued and encoded using the lchan specific channel coding function. This simplifies the code a lot and prepares for the upcoming CSD support.
As a bonus, this patch fixes an inconsistency between TDMA FNs reported in Uplink and Downlink GSMTAP PDUs. Now we're indicating the first Fn in both cases, so Uplink is consistent with Downlink.
Commit
171ba463828af143911d38e8a4885759f9d2e389
by Pau Espin Pedrol
layer23: modem: gmm: Adapt log string about no TLLI found
During initial GMM Attach, the GMM layer generates an internal local TLLI and uses it to do the GMM Attach. Only at the time it receives the GMM Attach Accept with the assigned TLLI from the network then explicitly informs other layers about the TLLI update. Hence, the GMMREG user doesn't really know about the TLLI in use until the GMM Attach success happens (gmmreg-attach.cnf). During that time, the TLLI at the app is basically unassigned (0xffffffff). Hence, during that same time a TLLI update hook in GMMRR-Assign.req will not work since the app is unaware of the remporary local TLLI, so no match can be done. In that specific scenario, that's fine, since anyway it is waiting to receive the GMMREG-Attach.cnf, which will indicate the assigned TLLI to it. In summary, not being able to match the TLLI in GMMRR-Assign.req is not bad per se, so soften the log error there.
Commit
8bbd0d173fad3708fac3207d56dd04c14912351e
by Pau Espin Pedrol
l1ctl: Fix fill ph_data_param fn field
This commit fixes recent previous commit filling in the fn field. The dl->frame_nr is network order, and we want to pass a host order integer in the primitive. Use the tm.fn which already includes the proper value calculated from dl->frame_nr.
Commit
a93785bf437c45131aad0bac52b7c3f3db9ecee3
by Vadim Yanitskiy
trxcon/l1sched: implement CSD scheduling support
This patch adds support for TCH/[FH]2.4, TCH/[FH]4.8, TCH/F9.6 and TCH/F14.4 (including FACCH). Additional changes made:
* enlarge the maximum TCH burst buffer size to 24 * (2 * 58) bytes; * enlarge per-l1cs UL/DL burst masks to hold up to 32 bits; * enlarge per-l1cs DL meas ring buffer to 24 entries; * enlarge L1SCHED_PRIM_TAILROOM from 256 to 512 bytes; * enlarge L1CTL_LENGTH from 256 to 512 bytes;
Commit
59e649dbf1e1e71d38ab6fc40d2feb6fe6195f54
by laforge
firmware: board: add support for TR-800 target
iWOW TR-800 is a packaged GSM modem module based on Calypso+Iota+Rita chipset; it is fully quadband, and reverse engineering of its PCB confirms that this module is nothing but a mass-produced version of the core of TI's legendary Leonardo+ reference platform. The same module is also known as FreeCalypso Tango - a rebranded version of the same hardware module with different firmware and a different Responsible Party for official support.
FreeCalypso HQ is contributing OsmocomBB support for this Calypso modem module for two reasons:
1) Harm reduction - sooner or later someone in Osmocom universe is going to run OBB firmware on TR-800 once they lay their hands on this hardware, and the resulting operation will be less harmful / closer to correct if we provide the basic board support patch.
2) There exists a large surplus of FreeCalypso Caramel2 development boards that are based around FC Tango modules. Having this hw supported by both firmwares will hopefully increase the chances that these boards will find loving homes, as opposed to continuing to gather dust in a cardboard box.
Legal and ethical disclaimer: OsmocomBB firmware running on ANY Calypso+Iota+Rita target is *known*, through confirmed observations with a measuring instrument (R&S CMU200), to put out radio transmissions that are *severely out of spec*, and this defect does NOT go away with the present patch which merely adds support for a different C+I+R board target. The present patch has been produced as a harm reduction measure, to reduce (but not to zero) the harm that will be caused by parties who run OsmocomBB firmware on C+I+R hardware despite having been advised not to. As the party seeking to reduce rather than cause that harm, Mother Mychaela and her related business entities explicitly disclaim all liability for damage that will be caused by parties who continue running OsmocomBB firmware despite having been repeatedly advised to switch to manufacturer-approved published-source firmware instead.
Commit
fa833e40956a72334174eea220e4b5a20bf7864a
by Pau Espin Pedrol
l1gprs/l1ctl: Decouple RTS.ind from DL_BLOCK.ind
Before this patch, the RTS:ind was crafted up in the stack when receiving the DL_BLOCK.ind. This created some problems since the internal low level state has to be updated in between signalling DL_BLOCK.ind and RTS.ind, as there's a fn-advnace of one block between those 2 signals (hence the timeslot allocation has to be applied at the time when the fn-advance is applied). This is actually not fixing the whole issue, since there's several timeslots and hence the following events will have the internal timeslot updated during the event in the middle, hence potentially causing problems in the remaining TS: DL_BLOCK.ind(FN=N, TS=1), RTS.ind(FN=N+4, TS=1), DL_BLOCK.ind(FN=N, TS=2)
In any case, this decoupling already improves the situation and is step needed anyway towards fully fixing the problem (by, for instance, maintaining a timeslot state duplicated both for DL and Ul directions, since they drive based on differnet FN time (1 PDCH block).
Commit
9978b00ea0357be5a5d071562f5695c3165a3e82
by Vadim Yanitskiy
modem: grr: implement RACH.req retransmission
Sometimes sending one Access Burst is not enough, so we need to repeat sending it a few more more times changing the 3 LSBs randomly. This is what we already do in the mobile app, but not in the modem app.
* Rename GRR_EV_RACH_{REQ,CNF} to GRR_EV_CHAN_ACCESS_{REQ,CNF}. * Rename VTY command 'grr tx-chan-req' to 'grr start-chan-access'. * Add an intermediate state GRR_ST_PACKET_ACCESS. ** The GRR_EV_CHAN_ACCESS_REQ transitions to this state. ** One RACH.req gets transmitted when entering this state. ** The GRR_EV_CHAN_ACCESS_CNF confirms transmission of a RACH.req. ** Upon the timeout (300 ms) expiry, a loop state transition happens. ** After 3 loop-transitions, transition to GRR_ST_PACKET_NOT_READY.
Commit
a41ca4bbc42bc66557cdc7c646c539d1a4c1e30f
by Vadim Yanitskiy
trxcon/l1sched: rework dequeueing of PDCH Tx prims
When an UL BLOCK.req is received late, i.e. after the first Tx burst of the respective TDMA Fn was requested by the PHY, a domino effect can be observed: the stale Tx primitive remains in the queue and prevents transmission of the next primitive, even if the later was received in time. This breaks transmission of consecutive UL blocks.
Don't let stale primitives poison the Tx queue: drop them like before, but keep looking for a primitive with the matching TDMA Fn. If found a primitive with TDMA Fn past the current one, stop the iteration.
Commit
245b4b92389a7fd417cfc784d13adeb0da94be03
by jolly
ASCI: Get timing advance and TX power only when included
Instead of assuming that there are TX power and timing advance IEs included in RSL message, check for existence.
gsm48_rr_rx_acch() may receive frames from FACCH that do not have these IEs included in the message. These frames are UI frames on DCCH and Bter frames. E.g. these frames are used on voice group channel to control uplink.
Commit
b7663882c0ad86b7fb709a4e5b136e7c0230a399
by jolly
ASCI: MM connections are defined by 'ref' and 'protocol' tuple
VGCS and VBS calls may share the same (call) ref or share with other protocols. Therefore the MM connection is defined by the reference and the prococol discriminator.
Commit
28d9a4880cdbc18195d759e182af3a23e9cfa236
by jolly
ASCI: Add a flag to turn transmitter off or on
This flag can be used to turn transmitter off for "group receive mode" or for handover procedure. The flag is stored in the channel description of the mobile application. It is sent to layer 1 when switching to dedicated channel or when changing TCH mode.
At the layer 1 the transmitter is turned off while the receiver is still active. This is done by:
* scheduling a TX dummy task for TCH bursts * scheduling no TX task for SACCH bursts * not enabling the transmit window
Commit
23d46f003f7cbb8d2c39b6285c1e22fd692f9631
by Andreas Eversberg
ASCI: Add UIC support to random access burst
A different identity code can be used on uplink access bursts on voice group channel. This is optional for the network, but mandatory for the MS side. If the network does not define a UIC, the BSIC is used instead. BSIC is used for RACH channel and handover.
Commit
bb32882adc170890dbf2e937181e6d6ee3c55b17
by Andreas Eversberg
ASCI: Increase channel request history to 5 entries
3 entries are enough for random access on CCCH. 5 are required for uplink request on VGCS channel.
The history is used to remember when the random access bursts were send. The RR layer can check if the IMMEDIATE ASSIGNMENT or VGCS UPLINK GRANT message has matching frame number and random value of up to 5 random access bursts previously sent.
Commit
2d9c447c3f63ea73f7f9d8dd93bbbf99dfd7f0b9
by Andreas Eversberg
ASCI: Add interface for group receive/transmit mode support to RR layer
This patch includes new messages and description. The are used to bring RR layer into group receive mode and from there in group transmit mode and back.
Commit
e676cf83eefaba5eb42eeef8e28ec01ed6655c14
by Andreas Eversberg
ASCI: Prepare gsm48_rr_rx_acch for voice group channel
The gsm48_rr_rx_acch function receives FACCH/SACCH. This is not only used for system information on SACCH, but also for short header messages and regular UI messages on TCH.
Commit
253e5cd1ebefcdd99c08ea7ab32dd93b67379ebe
by Andreas Eversberg
ASCI: Add group receive mode support to RR layer
This allows reception of VGCS and VBS calls. A special sub-state is used to differentiate between IDLE mode and group receive mode. Later it can be used to differentiate between dedicated and group transmit mode.
Commit
32399095be88bc05c6c4873876cc005a0b38f0ae
by Andreas Eversberg
ASCI: Add protocol type to trans_find_by_callref() function
This is required, because different protocols may share the same callref, but use different protocols. E.g. a voice group call can share the same callref with a voice broadcast call, but these calls are different transactions.
Commit
6708699b224a1e09cedc9caec79ee511f8bfbc45
by Andreas Eversberg
Correctly detect the follow-on proceed information element
Even if follow-on proceed is not supported, the warning message about not beeing supported should only show when the follow-on proceed information element is included in the location update accept messages.
Commit
de630abfc83a88091072fb3c93e958041c3d99f0
by Vadim Yanitskiy
layer23: send UL/DL GPRS blocks over GSMTAP
Note that despite the VTY interface offers various channel type filtering facilities, the actual filtering is not implemented.
This patch simply brings PS domain in consistency with CS domain: the UL and DL GPRS blocks are now being sent over GSMTAP without any filtering, just like GSM MAC blocks.
Commit
79baca14d41663b1282655620328452944a13f4d
by Vadim Yanitskiy
firmware/layer1: mute UL/DL vocodec if it's not needed
The upper layers usually request either of the two configurations:
* (AUDIO_TX_MICROPHONE | AUDIO_RX_SPEAKER) - in this configuration the phone (PHY) is both the origin and the destination of the TCH frames. DL frames are played via the built-in speaker; UL frames recorded using the built-in microphone.
* (AUDIO_TX_TRAFFIC_REQ | AUDIO_RX_TRAFFIC_IND) - in this case the upper layers (host side) become the origin and the destination of the TCH frames. The built-in speaker and microphone are expected to be disabled.
However, when using the second configuration, one can still hear DL TCH frames being played by the built-in speaker. The built-in microphone does not seem to be causing any issues, but still we definitely don't want the vocoder to interfere with the host.
Commit
f492a99d3639b067ed7d7f98a2e95f98b1c185a4
by Vadim Yanitskiy
firmware/layer1: clean up l1s_tch_resp()
* Reset both A_DD_0 and A_DD_1 headers, like in the case of FACCH. * Reduce nesting, fix minor coding style issues. * Add a FIXME for proper B_BFI checking.
Even though the function works as expected and *can* return -1, which is first casted to unsigned and then back to signed, let's make the code less confusing by returning -1 straightaway.
Commit
8c190e6f927a34e5967fe0abd189604f6024f51b
by Andreas Eversberg
ASCI: Handle rejection of voice group/broadcast call correctly
If joining a call gets rejected, the call must not be released, instead it must return to U3 state (incoming call), because the call still exists in the cell and it might possible to join it later.
If a call notification is gone, a new event is used in the state machine to release incoming call.
Commit
bfebc813842650e1d5191d561ef4a18b3c9b7eb6
by Andreas Eversberg
ASCI: Use correct mobile identiy in TALKER INDICATION message
Use TMSI only if valid in the current location area. If the MS moves to a different location area and joins a group call before location update, TMSI is not valid. Then use IMSI instead. If no IMSI/TSMI is available, send mobile identity without IMSI/TMSI.
Commit
e73a604de0ed5af7b7ef4c61ad3a92ec7a062ee8
by Vadim Yanitskiy
mobile: add support for Circuit Switched Data calls
This patch implements the signalling part for mobile originating and mobile terminating CSD calls. The user plane interface is to be implemented in follow-up patches.
In accordance with 3GPP TS 44.021, sections 8.1.6 and 10.2.3, the transmission of idle frames to the DTE is mandated when no data is received from the radio interface. An idle frame has all data, status, and E-bits to binary '1' (excluding the alignment pattern).
This requirement is currently implemented by osmo-bts for the Uplink, and is going to be adopted for the Downlink (see the related patch).
This patch brings trxcon/l1sched in sync with osmo-bts-trx.
[ 24s] layer1/prim_tch.c: In function 'l1s_tch_meas_avg': [ 24s] layer1/prim_tch.c:183:2: error: 'for' loop initial declarations are only allowed in C99 mode [ 24s] layer1/prim_tch.c:183:2: note: use option -std=c99 or -std=gnu99 to compile your code
We don't specify the C standard explicitly, so let's move the variable declaration out of the for-loop in l1s_tch_meas_avg().
Change-Id: I6c65fbead4e612c81728e9c6601d5f2107616ee6 Fixes: 7286560a3 "firmware/layer1: fill-in DL info for L1CTL TRAFFIC.ind"
Commit
818133cd23b493da472daff2cda9a8e97d0c9637
by laforge
firmware: -nostartfiles -nodefaultlibs are not flags of LD but flags of GCC
It seems that those flags have always been gcc flags, and not ld flags.
After decades of tolerating this, binutils 2.36.x no longer tolerates those flags but prints an error:
arm-none-eabi-ld: Error: unable to disambiguate: -nostartfiles (did you mean --nostartfiles ?)
See also https://github.com/apache/nuttx/issues/3826 and the related https://github.com/apache/nuttx/pull/3836 how this was solved in another project - I adopted that solution here 1:1
Commit
520dd66bdb98f6b7e476e1073ed51b1b5aae7972
by Andreas Eversberg
LAPDm: Enable flag to prevent sending two subsequent REJ frame
Setting the flag was not required in earlier versions of libosmogsm, because this feature was enabled by default.
The roundtrip delay for a LAPD link must be less than T200.
Osmocom-bb runs LAPDm on the host machine via serial interface and USB interface that may cause a roundtrip delay that exceeds T200. Also osmo-bts may have that problem, due to latency between physical interface and osmo-bts software.
What may happen:
An I frame gets lost.
The sending side transmits the next I frame. The receiving side detects the send-sequence error and responds with a REJ frame.
Due to the round trip delay, the T200 expires on the sending side and causes the I frame to be retransmitted with the P bit set, it enters the timer recovery state. The receiving side detects the send-sequence error and responds with a REJ frame with the F bit set.
The sending side will then receive two REJ frames. The first REJ frame will clear the timer recovery state. The second REJ frame (with F bit set) is received when not in timer recovery state, causing an MDL-ERROR-INDICATION.
The layer 2 connection is broken.
Early tests with osmocom-bb in a real network showed exactly this problem.
The solution is to suppress every second REJ frame at the receiving side, until the sequence error condition is cleared. If the first REJ frame gets lost, the sending side would retransmit the I frame again after another expiry of T200. Then the receiving side would respond with a REJ frame again.
Commit
046ee64e3dd7bf285d0e965996bde47acae53099
by Andreas Eversberg
mobile: Fix PCS ARFCN handling: PCS can only be ARFCN 512..810
While it is correct to use the band indicator from SI1 rest octets, it may only be applied for ARFCN values in the range 512..810.
The function gsm_refer_pcs() is used to determine, if the cell (which 'talks' about ARFCNs) refers to them PCS or DCS channels. It returns true, if it refers to PCS, but this only means that ARFCNs in the range 512..810 are PCS channels, not all ARFCNs.
The new function gsm_arfcn_refer_pcs() is used to add the PCS flag to an ARFCN, if the given cell refers to PCS and the given ARFCN is in the PCS range 512..810.
Commit
1641e07c98a68f3c038b9ee9d490f66914629d24
by Andreas Eversberg
Correctly assemble measurement result into MEASUREMENT REPORT
After adding the strongest cell to the measurement report, the variables 'strongest' and 'strongest_i' are used to prevent that already added cells are added again.
Please note that there are no neighbor cell measurements available, because current layer 1 does not report BSIC of neighbor cells. This means that there is no neighbor cell reported.
Commit
2242dbfcc6f5ed14a9ea5505f769ab49a7c5f68f
by Vadim Yanitskiy
virt_phy: fix checking stderr_target in ms_log_init()
Checking the stderr makes a little sense, since it's an integer value (usually equal to 2). The actual intention, most likely, was to check 'stderr_target' against NULL.
Commit
2a688ec5e97a80ee3878bbf951c0acc2b17dbf7a
by Vadim Yanitskiy
layer23/ccch_scan: use osmo_mobile_identity API
* Migrate from deprecated gsm48_mi_to_string() API. * Take a chance to unfify printing of mobile identity. * Use osmo_load32be() for printing TMSI - this is what the osmo_mobile_identity API does internally.
Commit
697e259dc463bf744253ed27475727238a664941
by Vadim Yanitskiy
firmware: fix shebang in solve_envs.py: s/python/python3/
This patch fixes [currently missing] Jenkins build verification. Currently it's just skipping the firmware due to errors:
make -C target/firmware CROSS_COMPILE=arm-none-eabi- make[1]: Entering directory '/build/src/target/firmware' /usr/bin/env: 'python': No such file or directory /usr/bin/env: 'python': No such file or directory /usr/bin/env: 'python': No such file or directory /usr/bin/env: 'python': No such file or directory /usr/bin/env: 'python': No such file or directory /usr/bin/env: 'python': No such file or directory ...
Commit
c3a1f4a39b180432e2120970bb6d4eccad4501af
by Vadim Yanitskiy
mobile: add generic signals for CC/SS/SM transactions
This allows driving logic in other modules based on transaction related events, such as allocation, deallocation, or a state change. These new signals will be used in the upcoming CSD implementation.
Commit
149da511d413d5aa19aef2101aa4299aa7b50a64
by Vadim Yanitskiy
firmware (libosmocore): fix gsm48_chan_mode for TCH/[FH]2.4
This is basically a back-port of the fix that was merged to libosmocore.git back in 2013. Our ancient copy of libosmocore, which is used for building the firmware, predates this commit.
Ideally, we should rip off this ancient copy and build the firmware against recent master (see OS#2378). But for now, let's just fix our local copy. Otherwise TCH/[FH]2.4 support is broken.
Commit
8fa524c39703d0e4a0810eb7918d6938296737d7
by Vadim Yanitskiy
mobile: properly handle different TRAFFIC.{ind,req} formats for CSD
So far we supported the Texas Instruments format (TCH_DATA_IOF_TI), which is used by Calypso based phones (e.g. Motorola C1xx), but not the format that trxcon speaks/understands (TCH_DATA_IOF_OSMO).
Commit
4b496a8c1c2717ea529ea5adf7f0d6447fa19f3a
by Vadim Yanitskiy
mobile: fix rate adaption checking for MO/MT CSD calls
Currently we unconditionally expect the rate adaption (octet 5) in the Bearer Capability IE to be GSM48_BCAP_RA_V110_X30. This is correct for UDI (GSM48_BCAP_ITCAP_UNR_DIG_INF), but not for 3.1 kHz audio (GSM48_BCAP_ITCAP_3k1_AUDIO) and fax (GSM48_BCAP_ITCAP_FAX_G3) calls. For the later two it should be GSM48_BCAP_RA_NONE.
Commit
57ef3dea1b60fa8c2e10c2589240c433cd95ce97
by Vadim Yanitskiy
mobile: VTY: store/read data call params to/from config file
We already have VTY commands to configure data call parameters at run-time, but so far there was no way to save and restore them. This commit adds the respective commands to TCH_DATA_NODE.
Commit
e344d6b7c29b2ea5a4e0ef244695bbe52b01ac66
by Vadim Yanitskiy
l1gprs: minor changes to l1gprs_handle_rts_ind()
* assert() the given TDMA Tn before accessing gprs->pdch[] * do not check TDMA Fn, as there can be no RTS.ind for PTCCH/U ** unlike PTCCH/D, we send Access Bursts on PTCCH/U
Commit
1df9fecf168db9efb19a98cd6154c974a39bcfc5
by Pau Espin Pedrol
apn_fsm: Set default timeout for APN activation to 65s
The current timeout is too low, taking into account that SM PDP Activation timeout is already 30. When SM fails, it will retry sending PDP Context Activation Req. Hence, give it enough time to at least retry once, plus some extra buffer time (eg to go through GMM attach once).
Commit
fc02727700623fbc6f30404919c22b8a1d4bab32
by Vadim Yanitskiy
trxcon/l1sched: trigger sending UL BLOCK.cnf for PDTCH
In tx_pdtch_fn(), delay sending DATA.cnf until bid=3. Otherwise we send it too early (at bid=0) and trick the upper layers (RLC/MAC) to believe that the whole block (all bursts) has been transmitted.
Commit
1b79142f0f1d7aa0e3bd62faf72ba7de7c7ba745
by Vadim Yanitskiy
mobile: init TCH state earlier (on receipt of CC ALERTING)
During a Mobile Originating voice call, we would normally start receiving traffic indications with ringback tone (or even some melody) before the call gets CONNECTed. So in order for the user to be able to hear that, we need to init the voice call handler earlier (on receipt of CC ALERTING message).
We should not be transmitting voice/data frames before the call gets CONNECTed, so add 'rx_only' flag to the TCH state. In tch_send_msg() drop msgb if this flag is set.
Rx only mode makes no sense for data calls, so in tch_recv_cb() we discard received DL frames and thus do not trigger sending UL frames.
Commit
20107916e504f7238068df8088e785255fc17c81
by Vadim Yanitskiy
mobile: set TRAFFIC.{ind,req} mode during call establishment
Now that we support data (CSD) calls in addition to voice calls, we can no longer initialize the TRAFFIC.{ind,req} routing mode in gsm48_rr_init(). We need to apply the appropriate TCH routing mode *during call establishment* based on its type and the configured I/O handler type.
After this patch, one can have the following configuration:
tch-voice io-handler l1phy tch-data io-handler unix-sock io-tch-format ti
so that the io-handler setting for voice would not affect data calls. Before this patch, the L1 PHY (specifically, Calypso firmware) would not route TRAFFIC.{ind,req} during data calls at all.
Thanks to this patch, it's also no longer required to restart the mobile application after changing voice or data I/O handler.
Commit
f12b17dffb782c7428a563620aa83ec047fd99c4
by Vadim Yanitskiy
mobile: fix GAPK I/O producing too many UL frames
GAPK I/O is currently generating too many UL voice frames, causing Tx queue overflow in the L1 PHY. Change the logic to make DL voice frames drive the Uplink processing chain, like we do for CSD.
Commit
5250da87adb94479af71783bd5fe7a68d53d1221
by laforge
Add funding link to github mirror
see https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/displaying-a-sponsor-button-in-your-repository
Commit
d70e8a6de72c361870bd9202110dac933d91992f
by Vadim Yanitskiy
trxcon/l1sched: fix NULL pointer dereference in tx_tch[fh]_fn()
If msg is NULL, we're inducing a BFI condition at the BTS side receiver by sending a TCH/A[FH]S block with invalid CRC6. In this case we need to skip the rest of the function and jump to send_burst immediately.
Commit
ecaa0636426cc3c0142d18a86097f4fbddc9caa2
by Vadim Yanitskiy
trxcon/l1sched: make l1sched_lchan_emit_data_cnf() NULL-safe
Passing NULL to l1sched_lchan_emit_data_cnf() is not normal and generally not expected, but definitely not fatal enough to abort the process completely (due to assertion failure).
Commit
784993a54a3cadb81410cd8298caa4c179f33fa8
by Vadim Yanitskiy
trxcon/l1sched: refactor prim management in tx_tch[fh]_fn()
The code path below the switch statement in tx_tch[fh]_fn() is no longer common since we added CSD specific channel coding. This is why we had to jump over it in several case statements.
This patch significantly reduces the number of goto statements in these two functions and makes them easier to read/follow at the price of code duplication, which is tolerable.
Commit
c310fcfef7f10d026fbfb9569c2a2b46c6984186
by Vadim Yanitskiy
mobile: cosmetic: fix -Wswitch in tch_voice_state_init()
Not really critical, just make gcc happy:
tch_voice.c: In function ‘tch_voice_state_init’: tch_voice.c:117:9: warning: enumeration value ‘TCH_VOICE_IOH_GAPK’ not handled in switch [-Wswitch] 117 | switch (state->handler) { | ^~~~~~
Commit
04ea6f9cab3d9d5120c77f88b500bad526564c0a
by Vadim Yanitskiy
mobile: fix -Wmaybe-uninitialized in gsm48_rr_tx_meas_rep()
This is very unlikely to happen, because we set strongest to 127, but anyway we don't want to see those warnings:
gsm48_rr.c: In function ‘gsm48_rr_tx_meas_rep.isra’: gsm48_rr.c:3714:74: warning: ‘strongest_i’ may be used uninitialized [-Wmaybe-uninitialized] 3714 | if (rrmeas->nc_rxlev_dbm[i] == strongest && i <= strongest_i) | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~ gsm48_rr.c:3696:31: note: ‘strongest_i’ was declared here 3696 | int i, index, strongest_i; | ^~~~~~~~~~~
The entire K2xx-series of Sony Ericsson ODM phone models is based on TI Calypso. The K200, K205, and K220 are identical, except that the K205 has a different case/housing, and the K220 includes an FM receiver.
Commit
60bebe94507759c56698abffe7c09bb87c09f2b4
by Vadim Yanitskiy
trxcon/l1sched: fix decoding of DL FACCH/H for TCH/H4.8 and TCH/H2.4
The mapping sched_tchh_dl_csd_map[] is valid for DL TCH/H4.8 and TCH/H2.4, but not for DL FACCH/H. We need to use a separate lookup table sched_tchh_dl_facch_map[] for DL FACCH/H.
Commit
8e626aba27c4bc571e2dd35a38f2bcd32cd93b47
by Vadim Yanitskiy
trxcon/l1sched: fix FACCH/H regression in rx_tchh_fn()
In c15084a5 overlooked that in order to ensure alignment to the first FACCH/H block in rx_tchh_fn() we actually need to check if DL FACCH/H can start (not end!) at the current TDMA Fn. This means we cannot use the same mapping as we do below in that function; we need another one.
This patch fixes multiple FACCH/H regressions in ttcn3-bts-test.
Change-Id: Ia4b737cf11d4d9ce9847cabb77189e9cbcbb8840 Fixes: c15084a5 ("trxcon/l1sched: replace old API with sched_tchh_ul_facch_map[]")
Commit
5af5ee333ae9510b072d644daf33dfdb60e76928
by Oliver Smith
debian: prepare for more subpackages
Prepare to add these subpackages in follow-up patches: * osmocom-bb-trx-toolkit * osmocom-bb-trxcon * osmocom-bb-virtphy
We need these components to run some of the ttnc3 testsuites. By having them packaged, we can just install them from the binary repositories along with the SUT.
trxcon and virtphy are autootols based, so rework debian/rules to support building multiple autotools projects.
Commit
688bab508341b23746c0e0400e2bc79bd0970d21
by Oliver Smith
debian: add subpackage osmocom-bb-virtphy
I've decided to name the package osmocom-bb-virtphy so there is no underscore in it (would look weird in addition to the minus character) and because it matches the name of the binary "virtphy".
Commit
d95af8c46e46207d9a496a6e12a3e502a567db68
by Oliver Smith
Bump version: 0.1.0 → 0.2.0
Prepare a release tag, so we get binary packages for osmocom-bb-trx-toolkit, osmocom-bb-trxcon, osmocom-bb-virtphy in the osmocom:latest repository. Then we can use these packages when running TTCN-3 testsuites.
This is not an official release, as discussed here: https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/38851/1..2//COMMIT_MSG#b9
Commit
abc63d8d825eb56fdcd7e01bf8824915c8780e18
by Kirill Smelkov
trx_toolkit/clck_gen.py: Fix clock generator not to accumulate timing error
CLCKGen currently works as follows:
sleep(ctr_interval) some work sleep(ctr_interval) some work sleep(ctr_interval) some work ...
The intent here is to do some work at timestamps that are multiple of ctr_interval, however the implementation does not match the intent, because
1) sleep(ctr_interval) is not guaranteed by the OS to be ideal, so there will always be some jitter in actually slept time without any guarantee that the error will fluctuate over zero without accumulating.
2) "some work" takes some time to run and that time adds again and again to the current time when next sleep(ctr_interval) starts. As the result even if sleep implementation would be ideal, then n'th sleep would start not at
t₀ + n·ctr_interval
but instead at
t₀ + n·ctr_interval + Σ1..n t(work_i)
where trailing Σ term adds over and over as the timing error which can be seen as e.g. increasing trend of received GSM clock jitter in https://osmocom.org/issues/4658#note-10 .
The thinko in the clock generator logic is not so much visible if "some work" takes only a bit of time or is done infrequently. That was actually the case before fake_trx added tx queueing in 6e1c82d2 (trx_toolkit/transceiver.py: implement the transmit burst queue) because before that commit some work was only "send IND CLOCK data every ~ 100th tick". However after 6e1c82d2 the work was adjusted to do linear scan of tx queue over and over at every tick which amplified error accumulation and highlighted the problem.
With that tx queuing in fake_trx was disabled in d4ed09df (Revert "trx_toolkit/transceiver.py: implement the transmit burst queue") with the rationale being most likely, as https://osmocom.org/issues/4658#note-10 says,
Unfortunately, Python is not fast enough to handle the queues in time. Despite the relatively low CPU usage, fake_trx.py fails to scheduler everything during one TDMA frame period. This causes some of our TTCN-3 test cases to fail.
...
Most likely, the problem is that Python's threading.Event is not accurate enough. Running with SCHED_RR does not change anything.
However with the above analysis we can see that it is the logic in CLCKgen that needs fixing, not threading.Event . For the reference threading.Event indeed used dumb timeout implementation on Python2:
but on Python3 it essentially uses plain Lock.acquire(timeout) which, under the hood, uses PyThread_acquire_lock_timed - a plain wrapper over sem_timedwait:
so at least with py3 there should be no question about threading.Event .
-> Fix timing error accumulation by reworking the clock generator loop to compensate observed jitter, caused by OS noise and the work taking time, by adjusting to-sleep δt each tick accordingly.
This is generally good for correctness and will allow us to reinstate tx queueing in fake_trx.
Without the fix added test fails as
FAIL: test_no_timing_error_accumulated (test_clck_gen.CLCKGen_Test.test_no_timing_error_accumulated) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/kirr/src/osmocom/bb/src/target/trx_toolkit/test_clck_gen.py", line 60, in test_no_timing_error_accumulated self.assertTrue((ntick+1)*clck.ctr_interval > δT, "tick #%d: time overrun by %dµs total" % AssertionError: False is not true : tick #200: time overrun by 572478µs total
Commit
008dfba7d8511912e7a1dbc3a32d89bfee007b16
by Kirill Smelkov
trx_toolkit/clck_gen: Fix DeprecationWarning about Thread.setDaemon
This warning is currently emitted each time trx_toolkit unittests are run:
(osmo.venv) kirr@deca:~/src/osmocom/bb/src/target/trx_toolkit$ python -m unittest discover /home/kirr/src/osmocom/bb/src/target/trx_toolkit/clck_gen.py:71: DeprecationWarning: setDaemon() is deprecated, set the daemon attribute instead self._thread.setDaemon(True) ............................................... ---------------------------------------------------------------------- Ran 47 tests in 0.997s
OK
-> Fix it by using Thread.daemon attribute directly as suggested by https://docs.python.org/3/library/threading.html#threading.Thread.setDaemon
Commit
2727bef943d4bac02eef0cb8d18b25ba4a259918
by Kirill Smelkov
trx_toolkit/clck_gen: Fix clock generator to emit ticks with exactly GSM frame period
Since fake_trx beginning in 3187c8e6 (target/fake_trx: initial release of virtual transceiver) CLCKGen was tuned to emitting ticks with sleep period being time of 1 GSM frame _decreased_ a bit by "Average loop back delay". The idea for this decrease probably was to compensate the time spent in each tick handler, so that combined sleep + tick work would occupy time of 1 GSM frame more or less.
The idea of using hardcoded compensation turned out to be not very good, because for the overall tick period to be exactly as defined the compensation should be dynamic and take into account time spent in each tick handler. For example on one machine "loopback delay" is one value, while on another it will be another value. And if we attach more work to tick handler, like it already happened with adding tx queue, the compensation needs to take all that into account as well.
abc63d8d (trx_toolkit/clck_gen.py: Fix clock generator not to accumulate timing error) explains the problem in detail and adds dynamic compensation so that the tick period stays as defined instead of drifting. But it missed to adjust CLCKgen to stop decreasing desired tick period a bit by "average loop back delay".
So after that patch, because CLCKgen now follows desired period without drifting, its period was 4.615ms - 0.09ms instead of exact 4.615ms, which resulted in e.g. fake_trx and bts-trx clocks to become constantly dissynchronized with the following emitted by bts-trx non-stop:
20250122135431420 <0006> scheduler_trx.c:576 GSM clock skew: old fn=0, new fn=102 20250122135431882 <0006> scheduler_trx.c:604 We were 3 FN slower than TRX, compensated 20250122135432344 <0006> scheduler_trx.c:604 We were 2 FN slower than TRX, compensated 20250122135432805 <0006> scheduler_trx.c:604 We were 2 FN slower than TRX, compensated 20250122135433267 <0006> scheduler_trx.c:604 We were 2 FN slower than TRX, compensated 20250122135433728 <0006> scheduler_trx.c:604 We were 2 FN slower than TRX, compensated 20250122135434190 <0006> scheduler_trx.c:604 We were 2 FN slower than TRX, compensated 20250122135434651 <0006> scheduler_trx.c:604 We were 2 FN slower than TRX, compensated 20250122135435113 <0006> scheduler_trx.c:604 We were 2 FN slower than TRX, compensated 20250122135435575 <0006> scheduler_trx.c:604 We were 2 FN slower than TRX, compensated 20250122135436036 <0006> scheduler_trx.c:604 We were 2 FN slower than TRX, compensated 20250122135436498 <0006> scheduler_trx.c:604 We were 2 FN slower than TRX, compensated 20250122135436959 <0006> scheduler_trx.c:604 We were 2 FN slower than TRX, compensated 20250122135437421 <0006> scheduler_trx.c:604 We were 2 FN slower than TRX, compensated ...
What happens here is that there are ~ 216 GSM frames every second, and since fake_trx drifts by 0.09ms every frame, it results in drifting by ~ 20ms every second. Which results in "2 FN slower than TRX" emitted approximately twice per second as above log excerpt confirms.
-> Fix this by adjusting CLCKgen to emit ticks with exactly GSM frame period by default.
Commit
0f4714776a9c9b64c4a7268eb8a346f304835565
by Kirill Smelkov
Revert "Revert "trx_toolkit/transceiver.py: implement the transmit burst queue""
This reverts commit d4ed09df57b3461470af501e9687ddd80eb78838, reinstating tx queue into fake_trx.
It is ok to do so because, as explained in abc63d8d (trx_toolkit/clck_gen.py: Fix clock generator not to accumulate timing error), the reason for GSM clock jitter problem was timing error accumulation in CLCKgen, not problems with py threading.Event.
Note: this restores original tx queue implementation basically as-is with only resolve minor conflicts during the revert. The original tx queue implementation wastes CPU cycles though because it linearly scans the whole tx queue at every TDMA frame. If that CPU usage becomes a real problem it should be straightforward to fix by reworking tx queue to use priority queue instead of unordered array via heapq module from standard library. See https://docs.python.org/3/library/heapq.html for details.
The follow-up patches will make necessarry adjastments for tx-queue to function properly.
Commit
c80e193f6d95367e764684a6021ede981f44ebbd
by Kirill Smelkov
trxcon: Advance Uplink TDMA Fn by default again
This essentially reverts 923e9b0b (trxcon: do not advance Uplink TDMA Fn by default; I838b1ebc54e4c5d116f8af2155d97215a6133ba4) for the following reason:
In trxcon TRX clock is unused, because the signal from BTS is used as the master clock source instead (see 45c821ae/Ic8a5b6277c6b16392026e0557376257d71c9d230 "trxcon: get rid of the timer driven clock module" for details".
Before restoration of tx-queue in fake_trx this was working ok even with fn-advance=0 on Ms side, but after I41291708effdd2c767be680fff22ffbd9a56815e (Revert "Revert "trx_toolkit/transceiver.py: implement the transmit burst queue"") fake_trx is sending frames having Fn when exactly same Fn happens corresponding on fake_trx clock. This results in BTS frames (that are sent with fn-advance=2 by default (see I7da3d0948f38e12342fb714b29f8edc5e9d0933d in osmo-bts.git and OS#4487) to be queued, waited to be sent, and then actually sent to Ms on fn=msg.fn . And then even if Ms replies immediately with that same fn, that message will be dropped by fake_trx as stalled, because fake_trx thinks that the message is too late since that fn already happened according to fake_trx clock.
Here is a trace of how that looks like with 1 BTS and 1 MS(*):
10.009.987 MS <- fn=709 tn=0 # messages of BTS queued previously with that fn=709 are forwarded to Ms 10.010.696 MS <- fn=709 tn=1 10.010.904 MS -> fn=709 tn=0 # <-- MS sends UL message with that same fn=709 _before_ CLOCK fn=710 10.011.397 BTS -> fn=712 tn=0 10.011.507 MS <- fn=709 tn=2 10.011.770 MS <- fn=709 tn=3 10.011.968 MS <- fn=709 tn=4 10.012.156 MS <- fn=709 tn=5 10.012.342 MS <- fn=709 tn=6 10.012.527 MS <- fn=709 tn=7 10.012.914 BTS <- fn=709 tn=0 10.013.166 BTS -> fn=712 tn=1 10.013.524 MS -> fn=709 tn=1 # <-- MS sends UL message with that same fn=709 _before_ CLOCK fn=710 10.013.832 BTS -> fn=712 tn=2 10.013.949 MS -> fn=709 tn=2 # <-- MS sends UL message with that same fn=709 _before_ CLOCK fn=710 10.014.081 BTS -> fn=712 tn=3 10.014.177 MS -> fn=709 tn=3 # <-- MS sends UL message with that same fn=709 _before_ CLOCK fn=710 10.014.361 BTS -> fn=712 tn=4
10.014.475 CLOCK fn=710 # but most of those messages of MS with fn=709 are not picked up 10.014.713 MS -> fn=709 tn=4 # instantly and so become dropped as stale on CLOCK fn=710 10.014.815 MS <- fn=710 tn=0 10.015.032 BTS -> fn=712 tn=5 10.015.687 MS <- fn=710 tn=1 10.016.189 MS <- fn=710 tn=2 10.016.464 MS <- fn=710 tn=3 10.016.648 MS <- fn=710 tn=4 10.016.882 MS <- fn=710 tn=5 10.017.110 MS <- fn=710 tn=6 10.017.336 MS <- fn=710 tn=7 [WARNING] transceiver.py:321 (MS) Stale TRXD message (fn=710): fn=709 tn=1 pwr=0 [WARNING] transceiver.py:321 (MS) Stale TRXD message (fn=710): fn=709 tn=2 pwr=0 [WARNING] transceiver.py:321 (MS) Stale TRXD message (fn=710): fn=709 tn=3 pwr=0 [WARNING] transceiver.py:321 (MS) Stale TRXD message (fn=710): fn=709 tn=4 pwr=0
So without adding some fn-advance it is practically not possible for Ms to be on time with tx-queueing on TRX even if Ms sends its uplink frames right immediately after receiving downlink ones.
This way Ms fn-advance has to be 1 at the minimum, so that immediate UL replies can in principle arrive before fn+1 happens on fake_trx side, even for tn=7. And it is also better to increase fn-advance once more by another +1, to compensate for possible jitter due to OS scheduling latencies and similar things. This way default fn-advance=2 on Ms side becomes symmetric to default fn-advance on BTS side and Ms<->BTS exchange starts to work ok even with tx-queueing activated on fake_trx.
In theory it should be possible to reduce those fn-advances to 1 on both sides, but that will likely require to switch clock granularity from Fn to Tn increasing precision by an order of magnitude, which will likely also result in the need to make architectural change of moving trx to work inside BTS and MS instead of being separate service processes. That's a big task and I'm not delving into that here.
Note: Uplink Fn advance > 0 is needed for Ms when working with regular TRX'es as well. The reason is exactly the same as explained above. In 923e9b0b the reason for setting fn-advance=0 by default was that trxcon is usually being used with fake_trx, and that with fake_trx it is not needed. But after reenabling tx-queueing we have to revisit even fake_trx case again.
(*) the trace was captured with the help of the following debugging patch:
+ trace("%s\t<- fn=%d\ttn=%d" % (trx, rx_msg.fn, rx_msg.tn)) # Transform from TxMsg to RxMsg and forward tx_msg = rx_msg.trans(ver = trx.data_if._hdr_ver) trx.handle_data_msg(src_trx, rx_msg, tx_msg)
--- b/src/target/trx_toolkit/fake_trx.py +++ a/src/target/trx_toolkit/fake_trx.py @@ -29,7 +29,7 @@ import re
from app_common import ApplicationBase -from burst_fwd import BurstForwarder +from burst_fwd import BurstForwarder, trace from transceiver import Transceiver from data_msg import Modulation from clck_gen import CLCKGen @@ -473,6 +473,7 @@ def run(self):
# This method will be called by the clock thread def clck_handler(self, fn): + trace("CLOCK\tfn=%d" % fn) # We assume that this list is immutable at run-time for trx in self.trx_list.trx_list: trx.clck_tick(self.burst_fwd, fn)
--- b/src/target/trx_toolkit/transceiver.py +++ a/src/target/trx_toolkit/transceiver.py @@ -25,6 +25,7 @@ from data_if import DATAInterface from udp_link import UDPLink from trx_list import TRXList +from burst_fwd import trace
Commit
fc9044895d23393f0fb81843012b83221e6183b7
by Kirill Smelkov
trx_toolkit/transceiver: Do not forward nor log from under tx_queue_lock
Even though for 1 BTS + 1 MS fake_trx works ok with tx-queuing, when I try to run two ccch_scan's with 1 BTS fake_trx starts occupy ~ 100% of CPU and emits lots of "Stale ..." messages:
Inspecting a bit with a profiler showed that fake_trx simply cannot keep up with the load.
Let's try to fix this with optimizing things a bit where it is easy to notice and easy to pick up low-hanging fruits.
This is the first patch in that optimization series. It moves blocking calls from out of under tx_queue_lock on transmit path. The reason for this move is not to block receive path while the transmit path is busy more than necessary. I originally noticed tx_queue_lock.acquire being visible in profile of the rx thread which indicates that tx/rx contention on this lock can really happen if we do non-negligible tasks from under this lock. Here, in particular, it was forward_msg that was preparing and actually sending RxMsg to destination. tx_queue_lock is needed only to protect tx_queue itself and synchronize rx and tx threads access to it. Once necessary items are appended or popped, we can do everything else out of this lock.
-> Move everything on the tx codepath, not actually needing access to tx_queue out of this lock:
- only collect messages to be sent under the lock; actually forward them after releasing the log; - same for logging.
Commit
abfd60b3ee7b6763661f59fce76c1e45fb9c0012
by Kirill Smelkov
trx_toolkit/*: Represent bursts as arrays instead of lists
Continuing fake_trx profiling story I noticed that on rx path a noticeable time is spent in converting from ubits to sbits via list comprehensions. By changing burst representation from py list, which stores each item as full python object, to an array, which stores each item as just byte, and by leveraging bytearray.translate, we can speed up that conversion by ~ 10x:
Commit
06456f118d6fcd6d60a9e50df1d8f07b5fde2c8b
by Kirill Smelkov
trx_toolkit/*: Try to avoid copying burst data where possible
Conveying burst data is the primary flow in data place of what fake_trx does, so the less copies we do, the less we make CPU loaded.
After this change I can finally run 1 BTS + 2 Mobile + 1 ccch_scan without hitting "Stale message ..." on fake_trx side. However fake_trx cpu load is close to 100% and there are internal clock overruns often:
[WARNING] clck_gen.py:97 CLCKGen: time overrun by -1385us; resetting the clock [WARNING] clck_gen.py:97 CLCKGen: time overrun by -2657us; resetting the clock [WARNING] clck_gen.py:97 CLCKGen: time overrun by -1264us; resetting the clock [WARNING] clck_gen.py:97 CLCKGen: time overrun by -2913us; resetting the clock [WARNING] clck_gen.py:97 CLCKGen: time overrun by -1836us; resetting the clock ...
This suggests that even though fake_trx.py + tx-queue started to work somehow, the rewrite of fake_trx in C, as explained in OS#6672, is still better to do.
Commit
867e849010864ecadf1f2e7adb5c70250b6a99fc
by Pau Espin Pedrol
fake_trx: Allow setting sched RR priority for clckgen thread
With this patch python is still too slow sometimes, with frecuent overruns in the range 50-2400 microsecs. Still, with higher prio we should hopefully see less cases where the process is being delayed by a much higher amount, which may trigger a "no clock" error from osmo-bts-trx.
Commit
6f45d36e936c9209a18e02b6a8c3a04eb1ff9fd9
by Pau Espin Pedrol
Set sched RR Priority on main thread
Since we are still affected by Python GIL, it makes sense to also set the main thread (which is actually also expected to be real time) to a real time priority.
Use a slightly higher rr prio (prio + 1) to the clckgen thread.