Put all io_uring related read and write states into sub structures (details)
Commit
6b851c3bb04ca368dec916d59f5ec4aadf0ad6bc
by laforge
Allow io_uring_submit batching just ahead of poll/select
Let's add a mode (enabled via the LIBOSMO_IO_URING_BATCH environment variable), where we don't call io_uring_submit() after every operation we add to the submission queue. Rather, do that once before we go into poll.
This should massively reduce the amount of io_uring_enter() syscalls we're seeing.
Commit
01ae6dbed8c15182ad262bbab50ba4d23f3aaaa8
by laforge
Avoid reusing pending buffer; append incoming data instead
When reading from a stream, a single read may return only part of a message segment. In such cases, the partial data was stored in 'iofd->pending' and reused for subsequent reads to complete the message.
With upcoming changes that submit multiple read SQEs to io_uring, each read uses its own pre-submitted buffer. Reusing 'iofd->pending' for submitting next read is not possible, as the next read buffer is already submitted.
Instead, create a new msgb which is used for the read operation and, once completed, memcopy to the existing pending buffer, allowing message segments to accumulate until complete.
Commit
76ace3ef1999551944e173e7b48d7b259abf4322
by laforge
Add multiple messages buffers to io_uring read operations
Multiple message buffers can be read by receiving a single CQE when using io_uring. If there is less data available than available buffers, not all buffers will be filled.
Having more than one buffer is optional and the number can be controlled via environment variable.
Commit
2e75e6892765155d55bc2bfedc5000949d6eaa89
by laforge
Add multiple messages buffers to io_uring write operations
Multiple message buffers can be writen by sending a single SQE when using io_uring. If there is less data written, the completely written buffers are removed and the partly written buffers are truncated. Afterwards they are re-queued for next write operation.
Having more than one buffer is optional and the number can be controlled via environment variable.
Commit
3c2a02db40046933b143a69d75cb2691a1208fbd
by laforge
osmo-io: Put together message buffers when dequeued from tx queue
Write operations may be incomplete. osmo-io processs will remove complete message buffers after a write operation from msghdr and put the msghdr with the remaining buffers back into tx_queue.
If the user requests multiple messages buffers per write operation, the msghdr of an incomplete write may have less message buffers than the user requested. The remaining buffers are buffers are taken from next msghdr in the queue, if exists.
Commit
8a1588aa71cdd49e8db8c2bc9722766338e6ed6a
by laforge
Send multiple read/recvfrom/recvmsg SQEs in advance
Multiple read or recvfrom operations can be submitted via SQEs when using io_uring. This allows reading multiple packet / more data between calls of osmo_select_main() the main loop.
Having more than one SQE submitted is optional and the number can be controlled via environment variable.
Commit
aa980b74961ed4f392d766a95a77f27789943f26
by laforge
Add environment variable to set io_uring size
The term "LIBOSMO_IO_URING_INITIAL_SIZE" is related to the following patch, which will increment the size of the io_uring automatically if the initial size is too small.
Commit
a63991d41cc41959d71c1d98413cc3349a7c7305
by laforge
Automatically increase io_uring, if too small.
The ring may be too small to store all SQEs before the kernel can handle them. If this happens, a new ring is allocated with twice of the size of the old ring. The old ring will not be destroyed, as it still contains uncompleted elements. Some of them may never be completed.
A pointer to the current ring will be stored within the msghdr structure. It is used when cancelling an SQE. The cancellation must be performed in the same ring where it was created.
It is quite unlikely that the old ring cannot store the cancellation SQE. If this happens, the cancellation is queued and submitted, once the ring can store it.
The old ring will not be removed, because there is currently no counter to determine when all submissions are completed.
Commit
56a346335f36a6c0aea60b7c0f32409116bbafe1
by laforge
Remove old empty io_uring
A previous patch creates a new io_uring, if it becomes too small to store all SQEs. When all SQEs of the old ring are completed, the old ring will be destroyed.
A counter is incremented whenever an SQE is submitted to an io_uring. The counter is decremented whenever a CQE is received and handled. This counter will determine when a ring is empty and can be destroyed.