Rework proposal for threadshare
Progress:
-
Initial proposal -
Second proposal (factorize state changes, API enhancements and some design flaws corrections) -
Check the termination strategy for Queue
-
Extra testing (note: pipeline::test_premature_shutdown
deadlocks sometimes) -
Fix FIXMEs (some more enhancement and better use of some futures) -
Update remaining elements. -
Update tokio. -
Rebase on master -
Clippy pass -
Use beta toolchain. -
Finalize doc
Context
Disclaimer: there are parts of the threadshare model for which I still don't have a completely clear representation of the dynamics, so there might still be holes in this proposal.
While working on the tokio 0.2 migration I came up with the following mental-model for the ts-elements:
- Data streams are processed in segments (like in a partition, not in the sense of
gst::Segment
). - A segment is shared between 2 ts-elements:
- one of them (A) is processing a
Stream
of incoming data, using apush_item
orpush_buffer
function - the other one (B) is pushing futures to a queue.
- one of them (A) is processing a
- The
push_{item,buffer}
function in (A) takes care ofdrain
ing the pending future queue feeded by (B). - (A) builds a sticky custom
Event
to communicate theIOContext
and the future queue to be used by (B).
Note: elements with a src and a sink such as ts-proxy
and ts-queue
both act as (A) and (B) for 2 segments of a data stream.
Proposal
I tried to map the threadshare IOContext
model to the executor model described in the async book.
- The async book defines the term "task" for a top level future. In current proposal, I named the futures pushed by the elements acting as (B)
Task
s and the pending future listTaskQueue
s. See below for a discussion about this choice. - 2 Elements sharing the same data stream segment use the same
IOContext
and the sameTaskQueueId
. ATaskContext
struct
gather them together. This allows sharing them at once and avoid storing and managing them separately in each element. - Since there is a certain separation of concerns between (A) and (B), I propose to separate the methods they use in 2
struct
s:- Elements acting as (A) use a
TaskExecutor
. TheTaskExecutor
runs aStreamProcessor
on the incoming data stream and handles the errors regarding the stream or the subsequent processing. TheTaskExecutor
automatically drains theTaskQueue
, saving the element from explicitly handling this in thepush_{item,buffer}
function. TheTaskExecutor
allows building a custom event referring to itsTaskContext
. - Element acting as (B) use a
TaskSpawner
. TheTaskSpawner
is built from a custom event referring to theTaskExecutor
'sTaskContext
. (B) can thenspawn
a new task which will be processed by theStreamProcessor
on theTaskExecutor
.
- Elements acting as (A) use a
- The
JitterBuffer
drain
s itsTaskQueue
, so the behaviour is a little different.
Edit: TaskScheduler
might be more accurate than TaskSpawner
since the tasks are not immediately spawned and in order to stay closer to the terminology used in DataQueue
.
Task vs Future
As mentioned previously, the async book uses the term "task" for a top level future. In the above model, what should be considered as a top-level future might be subject to debate:
- From a technical standpoint, the top-level futures are those invoked when calling
TaskExecutor::run_stream_processor
orTaskExecutor::timeout
. - From a functional standpoint, it could be argued that
TaskExecutor::run_stream_processor
runs an executor so that (B) is able to spawn futures, which would be high level-futures, i.e. tasks, under that perspective.TaskExecutor::timeout
is a little different since it is not implicitely related to theTaskQueue
.
If we decide that 1 is more accurate, I'll rename the TaskQueue
, TaskSpawner
, etc. as FutureQueue
, FutureSpawner
, etc.