Pthread model below golang runtime which is based on genode port of libc has a bit different model (at least emulated it). It assumes common space for created threads (which as mapped, correct me if I am wrong, to single OS thread in 1<->1 mode), and, at east, for common memory space and some subset of capabilities related to os resources, shared between threads. E.g. I assume that file, opened in one thread, I can use in another thread - therefore, related capability is common between them - it created from single session.
I think that works with TLS. At the user level. Another indirection on top of the native numbers. I think in the POSIX Library.
This is suffucient for my purpose. I do not use direct services except memory manipulation. the only problem with TLS that in current model is it bounded to stack virtual address while I need it to be bounded to my threads...
Talking about golang model: they use common space for memory and descriptors and utilise as minimum os thread as possible. typically number of os threads equal number of processor cores, and it growth only in situation when thread blocked by syscall. in such case it stopped and new thread take from idle list or created from the scratch).
I did read your link. And if you disable the global queue it may be possible that go never executes code that is invalid on genode although it is included. And therefore not reliable. And that could be an issue with the philosophy of genode.
I don’t think that this will work, global queue is a kind of natural load balancer and a way to continue execution if one cpu and related os thread do block… as a palliative solution we can have per-cpu «global» queues. not sure that it will work without significant scheduler code update
Talking about optimal model in genode, I think, if we can group a set of threads in the same way as we does with pthread model (making kind of «domain of threads with shared resources» as derivative from main one), then we can migrate code inside this «domain».
This «domain» is the group of user level threads pinned to one os level thread.
is this is, as mentioned above in relation of POSIX subsystem emulation, then it will work even for different ones. all capabilities hidden inside it’s implementation I suppose that on genode with posix thread emulation I could write the following code: 0 run main os thread in standard way for libc+posix 1. create mutex/blocade/any sync primitive and store it in main thread - os thread 0 2. create pthread 1 with func1 (where os thread will be below posix thread 1) 3. in func1 set primitive as busy 4. open file and store file descriptor in common memory of main thread 0, and set sync primitive free 5. in main thread create pthread 2 with func2 where it will be another os thread 2 6. in func2 wait for sync primitive set in pthread1, and after it became free read data from the descriptor from 4 above
this code should work! but below fd it should be uniq file descriptor mapped somehow to capability - but it is created in thread 1 and used in thread 2 - «semi migrated» or shared (both will be os thread). No need to interpret fd or doing something else (like with memory availability) - everything already done in posix+libc ontop of genode!
I want to have the same for golang, IMHO no reasons to make it much more complex… Not everything will work - anyway, basics could be enough as a first step.
in 4 above I should not switch to original os thread 1 - it is busy for some other goroutines...
No, you *can* switch to the original os level thread *but* not to the original user level thread in the os level thread or rather you first message the running original thread to do a user level switch to a dedicated user level thread which is there to wait on the mutex.
imho this is too heavyweight solution, and, moreover, golang do not have language VM and interpreter...
Not really. *But only if* you think the VM inside-out. All compiled code between preemption points is a *primitive* in this inside-out VM. And the instructions for the VM describe only the pattern for thread switches.
this assume possibility to move user-thread only goroutines between CPUs (as mentioned below represented by 2 thread - for syscall and for user being preemptively scheduled by genode itself?). so, can I use for such approach arbitrary stack not taken from alloc_secondary_stack()? and, if I do really want to know on which cpu/thread I run - how this can be done (e.g. for TLS operations)?
in such approach seems that we will utilise the only thread - while the main reason for existence of multiply threads is to run single thread per cpu, with switchable goroutines till any of them will be blocked.
No, the idea is to create 2 os level threads per CPU core. From the start of the program. Independent of code. Half of them run goroutines. The other half run the system calls of these goroutines. The needed task switches should be obvious.
and, what happens if thread for parcilular CPU responsible for sys calls, really blocked due to genode call (e.g. block on genode mutex, or wait for responce because syscall read() should provide data from disk)? os thread can’t continue execution and we need to create another one (like it implemented in golang runtime)...