Aw: Re: How to switch thread stack between threads?

Uwe geno.de at public-files.de
Wed Jul 21 14:49:52 CEST 2021



> Gesendet: Montag, 19. Juli 2021 um 22:00 Uhr
> Von: "Alexander Tormasov via users" <users at lists.genode.org>
> An: "Genode users mailing list" <users at lists.genode.org>
> Cc: "Alexander Tormasov" <a.tormasov at innopolis.ru>
> Betreff: Re: How to switch thread stack between threads?
>
> >> Pthread model below golang runtime which is based on genode port of libc has a bit different model (at least emulated it).
> >> It assumes common space for created threads (which as mapped, correct me if I am wrong, to single OS thread in 1<->1 mode), and, at east, for common memory space and some subset of capabilities related to os resources, shared between threads. E.g. I assume that file, opened in one thread, I can use in another thread - therefore, related capability is common between them - it created from single session.
> > I think that works with TLS. At the user level. Another indirection on top of the native numbers. I think
> > in the POSIX Library.
> 
> This is suffucient for my purpose. I do not use direct services except memory manipulation. the only problem with TLS that in current model is it bounded to stack virtual address while I need it to be bounded to my threads...
> 
> >> Talking about golang model:
> >> they use common space for memory and descriptors and utilise as minimum os thread as possible. 
> >> typically number of os threads equal number of processor cores, and it growth only in situation when thread blocked by syscall. 
> >> in such case it stopped and new thread take from idle list or created from the scratch).
> > I did read your link. And if you disable the global queue it may be possible that go never executes code that
> > is invalid on genode although it is included. And therefore not reliable. And that could be an issue with the
> > philosophy of genode.
> 
> I don’t think that this will work, global queue is a kind of natural load balancer and a way to continue execution if one cpu and related os thread do block…
> as a palliative solution we can have per-cpu «global» queues. not sure that it will work without significant scheduler code update
> 
Yes, I did understand that. But this loadbalancing doesn't work on genode. And if you can run your program
at all, although not efficiently, would you choose that?
> >> 
> >> Talking about optimal model in genode, I think, if we can group a set of threads in the same way as we does with pthread model (making kind of «domain of threads with shared resources» as derivative from main one), then we can migrate code inside this «domain».
> > This «domain» is the group of user level threads pinned to one os level thread.
> 
> is this is, as mentioned above in relation of POSIX subsystem emulation, then it will work even for different ones. all capabilities hidden inside it’s implementation
> I suppose that on genode with posix thread emulation I could write the following code:
> 0 run main os thread in standard way for libc+posix
> 1. create mutex/blocade/any sync primitive and store it in  main thread - os thread 0
> 2. create pthread 1 with func1 (where os thread will be below posix thread 1)
> 3. in func1 set primitive as busy
> 4. open file and store file descriptor in common memory of main thread 0, and set sync primitive free
> 5. in main thread create pthread 2 with func2 where it will be another os thread 2
> 6. in func2 wait for sync primitive set in pthread1, and after it became free read data from the descriptor from 4 above
> 
> this code should work! but below fd it should be uniq file descriptor mapped somehow to capability - but it is created in thread 1 and used in thread 2 - «semi migrated» or shared (both will be os thread).
> No need to interpret fd or doing something else (like with memory availability) - everything already done in posix+libc ontop of genode!
> 
> I want to have the same for golang, IMHO no reasons to make it much more complex… Not everything will work - anyway, basics could be enough as a first step.
> 
> >> 
> >> in 4 above I should not switch to original os thread 1 - it is busy for some other goroutines...
> > No, you *can* switch to the original os level thread *but* not to the original user level thread in the
> > os level thread or rather you first message the running original thread to do a user level switch to a
> > dedicated user level thread which is there to wait on the mutex.
> >> 
> >> 
> >> imho this is too heavyweight solution, and, moreover, golang do not have language VM and interpreter...
> > Not really. *But only if* you think the VM inside-out. All compiled code between preemption points is
> > a *primitive* in this inside-out VM. And the instructions for the VM describe only the pattern for thread
> > switches.
> 
> 
> this assume possibility to move user-thread only goroutines between CPUs (as mentioned below represented by 2 thread - for syscall and for user being preemptively scheduled by genode itself?).
> so, can I use for such approach arbitrary stack not taken from alloc_secondary_stack()?
No, the thread stack must come from alloc_secondary_stack()! But nothing compels you to use the thread stack
for storing local state or return address. You can do at entry something like
type some_func(){
 typedef struct locals{
   ...;
 } local;
 local* l=malloc(sizeof(struct locals));
 frame*f=save_return_to_heap(&l);
...
}
And on return
  restore_return(f);return(...);
This moves all data to the heap while the function is running and restores the stack on return.
The functions restore_return() and save_return_to_heap() have to be written in assembler.
> and, if I do really want to know on which cpu/thread I run - how this can be done (e.g. for TLS operations)?
> 
> >> 
> >> in such approach seems that we will utilise the only thread - while the main reason for existence of multiply threads is to run single thread per cpu, with switchable goroutines till any of them will be blocked.
> >> 
> > No, the idea is to create 2 os level threads per CPU core. From the start of the program. Independent of code.
> > Half of them run goroutines. The other half run the system calls of these goroutines. The needed task switches
> > should be obvious.
> 
> and, what happens if thread for parcilular CPU responsible for sys calls, really blocked due to genode call (e.g. block on genode mutex, or wait for responce because syscall read() should provide data from disk)? os thread can’t continue execution and we need to create another one (like it implemented in golang runtime)...
There are only 2 syscalls that make a thread really wait. All others can be emulated by these 2. These 2 are
select() and wait(). (I mean the ones expected by go, not the ones provided by genode) Both allow modification
of parameters to make them query instead of wait. They can store their parameters in a list and use a hook in
the preemption points of the go language. This hook yields to the syscall thread which yields back if the
parameter list is empty. If the parameter list is not empty the parameters are restored from the list and
the system is queried. On failure the thread is yielded back as in the case of an empty list. On success
the waiting thread (stored in the parameter list) is unblocked with the guarantee that the intended syscall
won't block. Whenever that thread runs it yields to the syscall thread with the intended syscall as message
which is performed without undue delay and yielded back with the results.
> 
> 
> _______________________________________________
> Genode users mailing list
> users at lists.genode.org
> https://lists.genode.org/listinfo/users



More information about the users mailing list