Page faults in managed dataspaces

Stefan Kalkowski stefan.kalkowski at ...1...
Tue Sep 27 11:51:46 CEST 2016


Hi Denis,

On 09/27/2016 10:48 AM, Denis Huber wrote:
> Again, thank you Stefan, you are a big help! :)

you are welcome.

> 
> The test program run successfully. Is there a date, when Genode's foc 
> version will be upgraded?
> 

When you ask this question to the team at Genode Labs, the short answer
is no. As you can imagine, there is a lot of different work to do, to
make Genode a serious OS alternative with respect to existent,
established ones, apart from the kernel component.
For us at Genode Labs it is always a question of priorities with respect
to community efforts, paid projects and our personal agenda. With
respect to the kernel component and our own agenda: first and foremost,
we're using the Nova hypervisor currently to run Genode on our own
laptops. We try to develop further our self-written kernel library (ARM,
x86, RiscV) for core, because it best matches the Genode abstractions.
It is under our full control and completely understood by us. On the
other hand, we recently ported Genode to the sel4 kernel to attract the
community. And we did so with Fiasco.OC kernel in the past. If there is
a lot of usage "pressure" by the community we have to re-think about our
priorities.

Nevertheless, Genode is an open-source _community_ project. We always
try to motivate people to contribute to the project. I know that in the
past people from Universidad Central "Marta Abreu" de Las Villas in Cuba
used the unofficial update branch of Fiasco.OC too. But sadly I cannot
find their repository anymore. So I'm unsure whether they discontinued
their work, and how far they were gone.
I do not know whether there are still other people using
Fiasco.OC/Genode, or whether those possibly will contribute an updated
version soon.
Spoken from my perspective, it would be best when people can try to
contribute such improvements to the overall project if they use it
anyway. Otherwise, it is obvious that our small team cannot provide more
and more functionality and simultaneously hold all 3rd party projects
up-to-date.

To sum it up: I cannot provide a specific date to you, but would
encourage you and other people currently using the Fiasco.OC kernel to
try to update by yourself (using the unofficial work as ground work),
and at best to contribute the results to the mainline Genode. Of course,
we (at Genode Labs) would always try to help you in such a process.
Alternatively, you can use another kernel, or try to change our
priorities in favor of Fiasco.OC by financial help or persuading ;-).

Best regards
Stefan

> 
> Kind regards,
> Denis
> 
> On 27.09.2016 09:17, Stefan Kalkowski wrote:
>> Hi Dennis,
>>
>> On 09/26/2016 04:48 PM, Denis Huber wrote:
>>> Hello Stefan,
>>>
>>> thank you for your help and finding the problem :)
>>>
>>> Can you tell me, how I can obtain your unofficial upgrade of foc and how
>>> I can replace Genode's standard version with it?
>>
>> But be warned: it is unofficial, because I started to upgrade but
>> stopped at some point due to timing constraints. That means certain
>> problems we already fixed in the older version might still exist in the
>> upgrade. Moreover, it is almost completely untested. Having said this,
>> you can find it in my repository, it is the branch called foc_update.
>> I've rebased it to the current master branch of Genode.
>>
>> Regards
>> Stefan
>>
>>>
>>>
>>> Kind regards,
>>> Denis
>>>
>>> On 26.09.2016 15:15, Stefan Kalkowski wrote:
>>>> Hi Dennis,
>>>>
>>>> I further examined the issue. First, I found out that is is specific to
>>>> Fiasco.OC. If you use another kernel, e.g., Nova, with the same test, it
>>>> succeeds. So I instrumented the core component to always enter
>>>> Fiasco.OC's kernel debugger when core unmapped the corresponding managed
>>>> dataspace. When looking at the page-tables I could see that the mapping
>>>> was successfully deleted. After that I enabled all kind of loggings
>>>> related to page-faults and mapping operations. Lo and behold, after
>>>> continuing and seeing that the "target" thread continued, I re-entered
>>>> the kernel debugger and realized that the page-table entry reappeared
>>>> although the kernel did not list any activity regarding page-faults and
>>>> mappings. To me this is a clear kernel bug.
>>>>
>>>> I've tried out my unofficial upgrade to revision r67 of the Fiasco.OC
>>>> kernel, and with that version it seemed to work correctly (I just tested
>>>> some rounds).
>>>>
>>>> I fear the currently supported version of Fiasco.OC is buggy with
>>>> respect to the unmap call, at least the way Genode has to use it.
>>>>
>>>> Regards
>>>> Stefan
>>>>
>>>> On 09/26/2016 11:13 AM, Stefan Kalkowski wrote:
>>>>> Hi Dennis,
>>>>>
>>>>> I've looked into your code, and what struck me first was that you use
>>>>> two threads in your server, which share data in between
>>>>> (Resource::Client_resources) without synchronization.
>>>>>
>>>>> I've rewritten your example server to only use one thread in a
>>>>> state-machine like fashion, have a look here:
>>>>>
>>>>>
>>>>> https://github.com/skalk/genode-CheckpointRestore-SharedMemory/commit/d9732dcab331cecdfd4fcc5c8948d9ca23d95e84
>>>>>
>>>>> This way it is thread-safe, simpler (less code), and if you are adapted
>>>>> to it, it becomes even easier to understand.
>>>>>
>>>>> Nevertheless, although the possible synchronization problems are
>>>>> eliminated by design, your described problem remains. I'll have a deeper
>>>>> look into our attach/detach implementation of managed dataspaces, but I
>>>>> cannot promise whether this will happen in short time.
>>>>>
>>>>> Best regards
>>>>> Stefan
>>>>>
>>>>> On 09/26/2016 10:44 AM, Sebastian Sumpf wrote:
>>>>>> Hey Denis,
>>>>>>
>>>>>> On 09/24/2016 06:20 PM, Denis Huber wrote:
>>>>>>> Dear Genode Community,
>>>>>>>
>>>>>>> perhaps the wall of text is a bit discouraging to tackle the problem.
>>>>>>> Let me summaries the important facts of the scenario:
>>>>>>>
>>>>>>> * Two components 'ckpt' and 'target'
>>>>>>> * ckpt shares a thread capability of target's main thread
>>>>>>> * ckpt shares a managed dataspace with target
>>>>>>>    * this managed dataspace is initially empty
>>>>>>>
>>>>>>> target's behaviour:
>>>>>>> * target periodically reads and writes from/to the managed dataspace
>>>>>>> * target causes page faults (pf) which are handled by ckpt's pf handler
>>>>>>> thread
>>>>>>>    * pf handler attaches a pre-allocated dataspace to the managed
>>>>>>> dataspace and resolves the pf
>>>>>>>
>>>>>>> ckpt's behaviour:
>>>>>>> * ckpt periodically detaches all attached dataspaces from the managed
>>>>>>> dataspace
>>>>>>>
>>>>>>> Outcome:
>>>>>>> After two successful cycles (pf->attach->detach -> pf->attach->detach)
>>>>>>> the target does not cause a pf, but reads and writes to the managed
>>>>>>> dataspace although it is (theoretically) empty.
>>>>>>>
>>>>>>> I used Genode 16.05 with a foc_pbxa9 build. Can somebody help me with my
>>>>>>> problem? I actually have no idea what could be the problem.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> You are programming against fairly untested grounds here. There still
>>>>>> might be bugs or corner cases in this line of code. So, someone might
>>>>>> have to look into things (while we are very busy right now). Your
>>>>>> problem is reproducible with [4] right?
>>>>>>
>>>>>> By the way, your way of reporting is exceptional, the more information
>>>>>> and actual test code we have, the better we can debug problems. So,
>>>>>> please keep it this way, even though we might not read all of it at times ;)
>>>>>>
>>>>>> Regards and if I find the time, I will look into your issue,
>>>>>>
>>>>>> Sebastian
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 19.09.2016 15:01, Denis Huber wrote:
>>>>>>>> Dear Genode Community,
>>>>>>>>
>>>>>>>> I want to implement a mechanism to monitor the access of a component to
>>>>>>>> its address space.
>>>>>>>>
>>>>>>>> My idea is to implement a monitoring component which provides managed
>>>>>>>> dataspaces to a target component. Each managed dataspace has several
>>>>>>>> designated dataspaces (allocated, but not attached, and with a fixed
>>>>>>>> location in the managed dataspace). I want to use several dataspaces to
>>>>>>>> control the access range of the target component.
>>>>>>>>
>>>>>>>> Whenever the target component accesses an address in the managed
>>>>>>>> dataspace, a page fault is triggered, because the managed dataspace has
>>>>>>>> no dataspaces attached to it. The page fault is caught by a custom page
>>>>>>>> fault handler. The page fault handler attaches the designated dataspace
>>>>>>>> into the faulting managed dataspace and resolves the page fault.
>>>>>>>>
>>>>>>>> To test my concept I implemented a prototypical system with a monitoring
>>>>>>>> component (called "ckpt") [1] and a target component [2].
>>>>>>>>
>>>>>>>> [1]
>>>>>>>> https://github.com/702nADOS/genode-CheckpointRestore-SharedMemory/blob/b502ffd962a87a5f9f790808b13554d6568f6d0b/src/test/concept_session_rm/server/main.cc
>>>>>>>> [2]
>>>>>>>> https://github.com/702nADOS/genode-CheckpointRestore-SharedMemory/blob/b502ffd962a87a5f9f790808b13554d6568f6d0b/src/test/concept_session_rm/client/main.cc
>>>>>>>>
>>>>>>>> The monitoring component provides a service [3] to receive a Thread
>>>>>>>> capability to pause the target component before detaching the dataspace
>>>>>>>> and resume after detaching and to provide a managed dataspace to the client.
>>>>>>>>
>>>>>>>> [3]
>>>>>>>> https://github.com/702nADOS/genode-CheckpointRestore-SharedMemory/tree/b502ffd962a87a5f9f790808b13554d6568f6d0b/include/resource_session
>>>>>>>>
>>>>>>>> The monitoring component runs a main loop which pauses the client's main
>>>>>>>> thread and detaches all attached dataspaces from the managed dataspace.
>>>>>>>> The target component also runs a main loop which prints (reads) a number
>>>>>>>> from the managed dataspace to the console and increments (writes) it in
>>>>>>>> the managed dataspaces.
>>>>>>>>
>>>>>>>> The run script is found here [4].
>>>>>>>>
>>>>>>>> [4]
>>>>>>>> https://github.com/702nADOS/genode-CheckpointRestore-SharedMemory/blob/b502ffd962a87a5f9f790808b13554d6568f6d0b/run/concept_session_rm.run
>>>>>>>>
>>>>>>>> The scenario works for the first 3 iterations of the monitoring
>>>>>>>> component: Every 4 seconds it detaches the dataspaces from the managed
>>>>>>>> dataspace and afterwards resolves the page faults by attaching the
>>>>>>>> dataspaces back. After the 3. iteration, the target component accesses
>>>>>>>> the theoretically empty managed dataspaces, but does not trigger a page
>>>>>>>> fault. In fact, it reads and writes to the designated dataspaces as if
>>>>>>>> it was attached.
>>>>>>>>
>>>>>>>> By running the run script I get the following output:
>>>>>>>> [init -> target] Initialization started
>>>>>>>> [init -> target] Requesting session to Resource service
>>>>>>>> [init -> ckpt] Initialization started
>>>>>>>> [init -> ckpt] Creating page fault handler thread
>>>>>>>> [init -> ckpt] Announcing Resource service
>>>>>>>> [init -> target] Sending main thread cap
>>>>>>>> [init -> target] Requesting dataspace cap
>>>>>>>> [init -> target] Attaching dataspace cap
>>>>>>>> [init -> target] Initialization ended
>>>>>>>> [init -> target] Starting main loop
>>>>>>>> Genode::Pager_entrypoint::entry()::<lambda(Genode::Pager_object*)>:Could
>>>>>>>> not resolve pf=6000 ip=10034bc
>>>>>>>> [init -> ckpt] Initialization ended
>>>>>>>> [init -> ckpt] Starting main loop
>>>>>>>> [init -> ckpt] Waiting for page faults
>>>>>>>> [init -> ckpt] Handling page fault: READ_FAULT pf_addr=0x00000000
>>>>>>>> [init -> ckpt]   attached sub_ds0 at address 0x00000000
>>>>>>>> [init -> ckpt] Waiting for page faults
>>>>>>>> [init -> target] 0
>>>>>>>> [init -> target] 1
>>>>>>>> [init -> target] 2
>>>>>>>> [init -> target] 3
>>>>>>>> [init -> ckpt] Iteration #0
>>>>>>>> [init -> ckpt]   valid thread
>>>>>>>> [init -> ckpt]   detaching sub_ds_cap0
>>>>>>>> [init -> ckpt]   sub_ds_cap1 already detached
>>>>>>>> Genode::Pager_entrypoint::entry()::<lambda(Genode::Pager_object*)>:Could
>>>>>>>> not resolve pf=6000 ip=10034bc
>>>>>>>> [init -> ckpt] Handling page fault: READ_FAULT pf_addr=0x00000000
>>>>>>>> [init -> ckpt]   attached sub_ds0 at address 0x00000000
>>>>>>>> [init -> ckpt] Waiting for page faults
>>>>>>>> [init -> target] 4
>>>>>>>> [init -> target] 5
>>>>>>>> [init -> target] 6
>>>>>>>> [init -> target] 7
>>>>>>>> [init -> ckpt] Iteration #1
>>>>>>>> [init -> ckpt]   valid thread
>>>>>>>> [init -> ckpt]   detaching sub_ds_cap0
>>>>>>>> [init -> ckpt]   sub_ds_cap1 already detached
>>>>>>>> [init -> target] 8
>>>>>>>> [init -> target] 9
>>>>>>>> [init -> target] 10
>>>>>>>> [init -> target] 11
>>>>>>>> [init -> ckpt] Iteration #2
>>>>>>>> [init -> ckpt]   valid thread
>>>>>>>> [init -> ckpt]   sub_ds_cap0 already detached
>>>>>>>> [init -> ckpt]   sub_ds_cap1 already detached
>>>>>>>> [init -> target] 12
>>>>>>>> [init -> target] 13
>>>>>>>>
>>>>>>>> As you can see: After "iteration #1" ended, no page fault was caused,
>>>>>>>> although the target component printed and incremented the integer stored
>>>>>>>> in the managed dataspace.
>>>>>>>>
>>>>>>>> Could it be, that the detach method was not executed correctly?
>>>>>>>>
>>>>>>>>
>>>>>>>> Kind regards
>>>>>>>> Denis
>>>>>>>>
>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>> _______________________________________________
>>>>>>>> genode-main mailing list
>>>>>>>> genode-main at lists.sourceforge.net
>>>>>>>> https://lists.sourceforge.net/lists/listinfo/genode-main
>>>>>>>>
>>>>>>>
>>>>>>> ------------------------------------------------------------------------------
>>>>>>> _______________________________________________
>>>>>>> genode-main mailing list
>>>>>>> genode-main at lists.sourceforge.net
>>>>>>> https://lists.sourceforge.net/lists/listinfo/genode-main
>>>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------------
>>>>>> _______________________________________________
>>>>>> genode-main mailing list
>>>>>> genode-main at lists.sourceforge.net
>>>>>> https://lists.sourceforge.net/lists/listinfo/genode-main
>>>>>>
>>>>>
>>>>
>>>
>>> ------------------------------------------------------------------------------
>>> _______________________________________________
>>> genode-main mailing list
>>> genode-main at lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/genode-main
>>>
>>
> 
> ------------------------------------------------------------------------------
> _______________________________________________
> genode-main mailing list
> genode-main at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/genode-main
> 

-- 
Stefan Kalkowski
Genode Labs

https://github.com/skalk ยท http://genode.org/




More information about the users mailing list