Why application request RAM quota from parent even if it is enough?

Alexander Tormasov a.tormasov at innopolis.ru
Fri Sep 6 14:03:53 CEST 2019


Hi Norman
See below.
In short, I understand your ideas -while I have some… can’t say objections, may be suggestions - based on my own experience in industry and academy, including software porting experience.


Yes, but why this call (wait for response from parent) just hangs forever instead of return and say that “I can’t”?

even though it might be confusing, this is intentional. Let me try to
give the rationale.

I'm probably over-generalizing but in Genode, child components are
supposed to act according to the parent's wishes. They receive their
configuration and a budget of resources from their parent and act
accordingly. They should not try to be clever, or probe for resources,
or implement fall-backs. That wouldn't be in the interest of the parent.

Here I see some contradiction - in our (and most, probably) cases we write child process and software, not parent. I even don’t know who my parent in that case. In normal Unix/Wndows/etc I know that this could be some kind of aux processes provided to me by OS on which I don’t have any power.
So, may be it worth to have a separation between «own intentional distributed inside OS (like genode) or over net application childs» and single application as a child of extenrally (not mine!) provided parent?

Now if a child is tasked (by its parent) to do a particular job but it
has not received enough resources to accomplish it, it _must_ escalate
this situation to its parent because the parent is generally the only
one who can resolve this situation (e.g., by upgrading the child's
resources, or changing plans). If a child silently dealt with a ENOMEM
situation without telling the parent, it would deprive the parent from
taking a conscious policy decision.

This all true if this is YOUR parent. But for more generic, «portable»  application written in assumption to be able to work in any environment, which does not have any control over parent at all…
Technically what you suggest is to drop all code related with «see-adaptation» from such applications.
For most of them is a kind of impossible - as you mention, classical example is Java or even Go runtime.
These runtime like «OS» themselves intended to work on any environment.

And, my hard believe that any application can’t «just hangs» in any case. At least, in production environment.
Even for research/educational ones - I can’t understand why, if I configure system wrongly, it just demonstrate black screen, without any reasonable way to find a reason (can’t attach debugger - not supported on my configuration, and I even don’t know the stack of hanging application/rom module/etc, as it happens with me).

Solution could be, IMHO,
1. Separate 2 cases: your own parent and external parent behavior
2. Implement common mechanism to make a status of parent and myself evaluation, you have everything available, e.g. with dedicated channel to send requests about status of last interactions/etc.
May be even make kind of «back signal» like «as a parent, I can’t implement your request and you can commit suicide or do something, not just hang in indefinite wait"


What do we gain from this rigid approach? Simplicity of the components,
robustness, and deterministic behavior.

*Simplicity* because the components are relieved from handling error
cases, which are - in practice - never thoroughly tested anyway. How
many applications are there in the wild who actually handle ENOMEM? How
many of those few handle it in a reasonable and deterministic way other
than panic? Returning ENOMEM is not a solution because in practice, such
rare conditions are not handled. Worse, in traditional software, you
find a lot of error-handling code that sits there but is almost never
executed or stressed. Such code is the perfect hiding place for
vulnerabilities. By relieving software from the this burden, such
low-quality code can be omitted.

*Robustness* follows from simplicity and the child being transparent
about its problems instead of swallowing them up and failing later with
much less obvious symptoms. Failing early is good.

*Deterministic behavior* is reinforced because the child does not
implement an opaque policy on its own. In the few cases where an
application intents to adapt itself the its resource budgets, Genode
allows a component to request its available budgets
(Env.pd().avail_ram()) but this is a special case.


I agree with most of arguments, while, as I mentioned above, in real world from engineering point of view we need robots behavior (not just robots system), even for legacy applications. We not always in power of changing old software, so, better to find kind of wrappers not to compromise existent systems, without «indefinite hangs deep inside code due to violation of invisible and non-intenreterable rules"
I think that we can find reasonable compromise by adding some agreements, way to report problems, way to debug situations/etc

Eg mentioned Env.pd().avail_ram()) could be used inside allocation mechanism to immediately «in application» report about ENOMEM without escalation to parent.

---

Regarding your practical problem with the Go runtime, the virtual-memory
reservation scheme you encountered is fundamentally at odds with Genode
because anonymous memory is always backed by physical memory, not a
zero-page mapping and a copy-on-write mechanism as in Linux. This is not
a limitation but in line with the philosophy outlined above. A
reservation of 500 GiB of memory is in fact an announcement by the
application that it may potentially _use_ this memory. The Linux kernel
says: fine. So the application proceeds. In the event it actually uses
all this memory, the application or even the entire system will suffer
in indeterministic and complex ways. Genode does not give such promises.
It also does not deny them. Instead, it tells the parent about it so the
parent can in principle resolve it. In your case, the parent simply

Who are this parent to resolve problem in case of Go environment for ANY Golang applications? Why it will communicate with not mine code (but inside MY application)? I think that this is fundamental question in such approach.

prints a message and keeps the child blocking infinitely.

This is completely wrong, we should not block anything without reasonable diagnostics which can helps people (not copmiler/application/etc) to resolve the situation.
Do Genode-on-top-of-microcernel has any tool to debug intentional black screen situation?
No-one should be penalized for incorrect code in such a way, IMHO, because this is not constructive approach and do not lead to resolution of problem

In your position, I'd try to investigate ways around the
reservation-based virtual memory management. We had to overcome similar
problems in the past, e.g., I think for the Java runtime. Unfortunately,
this usually requires one to dig deep into the runtime. If you get

I get stuck for some time, now I found some ways, like ask to reserve only 1 gb of ram and try to run everything here, it is somehow supported for mipsle platform.
Now I am stuck again in the setting up of signal processing…
All these requires efforts in trying to fix runtime code doing some nasty things before even main program.
The next potential problem I expect in stack switching (used by go coroutines), and, may be in stack growth (temporary I disable common gcc mechanism from morestack.c).
Hope to have soon running helloworld application from go.



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.genode.org/pipermail/users/attachments/20190906/f18bab26/attachment.html>


More information about the users mailing list