Hello Genodians
We have a somewhat complex scenario consisting of a hand full of components and (sub-)inits.
The outermost deploy-config starts, besides others, the following components: - vfs - report_rom and other helper components - init (drivers runtime) - init (management runtime) - init (dynamic runtime, ram/cap saturation)
The runtimes are configured via a deploy-config. Most components and runtimes depend on the file system provided by vfs.
When restarting vfs by incrementing the version attribute, some components that depend on vfs don't get any resources any more after being restarted. In the state reports of init I see that the RAM quota for some subsystems decreases after the restart of vfs. For some it increases for a short time but goes back down again. For most, including dynamic runtime, it stays constant for the whole time.
I wasn't able to create a simpler example that shows the same error.
Does anybody have an idea in which direction I could investigate?
Best regards, Pirmin
Hello Pirmin,
I think the crux lies in the combination of init's dynamic reconfiguration with resource saturation. There are multiple culprits.
First, you need to consider that once assigned resources cannot be withdrawn without the consent of the subsystem. During the configuration, components can temporarily disappear, which temporary frees their resources. Now, when using the resource saturation mechanism, those temporarily freed slack resources would end up in this single subsystem. Whether or not you can eventually regain those resources (without killing/restarting the subsystem) depends on the behavior of the subsystem, which is presumably not what you want.
Second, resource peaks can occur during the dynamic reconfiguration in the presence of chain reactions like in your case (the restart of the vfs implies the restart of its clients, transitively...). During the orderly destruction of inter-dependent components, init is required to keep parts of disappeared components around to maintain their resource accounting information until the chain reaction has settled. So it can happen that two instances of the same component exist as an intermediate state - the old one that just turned into some kind invisible zombi, and the freshly started one. If the resources are fully saturated, this usage peak cannot be accommodated.
As the bottom line, resource saturation should only be used for static init configurations.
Regarding your specific problem: Since you are consuming init's state reports already, couldn't you avoid the use of resource saturation by using the reported 'avail_ram' and 'avail_caps' to dynamically assign a sensible amount of resources to the runtime subsystem?
Cheers Norman
Hello Norman
On 12.05.22 12:22, Norman Feske wrote:
As the bottom line, resource saturation should only be used for static init configurations.
Many thanks for the in depth explanation of what is going on when init is restarting components.
I wasn't aware, that resource saturation should only be used with static init configurations.
Regarding your specific problem: Since you are consuming init's state reports already, couldn't you avoid the use of resource saturation by using the reported 'avail_ram' and 'avail_caps' to dynamically assign a sensible amount of resources to the runtime subsystem?
This is a good idea. We will probably implement that eventually. For the time beeing we will only restart components inside the run-times and reboot the system when a problem with the vfs arises.
Cheers Pirmin
Hi Pirmin,
I wasn't aware, that resource saturation should only be used with static init configurations.
our documentation actually missed to highlight the subtle interplay between these two features. It will be covered in the next version [1]. :-)
[1] https://github.com/nfeske/genode-manual/commit/4ec4910bdfbc658da0f3c1149974e...
Cheers Norman