Bender and GRUB1 "legacy"

List overview All Threads
Download

newer

older

RasPi 3 build problem

keyboard/keymap issues

Valery V. Sedletski

19 Feb 2018 19 Feb '18

10:24 p.m.

Hi, I tried to boot Genode scenarios from my bootable flash stick. It uses GRUB 1 "Legacy".Â Tried to boot Genode/Nova with bender with GRUB "legacy" (actually, GRUB4DOS). I have the following entries in menu.lst:

titleÂ Sculpt scenario (Genode/Nova)

kernel /genode/bender

module /genode/hypervisor hypervisor iommu serial novpid novga

module /genode/sculpt/image.elf image.elf

This scenario just hangs when booting. Does this work with GRUB2 only?

In GRUB2, I see loading the "multiboot2.mod" module. Is this a new version of Multiboot protocol? Could Bender work with older version of multiboot protocol? Or, is it possible to use Genode/Nova without Bender somehow with GRUB1?

Also, is it possible for Genode build system to avoid assembling all modules into image.elf, so I could load them separately as multiple modules? (at least, older Genode versions did so)

10nx in advance!

WBR,

valery

Show replies by date

Valery V. Sedletski

20 Feb 20 Feb

2:18 a.m.

Installed GRUB2 on another flash stick (with FAT32) and trying to boot the same setup as above:

set timeout=0 set gfxpayload="0x0x32"

menuentry 'Genode on NOVA' { Â insmod multiboot2 Â insmod gzio Â insmod part_msdos Â insmod fat Â multiboot2 /boot/bender Â module2 /boot/hypervisor hypervisor iommu serial novpid novga Â module2 /genode/sculpt/image.elf.gz image.elf }

I see NOVA hanging. If I remove "novga" from NOVA command line, it prints info about CPU's (Core2Duo, two cores) and hangs. Trying to remove "serial" results in a reboot. Maybe, the problem is with serial port (machine is Asus laptop, and has no COM ports). Or, the problem is with IOMMU support? Core2Duo has no IOMMU support, AFAIK. But removing "iommu" does not help. Any ideas?

WBR, valery

P.S. How do I cause GRUB2 to show loading progress and avoid clearing screen? GRUB1 showed progress, but version 2 has "silent" mode by default, it seems

Valery V. Sedletski

21 Feb 21 Feb

7:43 p.m.

On 20.02.2018 01:24, Valery V. Sedletski via genode-main wrote:

...

Hi, I tried to boot Genode scenarios from my bootable flash stick. It uses GRUB 1 "Legacy".Â Tried to boot Genode/Nova with bender with GRUB "legacy" (actually, GRUB4DOS). I have the following entries in menu.lst:

titleÂ Sculpt scenario (Genode/Nova)

kernel /genode/bender

module /genode/hypervisor hypervisor iommu serial novpid novga

module /genode/sculpt/image.elf image.elf

This scenario just hangs when booting. Does this work with GRUB2 only?

Ok, it appears that it works with GRUB1. This is NOVA hanged, for some reason. GRUB does work. I thought, it is GRUB guilty, but no. If I remove "novga", it shows CPU list and hangs. I tried to boot NOVA on three machines. It hangs on all three. Why it could be?

...

In GRUB2, I see loading the "multiboot2.mod" module. Is this a new version of Multiboot protocol? Could Bender work with older version of multiboot protocol? Or, is it possible to use Genode/Nova without Bender somehow with GRUB1?

Ok, no need. As I said above, GRUB1 works, so multiboot1 protocol should work.

...

Also, is it possible for Genode build system to avoid assembling all modules into image.elf, so I could load them separately as multiple modules? (at least, older Genode versions did so)

Ok, I commented

# exec rm -rf [run_dir]/genode

out in genode/tool/run/boot_dir/nova

and now it does not remove the "genode" subdirectory, containing all the binaries after "imahge.elf.gz" creation. But I see it creating "core.o" object file, instead of "core" binary. How could I convert "core.o" to "core"? I tried to link it like this:

/usr/local/genode-gcc/bin/genode-x86-gcc -o core core.o

but it complains about duplicate symbol: "__dso_handle". It has been defined in crtbegin.o. Also, it tries to link with -lc and fails. What should be correct way to link "core" binary?

WBR, valery

Valery V. Sedletski

9:27 p.m.

On 21.02.2018 22:43, Valery V. Sedletski via genode-main wrote:

...

Also, is it possible for Genode build system to avoid assembling all modules into image.elf, so I could load them separately as multiple modules? (at least, older Genode versions did so)

...
Ok, I commented

# exec rm -rf [run_dir]/genode

out in genode/tool/run/boot_dir/nova

and now it does not remove the "genode" subdirectory, containing all the binaries after "imahge.elf.gz" creation. But I see it creating "core.o" object file, instead of "core" binary. How could I convert "core.o" to "core"? I tried to link it like this:

/usr/local/genode-gcc/bin/genode-x86-gcc -o core core.o

but it complains about duplicate symbol: "__dso_handle". It has been defined in crtbegin.o. Also, it tries to link with -lc and fails. What should be correct way to link "core" binary?

Now I found the

build_core [run_dir]/genode/$core_obj $modules [run_dir]/image.elf [core_link_address]

stringÂ in genode/tool/run/run. I tried to change it to

build_core [run_dir]/genode/$core_obj [run_dir]/core [core_link_address]

but it does not like empty $modules list. How could I create core image with empty modules list?

PS: A feature request for Genode build system: May be, it would be good to add options for etc/build.conf, to disable removing the "genode" subdirectory, containing all the binaries, plus optionally generate "core" together with "image.elf", so that it will generate a core image without embedded modules. It would be more convenient if a developer wants to copy modules to some GRUB installation manually. So, one can run the run script to generate both the "image.elf" and separate binaries. This would allow both to 1) deploy the scenario on test machine automatically and 2) copy the scenario manually to GRUB config file. For manual copying, it would be more convenient to have separate binaries, so they are not duplicated in each scenario image.elf images, which both take much space on disk, and the same binaries can be reused in many scenarios. Also, with all modules built into the "image.elf", it is not very comfortable to edit the "config" files without regenerating the "image.elf".

WBR, valery

Norman Feske

23 Feb 23 Feb

11:55 a.m.

Hello Valery,

I have the impression that your undertaking goes very much against the grain of our regular work flow. Frankly speaking, I miss the point of bypassing our tooling and instead manually working with boot modules. We deliberately moved away from relying on the boot loader to load individual ROM modules, for a number of reasons:

1. Not all kernels provide the a way for roottask (Genode's core) to access individual boot modules. In particular seL4 or OKL4 do not.

2. On ARM there is not such concept. Boot loaders on ARM load a kernel image only.

3. For the reasons above, Genode has to provide a way to include the initial ROM modules in the boot image. Using this mechanism across all base platforms reduces Genode's complexity and ensures that the solution is well tested. In contrast, the prior used kernel-specific code was more fragile.

4. We hit limitations of several multi-boot loaders. E.g., I am thinking of the GRUB's maximum number of boot modules. With iPXE, we hit different surprises such as a slightly different convention of how boot modules are named. By using one unified mechanism, we rule out these sources of trouble.

5. Shuffling boot modules manually is bug prone. By letting a run script generate one image that contains all needed ingredients in their current version eliminates the chance for inconsistencies between the modules.

In short, we went into the deep end, experienced the limits of the multi-boot approach, and decided for a more robust approach. Our tooling reflects that. It makes it arguably difficult to edit init's configuration on the fly - as you noted - and requires you to execute the run script after each change. You present this as a limitation. But I regard the approach of mutating the boot image manually as misguided. It is not only bug prone but also evades the versioning of the individual modifications. In contrast, if you embrace the work with run scripts, you can always reproduce your scenario and naturally track modifications using Git.

...

PS: A feature request for Genode build system: May be, it would be good to add options for etc/build.conf, to disable removing the "genode" subdirectory, containing all the binaries, plus optionally generate "core" together with "image.elf", so that it will generate a core image without embedded modules. It would be more convenient if a developer wants to copy modules to some GRUB installation manually. So, one can run the run script to generate both the "image.elf" and separate binaries. This would allow both to 1) deploy the scenario on test machine automatically and 2) copy the scenario manually to GRUB config file. For manual copying, it would be more convenient to have separate binaries, so they are not duplicated in each scenario image.elf images, which both take much space on disk, and the same binaries can be reused in many scenarios. Also, with all modules built into the "image.elf", it is not very comfortable to edit the "config" files without regenerating the "image.elf".

The latter point was indeed a concern we had when unifying the boot-module handling. However, on closer inspection, we found three possible scenarios where the mechanism is used:

1. The majority of run scripts are test cases. They are small and are executed ad-hoc (or automatically) but are never permanently installed. So sharing binaries across scenarios would not give any benefit.

2. Run scripts that describe self-sustaining systems, like the Turmvilla scenario. Here we have a large base system consisting of many boot modules, maybe even including virtual disk images. This situation likely corresponds to your's. It would be nice to reuse selected boot modules (like a virtual disk image) between scenarios. The single-image approach is clearly limiting.

3. Run scripts that create the boot image of a multi-stage scenario, like the Sculpt scenario. Here, the boot image contains merely the components needed to bootstrap a second stage from within Genode. The initial boot image features a block-device driver, file system, and fs-rom server. The interesting part happens at a second stage where the Genode system can access information from the disk directly.

Of these three cases, only the second one would really benefit from loading individual ROM modules as multi-boot modules. Based on our experience with Turmvilla, we figured that scenarios of this type - where a complex system is bootstrapped by the boot loader only - do not scale well. Since the direction is inherently limiting, we should stop pursing it but instead embrace systems of the third type.

With Sculpt as the most prominent example of third type, one can already see the benefits. Thanks to fetching all ingredients from the depot, executing the run script is fast. The resulting boot image is quite small (less than 20 MiB) and will stay small even when the second-stage system grows. The potential benefit of sharing parts of it between multiple scenario is negligible.

I hope that this background sheds some light on our line of thoughts that went into the boot-module handling of Genode. Please understand that we won't like to go back to supporting earlier approaches that haven't worked out for us.

Cheers Norman

-- Dr.-Ing. Norman Feske Genode Labs https://www.genode-labs.com Â· https://genode.org Genode Labs GmbH Â· Amtsgericht Dresden Â· HRB 28424 Â· Sitz Dresden GeschÃ¤ftsfÃ¼hrer: Dr.-Ing. Norman Feske, Christian Helmuth

Valery V. Sedletski

5:47 p.m.

On 23.02.2018 14:55, Norman Feske wrote:

...

Hello Valery,

I have the impression that your undertaking goes very much against the grain of our regular work flow. Frankly speaking, I miss the point of bypassing our tooling and instead manually working with boot modules. We deliberately moved away from relying on the boot loader to load individual ROM modules, for a number of reasons:

Not all kernels provide the a way for roottask (Genode's core) to access individual boot modules. In particular seL4 or OKL4 do not.

On ARM there is not such concept. Boot loaders on ARM load a kernel image only.

For the reasons above, Genode has to provide a way to include the initial ROM modules in the boot image. Using this mechanism across all base platforms reduces Genode's complexity and ensures that the solution is well tested. In contrast, the prior used kernel-specific code was more fragile.

We hit limitations of several multi-boot loaders. E.g., I am thinking of the GRUB's maximum number of boot modules. With iPXE, we hit different surprises such as a slightly different convention of how boot modules are named. By using one unified mechanism, we rule out these sources of trouble.

Shuffling boot modules manually is bug prone. By letting a run script generate one image that contains all needed ingredients in their current version eliminates the chance for inconsistencies between the modules.

This all is understandable, and I'm aware of the most of all this background. But on x86 and the usual foc/nova kernels, the limitations of ARM bootloaders and some x86 kernels are not the case. Yes, GRUB has a limitation of 99 modules max. In my own GRUB- based loader I increased it to 200, or so. It was sufficient for my use cases (I wrote a multiboot kernel allowing to boot OS/2 with GRUB-like loaders. It required about 70 modules, and it was sufficient).

...

In short, we went into the deep end, experienced the limits of the multi-boot approach, and decided for a more robust approach. Our tooling reflects that. It makes it arguably difficult to edit init's configuration on the fly - as you noted - and requires you to execute the run script after each change. You present this as a limitation. But I regard the approach of mutating the boot image manually as misguided. It is not only bug prone but also evades the versioning of the individual modifications. In contrast, if you embrace the work with run scripts, you can always reproduce your scenario and naturally track modifications using Git.

Yes, this all is understandable too. But I cannot run the "run" scripts manually after I changed the configuration. If they are contained inside the core image, it requires it to be rebuilt. But I cannot rebuild the image without access to my development machine. I don't change binaries, I mostly change the configuration files. I usually update the binaries after changing them and recompiling, so all binaries should be of the latest version. (this is regarding the versioning)

So, the reason why I want to bypass the build system is because ATM, I have no spare machine with Intel AMT technology, so I cannot deploy the image generated with the run script, to my test machine, automatically. (I think that most people outside the core Genode team have no ThinkPads with Intel AMT Support. I have one ThinkPad, but it is used to run my development Linux system, and it is not desirable to reboot it each time. My third test machine is Asus Core2Duo machine with Intel chipset and Nvidia video card, so it has no Intel AMT support.) So, I just copy the image manually to my bootable flash stick, and trying to boot it on my test machine, manually. To avoid regenerating "core" image, I'd like to split it back to separate files. So, regenerating the image each time after each "config" change is not feasible here. This would require copying all the big "core" image to the flash stick, which is too long (and requires access to my development machine). That's why I want to have "core" to be disassembled to separate files. This is simply more convenient (for my specific case) toÂ edit "config" files only, without the need to regenerating and copying the image over. So, I'd like to have a possibility to bypass the usual approach. But if it's not possible via standard options in etc/boot.conf, I'd like at least a way to create the "core"Â image without any modules included. I modified the "run" tool a bit, but It does not like an empty modules list. So, my last question was how would I create the image with an empty modules list. I see "run" tool generating an assembler file with the module list. Is it possible to have it empty somehow?

...

...
PS: A feature request for Genode build system: May be, it would be good to add options for etc/build.conf, to disable removing the "genode" subdirectory, containing all the binaries, plus optionally generate "core" together with "image.elf", so that it will generate a core image without embedded modules. It would be more convenient if a developer wants to copy modules to some GRUB installation manually. So, one can run the run script to generate both the "image.elf" and separate binaries. This would allow both to 1) deploy the scenario on test machine automatically and 2) copy the scenario manually to GRUB config file. For manual copying, it would be more convenient to have separate binaries, so they are not duplicated in each scenario image.elf images, which both take much space on disk, and the same binaries can be reused in many scenarios. Also, with all modules built into the "image.elf", it is not very comfortable to edit the "config" files without regenerating the "image.elf".

The latter point was indeed a concern we had when unifying the boot-module handling. However, on closer inspection, we found three possible scenarios where the mechanism is used:

The majority of run scripts are test cases. They are small and are executed ad-hoc (or automatically) but are never permanently installed. So sharing binaries across scenarios would not give any benefit.

But still all the binaries, except the test itself, can be shared. The tests use the same "general" components (which are under test) as other scenarios.

...

Run scripts that describe self-sustaining systems, like the Turmvilla scenario. Here we have a large base system consisting of many boot modules, maybe even including virtual disk images. This situation likely corresponds to your's. It would be nice to reuse selected boot modules (like a virtual disk image) between scenarios. The single-image approach is clearly limiting.

Run scripts that create the boot image of a multi-stage scenario, like the Sculpt scenario. Here, the boot image contains merely the components needed to bootstrap a second stage from within Genode. The initial boot image features a block-device driver, file system, and fs-rom server. The interesting part happens at a second stage where the Genode system can access information from the disk directly.

Of these three cases, only the second one would really benefit from loading individual ROM modules as multi-boot modules. Based on our experience with Turmvilla, we figured that scenarios of this type - where a complex system is bootstrapped by the boot loader only - do not scale well. Since the direction is inherently limiting, we should stop pursing it but instead embrace systems of the third type.

With Sculpt as the most prominent example of third type, one can already see the benefits. Thanks to fetching all ingredients from the depot, executing the run script is fast. The resulting boot image is quite small (less than 20 MiB) and will stay small even when the second-stage system grows. The potential benefit of sharing parts of it between multiple scenario is negligible.

I hope that this background sheds some light on our line of thoughts that went into the boot-module handling of Genode. Please understand that we won't like to go back to supporting earlier approaches that haven't worked out for us.

Most scenarios are still of case "2". Dynamic scenarios of case "3" are not common yet. It seems, only "Sculpt" belong to this class.

BTW, I cannot run both "Sculpt" and "VirtualBox" and "Seoul" scenarios, so far. NOVA seems to hang on my three available machines, for some reason. So, I tried the Fiasco.OC kernel instead. (But VirtualBox and Seoul are working on NOVA only). So far, my attempts to run Sculpt was not successfull too. It looks like acpi_drv does not like something in ACPI tables of my test machine.

...

Cheers Norman

Alexander Boettcher

27 Feb 27 Feb

7:31 p.m.

Hello,

On 23.02.2018 18:47, Valery V. Sedletski via genode-main wrote:

...

scenarios, so far. NOVA seems to hang on my three available machines, for some reason. So, I tried the Fiasco.OC kernel instead. (But VirtualBox and Seoul are working on NOVA only). So far, my attempts to run Sculpt was not successfull too. It looks like acpi_drv does not like something in ACPI tables of my test machine.

the provided information are very vague. What exactly "does not like something" mean ?

Please note that on Genode/Fiasco.OC you see output since the kernel prints the Genode messages also to the VGA console.

On Genode/NOVA there is no such support by the kernel for Genode. That means, probably the NOVA kernel is booted up completely and Genode gets running and just get stuck in your apci_drv issues you encountered. You just don't see it. You have several options here:

a) Either debug the early boot on Genode/Fiasco.OC (having VGA output), which could also solve your issue with Genode/NOVA bootstrap.

b) Get serial output of Genode/NOVA on you target machines.

c) Alternatively, you may try a debugging commit [0] for Genode/NOVA which adds some very limited VGA output support. The patch is probably outdated and you will have to adjust it.

If you really want to get your hardware running, I would go with option b), if this is not possible with a) and if nothing helps with c).

[0] https://github.com/alex-ab/genode/commits/experimental_vga_console_16_08

Cheers,

-- Alexander Boettcher Genode Labs http://www.genode-labs.com - http://www.genode.org Genode Labs GmbH - Amtsgericht Dresden - HRB 28424 - Sitz Dresden GeschÃ¤ftsfÃ¼hrer: Dr.-Ing. Norman Feske, Christian Helmuth

Valery V. Sedletski

11:58 p.m.

On 27.02.2018 22:31, Alexander Boettcher wrote:

...

Hello,

On 23.02.2018 18:47, Valery V. Sedletski via genode-main wrote:

...
scenarios, so far. NOVA seems to hang on my three available machines, for some reason. So, I tried the Fiasco.OC kernel instead. (But VirtualBox and Seoul are working on NOVA only). So far, my attempts to run Sculpt was not successfull too. It looks like acpi_drv does not like something in ACPI tables of my test machine.

the provided information are very vague. What exactly "does not like something" mean ?

I took a screenshot and put it here [1]. Unfortunately, this machine has no COM port (it is an Asus Core2Duo laptop). This is a result of trying to run "noux_bash.run" on Fiasco.OC.

Running noux_bash/foc on another machine with a COM port, I saw two different errors, [2] with PS/2 driver stopping to boot, and [3] with an unhandled page fault in usb_drv. This machine is my desktop with Athlon 64 (Socket 939), 4 GB of RAM. It has two PS/2 ports for both, keyboard and mouse, but ATM, no devices are attached. I currently use a pair of wireless Logitech mouse M280, and wireless Logitech Keyboard K800, both paired with Logitech Unifying transceiver. This transceiver can be paired with up to 6 compatible mice or keyboards. It only requires an USB stack + a HID class driver + USB mouse driver + USB keyboard driver. Once paired, it works out of the box at hardware level. No special support is required (except for support for "composite" devices, maybe. The transceiver is a composite HID device. I can supply a report of an utility similar to "lsusb"). It works good under Linux, so I decided that Linux USB stack portedÂ to Genode should work with it. But it seems, that there are some problems with it. I can help with debugging this to get it working, if you'd like. Otherwise, I can return my USB, or PS/2 keyboard and mouse back (I still have them).

...

Please note that on Genode/Fiasco.OC you see output since the kernel prints the Genode messages also to the VGA console.

On Genode/NOVA there is no such support by the kernel for Genode. That means, probably the NOVA kernel is booted up completely and Genode gets running and just get stuck in your apci_drv issues you encountered. You just don't see it. You have several options here:

a) Either debug the early boot on Genode/Fiasco.OC (having VGA output), which could also solve your issue with Genode/NOVA bootstrap.

b) Get serial output of Genode/NOVA on you target machines.

c) Alternatively, you may try a debugging commit [0] for Genode/NOVA which adds some very limited VGA output support. The patch is probably outdated and you will have to adjust it.

If you really want to get your hardware running, I would go with option b), if this is not possible with a) and if nothing helps with c).

Yes, thanks -- I already have guessed this myself.

Regarding what's wrong with NOVA, the problem was that NOVA does not echo the COM port log to screen, like Fiasco/Fiasco.OC do. So, I ran it on a second machine with a COM port (the Athlon 64 desktop). The problem appears to be with my custom GRUB- like bootloader, it appears that I forgot to add

modaddr 0x02000000

to load modules above 32 MB (The bootloader has the GRUB patch from Adam Lackorzynski included). So, this is a problem with this loader, not NOVA. NOVA is ok. Core got stuck with an "init: Bad ELF file" error, because of a problem with my bootloader. It currently needs modules loaded above 4 Gb boundary.

The scenario I tried to run was last "Sculpt" version from "master" branch. So, I ran Sculpt/Fiasco.OC on Athlon 64 machine with a COM port. It stops on loading "init" with the follownig error: Genode 17.11-284-ge79ce5a03 <local changes> 686 MiB RAM and 63253 caps assigned to init [init] Error: RAM preservation exceeds available memory [init] Warning: runtime: assigned RAM exceeds available RAM

(this log is taken in QEMU. The same problem is on the real machine. The only difference is that it has more RAM)

So, it looks that some quota is insufficient. Though, I have no idea, which one. The above error usually is not fatal. It is often showed when running in QEMU. But I see it on both QEMU and two my real machines (both the Asus Core2Duo laptop, and Athlon 64 desktop). Both machines have 4 GB of RAM. Is "Sculpt" supposed to run in QEMU, or it even requires more than 4 GB of RAM? Three of my available machines have 4 GB, and the usual test machine (Asus Core2Duo laptop) has only 2 GB of RAM.

[1] https://imgur.com/a/kSptw

[2] https://pastebin.mozilla.org/9078740

[3] https://pastebin.mozilla.org/9078741

WBR,

valery

Alexander Boettcher

28 Feb 28 Feb

12:25 p.m.

On 28.02.2018 00:58, Valery V. Sedletski via genode-main wrote:

...

[1] https://imgur.com/a/kSptw

The ACPI table named ATKG seems to cause also trouble on other Linux machines according to my search results. Maybe you can apply the latest BIOS update to that machine, since the root cause of the issue are wrong ACPI tables provided by the BIOS vendor. Another way we can try is to make our ACPI parser more robust and just skip the table.

Try to uncomment in

repos/os/src/drivers/acpi/acpi.cc

after the "checksum mismatch for" message the "throw" (line ~450). Cross fingers, maybe you have luck and you get further.

You have to re-create the drivers package by invoking:

tool/depot/create genodelabs/pkg/drivers_interactive-pc UPDATE_VERSIONS=1 FORCE=1

Additionally, you may send me directly (not to the mailing list) the dumped ACPI tables, so I may have a look.

...

[2] https://pastebin.mozilla.org/9078740 [3] https://pastebin.mozilla.org/9078741

...

[init -> drivers -> fb_drv] resource_request: ram_quota=0, cap_quota=4 [init -> drivers] child "fb_drv" requests resources: ram_quota=0,

cap_quota 4

Increase the "caps" value by some minor value for the fb_drv start element in

repos/os/recipes/raw/drivers_interactive-pc/drivers.config

and re-create the interactive driver package (as shown above).

...

[init -> drivers -> usb_drv] dev_info: new low-speed USB device number

2 using uhci_hcd

...

no RM attachment (READ pf_addr=0x18 pf_ip=0xa15ef from pager_object:

pd='init -> drivers -> usb_drv' thread='ep')

Try adding the ohci attribute to the usb driver component, since you're running on a AMD machine in repos/os/recipes/raw/drivers_interactive-pc/drivers.config, e.g.:

(rebuild the package as noted above.)

If this does not help, try first with the simple run script usb_hid.run in repos/dde_linux. Use objdump to find out which source line the instruction pointer belongs to.

Cheers,

Valery V. Sedletski

1 Mar 1 Mar

1:01 a.m.

On 28.02.2018 15:25, Alexander Boettcher wrote:

...

On 28.02.2018 00:58, Valery V. Sedletski via genode-main wrote:

...
[1] https://imgur.com/a/kSptw

The ACPI table named ATKG seems to cause also trouble on other Linux machines according to my search results. Maybe you can apply the latest BIOS update to that machine, since the root cause of the issue are wrong ACPI tables provided by the BIOS vendor. Another way we can try is to make our ACPI parser more robust and just skip the table.

Try to uncomment in

repos/os/src/drivers/acpi/acpi.cc

after the "checksum mismatch for" message the "throw" (line ~450). Cross fingers, maybe you have luck and you get further.

You have to re-create the drivers package by invoking:

tool/depot/create genodelabs/pkg/drivers_interactive-pc UPDATE_VERSIONS=1 FORCE=1

Additionally, you may send me directly (not to the mailing list) the dumped ACPI tables, so I may have a look.

Yes, I will try creating ACPI tables dump later.

...

...
[2] https://pastebin.mozilla.org/9078740 [3] https://pastebin.mozilla.org/9078741 [init -> drivers -> fb_drv] resource_request: ram_quota=0, cap_quota=4 [init -> drivers] child "fb_drv" requests resources: ram_quota=0,

cap_quota 4

Increase the "caps" value by some minor value for the fb_drv start element in

repos/os/recipes/raw/drivers_interactive-pc/drivers.config

and re-create the interactive driver package (as shown above).

Yes, I increased caps quota for fb_drv now, and commented "throw" out in acpi.cc, this helped! Also, I rebuilt both drivers_interactive-pc and drivers_managed-pc. The first one is required by noux_bash.run, and the second one is required by sculpt.run. Now noux_bash works like a charm on Asus machine, on both Fiasco.OC and NOVA kernels. ;-) The BIOS seems to be updated in the service center, where this machine was some months ago, though, I can check this. Yes, the ACPI table seems to be broken.

On Athlon 64 machine, noux_bash now starts ok with NOVA kernel, but still the "No RM attachment" problem on Fiasco.OC kernel. I'll check the exact source line later.

So, noux_bash mostly works now, but still the same problems with Sculpt, as previously. I updated sources to the latest level, but this didn't helped. The only difference now is that Sculpt does not try running in QEMU automatically, but only binaries are created. Also, I added the <config verbose="yes"... to "init" parameters. Now it shows service announcements and resource reservations. But I still don't see any failed resource requests or any other errors, except for the usual

[init] [31mError: RAM preservation exceeds available memory[0m[0m [init] [34mWarning: runtime: assigned RAM exceeds available RAM[0m[0m

Maybe, there are some other useful parameters for more details than verbose="yes" I don't know about? The log with verbose="yes" is here: [1]

Oops, I forgot about these messages:

[init] [34mWarning: specified quota exceeds available quota, proceeding with a quota of 3424667716[0m[0m [init] [34mWarning: runtime: assigned caps (50000) exceed available caps (1381)[0m[0m

Maybe, I'll need to increase caps quota for "init" somehow. (Is it not given all available caps in the system?). But still, In QEMU and on Real machine with NOVA kernel this message is missing, and it stops booting on the same messages.

Another problem with Sculpt that fb_drv causes my FullHD monitor to go out of sync on an attempt to set up the 1024x768 mode, if I'll wait until when Nitpicker is started. However, with noux_bash the 1024x768@...64... mode is set up correctly.

...

...
[init -> drivers -> usb_drv] dev_info: new low-speed USB device number

2 using uhci_hcd

...
no RM attachment (READ pf_addr=0x18 pf_ip=0xa15ef from pager_object:

pd='init -> drivers -> usb_drv' thread='ep')

Try adding the ohci attribute to the usb driver component, since you're running on a AMD machine in repos/os/recipes/raw/drivers_interactive-pc/drivers.config, e.g.:

<config uhci="yes" ehci="yes" ohci="yes" xhci="yes"> <hid/> </config>

(rebuild the package as noted above.)

No, this machine has no OHCI controllers, so this is not needed. It has 1) an integrated USB chip, which has 4 UHCI and 1 EHCI controllers 2) a PCI extension USB board, having 1 EHCI and 2 UHCI controllers. Both are based on VIA chipset (Motherboard is Abit AV8 "Third Eye" based on K8T800Pro chipset, and an external VIA6202 USB controller).

...

If this does not help, try first with the simple run script usb_hid.run in repos/dde_linux. Use objdump to find out which source line the instruction pointer belongs to.

Cheers,

Ok, will try that soon.

[1] https://pastebin.mozilla.org/9078851 Sculpt on Genode/Fiasco.OC (Athlon 64 machine)

WBR,

valery

Norman Feske

8:25 a.m.

Hi Valery,

...

[init] [31mError: RAM preservation exceeds available memory[0m[0m [init] [34mWarning: runtime: assigned RAM exceeds available RAM[0m[0m ... [init] [34mWarning: specified quota exceeds available quota, proceeding with a quota of 3424667716[0m[0m [init] [34mWarning: runtime: assigned caps (50000) exceed available caps (1381)[0m[0m

these messages are normal. Sculpt assigns all remaining memory and caps to the runtime subsystem by specifying large amounts of quota to the runtime start node (the real amount is not prior known). It is nothing to worry about - we should definitely try to silence those messages in the future.

You don't see any additional messages with Sculpt because the log of the drivers subsystem is redirected to the report file system so that the Sculpt user can inspect it once the system is up and running. Hence, the drivers log is not routed to core's LOG service. For debugging, you may remove the following routing rule from the "drivers" start node of the sculpt.run script:

With this rule removed, the routing falls back to the parent. Now, with the LOG session routed to core, you should be able to see what the drivers are complaining about.

Cheers Norman

Valery V. Sedletski

4 Mar 4 Mar

7:28 a.m.

On 01.03.2018 11:25, Norman Feske wrote:

...

Hi Valery,

...
[init] [31mError: RAM preservation exceeds available memory[0m[0m [init] [34mWarning: runtime: assigned RAM exceeds available RAM[0m[0m ... [init] [34mWarning: specified quota exceeds available quota, proceeding with a quota of 3424667716[0m[0m [init] [34mWarning: runtime: assigned caps (50000) exceed available caps (1381)[0m[0m

these messages are normal. Sculpt assigns all remaining memory and caps to the runtime subsystem by specifying large amounts of quota to the runtime start node (the real amount is not prior known). It is nothing to worry about - we should definitely try to silence those messages in the future.

You don't see any additional messages with Sculpt because the log of the drivers subsystem is redirected to the report file system so that the Sculpt user can inspect it once the system is up and running. Hence, the drivers log is not routed to core's LOG service. For debugging, you may remove the following routing rule from the "drivers" start node of the sculpt.run script:

Â Â <service name="LOG"> <child name="log"/> </service>

With this rule removed, the routing falls back to the parent. Now, with the LOG session routed to core, you should be able to see what the drivers are complaining about.

Cheers Norman

Thanks for the hint! Ok, I commented the routing rule, and similar

routing rules for "runtime" and "leitzentrale" too, just in case. I got the following log: [1].

So, the problem is with this:

init -> drivers -> driver_manager] [34mWarning: abort called - thread: ep[0m[0m [init -> drivers] child "driver_manager" exited with exit value 1[0m[0m

Running sculpt_test.run in QEMU, I see exactly the same error! (The QEMU version I'm

using is 2.8.1 -- the version from Debian Stretch).

These messages:

[init -> drivers -> driver_manager] [31mError: Could not open ROM session for "platform_info"[0m[0m [init -> drivers -> driver_manager] [31mError: Uncaught exception of type 'Genode::Rom_connection::Rom_connection_failed'[0m[0m

are strange, because "platform_info" seems to be optional. Previously, it failed too in different scenarios, but it was not fatal.

PS: A good idea might be to "#ifdef" these routing rules in XML config somehow. If I correctly remember, there are some conditional XML tags in "config" file syntax. So, probably it should look like this: you specify a special parameter in "core" command line, and it gets passed to "init" from "core". So, probably, "core" should create a new ROM module containing the variable list with their values (probably, in XML syntax). So, if "init" sees this module, it gets varable names and values from that module, and uses them like C preprocessor defines, so some "config" parts can be #ifdef-ed. This way, for example, routing rules can be skipped, if some variable has some special value. This allows to bypass the routing rules without the need to rebuild the "core " image on each "config" change, for trouble-shooting purposes. So, specifying a special command line parameter in "core" command line allows to redirect the log back to "core". This might be impossible on ARM bootloaders, of course, but it takes advantage of some GRUB-like loaders' features. So, this way some troubleshooting modes can be implemented, without the need to alter the "core" image. What do you think?

[1] https://pastebin.mozilla.org/9078923 Sculpt on Genode/Fiasco.OC (Athlon 64 machine)

WBR,

valery

Tomasz Gajewski

10:53 a.m.

New subject: [SPAM] Re: Bender and GRUB1 "legacy"

"Valery V. Sedletski via genode-main" genode-main@lists.sourceforge.net writes:

...

These messages:

[init -> drivers -> driver_manager] [31mError: Could not open ROM session for "platform_info"[0m[0m [init -> drivers -> driver_manager] [31mError: Uncaught exception of type 'Genode::Rom_connection::Rom_connection_failed'[0m[0m

I've had the same info on Fiasco OC. platform_info is available at least on NOVA and hw.

...

are strange, because "platform_info" seems to be optional. Previously, it failed too in different scenarios, but it was not fatal.

If I correctly understand driver_manager is a component in Sculpt that generates configuration for framebuffer component to start appropriate one (vesa or intel) based on informations acquired from (among others) platform_info.

I think it is not optional for driver_manager - at least it doesn't seem to be when looking into code and without driver_manager there is no configuration in Sculpt to start framebuffer component.

Tomasz Gajewski

Valery V. Sedletski

4:06 p.m.

On 04.03.2018 13:53, Tomasz Gajewski wrote:

...

"Valery V. Sedletski via genode-main" genode-main@lists.sourceforge.net writes:

...
These messages:

[init -> drivers -> driver_manager] [31mError: Could not open ROM session for "platform_info"[0m[0m [init -> drivers -> driver_manager] [31mError: Uncaught exception of type 'Genode::Rom_connection::Rom_connection_failed'[0m[0m

I've had the same info on Fiasco OC. platform_info is available at least on NOVA and hw.

Yes, indeed. I tried last time on Fiasco.OC, to see the log mesages on the screen. So, it looks like Fiasco.OC is unsupported for Sculpt, or is it possible to go further on it without a fatal error because of missing platform_info? (Maybe, just catch this exception). Yes, now I tried Scuplt on both "hw" and NOVA kernels, it works fine on both. Though, trying to run "sculpt_test" on NOVA in QEMU, I see "drivers" and "leitzentrale", and "runtime" starting, and then silence (no progress). I commented the routing rules in all three subsystems out, so that, now I see all messages in core log. Where could I look more to see why it doesn't start? I see Nitpicker started, but no "leitzentrale" on the screen. No errors. Maybe, I need to wait more, but much time passed.

...

...
are strange, because "platform_info" seems to be optional. Previously, it failed too in different scenarios, but it was not fatal.

If I correctly understand driver_manager is a component in Sculpt that generates configuration for framebuffer component to start appropriate one (vesa or intel) based on informations acquired from (among others) platform_info.

I think it is not optional for driver_manager - at least it doesn't seem to be when looking into code and without driver_manager there is no configuration in Sculpt to start framebuffer component.

Tomasz Gajewski

So, currently Sculpt works with NOVA or "hw" kernels, because core generate

this "platform_info" module. On Fiasco.OC it's missing, so it won't work? -- I thought,

it will work on any x86 kernel, excepl Linux... (at least, I see such limitation in the

beginning of sculpt.run script).

WBR,

valery

Norman Feske

5 Mar 5 Mar

11:09 a.m.

Hello Valery,

On 04.03.2018 17:06, Valery V. Sedletski via genode-main wrote:

...

Though, trying to run "sculpt_test" on NOVA in QEMU, I see "drivers" and "leitzentrale", and "runtime" starting, and then silence (no progress). I commented the routing rules in all three subsystems out, so that, now I see all messages in core log. Where could I look more to see why it doesn't start? I see Nitpicker started, but no "leitzentrale" on the screen. No errors. Maybe, I need to wait more, but much time passed.

this can very well be a problem with the nit_fader component that apparently misses to fade-in the client if the startup is extremely slow. I have observed this sporadically myself, but only in Qemu. May you try pressing F12 twice to toggle the Leitzentrale? In other words: Have you tried switching it off and on again? ;-)

Cheers Norman

Valery V. Sedletski

1:19 p.m.

On 05.03.2018 14:09, Norman Feske wrote:

...

Hello Valery,

On 04.03.2018 17:06, Valery V. Sedletski via genode-main wrote:

...
Though, trying to run "sculpt_test" on NOVA in QEMU, I see "drivers" and "leitzentrale", and "runtime" starting, and then silence (no progress). I commented the routing rules in all three subsystems out, so that, now I see all messages in core log. Where could I look more to see why it doesn't start? I see Nitpicker started, but no "leitzentrale" on the screen. No errors. Maybe, I need to wait more, but much time passed.

this can very well be a problem with the nit_fader component that apparently misses to fade-in the client if the startup is extremely slow. I have observed this sporadically myself, but only in Qemu. May you try pressing F12 twice to toggle the Leitzentrale? In other words: Have you tried switching it off and on again? ;-)

Cheers Norman

Yes, looks like a redraw problem indeed. Toggling "leitzentrale" with F12 key

helps. And it's veeeery slow :(

Also which is strange, 1024x768x16 video mode works fine on Asus laptop

(which has 1400x900 display and NVidia video chip). But on my desktop (having

ATI Radeon 9600 XT and a FullHD monitor) the monitor goes out of sync when

setting the video mode up. Both are working in VESA mode. I suspect that

either the video card, or the display don't like the 16-bit 1024x768 video mode.

And Genode's framebuffer seems to be currently 16-bits only. It goes out of sync

when I boot Linux too, BTW.

Norman Feske

3:05 p.m.

Hi Valery,

...

Yes, looks like a redraw problem indeed.

FYI, I fixed this problem today: https://github.com/genodelabs/genode/commit/e0ac419ecd50594a4a001711c54413e7...

Cheers Norman

Alexander Boettcher

10:16 a.m.

On 01.03.2018 02:01, Valery V. Sedletski via genode-main wrote:

...

...
and re-create the interactive driver package (as shown above).

Yes, I increased caps quota for fb_drv now, and commented "throw" out in acpi.cc, this

Can you please test commit [0], which should ignore the invalid ACPI table in your case.

[0] https://github.com/alex-ab/genode/commit/b2bedb8eef2d39df79db648c23d6fe8725e...

Valery V. Sedletski

2:41 p.m.

On 05.03.2018 13:16, Alexander Boettcher wrote:

...

On 01.03.2018 02:01, Valery V. Sedletski via genode-main wrote:

...
...
and re-create the interactive driver package (as shown above).

Yes, I increased caps quota for fb_drv now, and commented "throw" out in acpi.cc, this

Can you please test commit [0], which should ignore the invalid ACPI table in your case.

[0] https://github.com/alex-ab/genode/commit/b2bedb8eef2d39df79db648c23d6fe8725e...

Yes, I cherry-picked your commit, and everything seems to work fine with ACPI.

At least, I see NOVA and "base-hw" kernels booting fine with Sculpt on Asus laptop.

Is ACPI tables dump of Asus machine still needed?

WBR,

valery

Alexander Boettcher

7:38 p.m.

On 05.03.2018 15:41, Valery V. Sedletski via genode-main wrote:

...

...
Can you please test commit [0], which should ignore the invalid ACPI table in your case.

[0] https://github.com/alex-ab/genode/commit/b2bedb8eef2d39df79db648c23d6fe8725e...

Yes, I cherry-picked your commit, and everything seems to work fine with ACPI.

Thanks for testing - commit is scheduled for addition to Genode staging/master.

...

At least, I see NOVA and "base-hw" kernels booting fine with Sculpt on Asus laptop.

Is ACPI tables dump of Asus machine still needed?

No, thanks,

2709

Age (days ago)

2723

Last active (days ago)

users@lists.genode.org

19 comments

4 participants

tags (0)

participants (4)

Alexander Boettcher
Norman Feske
Tomasz Gajewski
Valery V. Sedletski