Hi,
I'm trying to get the latest Genode running (using the base platform NOVA on a real machine, not qemu). As a simple kick-off I tried to use the demo scenario with nitpicker and launchpad. However, with the USB Legacy support enabled, only the keyboard was working, the mouse could not be configured properly. Now that this machine doesn't have any real PS/2 connectors and the Legacy mode tends to be a bit quirky sometimes, I decided to go for the USB HID driver. But now the driver seems to only find several hubs and ports, but no devices attached to them. Plus, during the boot process, the mouse's LED gets switched off and is never turned on again. Is there some limitation regarding scanned USB ports and/or suitable devices? Or am I missing something else?
I attached the log file of this boot process, but be warned that there seems to be some problem with serial output, as mainly in the beginning there are several weird broken lines in the log.
Thanks
Markus
Hi Markus,
On 07/12/2012 01:17 PM, Markus Partheymueller wrote:
Hi,
I'm trying to get the latest Genode running (using the base platform NOVA on a real machine, not qemu). As a simple kick-off I tried to use the demo scenario with nitpicker and launchpad. However, with the USB Legacy support enabled, only the keyboard was working, the mouse could not be configured properly. Now that this machine doesn't have any real PS/2 connectors and the Legacy mode tends to be a bit quirky sometimes, I decided to go for the USB HID driver. But now the driver seems to only find several hubs and ports, but no devices attached to them. Plus, during the boot process, the mouse's LED gets switched off and is never turned on again. Is there some limitation regarding scanned USB ports and/or suitable devices? Or am I missing something else?
I attached the log file of this boot process, but be warned that there seems to be some problem with serial output, as mainly in the beginning there are several weird broken lines in the log.
I assume you are using the driver from the 'dde_linux' directory, right? There is another USB driver in the 'linux_drivers' repository on github which can also handle HID devices. To tell you the truth, the 'dde_linux' version has not been tested on x86-hardware so far, since we mostly concentrated on PandaBoard support lately. So I have to admit that you're a sort of an alpha tester here. Anyway, I can test the thing on some box tomorrow, since it actually does not look that bad. The UHCI controller (where HID devices usually reside) has been detected. My first uneducated guess here would be that the driver does not get any interrupt from the controller. Could you enable 'DEBUG_IRQ' in 'dde_linux/src/drivers/usb/lx_emul.h' and check if the 'dde_linux/src/drivers/usb/signal/irq.cc:handle()' function gets actually called? If not please start the 'acpi_drv' beforehand, since it might find a different (to the one in the PCI config space) interrupt number for the controller within the ACPI-tables. That's usually the fun on kernels that use APICs.
Thx for reporting and please keep me informed about your insights,
Sebastian
P.S. If nothing works please also send your Genode configuration file.
Hi Sebastian,
On 12 July 2012 13:58, Sebastian Sumpf <Sebastian.Sumpf@...1...> wrote:
Hi Markus,
I assume you are using the driver from the 'dde_linux' directory, right? There is another USB driver in the 'linux_drivers' repository on github which can also handle HID devices. To tell you the truth, the 'dde_linux' version has not been tested on x86-hardware so far, since we mostly concentrated on PandaBoard support lately. So I have to admit that you're a sort of an alpha tester here. Anyway, I can test the thing on some box tomorrow, since it actually does not look that bad. The UHCI controller (where HID devices usually reside) has been detected. My first uneducated guess here would be that the driver does not get any interrupt from the controller. Could you enable 'DEBUG_IRQ' in 'dde_linux/src/drivers/usb/lx_emul.h' and check if the 'dde_linux/src/drivers/usb/signal/irq.cc:handle()' function gets actually called? If not please start the 'acpi_drv' beforehand, since it might find a different (to the one in the PCI config space) interrupt number for the controller within the ACPI-tables. That's usually the fun on kernels that use APICs.
Thx for reporting and please keep me informed about your insights,
Sebastian
P.S. If nothing works please also send your Genode configuration file.
DEBUG_IRQ revealed no real news for me, except for the fact that IRQ 10 was discovered, but apparently no interrupt came in, handle() was not called once.
So I added acpi_drv to my setup, resulting in a different log. As you suggested, I attached the new log as well as my configuration file.
Another interesting fact is that, when the iommu flag of the hypervisor is removed, it looks quite different already. A log file of this behavior is also attached.
Regards
Markus
On 07/12/2012 03:21 PM, Markus Partheymueller wrote:
DEBUG_IRQ revealed no real news for me, except for the fact that IRQ 10 was discovered, but apparently no interrupt came in, handle() was not called once. So I added acpi_drv to my setup, resulting in a different log. As you suggested, I attached the new log as well as my configuration file. Another interesting fact is that, when the iommu flag of the hypervisor is removed, it looks quite different already. A log file of this behavior is also attached. Regards Markus
Okay the ACPI driver seems definitely necessary here. Since the device IRQs (10/11) are overwritten (to GSI 16/18) by the driver. So far so good, what is really weird is that the drivers seems to read strange values in the IOMMU case.
@Alexander Boettcher: Is there anything to consider regarding I/O memory when using Nova with enabled IOMMU?
The second example (no IOMMU) looks actually better. The strange thing here is that the available RAM quota does not seems to suffice. Are you on a 64Bit machine? For a quick fix you could try to increase the memory quota of some servers until you don't get the 'Quota exceeded' message any more. On the other hand, if this is a 64Bit machine the GSI values produced by the ACPI driver might be wrong, since it only parses the DSDT, on 64Bit one would have to use the XDSDT table. To make long things short, as long as you don't get interrupts from the UHCI/EHCI controllers the driver will not work. The thing to find out is, why we don't get interrupts on Nova. Unfortunately I can't really help you right now since I am pretty sick. I will test this however, when I get my hands on a hardware box. Until then you're more than welcome to find out what is going on.
Greetings,
Sebastian
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
@Alexander Boettcher: Is there anything to consider regarding I/O memory when using Nova with enabled IOMMU?
A quick lock into the specification of the NOVA kernel shows, that assign_gsi returns the msi vector to be programmed (if the device is used in msi mode, is it?).
In base-nova/src/core/irq_session_component.cc however - - first the return value of the syscall operation is ignored - - second the msi values returned by the kernel are ignored.
The assign_pci system call is completly not used in Genode currently, however required to work with IOMMU support enabled ...
@Sebastian: we have to revisit this part asap., maybe next week together.
Best,
Alex
On 07/13/2012 03:24 PM, Alexander Boettcher wrote:
@Alexander Boettcher: Is there anything to consider regarding I/O memory when using Nova with enabled IOMMU?
A quick lock into the specification of the NOVA kernel shows, that assign_gsi returns the msi vector to be programmed (if the device is used in msi mode, is it?).
Big NO here!
In base-nova/src/core/irq_session_component.cc however
- first the return value of the syscall operation is ignored
- second the msi values returned by the kernel are ignored.
The assign_pci system call is completly not used in Genode currently, however required to work with IOMMU support enabled ...
@Sebastian: we have to revisit this part asap., maybe next week together.
Okay, lets do that.
@Markus: You can come along as well, since our offices are not that far apart .-)
Sebastian
On Fri, 13 Jul 2012 13:07:04 +0200 Sebastian Sumpf (SS) wrote:
SS> Okay the ACPI driver seems definitely necessary here. Since the device SS> IRQs (10/11) are overwritten (to GSI 16/18) by the driver. So far so SS> good, what is really weird is that the drivers seems to read strange SS> values in the IOMMU case.
When the IOMMU is active, DMA requests and interrupts are restricted by the NOVA microhypervisor. Device drivers need to follow certain guidelines, see below.
SS> @Alexander Boettcher: Is there anything to consider regarding I/O memory SS> when using Nova with enabled IOMMU?
With the IOMMU active, a user-level device driver must specify which memory regions of its PD are DMA-able. This is done by setting the D-bit in the Delegate Transfer Item. All memory mappings where the D-bit is not set will not be DMA-able (unless the IOMMU is inactive). My recommendation is that device drivers map their own code and private non-DMA-able and only allow DMA transfers to I/O buffer regions.
Also, when the IOMMU is active, the addresses programmed into DMA transfers must be host-virtual addresses. This alleviates device drivers from having to know physical memory addresses. They can DMA into their virtual address space. In contrast, when the IOMMU is inactive, the addresses in DMA transfers must be host-physical addresses.
An active IOMMU will only let interrupt vectors through that have been explicitly whitelisted by the hypervisor. Applications request an interrupt by means of the assign_gsi hypercall and also specify which CPU they want the interrupt routed to. This must be done for any interrupt (IOAPIC or MSI). (I'm thinking about adding some code to the hypervisor, causing sem_down to fail on an interrupt semaphore unless the interrupt has been configured using assign_gsi.)
Failure to whitelist a DMA region or interrupt vector will result in a diagnostic message printed to the hypervisor console when the IOMMU aborts a transaction.
Let me know if there are further questions related to the IOMMU.
Cheers, Udo
Hi Markus,
On 07/13/2012 01:07 PM, Sebastian Sumpf wrote
I will test this however, when I get my hands on a hardware box. Until then you're more than welcome to find out what is going on.
I've tested the USB driver on x86 as promised and fixed some issues that appeared on real hardware. For a short description you can check (https://github.com/genodelabs/genode/issues/282). The most important issue was the lack of support for more than one UHCI controller. Interrupts are now also working for that controller.
Nevertheless there are still limitations for Nova. Because of the large number of controllers (on my box seven) the driver might discover, the PCI legacy interrupts may exhaust pretty quickly, what in return means that it is likely that no other driver that depends on PCI-IRQs might get access to them. The reason for that is that Genode does currently neither support shared interrupts nor message-signaled interrupts in Nova's base platform. Also there is no IOMMU support, yet, so one has to disable it when using the driver.
@Udo: Thanks for your extensive explanations of the IOMMU interface on Nova.
You can find the changes in my Genode repository (git://github.com/ssumpf/genode.git) under the 'issue282' branch.
Bye, Sebastian
On Wed, 18 Jul 2012 16:15:04 +0200 Sebastian Sumpf (SS) wrote:
SS> Nevertheless there are still limitations for Nova. Because of the large SS> number of controllers (on my box seven) the driver might discover, the SS> PCI legacy interrupts may exhaust pretty quickly, what in return means SS> that it is likely that no other driver that depends on PCI-IRQs might SS> get access to them. The reason for that is that Genode does currently SS> neither support shared interrupts nor message-signaled interrupts in SS> Nova's base platform. Also there is no IOMMU support, yet, so one has to SS> disable it when using the driver.
Most PCI devices are only connected to one or two PCI interrupt lines. This means you don't have much of a choice which IOAPIC pin an interrupt from a particular USB controller will signal. It is very likely that several of those USB controllers can only drive the same PCI interrupt line.
So you need to implement interrupt sharing or alternatively switch to MSI. With MSI you can give each device its dedicated interrupt vector. MSI is a mandatory feature of PCI-E, but older PCI devices may not support it.
Cheers, Udo
Hello,
what is missing for MSI support?
Julian
Udo Steinberg <udo@...121...> wrote:
On Wed, 18 Jul 2012 16:15:04 +0200 Sebastian Sumpf (SS) wrote:
SS> Nevertheless there are still limitations for Nova. Because of the large SS> number of controllers (on my box seven) the driver might discover, the SS> PCI legacy interrupts may exhaust pretty quickly, what in return means SS> that it is likely that no other driver that depends on PCI-IRQs might SS> get access to them. The reason for that is that Genode does currently SS> neither support shared interrupts nor message-signaled interrupts in SS> Nova's base platform. Also there is no IOMMU support, yet, so one has to SS> disable it when using the driver.
Most PCI devices are only connected to one or two PCI interrupt lines. This means you don't have much of a choice which IOAPIC pin an interrupt from a particular USB controller will signal. It is very likely that several of those USB controllers can only drive the same PCI interrupt line.
So you need to implement interrupt sharing or alternatively switch to MSI. With MSI you can give each device its dedicated interrupt vector. MSI is a mandatory feature of PCI-E, but older PCI devices may not support it.
Cheers, Udo
Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
Genode-main mailing list Genode-main@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/genode-main
Hi Udo,
thanks for your very helpful explanation!
I am just wondering: If DMA addresses are host-virtual addresses and there are multiple processes running, how does the IOMMU know which virt-to-phys translation to use? I can loosely remember that you once told me that devices must be associated with PDs. Is this correct? If so, how does a driver PD express to the hypervisor that it deals with a certain device?
With the IOMMU active, a user-level device driver must specify which memory regions of its PD are DMA-able. This is done by setting the D-bit in the Delegate Transfer Item. All memory mappings where the D-bit is not set will not be DMA-able (unless the IOMMU is inactive). My recommendation is that device drivers map their own code and private non-DMA-able and only allow DMA transfers to I/O buffer regions.
This seems to fit quite nicely with the recent addition of a facility to explicitly allocate DMA buffers via core's RAM session interface:
https://github.com/genodelabs/genode/commit/288fd4e56e636a0b3eb193353cf80286...
Currently, this function is meaningful only on ARM on Fiasco.OC. But it looks like a good way to handle the D-bit on NOVA without the need of any special precautions at the driver side. Because core is the pager of the driver, it could always map DMA buffers (and only those) with the D-bit set.
Also, when the IOMMU is active, the addresses programmed into DMA transfers must be host-virtual addresses. This alleviates device drivers from having to know physical memory addresses. They can DMA into their virtual address space. In contrast, when the IOMMU is inactive, the addresses in DMA transfers must be host-physical addresses.
Is the use of virtual addresses mandatory? If so, the driver must be aware of the presence of the IOMMU, doesn't it? It would be nice to find a way of using the IOMMU in a way that is transparent to the driver.
Cheers Norman
On Wed, 18 Jul 2012 21:34:24 +0200 Norman Feske (NF) wrote:
NF> I am just wondering: If DMA addresses are host-virtual addresses and NF> there are multiple processes running, how does the IOMMU know which NF> virt-to-phys translation to use? I can loosely remember that you once NF> told me that devices must be associated with PDs. Is this correct? If NF> so, how does a driver PD express to the hypervisor that it deals with NF> a certain device?
You can bind a device to a PD using the assign_pci hypercall. Once bound, the device uses the DMA-able part of the memory space of that PD. If you don't bind the device, it will only work with the IOMMU being inactive.
NF> Currently, this function is meaningful only on ARM on Fiasco.OC. But NF> it looks like a good way to handle the D-bit on NOVA without the need NF> of any special precautions at the driver side. Because core is the NF> pager of the driver, it could always map DMA buffers (and only those) NF> with the D-bit set.
Right.
NF> > Also, when the IOMMU is active, the addresses programmed into DMA NF> > transfers must be host-virtual addresses. This alleviates device NF> > drivers from having to know physical memory addresses. They can DMA NF> > into their virtual address space. In contrast, when the IOMMU is NF> > inactive, the addresses in DMA transfers must be host-physical NF> > addresses. NF> NF> Is the use of virtual addresses mandatory? If so, the driver must be NF> aware of the presence of the IOMMU, doesn't it? It would be nice to NF> find a way of using the IOMMU in a way that is transparent to the driver.
The memory space of a PD consists of three subspaces. 1) The host page table translates host-linear to host-physical addresses 2) The guest page table translates guest-physical to host-physical addresses 3) The DMA page table translates DMA addresses to host-physical addresses
When you delegate memory * all mappings to into subspace 1) (unless they target the hypervisor region) * mappings with G-bit additionally go into 2) * mappings with D-bit additionally go into 3)
It follows that subspaces 2) and 3) are a subset of subspace 1) and also that guest-physical address = host-virtual address = DMA address.
This has a number of benefits: If a device is directly assigned to a guest, the guest will obviously program guest-physical addresses into its DMA transfers, and the IOMMU will translate to host-physical. Likewise, a host device driver is expected to program a host-virtual address into its DMA transfers, which the IOMMU will translate into host-physical the very same way. Also, guest-physical memory of a VM is equivalent to host-virtual memory of the VMM, which allows the VMM easy access to its VM.
So yes, the device driver must know if the IOMMU is active or not, and it must choose the right type of address for DMA transfers. NUL tries to assign a device to a PD: if that fails, it assumes the IOMMU is inactive and uses host-physical addresses; if that succeeds, it uses host-virtual addresses.
Cheers, Udo
Hey,
there is a system call that assigns PCI devices to PDs. A PD proves that is allowed to drive a specific device by presenting the respective PCI MMCONFIG page to the kernel. In addition, memory supposed to be visible to devices must be mapped with the D bit set.
HTH, Julian
Norman Feske <norman.feske@...1...> wrote:
Hi Udo,
thanks for your very helpful explanation!
I am just wondering: If DMA addresses are host-virtual addresses and there are multiple processes running, how does the IOMMU know which virt-to-phys translation to use? I can loosely remember that you once told me that devices must be associated with PDs. Is this correct? If so, how does a driver PD express to the hypervisor that it deals with a certain device?
With the IOMMU active, a user-level device driver must specify which memory regions of its PD are DMA-able. This is done by setting the D-bit in the Delegate Transfer Item. All memory mappings where the D-bit is not set will not be DMA-able (unless the IOMMU is inactive). My recommendation is that device drivers map their own code and private non-DMA-able and only allow DMA transfers to I/O buffer regions.
This seems to fit quite nicely with the recent addition of a facility to explicitly allocate DMA buffers via core's RAM session interface:
https://github.com/genodelabs/genode/commit/288fd4e56e636a0b3eb193353cf80286...
Currently, this function is meaningful only on ARM on Fiasco.OC. But it looks like a good way to handle the D-bit on NOVA without the need of any special precautions at the driver side. Because core is the pager of the driver, it could always map DMA buffers (and only those) with the D-bit set.
Also, when the IOMMU is active, the addresses programmed into DMA transfers must be host-virtual addresses. This alleviates device drivers from having to know physical memory addresses. They can DMA into their virtual address space. In contrast, when the IOMMU is inactive, the addresses in DMA transfers must be host-physical addresses.
Is the use of virtual addresses mandatory? If so, the driver must be aware of the presence of the IOMMU, doesn't it? It would be nice to find a way of using the IOMMU in a way that is transparent to the driver.
Cheers Norman
Am Mittwoch, den 18.07.2012, 11:44 +0200 schrieb Udo Steinberg:
An active IOMMU will only let interrupt vectors through that have been explicitly whitelisted by the hypervisor. Applications request an interrupt by means of the assign_gsi hypercall and also specify which CPU they want the interrupt routed to. This must be done for any interrupt (IOAPIC or MSI).
One quick addition to this: If you guys implement that, be sure to test it on a box that actually has Interrupt Remapping support. The kernel will hide the difference (it magically returns the correct MSI address/value pair to program), but if it works on a box with Interrupt Remapping support, you are sure that you did everything correctly.
Udo, is there an easy way to figure out, if a system supports Interrupt Remapping short of looking into ACPI tables?
(I'm thinking about adding some code to the hypervisor, causing sem_down to fail on an interrupt semaphore unless the interrupt has been configured using assign_gsi.)
+1 from me. That would cause broken code to fail early.
Julian
On Fri, 20 Jul 2012 12:22:37 -0700 Julian Stecklina (JS) wrote:
JS> One quick addition to this: If you guys implement that, be sure to test JS> it on a box that actually has Interrupt Remapping support. The kernel JS> will hide the difference (it magically returns the correct MSI JS> address/value pair to program), but if it works on a box with Interrupt JS> Remapping support, you are sure that you did everything correctly.
An application doesn't need to know if interrupt remapping is enabled or not. It just has program the MSI values given by the hypervisor into the device (if the device uses MSI rather than IOAPIC).
I think what Julian is hinting at, is, that incorrect code that appears to work on machines without interrupt remapping will fail once it runs on a machine with interrupt remapping. As with DMA remapping, this is due to the tighter restrictions enforced by the hypervisor. To be on the safe side, an applications that configures or uses interrupts must: 1) use assign_pci for each interrupt prior to using it 2) use the MSI values returned by assign_gsi, rather than making up your own
JS> Udo, is there an easy way to figure out, if a system supports Interrupt JS> Remapping short of looking into ACPI tables?
You could infer that from looking at the MSI data provided by the hypervisor. Or the hypervisor could announce it in the HIP. But like I said, userland doesn't really need to know.
JS> > (I'm thinking about adding some code to the hypervisor, causing sem_down to JS> > fail on an interrupt semaphore unless the interrupt has been configured JS> > using assign_gsi.) JS> JS> +1 from me. That would cause broken code to fail early.
The reason I haven't done it yet, is, that it adds a branch to every sem_down on an interrupt semaphore, which is overhead you pay every time, just to get some additional checking at initialization time.
On Fri, 20 Jul 2012 22:54:35 +0200 Udo Steinberg (US) wrote:
US> To be on the safe side, an application that configures or uses interrupts US> must: US> 1) use assign_pci for each interrupt prior to using it
I mean assign_gsi of course.
- Udo
Udo Steinberg <udo@...121...> wrote:
On Fri, 20 Jul 2012 12:22:37 -0700 Julian Stecklina (JS) wrote: JS> Udo, is there an easy way to figure out, if a system supports Interrupt JS> Remapping short of looking into ACPI tables?
You could infer that from looking at the MSI data provided by the hypervisor. Or the hypervisor could announce it in the HIP. But like I said, userland doesn't really need to know.
I meant it more in the way of figuring out which test box to use. :-) ark.intel.com doesn't list this unfortunately.
And yes, correct userland code should be completely oblivious to whether IRQ remapping is used.
JS> > (I'm thinking about adding some code to the hypervisor, causing sem_down to JS> > fail on an interrupt semaphore unless the interrupt has been configured JS> > using assign_gsi.) JS> JS> +1 from me. That would cause broken code to fail early.
The reason I haven't done it yet, is, that it adds a branch to every sem_down on an interrupt semaphore, which is overhead you pay every time, just to get some additional checking at initialization time.
You could enable it only for debug builds, but I don't think it makes any noticeable difference either way.
Julian