Genode on RPI

Tue Jan 29 14:02:46 CET 2019

Hello Tomasz,

thank you very much for sharing your experimentation efforts and
insights and for starting the discussion. Please, see my comments
inline below.

On Mon, Jan 28, 2019 at 06:31:33PM +0100, Tomasz Gajewski wrote:
> 
> Hi,
> 
> last weeks of attempts gave me some small progress with code,
> understanding of some aspects of ARM (AArch32) architecture and some
> knowledge about how low level hw kernel is implemented.
> 
> I have some thoughts and questions about how some issues should/could be
> implemented that I'd like to discuss.
> 
> 
> Firstly I think that in my ignorance I took a wrong path to experiment
> with Raspberry Pi 3B+. It has a Cortex A53 (ARMv8-A) processor but as I
> want to target AArch32 (and initially without SMP) I thought that
> modifying current rpi target is the best choice. Now I think that I
> should have started with the newest supported by hw ARM processor
> (Cortex A15 if I correctly understand) and I would not be stopped
> by issues I had. Or maybe not...
> 

Indeed, starting with a fork of the most recent processor in hw as a
starting point might be better.

> 
> 
> When last time I wrote about my progress I couldn't pass through
> following code:
> 
> >     sctlr = Cpu::Sctlr::read();
> >     Cpu::Sctlr::C::set(sctlr, 1);   // enable data cache
> >     Cpu::Sctlr::I::set(sctlr, 1);   // enable instruction cache
> >     Cpu::Sctlr::M::set(sctlr, 1);   // enable mmu
> >     Cpu::Sctlr::write(sctlr);
> 
> After reading and making different attempts I checked that enabling
> instruction cache does not cause problems, enabling data cache or mmu
> individually does allow to go further but enabling both does not. But I
> thought that for initial experiments I can live without data cache. But
> soon I found out that I was wrong.
> 
> After many attempts I found a place where it halted. It was deep inside
> Genode::log in ... assembly in atomic.h in cmpxchg. And reading gave me
> another unpleasent surprise that it will not work if I don't have data
> cache enabled... So I had to go back.

Yes, we are using ldrex/strex especially to support multi-processor
systems. Those instructions use cache-coherency mechanisms of the
processors when working on "cacheable" memory. I you did not turn on
caches and the snoop-control-unit (SMP bit in Armv7) the cmpxchg won't
work properly.

> 
> After two weeks of poking around finally I found a workaround that
> allowed me to go further. I've changed Page_flags for all types from
> CACHED to UNCACHED and it allowed to pass through enabling mmu with
> data cache enabled (of course without real caching) and through spinlock
> in atomic.h.
> 
> 
> Next thing was to make UART working. Unfortunately version 3 has
> different UART device enabled by default (it has two) and it required a
> different driver which I implemented.
> 
> Seeing "kernel initialized" passed through serial connection was a real
> pleasure after analyzing logs in memory for two weeks.
> 
> 
> So I thought that it is a good moment to write my thoughts and
> questions.
> 
> 
> For writing traces to memory I've created a simple utility (set of
> macros in assembly and C++) to write simple debug values to a buffer in
> memory. I used it to diagnose what is going on before serial connection
> is working.
> 
> It is specified by a buffer address and size. At the beginning a pointer
> to current position in buffer is stored. A 1-1 mapping is inserted into
> TLB for this region and I had to make changes in virtual memory layout
> of core. Currently addresses are hardcoded but I can polish it to a
> state where it could be enabled/disabled with some build option or some
> defines in one place in the code if you'd like to use this. I pushed
> current state of my work [1] (without any cleanup yet). Utility macros
> are in:
>  - repos/base/include/base/memtrace.h
>  - repos/base-hw/src/bootstrap/spec/arm/crt0.s
> Would you be interested in adding something like this to Genode? I would
> create issue and propose some version.
> 

Very cool that you could help yourself that way!
In general, we already have a tracing mechanism in Genode, where
tracing points are collected in memory, and might get aggregated even
from many different components for debug or optimization reasons.
Anyway, that mechanism is based on core's functionality and therefore
only appropriated for components running on top of core/kernel.

Another option used in NOVA/Genode is to write all kernel messages
into a memory buffer and export it to the userland. Thereby, you can
aggregate messages in headless systems or systems without AMT or
serial line. I think this way might be appropriated to be used in
base-hw too. Anyway, I would not introduce new debug macros, but
just add a simple serial driver, which isn't using a special device
but a portion of memory for printing. Then one can switch UART to that
model if there is no one, or it is troublesome. What do you think
about that approach?

I have to admit that usually I can use a JTAG debugger on most
platforms to tackle such early initialization problems. On the other
hand I'm quite cautious in adding pure debug feature in the
"microkernel" ;-).

> 
> 
> 
> I have some general doubts about how some issues could be resolved. I
> started experimenting with an idea to make it possible to have a Sculpt
> for Raspberry Pi. And I thought it could be created one in such a way
> that single binary could run on any Pi. I knew that they are all ARM
> devices and even though some are 64bit they can work in AArch32 so I
> wanted to treat all version as just 32bit ARM boards.
> 
> I knew that they differ in list of devices so I thought that some kind
> of configuration would have to be passed during booting process so
> proper device drivers could be loaded and with proper configuration. I
> checked that a solution for this problem (using one binary for different
> ARM boards) used by at least Linux and U-boot is device tree. My plan
> was (and still is unless something better is proposed) to try to
> experiment with incorporating support for device tree to a platform
> driver - if I correctly understand it is a place where similar
> functionality is implemented for x86_64.
> 
> 
> Browsing through bootstrap and core code in base-hw for past weeks made
> me realise that to support one binary for different Pi devices which
> have different generations of processors other problems would have to be
> resolved.
> 
> 
> If I correctly understand currently whe building base-hw for a specific
> ARM board all configuration is provided during build process. It is
> performed by:
> 
>  a. specifying target processor version (-m for g++)
> 
>  b. specifying list of files to compile and include to binary that
>     contain implementations of functions that differ between different
>     processor generations and to enable some functionality (e.g. for
>     virtualization support for ARM)
> 
>  c. constants in code - to provide proper memory ranges, MMIO addresses,
>     etc. for device drivers
> 

Don't get me wrong, striving for the stars is welcome, but I think the
goal of having a generic kernel image for all RPIs or even all ARM
devices is a bit too ambitious right now.
Although, Linux ARM developers are working on it since many years,
they aren't there yet. Look at current RPI distro images like Raspian,
they deliver at least two different kernels _and_ multiple device
trees. The bootloader code has to decide, which one is loaded.
Currently, there is no generic convention followed by all SoC/board
vendors, bootloaders etc.. We have to be pragmatic here.
Given the current state of e.g. the Rasperry Pi universe, where you
have to decide what needs to be loaded, I do not see in general the
advantage to differentiate in between configuration data and some
small, statically linked image.

> 
> Having all this information during build allows to:
> 
>  - optimize code by inlining even processor specific code
> 
>  - minimize resulting binary (maybe not so important) and therefore
>    memory footprint on runnig system (more importand)
> 
> 
> 
> On the other hand wikipedia (here [2]) currently mentions 15 models of
> Raspberry Pi which differ in:
> 
>  - processor generations (ARM11, Cortex-A7, Cortex-A53) - that breaks
>    point (a)
> 
>  - have different devices (with/without ethernet, wireless, etc.) -
>    that breaks point (b)
> 
>  - have different runtime configurations that can be changed by
>    configuring firmware (e.g. different partitioning of RAM for used by
>    graphics and operating system can be performed using a configuration
>    option) - that contradicts with (c)

Surely, those are too much dimensions to provide different fully
fledged system images. Also, we do not want too much platform
differentiations in the codebase. Again, I vote for pragmaticsm and to
target the different platforms step-by-step and not all-in-one.
Currently, I would think that having different targets for the
different architectures (Armv6, Armv7, Armv8) is a good starting
point. It is a natural boundary, and we can compile all components to
target the correct one.
Then the question is whether there is a mechanism in the
Broadcom SoC to identify the correct model at runtime, e.g. some
identification registers. If possible, we can of course hide the model
differences in the platform/bootstrap code within one image.
I would omit different runtime configurations of the firmware for the
moment and just support the default one.

> 
> 
> 
> When thinking about supporting different Pi devices (and more generally
> other ARM boards) I think that there are two areas to consider:
> 
>  A. support for runtime/startup configuration of devices that will have
>     drivers running in different processes - this is a part that I knew
>     about from the beginning of my experimentation. I still think that
>     implementing support for device tree (or some alternative) is a
>     proper solution for this part (what whould be a method of passing
>     configuration to device drivers is an open question)
> 
>  B. support for passing configuration that is required by bootstrap and
>     core. Here drivers are selected with C++ code e.g.:
> 
>         //using Serial   = Genode::Pl011_uart;
>         using Serial   = Genode::Mini_uart;
> 
>     Selecting UART driver is an interesting example as RPI3B+ has two
>     UART devices and which one is available is decided by using proper
>     configuration for firmware (mini uart is available by default and
>     Pl011 uart is used internally for communcation with Bluetooth but it
>     can be changed with configuration before starting operating system).
> 
> 
> Now questions:
> 
>  1. Do you (Genode) plan to have or would like to have such configurable
>     support for different ARM devices?

I only state my point of view that does not necessarily be the way it
will go, but anyway: nowadays, configuration starts above core/kernel.
We should keep this for the time-being to limit the risk of
over-engineering in this minimal, critical code-base. I can imagine, that
we weaken this claim for the bootstrap component in the future,
because that code is critical in the boot-stage only, but then gets
thrown away anyway.

Above core, we need to know:

* What devices exists
* What resources do they need
* Are there additional configuration aspects needed by the driver,
  beyond the resources, e.g.: operation mode of the PHY interface for
  ethernet devices ...
* Resp. how to power/clock the device at runtime if dynamic power
  consumption is a topic

> 
>  2. Do you think that creating generic platform driver for ARM (to
>     support A) that work using informations provided by device tree
>     support is generally a good idea? What alternatives you consider
>     better?

All functionality described above is realized using open firmware /
flattened device tree in Linux and Co. I agree that it is appealing to
be able to access all the platform data for free. But this only half
the truth for the following reasons:

# Device tree support was a moving target within the last years
  with lots of changes in the implementation and in the trees
  themselfs, it is not forseeable that this will change
# There are reams of device-specific attributes. Please, have a look
  at Documentation/devicetree in the Linux kernel. There are over
  3000 files. The whole "of_" bureaucracy in the Linux kernel
  shows its complexity.
# Different device trees and its "generic" language should not suggest
  that we will have one platform driver to rule them all. There are
  tons of "platform devices" in the Linux kernel necessary to provide
  the resources that toplevel devices need. That means for each SoC
  you still need a specific platform driver that support all the
  multiplexing units for pins, power, clocking etc.
# Device trees describe also a lot of relations in between devices.
  For instance, a device might get its interrupts from the interrupt
  controller or from a GPIO controller. All routes to some kind of
  multiplexer device are part of the tree. It is much easier if you
  want to apply this configuration to a monolithic component containing
  all multiplexer devices than to split this in multiple components.

To sum it up. I'm not sure whether it is the right way to go to
support device trees by Genode's ARM platform driver, but I do not
want to exclude it completely. Maybe I'm wrong and the complexity of
the core functionality is over-estimated by me.
Surely we need something comparable at some point. Maybe, it would be
good to start it as an experiment for one specific SoC.

> 
>  3. Do you see any way to implement suport for B?
> 

Surely, but at least here at Genode Labs it is not highly prioritized
right now. Currently, we try to limit the platform support and care to
the NXP i.MX* platforms. Here, we plan to support i.MX8 as soon as
possible, which has Cortex A53 too.

> 
> I'd very much like to receive some comments about it.
> 

Than you very much for sharing your thoughts.

Best regards
Stefan

> -- 
> Tomasz Gajewski
> 
> 
> [1] https://github.com/tomga/genode/tree/rpi3bplus
> [2] https://en.wikipedia.org/wiki/Raspberry_Pi#Specifications
> 
> _______________________________________________
> Genode users mailing list
> users at lists.genode.org
> https://lists.genode.org/listinfo/users

-- 
Stefan Kalkowski
Genode labs

https://github.com.skalk | https://genode.org