Hi,
awesome! I know it's not a good idea to disable the timer but I thought disabling it might give me more information about what's happening. The data-abort you get is the same I got after disabling the timer. It's Linux trying to access the GPIO. You can take a look what I did here: https://github.com/envy/linux/commit/1aade6f960aec6fa5a18e6b6effa82f8ff1533c... to disable it. This is basically what Stefan Kalkowski did here: https://github.com/skalk/linux/commit/eccce1c595d7962c95086d6fa60291f7c2c1a4...
I will try your fix tomorrow, I forgot to bring my development VM to uni today...
Cheers, Nico ________________________________________ Von: Martin Stein Gesendet: Montag, 26. Januar 2015 12:14 An: Genode OS Framework Mailing List Betreff: Re: AW: AW: Running Linux 3.14 as a Genode VM
Hi Nico,
Thank you for your detailed feedback about your progress :)
Although it enabled you to go on, I don't think that it is a good idea to disable the GPT timer in the long term. Thus I've digged a little deeper into the "Calibrating delay loop" problem. What goes on after the 'msr CPSR_c, rx' is that the kernel catches GPT timer IRQs all the time and thus not getting any further. The IRQs aren't handled because the driver for the interrupt controller (arch/arm/mach-imx/tzic.c) accepts secure interrupts only in 'tzic_handle_irq' (see 'stat = ... & __raw_readl(tzic_base + TZIC_INTSEC0(i))' ). Looking a little bit around, I've found that the driver attempts to set all IRQs secure by doing '__raw_writel(0xFFFFFFFF, tzic_base + TZIC_INTSEC0(i));' in 'tzic_init_irq' which, of course, gets ignored when Linux runs as nonsecure VM. So I've configured out the masking in 'tzic_handle_irq' via a '#define GENODE_TZ_VMM' switch and after that, my Linux got further:
Linux version 3.16.2-ga3bd210-dirty (lypo@...207...) (gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) ) #1 PREEMPT Mon Jan 26 11:42:29 CET 2015 CPU: ARMv7 Processor [412fc085] revision 5 (ARMv7), cr=10c5387d CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache Machine model: Freescale i.MX53 Quick Start Board bootconsole [earlycon0] enabled Memory policy: Data cache writeback On node 0 totalpages: 65536 free_area_init_node: node 0, pgdat 803c1be8, node_mem_map 8fdf8000 Normal zone: 512 pages used for memmap Normal zone: 0 pages reserved Normal zone: 65536 pages, LIFO batch:15 CPU: All CPU(s) started in SVC mode. pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768 pcpu-alloc: [0] 0 Built 1 zonelists in Zone order, mobility grouping on. Total pages: 65024 Kernel command line: console=ttymxc0,115200 earlyprintk loglevel=10 PID hash table entries: 1024 (order: 0, 4096 bytes) Dentry cache hash table entries: 32768 (order: 5, 131072 bytes) Inode-cache hash table entries: 16384 (order: 4, 65536 bytes) Memory: 255824K/262144K available (2662K kernel code, 121K rwdata, 904K rodata, 120K init, 86K bss, 6320K reserved, 0K highmem) Virtual kern> <40fiq el memory layout: vector : 0xffff0000 - 0xffff1000 ( 4 kB) fixmap : 0xffc00000 - 0xffe00000 (2048 kB) vmalloc : 0x90800000 - 0xff000000 (1768 MB) lowmem : 0x80000000 - 0x90000000 ( 256 MB) pkmap : 0x7fe00000 - 0x80000000 ( 2 MB) modules : 0x7f000000 - 0x7fe00000 ( 14 MB) .text : 0x80008000 - 0x80383a94 (3567 kB) .init : 0x80384000 - 0x803a21bc ( 121 kB) .data : 0x803a4000 - 0x803c2620 ( 122 kB) .bss : 0x803c262c - 0x803d8018 ( 87 kB) SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 Preemptible hierarchical RCU implementation. NR_IRQS:0 nr_irqs:0 0 TrustZone Interrupt Controller (TZIC) initialized CPU identified as i.MX53, silicon rev 2.1 Switching to timer-based delay loop sched_clock: 32 bits at 33MHz, resolution 29ns, wraps every 128849015778ns clocksource_of_init: no matching clocksources found Console: colour dummy device 80x30 Calibrating delay loop (skipped), value calculated using timer frequency.. 66.66 BogoMIPS (lpj=333333) pid_max: default: 32768 minimum: 301 Mount-cache hash table entri> <40fiq es: 1024 (order: 0, 4096 bytes) Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes) CPU: Testing write buffer coherency: ok Setting up static identity map for 0x8028a820 - 0x8028a878 VFP support v0.3: implementor 41 architecture 3 part 30 variant c rev 2 pinctrl core: initialized pinctrl subsystem NET: Registered protocol family 16 DMA: preallocated 256 KiB pool for atomic coherent allocations
dab f5784008
def [init -> tz_vmm] Cpu state: [init -> tz_vmm] Register Virt Phys [init -> tz_vmm] --------------------------------- [init -> tz_vmm] r0 = f5784008 [53f84008] [init -> tz_vmm] r1 = 8f896c10 [8f896c10] [init -> tz_vmm] r2 = 00000001 [00000000] [init -> tz_vmm] r3 = 8018d4e0 [8018d4e0] [init -> tz_vmm] r4 = 8f838e68 [8f838e68] [init -> tz_vmm] r5 = 00000000 [00000000] [init -> tz_vmm] r6 = 00000000 [00000000] [init -> tz_vmm] r7 = f5784004 [53f84004] [init -> tz_vmm] r8 = 8f898340 [8f898340] [init -> tz_vmm] r9 = 8fdedfcc [8fdedfcc] [init -> tz_vmm] r10 = 8f838e68 [8f838e68] [init -> tz_vmm] r11 = 00000000 [00000000] [init -> tz_vmm] r12 = f5784000 [53f84000] [init -> tz_vmm] sp = 00000000 [00000000] [init -> tz_vmm] lr = 00000000 [00000000] [init -> tz_vmm] ip = 8018d4e0 [8018d4e0] [init -> tz_vmm] cpsr = 20000013 [init -> tz_vmm] sp_und = 803c27d8 [803c27d8] [init -> tz_vmm] lr_und = 803c27d8 [803c27d8] [init -> tz_vmm] spsr_und = 00000000 [00000000] [init -> tz_vmm] sp_svc = 8f853cd0 [8f853cd0] [init -> tz_vmm] lr_svc = 8018dbbc [8018dbbc] [init -> tz_vmm] spsr_svc = 20000013 [00000000] [init -> tz_vmm] sp_abt = 803c27cc [803c27cc] [init -> tz_vmm] lr_abt = 803c27cc [803c27cc] [init -> tz_vmm] spsr_abt = 00000000 [00000000] [init -> tz_vmm] sp_irq = 803c27c0 [803c27c0] [init -> tz_vmm] lr_irq = 80010f80 [80010f80] [init -> tz_vmm] spsr_irq = 20000093 [00000000] [init -> tz_vmm] sp_fiq = 00000000 [00000000] [init -> tz_vmm] lr_fiq = 00000000 [00000000] [init -> tz_vmm] spsr_fiq = 00000000 [00000000] [init -> tz_vmm] ttbr0 = 80004019 [init -> tz_vmm] ttbr1 = 80004019 [init -> tz_vmm] ttbrc = 00000000 [init -> tz_vmm] dfar = f5784008 [53f84008] [init -> tz_vmm] exception = data_abort [init -> tz_vmm] Could not handle data-abort will exit!
To be on the safe side, I've also applied the switch to 'tzic_set_irq_fiq' to prevent Linux from thinking it were able to use FIQs:
if (GENODE_TZ_VMM) { printk(KERN_NOTICE, "Warning: Can't use FIQ in nonsecure world" return -EINVAL; }
Cheers, Martin
On 21.01.2015 16:10, Nico Weichbrodt wrote:
Hi,
I have some more information which might be helpful. I uploaded all my changes so far to github: https://github.com/envy/linux/tree/weichbr-docker I didn't start with a vanilla kernel. Instead I used the script from https://eewiki.net/display/linuxonarm/i.MX53+Quick+Start#i.MX53QuickStart-Li... which downloads the vanilla kernel and then applied a set of patches to it.
Here is what I found so far: If you got to arch/arm/mach-imx/time.c at around line 320, there is a block for initializing the timer. If you comment this out, the kernel boots but the clock is (of course) wrong. I then found a data-abort exception which was caused because Linux tried to access the GPIO. I disabled this in one of my commits like Stefan Kalkowski did. Then I am able to boot up until: [ 0.000000] sched_clock: 32 bits at 33MHz, resolution 29ns, wraps every 128849015778ns [ 128.835985] -- end mxc_timer_init() [ 128.835989] -- end mx53_clocks_init() [ 128.835993] Console: colour dummy device 80x30 [ 128.835997] Calibrating delay loop (skipped) preset value.. 996.14 BogoMIPS (lpj=4980736) [ 128.836005] pid_max: default: 32768 minimum: 301 [ 128.836010] Security Framework initialized [ 128.836014] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes) [ 128.836021] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes) [ 128.836031] Initializing cgroup subsys debug [ 128.836036] Initializing cgroup subsys devices [ 128.836040] Initializing cgroup subsys freezer [ 128.836044] Initializing cgroup subsys blkio [ 128.836049] CPU: Testing write buffer coherency: ok [ 128.836054] Setting up static identity map for 0xb03c0200 - 0xb03c0234 [ 128.836063] devtmpfs: initialized [ 128.836068] VFP support v0.3: implementor 41 architecture 3 part 30 variant c rev 2 [ 128.836075] pinctrl core: initialized pinctrl subsystem [ 128.836081] regulator-dummy: no parameters [ 128.836086] NET: Registered protocol family 16 [ 128.836091] DMA: preallocated 256 KiB pool for atomic coherent allocations [ 128.836104] imx53-pinctrl 53fa8000.iomuxc: initialized IMX pinctrl driver [ 128.836115] bio: create slab <bio-0> at 0 [ 128.836120] SCSI subsystem initialized [ 128.836123] pps_core: LinuxPPS API ver. 1 registered [ 128.836128] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@...292...> [ 128.836138] PTP clock support registered [ 128.836142] Switched to clocksource mxc_timer1 [ 128.836153] NET: Registered protocol family 2 [ 128.836158] TCP established hash table entries: 4096 (order: 2, 16384 bytes) [ 128.836165] TCP bind hash table entries: 4096 (order: 2, 16384 bytes) [ 128.836172] TCP: Hash tables configured (established 4096 bind 4096) [ 128.836178] TCP: reno registered [ 128.836181] UDP hash table entries: 256 (order: 0, 4096 bytes) [ 128.836187] UDP-Lite hash table entries: 256 (order: 0, 4096 bytes) [ 128.836193] NET: Registered protocol family 1 [ 128.836198] RPC: Registered named UNIX socket transport module. [ 128.836204] RPC: Registered udp transport module. [ 128.836209] RPC: Registered tcp transport module. [ 128.836213] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 128.836221] futex hash table entries: 256 (order: -1, 3072 bytes) [ 128.836237] msgmni has been set to 1004 [ 128.836242] io scheduler noop registered [ 128.836246] io scheduler deadline registered [ 128.836250] io scheduler cfq registered (default) [ 128.836295] Serial: IMX driver [ 128.836299] 53fbc000.serial: ttymxc0 at MMIO 0x53fbc000 (irq = 47, base_baud = 4166666) is a IMX [ 128.836307] console [ttymxc0] enabled [ 128.836311] bootconsole [earlycon0] disabled [ 128.836319] loop: module loaded [ 128.836322] tun: Universal TUN/TAP device driver, 1.6 [ 128.836327] tun: (C) 1999-2004 Max Krasnyansky <maxk@...293...> [ 128.836334] 63fec000.ethernet supply phy not found, using dummy regulator
I also used the bootparams to give the kernel a preset lpj value. I took this from our working 2.6.x kernel.
I think, that the kernel is still doing stuff it's not supposed to do. For example I haven't found this whole block here: https://github.com/skalk/linux/blob/270ba1e8fb95e8b16e423e6a1bbbc9291276e35e... in the 3.14 kernel. I also wasn't able to find this block here: https://github.com/skalk/linux/blob/270ba1e8fb95e8b16e423e6a1bbbc9291276e35e... But for the second block I know, that in arch/arm/mach-imx/clk-imx51-imx53.c contains many function calls to imx_clk_gate2() inside mx53_clocks_init() and mx5_clocks_common_init(). Those calls lead to arch/arm/mach-imx/clk-gate2.c which sets an enable and disabled function. The disable functions look like it does the same thing as the remove code block in the 2.6.x kernel, except it is more generic. If you add a printk() inside the disable clock you will see an endless stream of output aber the [ 128.836307] console [ttymxc0] enabled line and if you also add a dump_stack() call you will see an endless stacktrace which looks like an infinite loop to me.
Maybe this helps.
Cheers, Nico ________________________________________ Von: Martin Stein Gesendet: Dienstag, 20. Januar 2015 16:32 An: Genode OS Framework Mailing List Betreff: Re: AW: Running Linux 3.14 as a Genode VM
Hi Nico,
I'm currently working on the issue of Linux 3.16 as TZ-VM on i.MX53. I get stuck at the same line of output as you and have found that it goes down to the ASM instruction 'msr CPSR_c, rx' in 'arch_local_irq_restore' called by 'local_irq_restore' called at the end of 'vprintk_emit' called by 'printk' that is called for printing "Calibrating delay loop (skipped), value calculated using timer frequency.. ". I have no fix so far but maybe this already helps you. I'll keep digging.
Btw. Genode uses the EPIT1 and EPIT2 timers while Linux uses the General Purpose Timer (GPT). Furthermore, Linux configures GPIO3_IRQH (ID 55) to signal the GPT timer interrupt. This interrupt is configured nonsecure and not used by the Genode side. So there seems to be no problem with the timer so far.
Cheers, Martin
On 15.01.2015 13:03, Nico Weichbrodt wrote:
Hi,
Thanks for your response. I can boot the Linux kernel with my device tree without Genode on the hardware and it boots fine.
I tried setting the lpj on the commandline, but it seems that gets ignored. If I set them at the start of calibrating_delay() the output switches to "Calibrating delay loop (skipped), preset value.." but it hangs. I think you are right with the "missing" interrupt but how can I prevent Linux from using the same timer as Genode? I think it gets initialized in linux/arch/arm/mach-imx/time.c in mxc_clocksource_init() but if I comment that out it doesn't work either. I also know that the clocks are initialized by linux in linux/arch/arm/mach-imx/clk-imx51-imx53.c in mx5_clocks_common_init() and mx53_clocks_init(). But those look nothing like https://github.com/skalk/linux/blob/270ba1e8fb95e8b16e423e6a1bbbc9291276e35e... probably because of the device tree. This is my first time doing kernel hacking, so I don't really know what everything is doing yet (and my whole approach is trial-and-error)
I will try to provide you with the repositories.
Cheers, Nico
Von: Stefan Kalkowski Gesendet: Mittwoch, 14. Januar 2015 17:51 An: Genode OS Framework Mailing List Betreff: Re: Running Linux 3.14 as a Genode VM
Hi Nico,
On 01/14/2015 04:16 PM, Nico Weichbrodt wrote:
Hi,
I'm a student at TU Braunschweig and I inherited a Genode project from another student. He used Genode on an i.MX53 Quick Start Board to boot a Linux 2.6.35 kernel (this one: https://github.com/skalk/linux) as a VM with TrustZone support. This works fine.
For my task I need a newer kernel, so I'm currently trying to upgrade to kernel 3.14. Since this kernel requires dtbs I modified Genode to load the dtb instead of ATAGs. I had to modify my dtb so the memory layout matches and to set the bootargs. My kernel now boots but hangs very soon at this point:
Uncompressing Linux... done, booting the kernel. [ 0.000000] Booting Linux on physical CPU 0x0 [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Initializing cgroup subsys cpuacct [ 0.000000] Linux version 3.14.15-00329-gaed2dee-dirty (envy@...289.....) (gcc version 4.5.2 (Sourcery G++ Lite 2011.03-41) ) #40 Wed Jan 14 16:06:19 CET 2015 [ 0.000000] CPU: ARMv7 Processor [412fc085] revision 5 (ARMv7), cr=10c5387d [ 0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache [ 0.000000] Machine model: Freescale i.MX53 Quick Start-R Board [ 0.000000] bootconsole [earlycon0] enabled [ 0.000000] Memory policy: Data cache writeback [ 0.000000] On node 0 totalpages: 131072 [ 0.000000] free_area_init_node: node 0, pgdat c05b36e8, node_mem_map dfbf9000 [ 0.000000] Normal zone: 1024 pages used for memmap [ 0.000000] Normal zone: 0 pages reserved [ 0.000000] Normal zone: 131072 pages, LIFO batch:31 [ 0.000000] CPU: All CPU(s) started in SVC mode. [ 0.000000] pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768 [ 0.000000] pcpu-alloc: [0] 0 [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 130048 [ 0.000000] Kernel command line: console=ttymxc0,115200 earlyprintk loglevel=10 [ 0.000000] PID hash table entries: 2048 (order: 1, 8192 bytes) [ 0.000000] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes) [ 0.000000] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes) [ 0.000000] Memory: 513696K/524288K available (4039K kernel code, 219K rwdata, 1384K rodata, 198K init, 123K bss, 10592K reserved, 0K highmem) [ 0.000000] Virtual kernel memory layout: [ 0.000000] vector : 0xffff0000 - 0xffff1000 ( 4 kB) [ 0.000000] fixmap : 0xfff00000 - 0xfffe0000 ( 896 kB) [ 0.000000] vmalloc : 0xe0800000 - 0xff000000 ( 488 MB) [ 0.000000] lowmem : 0xc0000000 - 0xe0000000 ( 512 MB) [ 0.000000] pkmap : 0xbfe00000 - 0xc0000000 ( 2 MB) [ 0.000000] modules : 0xbf000000 - 0xbfe00000 ( 14 MB) [ 0.000000] .text : 0xc0008000 - 0xc0553f8c (5424 kB) [ 0.000000] .init : 0xc0554000 - 0xc0585bbc ( 199 kB) [ 0.000000] .data : 0xc0586000 - 0xc05bcda0 ( 220 kB) [ 0.000000] .bss : 0xc05bcda0 - 0xc05dba2c ( 124 kB) [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 [ 0.000000] NR_IRQS:16 nr_irqs:16 16 [ 0.000000] TrustZone Interrupt Controller (TZIC) initialized [ 0.000000] CPU identified as i.MX53, silicon rev 2.1 [ 0.000000] Switching to timer-based delay loop [ 0.000009] sched_clock: 32 bits at 33MHz, resolution 29ns, wraps every 128849015778ns [ 0.008364] Console: colour dummy device 80x30 [ 0.012925] Calibrating delay loop (skipped), value calculated using timer frequency..
If I add some debug statements to linux/init/calibrate.c before the pr_info("Calibrating delay loop (skipped) ..."); then those get printed as well, but the "Calibrating..." text is not. I also added a loop at the top of the calibrate_delay() function which just prints an incrementing counter but this loop only prints once and then hangs.
Now I have two assumptions: Either the serial driver is hanging or it has something to do with the clocks/timer because the highest time passed number I got was 0.016800-ish.
Well it might be that your Linux kernel's version uses the same timer like Genode's "hw kernel" is using for scheduling purposes. If so, the interrupt is probably set as being secure, and timer interrupts are always delivered to the secure Genode side.
Another possiblity when using the vanilla Linux sources: there were known bugs at least in the past according to this thread:
https://community.freescale.com/thread/327545
Did you tested your kernel and device tree on the hardware without Genode underneath?
A possible workaround is to set the lpj (loops per jiffies) value on the command line instead of calculating it dynamically. When running Linux in a virtualized fashion this might be adviseable anyway, because when world switches take place during the calibrating loop, you might get fantastic values.
I have not yet replayed the TrustZone commits from Stefan Kalkowski for Linux 2.6.35 but because I set all devices as unsecure inside Genode (genode/repos/base-hw/src/core/include/spec/imx53/trustzone/csu.h) this should not be necessary, right?
Well, as already addressed: you'll run into trouble as soon as the guest OS (Linux) touches hardware already used by the kernel, like the timer. Another problem arises when it re-initializes all clocks and power domains. This works like a device reset (e.g. for the timer) and probably leads to a non-working secure side's kernel.
If that doesn't help you further, you might provide your Genode and Linux branch you are working with, as well as detailed instructions how to reproduce your scenario from that.
Best regards Stefan
If you have any pointers/ideas/suggestions for me, this would be very helpful.
Cheers, Nico Weichbrodt
New Year. New Location. New Benefits. New Data Center in Ashburn, VA. GigeNET is offering a free month of service with a new server in Ashburn. Choose from 2 high performing configs, both with 100TB of bandwidth. Higher redundancy.Lower latency.Increased capacity.Completely compliant. http://p.sf.net/sfu/gigenet _______________________________________________ genode-main mailing list genode-main@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/genode-main
-- Stefan Kalkowski Genode Labs
http://www.genode-labs.com/ * http://genode.org/
New Year. New Location. New Benefits. New Data Center in Ashburn, VA. GigeNET is offering a free month of service with a new server in Ashburn. Choose from 2 high performing configs, both with 100TB of bandwidth. Higher redundancy.Lower latency.Increased capacity.Completely compliant. http://p.sf.net/sfu/gigenet _______________________________________________ genode-main mailing list genode-main@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/genode-main
New Year. New Location. New Benefits. New Data Center in Ashburn, VA. GigeNET is offering a free month of service with a new server in Ashburn. Choose from 2 high performing configs, both with 100TB of bandwidth. Higher redundancy.Lower latency.Increased capacity.Completely compliant. http://p.sf.net/sfu/gigenet _______________________________________________ genode-main mailing list genode-main@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/genode-main
New Year. New Location. New Benefits. New Data Center in Ashburn, VA. GigeNET is offering a free month of service with a new server in Ashburn. Choose from 2 high performing configs, both with 100TB of bandwidth. Higher redundancy.Lower latency.Increased capacity.Completely compliant. http://p.sf.net/sfu/gigenet _______________________________________________ genode-main mailing list genode-main@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/genode-main
New Year. New Location. New Benefits. New Data Center in Ashburn, VA. GigeNET is offering a free month of service with a new server in Ashburn. Choose from 2 high performing configs, both with 100TB of bandwidth. Higher redundancy.Lower latency.Increased capacity.Completely compliant. http://p.sf.net/sfu/gigenet _______________________________________________ genode-main mailing list genode-main@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/genode-main
------------------------------------------------------------------------------ Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ genode-main mailing list genode-main@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/genode-main