Tomasz Gajewski tomga@wp.pl writes:
[init -> test-smp] TLB: thread started on CPU 1 [init -> test-smp] TLB: thread started on CPU 2 [init -> test-smp] TLB: thread started on CPU 3 [init -> test-smp] TLB: all threads are up and running... [init -> test-smp] TLB: ram dataspace destroyed, all will fault... no RM attachment (READ pf_addr=0xc00c pf_ip=0x1000d2c from pager_object: pd='init -> test-smp' thread='tlb_thread') Warning: page fault, pager_object: pd='init -> test-smp' thread='tlb_thread' ip=0x1000d2c fault-addr=0xc00c type=no-page Warning: core -> pager_ep: cannot submit unknown signal context [init -> test-sm
The page-fault of this tlb-thread is perfectly fine! It is what is tested here exactly. Actually, you should see three faulting tlb-threads. The test forks threads on each cpu apart from the first one, and let them infinetly work on a shared page, which is thereby in the TLB of the corresponding cpu. Because we do not start anything else on the other cpus, there is little probability that the TLB entry gets evicted. Then the first cpu unmaps the page. If cross-cpu TLB shootdown is implemented correctly for the platform, you should notice a fault on all other cpus.
The question is why your messages end here, or did you just send a snippet?
No. Unfortunately it is what I get. From the beginning I have problems with everything hanging after faults. I would expect some handler to be invoked that would give me some debug information but I don't get any.
I have a feeling that some exceptions may be routed to hyp mode but I didn't put much effort to confirm that. Is there any code that installs exception handlers in hyp mode on base-hw on some other architecture that I could check/use?
It seems that puzzle is somewhat solved.
With the help of Norman's helper to run depot scripts I could run more tests. Most of them were passing but test-timer was not and I remembered that I left calibrating timer for later and forgot about it. Now after setting proper value for TICS_PER_US both tests that were not working for me: test-timer and smp, are passing.
Nevertheless I'm afraid that this "fix" just hides some problem as with wrong value of TICS_PER_US and some greater values in test-timer I was receiving:
[init -> test ->Kernel: Cpu 0 error: re-entered lock. Kernel exception?!
For now the "solution" satisfies me but probably I'll get back to it when I have a JTAG debugger. Without it it is hard to diagnose this.
Tomasz Gajewski