Hello
I want to use the Genode signalling scheme to control asynchrones data exchange between a client and a server. To get familiar with it I started with a test implementation on the server side.
The server's main function fills some data into a buffer and instantiates a Signal receiver thread which shall dump the buffer content on signal reception. Then it instantiates a Signal transmitter to fire a signal at the receiver for one time. The unexpected outcome is that it looks as if the receiver gets the signal twice.
This is the implementation in main():
Signal_receiver s_rcvr;
Signal_context ct;
Receiver *ser_rx = new (env()->heap()) Receiver(s_rcvr, srv.transmit);
Signal_context_capability send_cap = s_rcvr.manage(&ct);
Signal_transmitter tx(send_cap);
tx.submit();
The receiver thread's entry() function (class Receiver derived from class Thread<>; the constructor's 2nd parameter is a handle to the buffer already filled with data, which is accessed thru _transmit inside the function):
void entry()
{
uint8_t ch[32];
int cnt = 0;
while (1)
{
Signal signal = _receiver.wait_for_signal();
#if 1
for (unsigned out = 0; out < sizeof(ch); ++out)
{
int rd = _transmit.read_byte();
if (rd >= 0) ch[out] = uint8_t(rd);
else break;
}
++cnt;
ch[31] = '\0';
printf("Found characters: %s\n", &ch[0]);
ch[0] = '\0';
printf("Caught signals: %d\n", cnt);
#endif
}
}
The debug output shows that the 2 printf()s inside the while() loop are called 2 times, which clearly indicates that _receiver.wait_for_signal() returned 2 times. I wonder what goes wrong, and what I should do to get rid of this odd behaviour.
Regards
Frank
Hi Frank,
to fire a signal at the receiver for one time. The unexpected outcome is that it looks as if the receiver gets the signal twice.
This is the implementation in /main()/: Signal_receiver s_rcvr; Signal_context ct; Receiver *ser_rx = new (env()->heap()) Receiver(s_rcvr, srv.transmit); Signal_context_capability send_cap = s_rcvr.manage(&ct); Signal_transmitter tx(send_cap); tx.submit();
I assume that you start executing the receiver thread by calling 'Thread_base::start()' in the receiver's constructor? If this is the case, you have a race condition. The receiver thread is executed right before you register the signal context at the signal receiver 's_rcvr'. You can fix this by starting the receiver thread after calling 's_rcvr.manage':
In the 'Receiver' class:
class Receiver : Thread<STACK_SIZE> { ... + using Thread_base::start; ... };
In your main function:
Receiver *ser_rx = new (env()->heap()) Receiver(s_rcvr, srv.transmit); Signal_context_capability send_cap = s_rcvr.manage(&ct); + ser_rx->start(); Signal_transmitter tx(send_cap);
BTW, if your code relies on the correct number of signals (that is, not just using signals as a wake-up mechanism), you need to take the value 'signal.num()' into account. If signals are batched, 'num' may be higher than one. In you current error case, the 'num' value returned by the first call of 'wait_for_signal' was indeed zero, indicating that is is no valid signal.
Signal signal = _receiver.wait_for_signal(); + /* evaluate signal.num() */ + ...
Regards Norman
Hello, Norman
Thanks for your reply.
I don't think that a race condition contributes to the problem, because the first signal reception takes place after the call tx.submit(). Anyway, I made the suggested change (remove the thread start from the Receiver's constructor and do it explicitly before the instantiation of Signal_transmitter), but again the signal was received twice.
I checked the properties of the signal, and indeed, as you supposed, on the first reception the parameter num has the value 0, while on the second reception the value is 1. This would allow to ignore the unwanted signal as a temporary workaround, but the observed behaviour casts some doubt on the reliability of the signalling mechanism. There might by other situations where the signal is not at all received although it was sent once.
Frank
-----Original Message-----
From: Norman Feske [mailto:norman.feske@...1...]
Sent: Monday, July 05, 2010 2:30 PM
To: Genode OS Framework Mailing List
Subject: Re: About Genode::Signal reception
Hi Frank,
to fire a signal at the receiver for one time. The unexpected
outcome is
that it looks as if the receiver gets the signal twice.
This is the implementation in /main()/:
Signal_receiver s_rcvr;
Signal_context ct;
Receiver *ser_rx = new (env()->heap()) Receiver(s_rcvr,
srv.transmit);
Signal_context_capability send_cap = s_rcvr.manage(&ct);
Signal_transmitter tx(send_cap);
tx.submit();
I assume that you start executing the receiver thread by calling
'Thread_base::start()' in the receiver's constructor? If this is the
case, you have a race condition. The receiver thread is executed right
before you register the signal context at the signal receiver
's_rcvr'.
You can fix this by starting the receiver thread after calling
's_rcvr.manage':
In the 'Receiver' class:
class Receiver : Thread<STACK_SIZE>
{
...
- using Thread_base::start;
...
};
In your main function:
Receiver *ser_rx = new (env()->heap()) Receiver(s_rcvr,
srv.transmit);
Signal_context_capability send_cap = s_rcvr.manage(&ct);
- ser_rx->start();
Signal_transmitter tx(send_cap);
BTW, if your code relies on the correct number of signals (that is,
not
just using signals as a wake-up mechanism), you need to take the value
'signal.num()' into account. If signals are batched, 'num' may be
higher
than one. In you current error case, the 'num' value returned by the
first call of 'wait_for_signal' was indeed zero, indicating that is is
no valid signal.
Signal signal = _receiver.wait_for_signal();
- /* evaluate signal.num() */
- ...
Regards
Norman
Hi Frank,
I don’t think that a race condition contributes to the problem, because the first signal reception takes place after the call /tx.submit()/. Anyway, I made the suggested change (remove the thread start from the /Receiver/’s constructor and do it explicitly before the instantiation of /Signal_transmitter/), but again the signal was received twice.
yesterday prior answering your email, I was able to reproduce the superfluous wakeup on Linux using the code snippets you provided. After fixing the race condition, the superfluous wakeup does longer occur with my test case. Today, I tested both variants (race, no race) on OKL4_x86 and the superfluous wakeup never occurs here. For your convenience, I have attached my test program to this email.
I checked the properties of the signal, and indeed, as you supposed, on the first reception the parameter /num/ has the value *0*, while on the second reception the value is *1*. This would allow to ignore the unwanted signal as a temporary workaround, but the observed behaviour
Evaluating the 'num' value of the returned signal should not be a temporary workaround - it is the only correct way to use the API if the exact number of occurred signals are of importance to you.
casts some doubt on the reliability of the signalling mechanism. There might by other situations where the signal is not at all received although it was sent once.
Please do not hesitate to substantiate this claim. I'm more than happy about bug reports and test cases. Sharing your gut feeling with us does not contribute to improve the framework though. .-)
Regards Norman