Hello Genodians,
we have a requirement for additional filesystem tests especially to measure performance and check reliability in reset/crash scenarios. The tests should be modular to cover different file systems and block devices including CBE as well as different platforms to detect any regressions.
Would such tests be generally welcome as an upstream contribution to the genode framework from us?
If so, I would then create an issue in the genode project at github to discuss the best way to go about this.
Bests Stefan
Hello Stefan,
we have a requirement for additional filesystem tests especially to measure performance and check reliability in reset/crash scenarios. The tests should be modular to cover different file systems and block devices including CBE as well as different platforms to detect any regressions.
Would such tests be generally welcome as an upstream contribution to the genode framework from us?
I'm admittedly struggling to give you a clear-cut answer.
On the one hand, I see the benefits of a strong base of tests, in particular bad-case tests that stress corner cases. So your offer seems generous.
On the other hand, I'm weary of the social pressure and secondary costs that come with accepting contributions in areas that are rather low on our priority list. In particular, we have no current plan to put file-system performance into the spotlight in the near future. So I see the risk for being distracted from plans that are much closer to our heart. Even if high-quality tests are contributed, their integration and maintenance can still be extremely costly, sometimes much more so than we are comfortable with while overriding existing commitments.
If file-system performance was our priority, I would intuitively know where to look first, without new benchmarks. In fact, existing scenarios like the tool-chain test already amplify bottlenecks and could thereby be readily taken as basis for performance analysis, asking questions like:
- Why does the 'cp -r' operation of the tool-chain test take so long?
- What would would be the effect of moving the symlink resolution of path elements from the libc to the VFS library?
- What would be the effect of caching or batching stat calls? What's the speedup when implementing stat as an async file-system- session operation instead of an RPC?
- What benefits could be had by allocating file-system node handles at the client side?
- What would be the effect of delivering dir entries in batches?
- What's the effect of replacing the currently synchronous I/O backend of the vfs_rump plugin by issuing proper asynchronous I/O operations?
Concrete answers to those questions along with exemplary implementations would be very welcome contributions! Such an analysis requires a deep dive through many layers of the stack though.
Regarding file-system integrity and reliability, the answer is much easier. Data loss would be a critical bug. So if we are presented with a reproducible test that triggers such a case, we will immediately make it our top priority.
Cheers Norman
Hello Norman,
Thanks for your feedback.
On 14.05.21 10:41, Norman Feske wrote:
If file-system performance was our priority, I would intuitively know where to look first, without new benchmarks.
Regarding file-system integrity and reliability, the answer is much easier. Data loss would be a critical bug. So if we are presented with a reproducible test that triggers such a case, we will immediately make it our top priority.
Optimizing the performance of file systems isn't our priority either as we don't have any concrete requirements for this. Instead, our main goals with such a test suite are:
a) Increase the test coverage of the file systems provided by Genode in order to transparently and continuously demonstrate the quality of the framework and to make sure regressions won't go unnoticed once a file system component is being used in a product.
b) Compare file system scenarios available in Genode.
Would you see value in such tests?
Kind regards Stefan
Hi Stefan,
a) Increase the test coverage of the file systems provided by Genode in order to transparently and continuously demonstrate the quality of the framework and to make sure regressions won't go unnoticed once a file system component is being used in a product.
b) Compare file system scenarios available in Genode.
Would you see value in such tests?
This aligns very well with the statement I gave at the end of my last email. Your efforts would be very much appreciated.
The risk of regressions is indeed a major concern when optimizing code that is already time-tested. Hence, profound file-system tests might be a good pathway towards structural optimization work in the longer-term future.
Cheers Norman