Hello,
while experimenting with rumpfs I noticed that sync() is not supported by rumpfs.
I added a sync() call in ../test/libc_ffat/main.cc and executed make run/rump_ext2. I got the following message:
[init -> test-libc_vfs] DUMMY sync(): sync called, not implemented
I guess, a filesystem not supporting sync() is of greatly reduced value. Has this call been blocked on purpose ?
Regards, Adrian Schuur
Hello Adrian,
* a3an <a3an@...294...> [2015-01-29 22:03:44 +0100]:
while experimenting with rumpfs I noticed that sync() is not supported by rumpfs.
I added a sync() call in ../test/libc_ffat/main.cc and executed make run/rump_ext2. I got the following message:
[init -> test-libc_vfs] DUMMY sync(): sync called, not implemented
I guess, a filesystem not supporting sync() is of greatly reduced value.
rump_fs actually does support syncing (you may take a look at [1]). However, sync(2) is only implemented as a dummy function call [2] in the common libc backend, hence this message. Looking at the current implementation, calling fsync(3) would have the desired effect [3]. (It will call the sync method on the root directory, which is a Dir_file_system, that in return will call sync on all its registered file systems and so.)
Has this call been blocked on purpose ?
No, it just was not implemented. I vaguely remember talking about this topic, when we added the sync syscall to Noux. Since we override the common libc implementation in libc_noux [4] anyway, there was probably no immediate need to also provide a more complete implementation there.
Cheers Josef
[1] repos/dde_rump/src/server/rump_fs/main.cc:325 [2] repos/libports/src/lib/libc/dummies:136 [3] repos/libports/src/lib/libc/vfs_lugin.cc:829 [4] repos/ports/src/lib/libc_noux/plugin.cc:635
Thanks Josef.
Actually, what I am looking for is a way to safely let go of the filesystem. So when the process ends, somebody cleans up the filesystem and leave it in a consistent state for the next time around.
I noticed [1], but I have problems locating where the Session_component instance is located so that I can could call sync(). I am not very familiar with the structure of a process, but can I assume that the Session_component instance will get destroyed orderly when the process terminates ? If so, wouldn't it be better its destructor calls sync() ?
I tried to do that, it did NOT work.
Regards, Adrian
Am 29.01.2015 um 23:40 schrieb Josef Söntgen:
Hello Adrian,
- a3an <a3an@...294...> [2015-01-29 22:03:44 +0100]:
while experimenting with rumpfs I noticed that sync() is not supported by rumpfs.
I added a sync() call in ../test/libc_ffat/main.cc and executed make run/rump_ext2. I got the following message:
[init -> test-libc_vfs] DUMMY sync(): sync called, not implemented
I guess, a filesystem not supporting sync() is of greatly reduced value.
rump_fs actually does support syncing (you may take a look at [1]). However, sync(2) is only implemented as a dummy function call [2] in the common libc backend, hence this message. Looking at the current implementation, calling fsync(3) would have the desired effect [3]. (It will call the sync method on the root directory, which is a Dir_file_system, that in return will call sync on all its registered file systems and so.)
Has this call been blocked on purpose ?
No, it just was not implemented. I vaguely remember talking about this topic, when we added the sync syscall to Noux. Since we override the common libc implementation in libc_noux [4] anyway, there was probably no immediate need to also provide a more complete implementation there.
Cheers Josef
[1] repos/dde_rump/src/server/rump_fs/main.cc:325 [2] repos/libports/src/lib/libc/dummies:136 [3] repos/libports/src/lib/libc/vfs_lugin.cc:829 [4] repos/ports/src/lib/libc_noux/plugin.cc:635
Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ genode-main mailing list genode-main@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/genode-main
Hello Josef,
I've got some new indications. I used the fsync() call to see how that worked out. It did not work either. By plowing through the vfs header files, I noticed that File_system contains a virtual function called sync() followed by {}, and some comments that only Fs_file_system needs sync. I placed a PWRN macro call with a msg within the braces and run the test test again. The msg appeared on the console.
It means there is no overriding implementation for sync(), presumably in Fs_file_system. I assume rump_fs has been used in read mode only.
Regards, Adrian
Am 29.01.2015 um 23:40 schrieb Josef Söntgen:
Hello Adrian,
- a3an <a3an@...294...> [2015-01-29 22:03:44 +0100]:
while experimenting with rumpfs I noticed that sync() is not supported by rumpfs.
I added a sync() call in ../test/libc_ffat/main.cc and executed make run/rump_ext2. I got the following message:
[init -> test-libc_vfs] DUMMY sync(): sync called, not implemented
I guess, a filesystem not supporting sync() is of greatly reduced value.
rump_fs actually does support syncing (you may take a look at [1]). However, sync(2) is only implemented as a dummy function call [2] in the common libc backend, hence this message. Looking at the current implementation, calling fsync(3) would have the desired effect [3]. (It will call the sync method on the root directory, which is a Dir_file_system, that in return will call sync on all its registered file systems and so.)
Has this call been blocked on purpose ?
No, it just was not implemented. I vaguely remember talking about this topic, when we added the sync syscall to Noux. Since we override the common libc implementation in libc_noux [4] anyway, there was probably no immediate need to also provide a more complete implementation there.
Cheers Josef
[1] repos/dde_rump/src/server/rump_fs/main.cc:325 [2] repos/libports/src/lib/libc/dummies:136 [3] repos/libports/src/lib/libc/vfs_lugin.cc:829 [4] repos/ports/src/lib/libc_noux/plugin.cc:635
Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ genode-main mailing list genode-main@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/genode-main
Hello Adrian,
* a3an <a3an@...294...> [2015-01-30 10:56:45 +0100]:
I've got some new indications. I used the fsync() call to see how that worked out. It did not work either. By plowing through the vfs header files, I noticed that File_system contains a virtual function called sync() followed by {}, and some comments that only Fs_file_system needs sync. I placed a PWRN macro call with a msg within the braces and run the test test again. The msg appeared on the console.
I added debug messages to the code (see the patch file) to trace the fsync() call and it actually calls the sync() method of rump_fs in the end:
[init -> test-libc_vfs] open(file_name, O_CREAT | O_WRONLY) succeeded [init -> test-libc_vfs] calling fsync(fd) [init -> test-libc_vfs] virtual int Libc::Vfs_plugin::fsync(Libc::File_descriptor*): [init -> test-libc_vfs] virtual void Vfs::Dir_file_system::sync(): [init -> test-libc_vfs] virtual void Vfs::Dir_file_system::sync(): [init -> test-libc_vfs] virtual void Vfs::Fs_file_system::sync(): [init -> rump_fs] virtual void File_system::Session_component::sync(): [init -> test-libc_vfs] fsync(fd) succeeded
It means there is no overriding implementation for sync(), presumably in Fs_file_system. I assume rump_fs has been used in read mode only.
We use rump_fs with Ext2 r/w on a regular basis and it worked fine so far. Since Ext2 does not support any journaling we call sync [1] once a second to mitigate problems resulting from a potential power loss. I admit that this is not the best solution to tackle this issue, however it has to suffice for now. There are plans to port the Ext4 driver from Linux but there is no concrete schedule, though.
Cheers Josef
[1] repos/dde_rump/src/server/rump_fs/file_system.cc:91
Hi Josef,
I installed the patch and it shows that fsync result in calling sync(). Actually I used fsync(0).
[init -> test-libc_vfs] Trying sync() [init -> test-libc_vfs] virtual int Libc::Vfs_plugin::fsync(Libc::File_descriptor*): [init -> test-libc_vfs] virtual void Vfs::Dir_file_system::sync(): [init -> test-libc_vfs] virtual void Vfs::Dir_file_system::sync(): [init -> test-libc_vfs] Virtual sync() <---------------------------- this my trace in File_system::sync() [init -> test-libc_vfs] virtual void Vfs::Fs_file_system::sync(): [init -> rump_fs] virtual void File_system::Session_component::sync(): [init -> test-libc_vfs] sync() done [init -> test-libc_vfs] test finished [init] virtual void Genode::Child_policy::exit(int): child "test-libc_vfs" exited with exit value 0
I have no explanation for the invocation of virtual void File_system::sync(), which is where my trace comes from. It could well be my C++ ignorance.
In addition to the PWRN macro I placed between the braces, I also modified the run script to not delete ext2.raw. After the first make run/rump_ext2, I reran the actual run part with qemu manually. So if sync() was successfully executed in the first run, the second run should find a clean file system. Which is not the case:
[init -> rump_fs] Backend::Backend(): Backend blk_size 512 [init -> rump_fs] rump: /genode: file system not clean; please fsck(8) [init -> test-libc_vfs] calling mkdir(dir_name, 0777) dir_name=testdir [init -> test-libc_vfs] mkdir(dir_name, 0777) failed, ret=-1, errno=28 [init] virtual void Genode::Child_policy::exit(int): child "test-libc_vfs" exited with exit value -1
I have to assume that sync() does not reset the not-yet-synced flag in the superblock, which surprises me. But that would be a rump_fs internal issue.
In the Unix/Linux world one normally does not issue a sync() call at the end of a program (unless one has paranoia streaks), ideally, one should not be aware of sync(). So I was looking for a more logical place to have this done automatically.
Originally I tried to to solve the lack of an explicit sync call by placing the sync() in the destructor of the System_component class. But as far as I can see, this destructor is never called. Is there a graceful shutdown process in Nova ? Or, to be more specific, is rump_fs signaled that Nova is being shutdown ?
What does this mean for Nova applications ? Must they organize save shutdown of a file system ? How would they do that ?
Regards, Adrian
Am 03.02.2015 um 19:13 schrieb Josef Söntgen:
Hello Adrian,
- a3an<a3an@...294...> [2015-01-30 10:56:45 +0100]:
I've got some new indications. I used the fsync() call to see how that worked out. It did not work either. By plowing through the vfs header files, I noticed that File_system contains a virtual function called sync() followed by {}, and some comments that only Fs_file_system needs sync. I placed a PWRN macro call with a msg within the braces and run the test test again. The msg appeared on the console.
I added debug messages to the code (see the patch file) to trace the fsync() call and it actually calls the sync() method of rump_fs in the end:
[init -> test-libc_vfs] open(file_name, O_CREAT | O_WRONLY) succeeded [init -> test-libc_vfs] calling fsync(fd) [init -> test-libc_vfs] virtual int Libc::Vfs_plugin::fsync(Libc::File_descriptor*): [init -> test-libc_vfs] virtual void Vfs::Dir_file_system::sync(): [init -> test-libc_vfs] virtual void Vfs::Dir_file_system::sync(): [init -> test-libc_vfs] virtual void Vfs::Fs_file_system::sync(): [init -> rump_fs] virtual void File_system::Session_component::sync(): [init -> test-libc_vfs] fsync(fd) succeeded
It means there is no overriding implementation for sync(), presumably in Fs_file_system. I assume rump_fs has been used in read mode only.
We use rump_fs with Ext2 r/w on a regular basis and it worked fine so far. Since Ext2 does not support any journaling we call sync [1] once a second to mitigate problems resulting from a potential power loss. I admit that this is not the best solution to tackle this issue, however it has to suffice for now. There are plans to port the Ext4 driver from Linux but there is no concrete schedule, though.
Cheers Josef
[1] repos/dde_rump/src/server/rump_fs/file_system.cc:91
Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now.http://goparallel.sourceforge.net/
genode-main mailing list genode-main@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/genode-main
Hi Adrian,
What does this mean for Nova applications ? Must they organize save shutdown of a file system ? How would they do that ?
Genode's general mechanism to destroy subsystems works as follows: When the parent decides to destroy a subsystem, it closes all sessions that were opened by the subsystem in reverse order. Naturally, those sessions comprise, among others, all file-system sessions held by the subsystem.
From the file system's point of view, this event looks like the client
issued a session-close request. The current approach would be to let the file-system call sync each time when a file-system session is closed.
Currently, there is no separate mechanism in place to gracefully shut down services.
Cheers Norman
Hello Adrian,
* a3an <a3an@...294...> [2015-02-04 22:14:37 +0100]:
I have no explanation for the invocation of virtual void File_system::sync(), which is where my trace comes from. It could well be my C++ ignorance.
No problem, it is not so easy to follow at first.
In addition to the PWRN macro I placed between the braces, I also modified the run script to not delete ext2.raw. After the first make run/rump_ext2, I reran the actual run part with qemu manually. So if sync() was successfully executed in the first run, the second run should find a clean file system. Which is not the case:
[init -> rump_fs] Backend::Backend(): Backend blk_size 512 [init -> rump_fs] rump: /genode: file system not clean; please fsck(8) [init -> test-libc_vfs] calling mkdir(dir_name, 0777) dir_name=testdir [init -> test-libc_vfs] mkdir(dir_name, 0777) failed, ret=-1, errno=28 [init] virtual void Genode::Child_policy::exit(int): child "test-libc_vfs" exited with exit value -1
I have to assume that sync() does not reset the not-yet-synced flag in the superblock, which surprises me. But that would be a rump_fs internal issue.
The problem here is the current implementation of rump_fs. Though we sync the file system regularly and the sync/dirty flag should be cleared, the “s_state” field in the superblock of Ext2 has not been updated because the file system is not unmounted. When rump_fs tries to mount the used image, it checks this field and prints the message. There is a commit [1], that addresses this issue by mounting the file system when the first client opens a session and umounting it when the last client closes its session but it is not on the master branch yet.
So, all in all, the data written to the file system is safe, rump_fs (and thereby the rumpkernel) does it job and the message can be ignored. I admit, the current situation is not optimal. It would be nice, if rump_fs would always do a fsck before mounting the file system and would only notify the user if something really is wrong.
Cheers Josef
[1] https://github.com/cnuke/genode/commit/5ba470b6b (Note the commit is old and may or may not apply to the current master branch.)