Making Linux Stacking-Friendly
(an abstract)

Josef Sipek and Erez Zadok

Appeared in the 2007 Linux Storage and Filesystem Workshop, co-located with USENIX FAST (February 2007)

Though it is possible to use stackable file systems in Linux today, a stackable file system maintainer has a number of issues one has to address. Some of these issues stem from the fact that the Linux Kernel has not been designed with stackable file systems in mind. However, it is not necessary to redesign the entire VFS. There are several specific parts of the VFS that when modified, would improve the correctness of stackable file systems, and make them easier to develop.

While revising the Unionfs code for inclusion into the kernel, we have been forced to address these correctness and maintainability issues. One of these involves the modification of lower file system. Currently, if the lower file system is modified, the upper (stacked) file system is not aware of the changes. This cache inconsistency can lead to system instability or data loss. In addition, some stackable file systems do not need to modify the stored data while presenting it to user-space. For example, Unionfs's inodes could theoretically share the page mapping with the lower file system. Yet neither the VFS nor the VM provide a simple and clean way of accomplishing this. Some applications expect the inode numbers to persist over time. Stackable file systems must therefore try to somehow reliably map the lower inode numbers to what they expose to user-space. This issue particularly affects fan-out file systems such as Unionfs, as they may have several lower inodes to chose from. Another issue we are addressing is code reuse. Virtually all stackable file systems perform several common actions, and therefore the code to do such actions should be shared instead of duplicated in each file system.

Some of these problems are relatively easy to solve. For example, some code can be shared by all stackable file systems. Some work has already been done in this direction with the creation of a generic fs-stack layer (currently in Andrew Morton's -mm tree): a set of functions used by all stackable file systems. Persistent inode mapping is somewhat file system specific, and Unionfs implements one of the simpler suggestions made over time. Other issues, such as the cache coherency and shared memory mappings, are an open question remaining to be answered. These questions should get answered soon as eCryptfs has made it into the Linux kernel (2.6.19), and we expect other stackable file systems to make it into the Linux kernel in the coming years.

In this talk we discuss some issues that relate to developing stackable file systems in the Linux kernel, and propose solutions for them.