Prerequisites

This article is aimed at people that are curious, or that always dreams of using their own file system because why not.

To understand this, you will need the notions of the language C (more importantly pointers to functions), GCC and the FUSE library. It’s okay if you don’t really know the FUSE library though I will explain it in more details below.

Introduction

So first of all you could ask, what’s a file system ?

It’s a system that governs file organization and access, it provides a data storage service that allow your process to share mass storage.

For example, on linux, there is the VFS (Virtual File System), it’s the software layer in the kernel that provides the filesystem interface to userspace programs.

It also provides an abstraction within the kernel which allows different filesystem implementation to coexist.

VFS system calls (open(2), read(2), write(2), …) and so on are called from a process context. But that’s not all ! There is also FUSE !

FUSE (Filesystem in Userspace) is a software interface for Unix computer operating systems that lets non-privileged users create their own file system without editing kernel code.

This is possible by running file system code in user space while FUSE module provides only a bridge to actual kernel interfaces.

How convenient ! We already have a module and won’t need to go through the pain to write modules ourselves.

Here is a diagram illustrating the relation and the calls between the kernel and the userspace with FUSE.

Kernel/Userspace relation

Table of Contents

  1. FUSE
  2. FUSE Operations Structure
  3. Simple filesystem
  4. GETATTR
  5. READDIR
  6. READ
  7. Run our filesystem
  8. Conclusion
  9. Bibliography

FUSE

To use FUSE, you need to install the library, to obtain it you can write apt-get install fuse libfuse-dev on ubuntu/debian.

At this point, you might be thinking, ‘FUSE sounds interesting, but isn’t it kind of niche and obscure?’

Well, let me ask: have you heard of SSHFS (SSH Filesystem)?

It is a tool that allows you to access files and directories on a remote computer as if they were on your own machine.

It does this using SSH, a protocol that securely connects to remote systems.

With SSHFS, you can mount a remote file system over an SSH connection, meaning the files on the remote system appear in your local file explorer, and you can work with them as if they were stored locally.

It uses SFTP (SSH File Transfer Protocol) for secure file transfer and management.

This makes SSHFS useful for securely accessing and managing files from another computer over a network.

The current version of SSHFS is built using FUSE, and it was rewritten by Miklos Szeredi, who also created FUSE.

If you’re curious about other filesystems built with FUSE, you can find more examples here.

Now, let’s dive into writing our own filesystem. To get started, we’ll explore the FUSE operations structure and then implement a few key functions to create a simple working filesystem.

For our simple filesystem, we’re aiming to create a read-only system that can open files and read their contents.

Fuse Operations Structure

FUSE has a structure called fuse_operations that is especially useful for what we want to achieve.

This structure consists of pointers to various functions, and each function is called by FUSE when a specific event occurs in the filesystem.

For example when the user tries to create a folder, the “mkdir” function is triggered, when the user will try to write, the “write” function is called, and so on.

If you examine it closely, you’ll notice many familiar system calls that you already know.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
struct fuse_operations {
	int (*getattr) (const char *, struct stat *);
	int (*readlink) (const char *, char *, size_t);
	int (*getdir) (const char *, fuse_dirh_t, fuse_dirfil_t);
	int (*mkdir) (const char *, mode_t);
	int (*unlink) (const char *);
	int (*rmdir) (const char *);
	int (*symlink) (const char *, const char *);
	int (*rename) (const char *, const char *);
	int (*link) (const char *, const char *);
	int (*chmod) (const char *, mode_t);
	int (*chown) (const char *, uid_t, gid_t);
	int (*open) (const char *, struct fuse_file_info *);
	int (*read) (const char *, char *, size_t, off_t,
		         struct fuse_file_info *);
	int (*write) (const char *, const char *, size_t, off_t,
		          struct fuse_file_info *);
	int (*statfs) (const char *, struct statvfs *);
	int (*flush) (const char *, struct fuse_file_info *);
	int (*getxattr) (const char *, const char *, char *, size_t);
	int (*opendir) (const char *, struct fuse_file_info *);
	int (*readdir) (const char *, void *, fuse_fill_dir_t, off_t,
			        struct fuse_file_info *);
	int (*access) (const char *, int);
	int (*create) (const char *, mode_t, struct fuse_file_info *);
	int (*write_buf) (const char *, struct fuse_bufvec *buf, off_t off, 
                      struct fuse_file_info*);
	int (*read_buf) (const char *, struct fuse_bufvec **bufp, size_t size, off_t off,
                     struct fuse_file_info *);
        .
        .
        .
};

Great! We have our own functions that will be called when needed.

Next, we need to define and implement them by filling the structure with our functions.

Obviously we don’t need all of them for what we want to do, just a few.

For our simple filesystem, we’re aiming to create a read-only system that can open files and read their contents.

Let’s take a look at the FUSE documentation structure and the role of each function.

The most essential function for a functional FS is “getattr”. Similar to stat(), it retrieves file or file system status.

This function will be called when the system needs to access file properties. The other functions we need are “readdir” and “read”.

Readdir is triggered when the user tries to show the files and directories in a specific directory.

And lastly, read is called when the system will try to read data from a file.

Simple FileSystem

Like we said previously, we want to write a read-only FS. Let’s start by implementing the “getattr” function.

Getattr

Getattr is called when the system tries to get the attribute of the file, it should return information about each file in our FS by filling a structure of type “stat” (see man 2 stat). You can also check out the structure with man 3 stat.

For our purposes, we’ll keep it simple and won’t need to fill out every field in the structure.

You can refer to fuse getattr documentation for more details.

Now, let’s define our function. The function signature should match the pointer in the fuse_operations structure.

  • The first parameter is the path of the file the system wants informations from.
  • The second parameter is the stat structure we need to update with the file’s attributes.

For fields that aren’t relevant or necessary, we can set them to 0 or provide a reasonable default value.

Here’s an example of how we can write this function.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#define FUSE_USE_VERSION 30

#include <fuse.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <time.h>
#include <string.h>
#include <stdlib.h>

static int my_getattr(const char *path, struct stat *stat)
{
	/* Debug message */
	printf("[GETATTR] : Get attributes of %s\n", path);

	stat->st_uid = getuid(); /* User ID of owner */
	stat->st_gid = getgid(); /* Group ID of owner */
	stat->st_atime = time(NULL); /* Time of last access */
	stat->st_mtime = time(NULL); /* Time of last modification */
	
	if (strcmp(path, "/") == 0)
	{
		stat->st_mode = S_IFDIR | 0755; /* File type and mode */
		stat->st_nlink = 2; /* Number of hard links */
	}
	else
	{
		stat->st_mode = S_IFREG | 0644; /* File type and mode */
		stat->st_nlink = 1; /* Number of hard links */
		stat->st_size = 1024; /* Total size, in bytes */
	}
	
	return 0;
}

The first few fields are straightforward to fill using existing functions. For the remaining fields, we have two different cases depending on the file path.

The st_mode field contains information about the type (regular file, directory, …) and the mode (permissions bits). For example, in the case where it is the root directory we put 0755 as permission to have -rwxr-xr-x, it means the owner will read, write and execute in the directory. The groupe and other will only read and execute in the directory.

The st_nlink is for the number of hard links pointing on path. For the directory we need two hardlinks to navigate between "." (current) and ".." (parent) and for the file only one.

For the case where it is a file, we put the 0644 permission bits so that the owner will read and write, the group and other will read. And we don’t forget to set the size of the file.

Readdir

Next, we’ll implement the readdir function. Let’s take a quick look at the documentation. The function should list the elements (file/directoy) inside a directory. In our version we only have the root directory.

The parameters for this function are as follows:

  • The first is the path of the directory.
  • The second is the buffer we need to update with the directory’s contents.
  • The third is a FUSE helper function (fuse_fill_dir_t) that we use to add entries to the buffer.

Looking at the documentation for fuse_fill_dir_t, we see that we need to provide: the buffer, the name of the element, the element attributes and the offset of the next entry or zero.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
static int my_readdir(const char *path, void *buffer, fuse_fill_dir_t filler, 
					  off_t offset, struct fuse_file_info *fi)
{
	/* Debug message */
	printf("[READDIR] : Get list of files in %s\n", path);
	
	filler( buffer, ".", NULL, 0 ); /* Current Directory */
	filler( buffer, "..", NULL, 0 ); /* Parent Directory */
	
	if (strcmp( path, "/" ) == 0) 
	{
		filler(buffer, "blog_gistre.md", NULL, 0);
		filler(buffer, "yet_another_kind_of_file", NULL, 0 );
	}
	
	return 0;
}

In our FS we’ll hard-code the files, here an example with “blog_gistre.md” and “yet_another_kind_of_file”.

Read

Finally, we need the read function, once again, here is the documentation. More basic, this function reads the content of a specific file.

In order :

  • The first parameter is the path of the file the system wants to read
  • The second one is the buffer
  • The third one is the size of the data we want to read
  • The fourth is the offset is the beginning offset in the content where we are going to read. We can compare it to something we already know, man lseek in C.

The function works similarly to the standard read system call, and it should return the number of bytes that have been read

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
static int my_read(const char *path, char *buffer, size_t size, off_t offset,
				   struct fuse_file_info *fi)
{
	char firstFileText[] = "You can find lot of interesting articles at : blog.gistre.epita.fr";
	char secondFileText[] = "Hello World from Yet Another Kind of File !";
	char *selectedText = NULL;

	if (strcmp(path, "/blog_gistre.md") == 0 )
		selectedText = firstFileText;
	else if (strcmp(path, "/yet_another_kind_of_file") == 0 )
		selectedText = secondFileText;
	else
		return -1;

	memcpy(buffer, selectedText + offset, size);
	
	/* Debug message */
	printf("[READ] : File : %s Content : %s\n", path, selectedText);
	
	return strlen(selectedText) - offset;
}

In our filesystem, we also hard-code the content of the files that we’ve hard-coded into the structure.

Run our filesystem

After implementing all these functions, we need a main. FUSE provides its own main function, we need to call it, passing in the fuse_operations structure along with the functions we’ve implemented.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
static struct fuse_operations myfs_operations = {
    .getattr = my_getattr,
    .readdir = my_readdir,
    .read = my_read,
};

int main( int argc, char *argv[] )
{
	return fuse_main(argc, argv, &myfs_operations, NULL);
}

Now, compile it and launch. To see debug messages we added earlier with printf, use the -f flag. We need to provide the compiler the arguments to include fuse library, to do so we use pkg-config.

1
2
$ gcc myfs.c -o myfs `pkg-config fuse --cflags --libs`
$ ./myfs -f [mount point]

Let’s try it together.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
$ mkdir mnt
$ gcc myfs.c -o myfs `pkg-config fuse --cflags --libs`
$ ./myfs -f ~/mnt
[GETATTR] : Get attributes of /.xdg-volume-info
[GETATTR] : Get attributes of /
[GETATTR] : Get attributes of /BDMV
[GETATTR] : Get attributes of /autorun.inf
[READDIR] : Get list of files in /
[GETATTR] : Get attributes of /blog_gistre.md
[GETATTR] : Get attributes of /yet_another_kind_of_file
[GETATTR] : Get attributes of /
[READDIR] : Get list of files in /

In a second terminal, let’s try a few commands to test the behavior of our filesystem.

Commands starting with $1 will be in the first terminal (where the filesystem is running), and those starting with $2 will be in the second terminal (where we interact with the filesystem).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
$2 ls mnt

$1 [GETATTR] : Get attributes of /
$1 [READDIR] : Get list of files in /
$1 [GETATTR] : Get attributes of /blog_gistre.md
$1 [GETATTR] : Get attributes of /yet_another_kind_of_file

$2 cat mnt/blog_gistre.md
$2 You can find lot of interesting articles at : blog.gistre.epita.fr

$1 [READDIR] : Get list of files in /
$1 [GETATTR] : Get attributes of /blog_gistre.md
$1 [GETATTR] : Get attributes of /yet_another_kind_of_file
$1 [GETATTR] : Get attributes of /b
$1 [READDIR] : Get list of files in /
$1 [GETATTR] : Get attributes of /blog_gistre.md
$1 [GETATTR] : Get attributes of /blog_gistre.md
$1 [READ] : File : /blog_gistre.md Content : You can find lot of interesting articles at : blog.gistre.epita.fr

$2 rm mnt/yet_another_kind_of_file
$2 rm: cannot remove 'mnt/yet_another_kind_of_file': Function not implemented

$1 [GETATTR] : Get attributes of /
$1 [GETATTR] : Get attributes of /y
$1 [GETATTR] : Get attributes of /
$1 [READDIR] : Get list of files in /
$1 [GETATTR] : Get attributes of /yet_another_kind_of_file
$1 [GETATTR] : Get attributes of /yet_another_kind_of_file

Conclusion

To wrap up, you now you know the basics and the methodology for writing your own FS using FUSE. The filesystem we built together is a simple one, as it doesn’t do much beyond triggering specific functions.If you want to upgrade it and create a more functional filesystem, you can add the following features:

  • Create new directories
  • Create new files
  • Write to files

To implement these features, you’ll need to add the following fuse_operations functions:

  • mkdir
  • mknod
  • write

Additionally, you’ll need to update the previous functions to ensure everything works together smoothly:

  • getattr
  • readdir
  • read

The full source code for this article is available on GitHub. Thanks for following along, and I hope you enjoyed it!

Bibliography