Restic: Cannot find docs for the exclude file syntax

Created on 8 Jun 2017  ·  17Comments  ·  Source: restic/restic

I can't seem to find any docs on how the exclude file syntax is parsed.

i.e. does it support wildcards? regex? how does it differentiate between files and directories? are path prefixes needed? where from (cwd/or root?)?

Some examples:

.qiv-trash (directory that could be anywhere on the filesystem)
.DS_Store
lost+found/
._*
desktop.ini
Thumbs.db (file that could be anywhere on the filesystem)
.Trash-* (the asterisk could be any number, is it needed?)
.tmp$ (file ending in .tmp)
~$ (file ending in a tilde)
~/.cache/ (cache directory in user home dir, using tilde syntax)
/full/path/to/directory/.syncthing/index*
documentation wanted

Most helpful comment

To answer a few of your questions already:

  • All patterns are tested against the full path of a file/dir to be saved
  • Relative paths/patterns will match anywhere below the path to be saved
  • At the moment there's no way to distinguish between a file and a directory, so --exclude foo will exclude any files and directories named foo. The same goes for --exclude foo/.

From your excludes file:

  • ._* will match all files and directories which name starts with a dot and an underscore
  • desktop.ini will match all files called desktop.ini exactly. So desktop.ini.bak is not excluded and saved in the snapshot.
  • .Trash-* excludes files/dirs named .Trash-, .Trash-foobar, etc.
  • .tmp$ excludes all files/dirs literally named .tmp$, that is a dot, followed by tmp, followed by a dollar sign. No regexp expansion.
  • ~$ excludes all files/dirs literally named tilde dollar. For excluding all files/dirs ending in a tilde, use *~.
  • ~/.cache excludes the directory .cache in all dirs called tilde. For excluding the cache directory in your home directory only, use $HOME/.cache (tilde is not expanded, environment variables are, but only in a file read via --exclude-file, in the command-line the shell expands both).
  • /full/path/to/directory/.syncthing/index* excludes all things with names starting with index below /full/path/to/directory/.syncthing.

All 17 comments

Hi, have you seen

Patterns use filepath.Glob internally, see filepath.Match for syntax. Additionally ** excludes arbitrary subdirectories. Environment-variables in exclude-files are expanded with os.ExpandEnv.

in https://github.com/restic/restic/blob/master/doc/manual.rst?

I think this should answer your questions.

I went to the documentation at https://restic.readthedocs.io/en/latest/manual.html and searched for exclude, where it unfortunately doesn't include any common examples (or a reference to the other documentation at manual.rst) :/

I read the golang doc and I think the end-user (me!) isn't going to know what restic is comparing an exclude to internally - is it the full path (e.g. /home/me/blah) or a path from the repository root (/blah or blah), or relative to the cwd (me/blah when I am at /home)?

Thanks for raising this issue, I think you have a valid point. The manual should explain the exclude filters without referencing godoc.org, and more examples are necessary.

To answer a few of your questions already:

  • All patterns are tested against the full path of a file/dir to be saved
  • Relative paths/patterns will match anywhere below the path to be saved
  • At the moment there's no way to distinguish between a file and a directory, so --exclude foo will exclude any files and directories named foo. The same goes for --exclude foo/.

From your excludes file:

  • ._* will match all files and directories which name starts with a dot and an underscore
  • desktop.ini will match all files called desktop.ini exactly. So desktop.ini.bak is not excluded and saved in the snapshot.
  • .Trash-* excludes files/dirs named .Trash-, .Trash-foobar, etc.
  • .tmp$ excludes all files/dirs literally named .tmp$, that is a dot, followed by tmp, followed by a dollar sign. No regexp expansion.
  • ~$ excludes all files/dirs literally named tilde dollar. For excluding all files/dirs ending in a tilde, use *~.
  • ~/.cache excludes the directory .cache in all dirs called tilde. For excluding the cache directory in your home directory only, use $HOME/.cache (tilde is not expanded, environment variables are, but only in a file read via --exclude-file, in the command-line the shell expands both).
  • /full/path/to/directory/.syncthing/index* excludes all things with names starting with index below /full/path/to/directory/.syncthing.

Thanks @fd0

So with "current directory", you don't mean the directory I was in when I launched the backup, but the directory that restic is currently in examining the files (apart from excludes that begin with a slash). Got it.

The behaviour for files and directories is slightly unexpected, I would have expected --exclude foo/ to backup the directory but not the contents, whereas --exclude foo to backup neither. Not sure why, from rsync I guess.

My examples missed an important one: spaces! I guess I need to escape those and shell metacharacters with a backslash.

I ended up copying lots of these ones: https://gist.github.com/jult/e2eaedad6b9e29d95977fea0ddffae7d

Are comments allowed in the excludes file? Edit: https://github.com/restic/restic/pull/916/commits/c796d84fca48feea91ca3e85fbf38e16f764a468 looks like a hash is the comment character.

Ah, I'm afraid that's still not completely correct. I'll describe how restic evaluates the exclude patterns. Let's suppose that restic is run by a user in his home directory (/home/user) like this:

$ restic backup --exclude='*.bak' --exclude='/home/user/secret' --exclude='extra' ~

Then restic will see the following command line arguments (after expansion by the shell):

["restic", "backup", "--exclude='*.bak'", "--exclude='/home/user/secret'", "--exclude='extra'", "/home/user"]

Then, it starts traversing /home/user. The following list describes what happens when the named file/dir is seen. restic always tests the complete path against the patterns:

  • file /home/user/foo.bak: The pattern *.bak matches and the file is not saved. The pattern is not absolute so it matches everywhere for all files ending in .bak.
  • dir /home/user/secret: The absolute pattern /home/user/secret matches, so the dir is not saved and not traversed
  • dir /home/user/foo/home/user/secret: No pattern matches, so the dir is saved.
  • dir /home/user/work/extra: The pattern extra matches, the dir is not saved.

I hope that this is a bit clearer now, I'll add a section to the manual describing the process. The key take-away point is that the patterns are evaluated against the full path of the files during backup. So if you want to match a single directory, use the complete path, otherwise it may match several times somewhere.

Any further questions? :)

I have further questions, and I very much appreciate the time you are taking to answer them. It's really one question about automagic anchoring of patterns. I think I can guess the answer (we rely on the absolute path being fairly unique and giving us the behaviour we want), but it's best to ask and be sure.

Would the absolute pattern /home/user/secret match /home/user/secret2? (What if you don't want it to?)
Would the absolute pattern /home/user/secret match /home/user/somemount/home/user/secret?

In both cases: No, the pattern won't match.

Why is that? Edit: I'm pleased that it doesn't, but I don't see why that is :)

The matching code there is modeled after what a shell would do: If you'd ask yourself, if the file /home/user/secret2 exists, what would ls /home/user/secret print (provided the file secret does not exist)?

In more formal terms: If the pattern starts with a / it is absolute and the pattern must match at the beginning of the string under test, so pattern /home/user/secret does not match /home/user/somemount[...]: The pattern is not a prefix of the string.

You can imagine for yourself that the pattern and the file path are both split into their respective components:

  • /home/user/secret is split into [ROOT, "home", "user", "secret"] and the file /home/user/somemount/home/user/secret is split into [ROOT, "home", "user", "somemount", "home", "user", "secret"]. The string ROOT is used in this example to mark the root directory. You can see that the pattern is not contained in the file name.

    • Let's look at the file /home/user/secret2, wich is split into [ROOT, "home", "user", "secret2"]. Again you can see that the pattern is not contained in the file name.

  • For the file /home/user/secret/secret.txt, which is split into [ROOT, "home", "user", "secret", "secret.txt"] you can see that the pattern is indeed contained in the file name, right at the beginning: [ROOT, "home", "user", "secret", ...], therefore the pattern matches and the file is excluded.
  • Let's say we have a relative exclude pattern of secret/secret.txt, which is split into ["secret", "secret.txt"]. You can see that this pattern can be found in the list for the file /home/user/secret/secret.txt, starting at offset 3: [ROOT, "home", "user", "secret", "secret.txt"], so the pattern matches.

When you have wildcards (*, ? and so on) in a path component, they are also tested. So for your first example, a pattern of /home/user/secret* would match the path /home/user/secret2.

All these examples should be document in the manual I think.

Gotcha. Thanks.

Are negative excludes possible ala .gitignore?
say I want to exclude all content in directories named .meteor except for the nested dir .meteor/local/db, could I do this?

/etc/restic/excludes:
.meteor/ !.meteor/local/db
restic backup exclude-file=/etc/restic/excludes

No, that is not implemented yet.

Documenting include/exclude examples is tracked in #396, I'm closing this issue here.

@fd0 i am trying exclude paths like below jobs//jobs//builds/**/archive to exclude archive dir from all directories would that work. And i have multiple paths like this which i need to exclude, if possible can you suggest me the best way to deal this kind as i cannot find any example for such kind in the documentation

Was this page helpful?
0 / 5 - 0 ratings

Related issues

TheLastProject picture TheLastProject  ·  3Comments

whereisaaron picture whereisaaron  ·  3Comments

fbartels picture fbartels  ·  3Comments

shibumi picture shibumi  ·  3Comments

fd0 picture fd0  ·  3Comments