Restic: Add command to copy all data to another repository

Created on 25 Oct 2015  ·  22Comments  ·  Source: restic/restic

During the discussion in #320 we discovered that functionality may be helpful to copy all data (data blobs, tree blobs, snapshots) from a repository to a new one, recreating pack files and indexes on the fly. This allows creating a new repository in a different location (e.g. moving from a local repository to an sftp-server) and using that from now on without losing any history and old snapshots.

This issues tracks the implementation of this feature and can be closed when it is implemented.

work in progress feature suggestion

Most helpful comment

I think I have the implemented already... /me scratches head and looks for it...
... https://github.com/middelink/restic/tree/fix-323
I need to check if it still compiles though, that branch is 228 commits behind ...

All 22 comments

Is this intended to handle a one-time copy from one repository (A) to a new one (B)? Or is this meant to be more general by performing a "sync" or update of changed content between (A) and (B) since the last sync?

At the moment this is inteded to handle a one-time copy only, so that users can migrate to a different repository in a different location, or with a new master key.

Given a slow internet connection, I would like the possibility to backup to s3 and another location as efficiently as possible.

@witeshadow I'm not sure how that can be done efficiently, as the data is encrypted in repo A with master A', and needs to do to repo B with a different masterkey B`. We need to read in all the data, decrypt with A`, encrypt with B` and write out. There is no way to optimize this for slow bandwidth. Its gonna hurt...

the only optimization I can think is having a selection criteria on the source repo A, by using the host, path and tags filters so you don't have to copy all. However, that depends on your use case.

@fd0 I just wanted to add my vote for this feature request. Anything I can do to make it happen?

You could implement it... The functionality itself is not hard to do, configuring the two backends is the hard thing. We don't support accessing more than one backend (e.g. there's only one $B2_ACCOUNT_ID)... so I think this feature depends on a proper config file (see #16).

Let's say we have two repos, A and B, and you'd like to sync A->B so that after the process is finished, the set of blobs (and snapshots) in B is a superset of the set of blobs in A.

So, you open both repos and load the index files for each one. Then you iterate over the index of A, for each blob checking if the blob is also contained in B. If this is true, move on to the next. If it's false, download, decrypt, encrypt and upload it to B.

Last is copying the snapshot files over. For each snapshot file in A, decrypt the file, encrypt it again for B, store it there and it's done.

As I said, the technical implementation is rather easy :)

Great! Thanks for the tips. I have this itch, so I will see if I can make time to scratch it -- but for the short-term I will have to go without this restic merge feature. If someone gets to it before I do, that's fine -- or I'll circle back around to this eventually!

I think I have the implemented already... /me scratches head and looks for it...
... https://github.com/middelink/restic/tree/fix-323
I need to check if it still compiles though, that branch is 228 commits behind ...

It might be useful to allow not only a full copy, but also a subset of snapshots. This would support a usecase suggested by #1910 (backup to a primary repo often, and from there backup to offsite/slower/more expensive storage less often) and, I think, would not be a lot harder to implement than a full copy. Might be a future addition, though :-)

Err… Any news for mere users without dev skills to compile and try out @middelink’s suggestion?

This is mostly a "me too" comment, but I'd like to have the ability to copy only specific snapshots from one repo to another, rather than a "copy-all" or "sync" semantic; e.g., make daily backups to local storage, then once a week copy only the most recent daily to an s3 bucket, etc.

Well, then you are in luck, my copy cmd takes one or more snapshot ids. In fact copy-all is not something it does. You would have to list your snapshot ids first and then concatenate them on the "restic copy" cmdline. As I see this as a degenerate use-case, I'm good with it.

Without delving too deep into this, perhaps some discussions with ncw/rclone could be of use...

I'm also interested in the merge/copy functionality, I have a repository on an USB-stick I would like to merge/copy to my central repository (same passwords).
Any news on this?

Looks like the fork branch was updated to master, but there's not yet a PR for it.

@middelink Is your code finished / mergeable? If not, what still needs to be done? This is a feature I really want :)

@theoretical2019 The code itself is finished, but each time I sit down to create an official PR, I keep finding things I need to do before it's ready. Like documentation, like a unreleased/changelog...
Oh, and tests! Did I mention tests? It needs tests :P

@middelink Fyi, I have tested your branch by rebasing to upstream master and it works pretty good. It created a new snapshot with same host, tags and date :+1:
Waiting for PR :tada:

Now with such feature, I can create a secondary repository, which is used by the clients only when the first repository is locked for maintenance (e.g. prune). And prune task can trigger a copy from the secondary after it finishes, so no missing backups, hence zero downtime on backup service.

@middelink Would you be so kind as to create a PR of your code? When doing so, please also allow edits from maintainers - this way, we can help you with the changelog, documentation and so on.

The important thing is that we get a base PR to work on. I'd love to get your great work moving, and so would others I think :) Let me know if you need any help creating the PR!

@rawtaz Sure. Let me sync up and all that stuff. For some reason I have not found the time to do so earlier, but it looks like I have some time now.

Thank you everyone for your work on this!

I've got one question left that's not answered by the docs (at least for me): Do I need to prune both or is it enough to do it in the source and snapshot deletions are propagated?

@lfrancke When using the copy command you specifically list the snapshots that you want to copy. Others, both existing, non-existing and previously-existing-but-now-pruned-and-no-longer-existing snapshots are not applicable.

If you copy snapshots from repo A to repo B and then forget and prune them in repo A, they will not be forgotten and pruned in repo B automatically, you'll have to do that in repo B yourself.

Excellent, thank you very much @rawtaz for the quick and helpful response.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

kontakm picture kontakm  ·  4Comments

fd0 picture fd0  ·  4Comments

viric picture viric  ·  5Comments

cfbao picture cfbao  ·  3Comments

TheLastProject picture TheLastProject  ·  3Comments