borg diff: add json output

Created on 11 Apr 2018  ·  6Comments  ·  Source: borgbackup/borg

Right now borg diff shows something like this:

+27.4 kB  -27.4 kB var/vmail/example.com/foobar/Auto/Steam und Co/dovecot-uidlist
+490.4 kB -489.3 kB var/vmail/example.com/foobar/Auto/Steam und Co/dovecot.index.cache
 +29.6 kB  -29.3 kB var/vmail/example.com/foobar/Auto/Steam und Co/dovecot.index.log
added      62.77 kB var/vmail/example.com/foobar/Auto/Steam und Co/new/1520542202.M174402P24533.turtle,S=62768,W=63599
  +4.5 kB   -4.5 kB var/vmail/example.com/foobar/maildirsize

And that's great. Except these human readable size informations are a bit tough for a machine to parse. Ideally there would be an option to output machine readable sizes (aka just the bytes, without any prefix).
Speaking of prefix: Are these kB as in 1000 Bytes or KiB as in 1024 Bytes? While at it maybe this could be clarified, too. Thanks!

Most helpful comment

I propose to add an option --json-output for diff.

That should either be --json or --json-lines (depending on which it outputs) for consistency with other commands.

And a diff output should be something like that:
Given this way, users will get easy way to parse the output

No, that looks rather akward to parse and recombine to something useful. The whole change part is no better than parsing the existing non-JSON output, I'm afraid.
I'd rather suggest one entry by path, having a type field for file/directory/hardlink/softlink etc., a change list that lists change types, e.g. added, deleted, modified (or content :thinking: ), owner, mode, etc., sizes should be given in bytes of course, and probably rather like sizes { old: 12345, new: 12346}

All 6 comments

Did you check whether we have json output support for that?

I did indeed check the borg help diff for that. The only thing related to JSON I found is this:

  --log-json            Output one JSON object per log line instead of
                        formatted text.

So no luck there I am afraid.

OK, if we do not have that yet, adding json output seems to be a good idea.

In that case I would also like to request including both the uncompressed and compressed size of each change. As far as I can understand the Borg source code, the current output refers to the uncompressed size.

I propose to add an option --json-output for diff.
And a diff output should be something like that:

{
 added: [
        {path: '/path/to/file', change: '27.4 kB'}
        ],
 modified: [
        {path: '/path/to/file', change: '+490.4 kB -489.3 kB'}
        ],
 deleted: [
        {path: '/path/to/file', change: 'directory'}
        ]
}

Given this way, users will get easy way to parse the output, cause they will have separate groups and fields.
So, I'm going to add inner function print_output_json to do_diff, which will produce the output like above from diffs generator.

Also for not-human-readable format we can add an option --bytes and show file size without units. I think it would be nice to have this option and for standard output too.
Now diff function uses overridden ItemDiff.__repr__() that in turn uses ItemDiff._content_string() to get difference representation. As we can't add arguments to __repr__ In my opinion it would be g
ood to add separate function and put current __repr__ content there. __repr__ will invoke that function to get standard output. The function will take an argument(e.g. in_bytes) and pass that
to Item.get_size(). Frankly, I haven't dived in Item.get_size() yet, but I think I can get item size in bytes.
Any thoughts, suggestions?

I propose to add an option --json-output for diff.

That should either be --json or --json-lines (depending on which it outputs) for consistency with other commands.

And a diff output should be something like that:
Given this way, users will get easy way to parse the output

No, that looks rather akward to parse and recombine to something useful. The whole change part is no better than parsing the existing non-JSON output, I'm afraid.
I'd rather suggest one entry by path, having a type field for file/directory/hardlink/softlink etc., a change list that lists change types, e.g. added, deleted, modified (or content :thinking: ), owner, mode, etc., sizes should be given in bytes of course, and probably rather like sizes { old: 12345, new: 12346}

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rugk picture rugk  ·  5Comments

unlandm picture unlandm  ·  4Comments

chebee7i picture chebee7i  ·  5Comments

anarcat picture anarcat  ·  4Comments

ThomasWaldmann picture ThomasWaldmann  ·  6Comments