For Windows, tr_sys_path_rename
could use the flag MOVEFILE_COPY_ALLOWED
rather than it's own built-in "read©" loop from util.c:tr_moveFile
Then the Linux version could use the more modern copy_file_range
instead of looping over each.
perhaps something like this:
size_t tr_sys_file_copy(tr_sys_file_t infile, tr_sys_file_t outfile, size_t len, tr_error** error )
{
TR_ASSERT(outfile != NULL);
TR_ASSERT(infile != NULL);
bool ret = false;
unsigned int flags = 0;
size_t res;
res = copy_file_range(infile, NULL, outfile, NULL, &len, flags);
if (res) {
return res;
}
else {
set_system_error(error, errno);
return res;
}
}
And maybe a:
tr_sys_file_copy_fallback(tr_sys_file_t infile, tr_sys_file_t outfile, size_t len, tr_error** error ) {
TR_ASSERT(outfile != NULL);
TR_ASSERT(infile != NULL);
uint64_t bytesLeft = len;
size_t const buflen = 1024 * 128; /* 128 KiB buffer */
tr_sys_path_info info;
while (bytesLeft > 0)
{
uint64_t const bytesThisPass = MIN(bytesLeft, buflen);
uint64_t numRead;
uint64_t bytesWritten;
if (!tr_sys_file_read(infile, buf, bytesThisPass, &numRead, error))
{
break;
}
if (!tr_sys_file_write(outfile, buf, numRead, &bytesWritten, error))
{
break;
}
TR_ASSERT(numRead == bytesWritten);
TR_ASSERT(bytesWritten <= bytesLeft);
bytesLeft -= bytesWritten;
}
return bytesLeft;
}
Then the Linux version could use the more modern copy_file_range instead of looping over each.
This cannot be done just with one call (or in simple loop):
If file_in is a sparse file, then copy_file_range() may expand any
holes existing in the requested range. Users may benefit from calling
copy_file_range() in a loop, and using the lseek(2) SEEK_DATA and
SEEK_HOLE operations to find the locations of data segments.
(man copy_file_range)
Ideally we should ensure properly locks and use sendfile() in a separate thread.
That's pretty much the same as the current "move file" code when
rename
fails. That doesn't keep sparse sections either, so I didn't
consider that a problem when replacing it.
On Sun, Jul 14, 2019 at 10:41 PM andreygursky notifications@github.com wrote:
>
Then the Linux version could use the more modern copy_file_range instead of looping over each.
This cannot be done just with one call (or in simple loop):
If file_in is a sparse file, then copy_file_range() may expand any holes existing in the requested range. Users may benefit from calling copy_file_range() in a loop, and using the lseek(2) SEEK_DATA and SEEK_HOLE operations to find the locations of data segments.
(man copy_file_range)
Ideally we should ensure properly locks and use sendfile() in a separate thread.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
The current "move file" linked as a reference
The current function:
https://github.com/transmission/transmission/blob/master/libtransmission/utils.c#L1682
https://github.com/transmission/transmission/blob/master/libtransmission/utils.c#L1744
That doesn't keep sparse sections either, so I didn't
consider that a problem when replacing it.
And it is missing preallocation for normal files.
Yeah. That too. The current bad behaviour of "move file" is enough to cause absurd loads in some set ups. If possible, it seems that the fall-back solution in glibc copy_file_range is a better implementation too.
I've had a play with this. It's hard to get right in a cross platform way: there doesn't seem to be any POSIX interface for an in-kernel copy. But I think it is worth special casing for each operating system because as file systems get more efficient at copies we want to access those features automatically.
I have a basic implementation up and running on MacOSX and FreeBSD but I need to set up a few more systems to test it on.
I think this work is orthogonal to making the file move asynchronous. One should not stand in the way of the other. For volunteer projects, scope creep is the enemy of productivity.
Changes are here: https://github.com/transmission/transmission/compare/master...RobCrowston:kernelcopy-wip
In brief I create a new abstraction called tr_sys_path_copy(). (I went with a copy instead of a move because when we make this an async operation, it may be necessary to keep the original file around for some indeterminate period of time after the copy is complete to close file handles in other threads.)
So far today, I've compiled and run the new test (transmission-test-copy) on MacOS 10.15.1, Windows Server 2016, Linux 5.0 (Ubuntu 19.05), Linux 5.3 (Arch 2019-12-03), FreeBSD 12.1 (uses userspace fallback), and FreeBSD 13. In each case I verified with a debugger that we are making the appropriate system calls. Unfortunately I wasn't able to get the code to build on OpenBSD 6.5, NetBSD 8.1, or Solaris 10 because of some unrelated libevent problems, but I am not sure these are supported platforms for libtransmission any more. In any event, this change should not affect those platforms.
Still to do:
Feedback welcome.
+1 to making data moves asynchronous. Moving large torrents between filesystems locks the entire application.
Most helpful comment
+1 to making data moves asynchronous. Moving large torrents between filesystems locks the entire application.