This was original a StackOverflow question by me, although it was maybe way too specific for StackOverflow, and too long. I then later also answered it myself. So the answer comes below.
I feel this might be interesting for some developers, so this is why I post it here.
I wonder about an efficient way to copy files (on Linux, on a FS which supports copy-on-write (COW)).
Specifically, I want that my implementation uses copy-on-write if possible, but otherwise falls back to other efficient variants. Specifically, I also care about server-side copy (supported by SMB, NFS and others), and also zero-copy (i.e. bypassing the CPU or memory if possible).
(This question is not really specific to any programming language. It could be C or C++, but also any other like Python, Go or whatever has bindings to the OS syscalls, or has any way to do a syscall. If this is confusing to you, just answer in C.)
It looks like ioctl_ficlonerange
, ioctl_ficlone
(i.e. ioctl
with FICLONE
or FICLONERANGE
) support copy-on-write (COW). Specifically FICLONE
is used by GNU cp
(here, via --reflink
).
Then there is also copy_file_range
, which also seems to support COW, and server-side-copy.
(LWN about copy_file_range.)
It sounds as if copy_file_range
is more generic (e.g. it supports server-side-copy; not sure if that is supported by FICLONE
).
However, copy_file_range
seems to have some issues.
E.g. here, Paul Eggert comments:
[copy_file_range]'s man page
says it uses a size_t (not off_t) to count the number of bytes to be
copied, which is a strange choice for a file-copying API.
Are there situations where FICLONE
would work better/different than copy_file_range
?
Are there situations where FICLONE
would work better/different than FICLONERANGE
?
Specifically, assuming the underlying FS supports this, and assume you want to copy a file. I ask about the support of these functions for the functionality of:
- Copy-on-write support
- Server-side copy support
- Zero-copy support
Are they (FICLONE
, FICLONERANGE
, copy_file_range
) always performing exactly the same operation? (Assuming the underlying FS supports copy-on-write, and/or server-side copy.)
Or are there situations where it make sense to use copy_file_range
instead of FICLONE
? (E.g. COW only works with copy_file_range
but not with FICLONE
. Or the other way around. Or can this never happen?)
Or formulating the same question differently: Would copy_file_range
always be fine, or are there situations where I would want to use FICLONE
instead?
Why does GNU cp
use FICLONE
and not copy_file_range
? (Is there a technical reason, or is this just historic?)
Related: GNU cp
originally did not use reflink
by default (see comment by the GNU coreutils maintainer Pádraig Brady).
However, that was changed recently (this commit, bug report 24400), i.e. COW behavior is the default now (if possible) (--reflink=auto
).
Related question about Python for COW support.
Related discussion about FICLONE vs copy_file_range by Python developers. I.e. this seems to be a valid question, and it's not totally clear whether to use FICLONE
or copy_file_range
.
Related Syncthing documentation about the choice of methods for copying data between files, and
Syncthing issue about copy_file_range
and others for efficient file copying, e.g. with COW support.
It also suggests that it is not so clear that FICLONE
would do the same as copy_file_range
, so their solution is to just try all of them, and fallback to the next, in this order:
ioctl (with FICLONE), copy_file_range, sendfile, duplicate_extents, standard.
Related issue by Go developers on the usage of copy_file_range
.
It sounds as if they agree that copy_file_range
is always to be preferred over sendfile
.
The answer:
See the Linux vfs doc about copy_file_range
, remap_file_range
, FICLONERANGE
, FICLONE
and FIDEDUPERANGE
.
Then see
vfs_copy_file_range
. This first tries to call remap_file_range
if possible.
FICLONE
calls ioctl_file_clone
(here),
and FICLONERANGE
calls ioctl_file_clone_range
.
ioctl_file_clone_range
calls the more generic ioctl_file_clone
(here).
ioctl_file_clone
calls vfs_clone_file_range
(here).
vfs_clone_file_range
calls do_clone_file_range
and that calls remap_file_range
(here).
I.e. that answers the question. copy_file_range
is more generic, and anyway tries to call remap_file_range
(i.e. the same as FICLONE
/FICLONERANGE
) first internally.
I think the copy_file_range
syscall is slightly newer than FICLONE
though, i.e. it might be possible that copy_file_range
is not available in your kernel but FICLONE
is.
In any case, if copy_file_range
is available, it should be the best solution.
The order done by Syncthing (ioctl (with FICLONE), copy_file_range, sendfile, duplicate_extents, standard) makes sense.
Top comments (0)