Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse-file handling for the Linux implementation of std::fs::copy() #55909

Closed
wants to merge 27 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
52f86d2
Start of testing for fs.rs.
tarka Nov 3, 2018
2e7931b
Add failing test of sparse copy.
tarka Nov 3, 2018
822760a
Add some more test cases.
tarka Nov 3, 2018
00e8db0
Minor test cleanups.
tarka Nov 4, 2018
e838cbd
Add sparse detection routine.
tarka Nov 4, 2018
abc69fe
Add test of current simple file copy before refactoring.
tarka Nov 6, 2018
b5eef91
Disable (deliberatly) non-working tests for now.
tarka Nov 6, 2018
c0d936a
Refactor copy_file_range() call to enable copying of chunks.
tarka Nov 6, 2018
e6b19ec
Break Linux copy function out to own mod for sparse refactoring.
tarka Nov 6, 2018
d356451
Move some internal fns up to root.
tarka Nov 7, 2018
69cb124
Add functions to handle sparse files, with tests. Not used yet.
tarka Nov 7, 2018
091eec7
Add tests with sparse data.
tarka Nov 8, 2018
be75c6f
Interim checkin; start moving kernel/uspace branch to lower-level fn.
tarka Nov 8, 2018
dfc92e5
Move flag of copy_file_range to thread-local, and move the decision t…
tarka Nov 9, 2018
84313fb
Add override of kernel vs userspace copy for cross-device copy.
tarka Nov 9, 2018
16c4010
Minor cleanup of tests.
tarka Nov 10, 2018
054990f
Initial cut of user-space copy range.
tarka Nov 11, 2018
95460c8
Do cross-mount detection up-front, and replace current copy impl with…
tarka Nov 11, 2018
b6f6f97
Enable disabled tests.
tarka Nov 11, 2018
5b84746
Port in external test module as we now have a local one, and use std-…
tarka Nov 11, 2018
ce17903
Test and fix for userspace copies larger than read blocks.
tarka Nov 12, 2018
9e52d35
Use MetadataExt rather than direct stat() call.
tarka Nov 13, 2018
b933836
Minor test and fmt cleanups.
tarka Nov 13, 2018
92283c8
Travis runs under 3.x kernel, so we need to disable any tests that re…
tarka Nov 14, 2018
a437a7a
Add notes about platform-specific handling in copy().
tarka Nov 14, 2018
a435874
Minimise calls to .metadata().
tarka Nov 20, 2018
f1e7e35
Add copyright header.
tarka Nov 20, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Test and fix for userspace copies larger than read blocks.
  • Loading branch information
tarka committed Nov 12, 2018
commit ce17903eccdf8bf5200b355b9554dc79001281d0
48 changes: 40 additions & 8 deletions src/libstd/sys/unix/fs_linux.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@

use cell::RefCell;
use cmp;
use io::{self, Error, ErrorKind, Read, Write};
use libc;
use mem;
Expand Down Expand Up @@ -94,17 +95,17 @@ fn copy_bytes_kernel(reader: &File, writer: &File, nbytes: usize) -> io::Result<

// Slightly modified version of io::copy() that only copies a set amount of bytes.
fn copy_bytes_uspace(mut reader: &File, mut writer: &File, nbytes: usize) -> io::Result<u64> {
const BLKSIZE: usize = 4 * 1024; // Assume 4k blocks on disk.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a reasonable default for io::copy that handles generic Read and Write types but for fs::copy we know we're doing syscalls and can probably do better. E.g. gnu cp has bloated up over the years and now does 128KiB or stat.blksize, whichever is larger.

This is probably due to ever-growing throughput (thus more syscalls per unit of time) and larger true block sizes, e.g. on flash and raid storage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and of course it only does this for large files.

On the other hand the rust std default buffer size was dropped to rust 8k in #32695 due to problems with jemalloc. But that might not be relevant here since jemalloc is not the default anymore and this would only apply to copying large files.

let mut buf = unsafe {
// Assume 4k blocks on disk.
let mut buf: [u8; 4 * 1024] = mem::uninitialized();
let mut buf: [u8; BLKSIZE] = mem::uninitialized();
reader.initializer().initialize(&mut buf);
buf
};

let mut written = 0;
while written < nbytes {
let left = nbytes - written;
let len = match reader.read(&mut buf[..left]) {
let next = cmp::min(nbytes - written, BLKSIZE);
let len = match reader.read(&mut buf[..next]) {
Ok(0) => return Err(Error::new(ErrorKind::InvalidData,
"Source file ended prematurely.")),
Ok(len) => len,
Expand Down Expand Up @@ -235,14 +236,15 @@ pub fn copy(from: &Path, to: &Path) -> io::Result<u64> {
#[cfg(test)]
mod tests {
use super::*;
use iter;
use sys_common::io::test::{TempDir, tmpdir};
use fs::{read, OpenOptions};
use io::{Seek, SeekFrom, Write};
use path::PathBuf;

fn create_sparse(file: &PathBuf, len: i64) {
fn create_sparse(file: &PathBuf, len: u64) {
let fd = File::create(file).unwrap();
cvt(unsafe {libc::ftruncate64(fd.as_raw_fd(), len)}).unwrap();
cvt(unsafe {libc::ftruncate64(fd.as_raw_fd(), len as i64)}).unwrap();
}

fn create_sparse_with_data(file: &PathBuf, head: u64, tail: u64) -> u64 {
Expand Down Expand Up @@ -321,7 +323,7 @@ mod tests {
write!(fd, "{}", data);
}

create_sparse(&from, 1024*1024);
create_sparse_with_data(&from, 0, 0);

{
let infd = File::open(&to).unwrap();
Expand Down Expand Up @@ -480,7 +482,7 @@ mod tests {


#[test]
fn test_copy_bytes_uspace() {
fn test_copy_bytes_uspace_small() {
let dir = tmpdir();
let (from, to) = tmps(&dir);
let data = "test data";
Expand Down Expand Up @@ -523,6 +525,36 @@ mod tests {
}
}

#[test]
fn test_copy_bytes_uspace_large() {
let dir = tmpdir();
let (from, to) = tmps(&dir);
let size = 128*1024;
let data = iter::repeat("X").take(size).collect::<String>();

{
let mut fd: File = File::create(&from).unwrap();
write!(fd, "{}", data).unwrap();
}

{
let infd = File::open(&from).unwrap();
let outfd = File::create(&to).unwrap();
let written = copy_bytes_uspace(&infd, &outfd, size).unwrap();

assert_eq!(written, size as u64);
}

assert_eq!(from.metadata().unwrap().len(),
to.metadata().unwrap().len());

{
let from_data = read(&from).unwrap();
let to_data = read(&to).unwrap();
assert_eq!(from_data, to_data);
}
}




Expand Down