Skip to content

Commit

Permalink
Merge branch 'for-4.8/core' of git://git.kernel.dk/linux-block
Browse files Browse the repository at this point in the history
Pull core block updates from Jens Axboe:

   - the big change is the cleanup from Mike Christie, cleaning up our
     uses of command types and modified flags.  This is what will throw
     some merge conflicts

   - regression fix for the above for btrfs, from Vincent

   - following up to the above, better packing of struct request from
     Christoph

   - a 2038 fix for blktrace from Arnd

   - a few trivial/spelling fixes from Bart Van Assche

   - a front merge check fix from Damien, which could cause issues on
     SMR drives

   - Atari partition fix from Gabriel

   - convert cfq to highres timers, since jiffies isn't granular enough
     for some devices these days.  From Jan and Jeff

   - CFQ priority boost fix idle classes, from me

   - cleanup series from Ming, improving our bio/bvec iteration

   - a direct issue fix for blk-mq from Omar

   - fix for plug merging not involving the IO scheduler, like we do for
     other types of merges.  From Tahsin

   - expose DAX type internally and through sysfs.  From Toshi and Yigal

* 'for-4.8/core' of git://git.kernel.dk/linux-block: (76 commits)
  block: Fix front merge check
  block: do not merge requests without consulting with io scheduler
  block: Fix spelling in a source code comment
  block: expose QUEUE_FLAG_DAX in sysfs
  block: add QUEUE_FLAG_DAX for devices to advertise their DAX support
  Btrfs: fix comparison in __btrfs_map_block()
  block: atari: Return early for unsupported sector size
  Doc: block: Fix a typo in queue-sysfs.txt
  cfq-iosched: Charge at least 1 jiffie instead of 1 ns
  cfq-iosched: Fix regression in bonnie++ rewrite performance
  cfq-iosched: Convert slice_resid from u64 to s64
  block: Convert fifo_time from ulong to u64
  blktrace: avoid using timespec
  block/blk-cgroup.c: Declare local symbols static
  block/bio-integrity.c: Add #include "blk.h"
  block/partition-generic.c: Remove a set-but-not-used variable
  block: bio: kill BIO_MAX_SIZE
  cfq-iosched: temporarily boost queue priority for idle classes
  block: drbd: avoid to use BIO_MAX_SIZE
  block: bio: remove BIO_MAX_SECTORS
  ...
  • Loading branch information
torvalds committed Jul 26, 2016
2 parents 75a442e + 17007f3 commit d05d7f4
Show file tree
Hide file tree
Showing 199 changed files with 1,918 additions and 1,523 deletions.
2 changes: 1 addition & 1 deletion Documentation/block/queue-sysfs.txt
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ disk.

logical_block_size (RO)
-----------------------
This is the logcal block size of the device, in bytes.
This is the logical block size of the device, in bytes.

max_hw_sectors_kb (RO)
----------------------
Expand Down
28 changes: 14 additions & 14 deletions Documentation/block/writeback_cache_control.txt
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,11 @@ a forced cache flush, and the Force Unit Access (FUA) flag for requests.
Explicit cache flushes
----------------------

The REQ_FLUSH flag can be OR ed into the r/w flags of a bio submitted from
The REQ_PREFLUSH flag can be OR ed into the r/w flags of a bio submitted from
the filesystem and will make sure the volatile cache of the storage device
has been flushed before the actual I/O operation is started. This explicitly
guarantees that previously completed write requests are on non-volatile
storage before the flagged bio starts. In addition the REQ_FLUSH flag can be
storage before the flagged bio starts. In addition the REQ_PREFLUSH flag can be
set on an otherwise empty bio structure, which causes only an explicit cache
flush without any dependent I/O. It is recommend to use
the blkdev_issue_flush() helper for a pure cache flush.
Expand All @@ -41,21 +41,21 @@ signaled after the data has been committed to non-volatile storage.
Implementation details for filesystems
--------------------------------------

Filesystems can simply set the REQ_FLUSH and REQ_FUA bits and do not have to
Filesystems can simply set the REQ_PREFLUSH and REQ_FUA bits and do not have to
worry if the underlying devices need any explicit cache flushing and how
the Forced Unit Access is implemented. The REQ_FLUSH and REQ_FUA flags
the Forced Unit Access is implemented. The REQ_PREFLUSH and REQ_FUA flags
may both be set on a single bio.


Implementation details for make_request_fn based block drivers
--------------------------------------------------------------

These drivers will always see the REQ_FLUSH and REQ_FUA bits as they sit
These drivers will always see the REQ_PREFLUSH and REQ_FUA bits as they sit
directly below the submit_bio interface. For remapping drivers the REQ_FUA
bits need to be propagated to underlying devices, and a global flush needs
to be implemented for bios with the REQ_FLUSH bit set. For real device
drivers that do not have a volatile cache the REQ_FLUSH and REQ_FUA bits
on non-empty bios can simply be ignored, and REQ_FLUSH requests without
to be implemented for bios with the REQ_PREFLUSH bit set. For real device
drivers that do not have a volatile cache the REQ_PREFLUSH and REQ_FUA bits
on non-empty bios can simply be ignored, and REQ_PREFLUSH requests without
data can be completed successfully without doing any work. Drivers for
devices with volatile caches need to implement the support for these
flags themselves without any help from the block layer.
Expand All @@ -65,22 +65,22 @@ Implementation details for request_fn based block drivers
--------------------------------------------------------------

For devices that do not support volatile write caches there is no driver
support required, the block layer completes empty REQ_FLUSH requests before
entering the driver and strips off the REQ_FLUSH and REQ_FUA bits from
support required, the block layer completes empty REQ_PREFLUSH requests before
entering the driver and strips off the REQ_PREFLUSH and REQ_FUA bits from
requests that have a payload. For devices with volatile write caches the
driver needs to tell the block layer that it supports flushing caches by
doing:

blk_queue_write_cache(sdkp->disk->queue, true, false);

and handle empty REQ_FLUSH requests in its prep_fn/request_fn. Note that
REQ_FLUSH requests with a payload are automatically turned into a sequence
of an empty REQ_FLUSH request followed by the actual write by the block
and handle empty REQ_OP_FLUSH requests in its prep_fn/request_fn. Note that
REQ_PREFLUSH requests with a payload are automatically turned into a sequence
of an empty REQ_OP_FLUSH request followed by the actual write by the block
layer. For devices that also support the FUA bit the block layer needs
to be told to pass through the REQ_FUA bit using:

blk_queue_write_cache(sdkp->disk->queue, true, true);

and the driver must handle write requests that have the REQ_FUA bit set
in prep_fn/request_fn. If the FUA bit is not natively supported the block
layer turns it into an empty REQ_FLUSH request after the actual write.
layer turns it into an empty REQ_OP_FLUSH request after the actual write.
10 changes: 5 additions & 5 deletions Documentation/device-mapper/log-writes.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,14 @@ Log Ordering

We log things in order of completion once we are sure the write is no longer in
cache. This means that normal WRITE requests are not actually logged until the
next REQ_FLUSH request. This is to make it easier for userspace to replay the
log in a way that correlates to what is on disk and not what is in cache, to
make it easier to detect improper waiting/flushing.
next REQ_PREFLUSH request. This is to make it easier for userspace to replay
the log in a way that correlates to what is on disk and not what is in cache,
to make it easier to detect improper waiting/flushing.

This works by attaching all WRITE requests to a list once the write completes.
Once we see a REQ_FLUSH request we splice this list onto the request and once
Once we see a REQ_PREFLUSH request we splice this list onto the request and once
the FLUSH request completes we log all of the WRITEs and then the FLUSH. Only
completed WRITEs, at the time the REQ_FLUSH is issued, are added in order to
completed WRITEs, at the time the REQ_PREFLUSH is issued, are added in order to
simulate the worst case scenario with regard to power failures. Consider the
following example (W means write, C means complete):

Expand Down
2 changes: 1 addition & 1 deletion arch/um/drivers/ubd_kern.c
Original file line number Diff line number Diff line change
Expand Up @@ -1286,7 +1286,7 @@ static void do_ubd_request(struct request_queue *q)

req = dev->request;

if (req->cmd_flags & REQ_FLUSH) {
if (req_op(req) == REQ_OP_FLUSH) {
io_req = kmalloc(sizeof(struct io_thread_req),
GFP_ATOMIC);
if (io_req == NULL) {
Expand Down
1 change: 1 addition & 0 deletions block/bio-integrity.c
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
#include <linux/bio.h>
#include <linux/workqueue.h>
#include <linux/slab.h>
#include "blk.h"

#define BIP_INLINE_VECS 4

Expand Down
20 changes: 9 additions & 11 deletions block/bio.c
Original file line number Diff line number Diff line change
Expand Up @@ -656,16 +656,15 @@ struct bio *bio_clone_bioset(struct bio *bio_src, gfp_t gfp_mask,
bio = bio_alloc_bioset(gfp_mask, bio_segments(bio_src), bs);
if (!bio)
return NULL;

bio->bi_bdev = bio_src->bi_bdev;
bio->bi_rw = bio_src->bi_rw;
bio->bi_iter.bi_sector = bio_src->bi_iter.bi_sector;
bio->bi_iter.bi_size = bio_src->bi_iter.bi_size;

if (bio->bi_rw & REQ_DISCARD)
if (bio_op(bio) == REQ_OP_DISCARD)
goto integrity_clone;

if (bio->bi_rw & REQ_WRITE_SAME) {
if (bio_op(bio) == REQ_OP_WRITE_SAME) {
bio->bi_io_vec[bio->bi_vcnt++] = bio_src->bi_io_vec[0];
goto integrity_clone;
}
Expand Down Expand Up @@ -854,21 +853,20 @@ static void submit_bio_wait_endio(struct bio *bio)

/**
* submit_bio_wait - submit a bio, and wait until it completes
* @rw: whether to %READ or %WRITE, or maybe to %READA (read ahead)
* @bio: The &struct bio which describes the I/O
*
* Simple wrapper around submit_bio(). Returns 0 on success, or the error from
* bio_endio() on failure.
*/
int submit_bio_wait(int rw, struct bio *bio)
int submit_bio_wait(struct bio *bio)
{
struct submit_bio_ret ret;

rw |= REQ_SYNC;
init_completion(&ret.event);
bio->bi_private = &ret;
bio->bi_end_io = submit_bio_wait_endio;
submit_bio(rw, bio);
bio->bi_rw |= REQ_SYNC;
submit_bio(bio);
wait_for_completion_io(&ret.event);

return ret.error;
Expand Down Expand Up @@ -1167,7 +1165,7 @@ struct bio *bio_copy_user_iov(struct request_queue *q,
goto out_bmd;

if (iter->type & WRITE)
bio->bi_rw |= REQ_WRITE;
bio_set_op_attrs(bio, REQ_OP_WRITE, 0);

ret = 0;

Expand Down Expand Up @@ -1337,7 +1335,7 @@ struct bio *bio_map_user_iov(struct request_queue *q,
* set data direction, and check if mapped pages need bouncing
*/
if (iter->type & WRITE)
bio->bi_rw |= REQ_WRITE;
bio_set_op_attrs(bio, REQ_OP_WRITE, 0);

bio_set_flag(bio, BIO_USER_MAPPED);

Expand Down Expand Up @@ -1530,7 +1528,7 @@ struct bio *bio_copy_kern(struct request_queue *q, void *data, unsigned int len,
bio->bi_private = data;
} else {
bio->bi_end_io = bio_copy_kern_endio;
bio->bi_rw |= REQ_WRITE;
bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
}

return bio;
Expand Down Expand Up @@ -1785,7 +1783,7 @@ struct bio *bio_split(struct bio *bio, int sectors,
* Discards need a mutable bio_vec to accommodate the payload
* required by the DSM TRIM and UNMAP commands.
*/
if (bio->bi_rw & REQ_DISCARD)
if (bio_op(bio) == REQ_OP_DISCARD)
split = bio_clone_bioset(bio, gfp, bs);
else
split = bio_clone_fast(bio, gfp, bs);
Expand Down
4 changes: 2 additions & 2 deletions block/blk-cgroup.c
Original file line number Diff line number Diff line change
Expand Up @@ -905,7 +905,7 @@ static int blkcg_print_stat(struct seq_file *sf, void *v)
return 0;
}

struct cftype blkcg_files[] = {
static struct cftype blkcg_files[] = {
{
.name = "stat",
.flags = CFTYPE_NOT_ON_ROOT,
Expand All @@ -914,7 +914,7 @@ struct cftype blkcg_files[] = {
{ } /* terminate */
};

struct cftype blkcg_legacy_files[] = {
static struct cftype blkcg_legacy_files[] = {
{
.name = "reset_stats",
.write_u64 = blkcg_reset_stats,
Expand Down
Loading

0 comments on commit d05d7f4

Please sign in to comment.