Fix breakage from switching to MQ BLK framework

Message ID	20181108130723.12101-1-anton.ivanov@cambridgegreys.com
State	Superseded
Headers	show Return-Path: <linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org> From: anton.ivanov@cambridgegreys.com To: linux-um@lists.infradead.org Subject: [PATCH] Fix breakage from switching to MQ BLK framework Date: Thu, 8 Nov 2018 13:07:23 +0000 Message-Id: <20181108130723.12101-1-anton.ivanov@cambridgegreys.com> summary: Content analysis details: (0.0 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- Precedence: list Cc: axboe@kernel.dk, richard@nod.at, hch@lst.de, Anton Ivanov <anton.ivanov@cambridgegreys.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-um" <linux-um-bounces@lists.infradead.org> Errors-To: linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org
Series	Fix breakage from switching to MQ BLK framework \| expand Fix breakage from switching to MQ BLK framework

Anton Ivanov Nov. 8, 2018, 1:07 p.m. UTC

From: Anton Ivanov <anton.ivanov@cambridgegreys.com>

Initial version by Jens Axboe <axboe@kernel.dk>

1. Fix use of use unititalized req->error for SYNC commands
2. Guard the request submission by a spinlock

Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
---
 arch/um/drivers/ubd_kern.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Jens Axboe Nov. 8, 2018, 1:10 p.m. UTC | #1

On 11/8/18 6:07 AM, anton.ivanov@cambridgegreys.com wrote:
> From: Anton Ivanov <anton.ivanov@cambridgegreys.com>
> 
> Initial version by Jens Axboe <axboe@kernel.dk>
> 
> 1. Fix use of use unititalized req->error for SYNC commands
> 2. Guard the request submission by a spinlock
> 
> Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
> ---
>  arch/um/drivers/ubd_kern.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
> index 74c002ddc0ce..e3d587d9eff6 100644
> --- a/arch/um/drivers/ubd_kern.c
> +++ b/arch/um/drivers/ubd_kern.c
> @@ -1305,6 +1305,7 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>  		io_req->fds[0] = dev->cow.fd;
>  	else
>  		io_req->fds[0] = dev->fd;
> +	io_req->error = 0;
>  
>  	if (req_op(req) == REQ_OP_FLUSH) {
>  		io_req->op = UBD_FLUSH;
> @@ -1313,9 +1314,7 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>  		io_req->cow_offset = -1;
>  		io_req->offset = off;
>  		io_req->length = bvec->bv_len;
> -		io_req->error = 0;
>  		io_req->sector_mask = 0;
> -
>  		io_req->op = rq_data_dir(req) == READ ? UBD_READ : UBD_WRITE;
>  		io_req->offsets[0] = 0;
>  		io_req->offsets[1] = dev->cow.data_offset;

These two hunks look fine.

> @@ -1341,11 +1340,15 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>  static blk_status_t ubd_queue_rq(struct blk_mq_hw_ctx *hctx,
>  				 const struct blk_mq_queue_data *bd)
>  {
> +	struct ubd *ubd_dev = hctx->queue->queuedata;
>  	struct request *req = bd->rq;
> +	unsigned long flags;
>  	int ret = 0;
>  
>  	blk_mq_start_request(req);
>  
> +	spin_lock_irqsave(&ubd_dev->lock, flags);
> +
>  	if (req_op(req) == REQ_OP_FLUSH) {
>  		ret = ubd_queue_one_vec(hctx, req, 0, NULL);
>  	} else {
> @@ -1361,6 +1364,8 @@ static blk_status_t ubd_queue_rq(struct blk_mq_hw_ctx *hctx,
>  		}
>  	}
>  out:
> +	spin_unlock_irqrestore(&ubd_dev->lock, flags);
> +
>  	if (ret < 0) {
>  		blk_mq_requeue_request(req, true);
>  	}

This one doesn't need to be save/restore. I've already queued up
the locking fix yesterday, I'll add your initialization fix.

Geert Uytterhoeven Nov. 8, 2018, 1:15 p.m. UTC | #2

Hi Anton,

On Thu, Nov 8, 2018 at 2:08 PM <anton.ivanov@cambridgegreys.com> wrote:
> From: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>
> Initial version by Jens Axboe <axboe@kernel.dk>
>
> 1. Fix use of use unititalized req->error for SYNC commands
> 2. Guard the request submission by a spinlock
>
> Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
> ---
>  arch/um/drivers/ubd_kern.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
> index 74c002ddc0ce..e3d587d9eff6 100644
> --- a/arch/um/drivers/ubd_kern.c
> +++ b/arch/um/drivers/ubd_kern.c
> @@ -1305,6 +1305,7 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>                 io_req->fds[0] = dev->cow.fd;
>         else
>                 io_req->fds[0] = dev->fd;
> +       io_req->error = 0;

Perhaps it's safer to just use kzalloc() instead of kmalloc()?

>
>         if (req_op(req) == REQ_OP_FLUSH) {
>                 io_req->op = UBD_FLUSH;
> @@ -1313,9 +1314,7 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>                 io_req->cow_offset = -1;
>                 io_req->offset = off;
>                 io_req->length = bvec->bv_len;
> -               io_req->error = 0;
>                 io_req->sector_mask = 0;
> -
>                 io_req->op = rq_data_dir(req) == READ ? UBD_READ : UBD_WRITE;
>                 io_req->offsets[0] = 0;
>                 io_req->offsets[1] = dev->cow.data_offset;

Gr{oetje,eeting}s,

                        Geert

Anton Ivanov Nov. 8, 2018, 1:22 p.m. UTC | #3

On 11/8/18 1:10 PM, Jens Axboe wrote:
> On 11/8/18 6:07 AM, anton.ivanov@cambridgegreys.com wrote:
>> From: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>>
>> Initial version by Jens Axboe <axboe@kernel.dk>
>>
>> 1. Fix use of use unititalized req->error for SYNC commands
>> 2. Guard the request submission by a spinlock
>>
>> Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>> ---
>>   arch/um/drivers/ubd_kern.c | 9 +++++++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
>> index 74c002ddc0ce..e3d587d9eff6 100644
>> --- a/arch/um/drivers/ubd_kern.c
>> +++ b/arch/um/drivers/ubd_kern.c
>> @@ -1305,6 +1305,7 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>>   		io_req->fds[0] = dev->cow.fd;
>>   	else
>>   		io_req->fds[0] = dev->fd;
>> +	io_req->error = 0;
>>   
>>   	if (req_op(req) == REQ_OP_FLUSH) {
>>   		io_req->op = UBD_FLUSH;
>> @@ -1313,9 +1314,7 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>>   		io_req->cow_offset = -1;
>>   		io_req->offset = off;
>>   		io_req->length = bvec->bv_len;
>> -		io_req->error = 0;
>>   		io_req->sector_mask = 0;
>> -
>>   		io_req->op = rq_data_dir(req) == READ ? UBD_READ : UBD_WRITE;
>>   		io_req->offsets[0] = 0;
>>   		io_req->offsets[1] = dev->cow.data_offset;
> These two hunks look fine.
>
>> @@ -1341,11 +1340,15 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>>   static blk_status_t ubd_queue_rq(struct blk_mq_hw_ctx *hctx,
>>   				 const struct blk_mq_queue_data *bd)
>>   {
>> +	struct ubd *ubd_dev = hctx->queue->queuedata;
>>   	struct request *req = bd->rq;
>> +	unsigned long flags;
>>   	int ret = 0;
>>   
>>   	blk_mq_start_request(req);
>>   
>> +	spin_lock_irqsave(&ubd_dev->lock, flags);
>> +
>>   	if (req_op(req) == REQ_OP_FLUSH) {
>>   		ret = ubd_queue_one_vec(hctx, req, 0, NULL);
>>   	} else {
>> @@ -1361,6 +1364,8 @@ static blk_status_t ubd_queue_rq(struct blk_mq_hw_ctx *hctx,
>>   		}
>>   	}
>>   out:
>> +	spin_unlock_irqrestore(&ubd_dev->lock, flags);
>> +
>>   	if (ret < 0) {
>>   		blk_mq_requeue_request(req, true);
>>   	}
> This one doesn't need to be save/restore. I've already queued up
> the locking fix yesterday, I'll add your initialization fix.

Apologies, I got a hang when testing it with just _irq, but I did not 
rerun the test. I suspect it was on something completely unrelated as I 
cannot reproduce it any more.

That is why I switched it to irqsave, which as you explained is 
unnecessary.

A.

>

Anton Ivanov Nov. 8, 2018, 1:24 p.m. UTC | #4

On 11/8/18 1:15 PM, Geert Uytterhoeven wrote:
> Hi Anton,
>
> On Thu, Nov 8, 2018 at 2:08 PM <anton.ivanov@cambridgegreys.com> wrote:
>> From: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>>
>> Initial version by Jens Axboe <axboe@kernel.dk>
>>
>> 1. Fix use of use unititalized req->error for SYNC commands
>> 2. Guard the request submission by a spinlock
>>
>> Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>> ---
>>   arch/um/drivers/ubd_kern.c | 9 +++++++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
>> index 74c002ddc0ce..e3d587d9eff6 100644
>> --- a/arch/um/drivers/ubd_kern.c
>> +++ b/arch/um/drivers/ubd_kern.c
>> @@ -1305,6 +1305,7 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>>                  io_req->fds[0] = dev->cow.fd;
>>          else
>>                  io_req->fds[0] = dev->fd;
>> +       io_req->error = 0;
> Perhaps it's safer to just use kzalloc() instead of kmalloc()?

That is one clear case where we would have never noticed the bug if we 
were using it :)

There are quite a few things set to zero by default so it may be worth it.

>
>>          if (req_op(req) == REQ_OP_FLUSH) {
>>                  io_req->op = UBD_FLUSH;
>> @@ -1313,9 +1314,7 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>>                  io_req->cow_offset = -1;
>>                  io_req->offset = off;
>>                  io_req->length = bvec->bv_len;
>> -               io_req->error = 0;
>>                  io_req->sector_mask = 0;
>> -
>>                  io_req->op = rq_data_dir(req) == READ ? UBD_READ : UBD_WRITE;
>>                  io_req->offsets[0] = 0;
>>                  io_req->offsets[1] = dev->cow.data_offset;
> Gr{oetje,eeting}s,
>
>                          Geert
>

Jens Axboe Nov. 8, 2018, 1:24 p.m. UTC | #5

On 11/8/18 6:22 AM, Anton Ivanov wrote:
> 
> On 11/8/18 1:10 PM, Jens Axboe wrote:
>> On 11/8/18 6:07 AM, anton.ivanov@cambridgegreys.com wrote:
>>> From: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>>>
>>> Initial version by Jens Axboe <axboe@kernel.dk>
>>>
>>> 1. Fix use of use unititalized req->error for SYNC commands
>>> 2. Guard the request submission by a spinlock
>>>
>>> Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>>> ---
>>>   arch/um/drivers/ubd_kern.c | 9 +++++++--
>>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
>>> index 74c002ddc0ce..e3d587d9eff6 100644
>>> --- a/arch/um/drivers/ubd_kern.c
>>> +++ b/arch/um/drivers/ubd_kern.c
>>> @@ -1305,6 +1305,7 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>>>   		io_req->fds[0] = dev->cow.fd;
>>>   	else
>>>   		io_req->fds[0] = dev->fd;
>>> +	io_req->error = 0;
>>>   
>>>   	if (req_op(req) == REQ_OP_FLUSH) {
>>>   		io_req->op = UBD_FLUSH;
>>> @@ -1313,9 +1314,7 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>>>   		io_req->cow_offset = -1;
>>>   		io_req->offset = off;
>>>   		io_req->length = bvec->bv_len;
>>> -		io_req->error = 0;
>>>   		io_req->sector_mask = 0;
>>> -
>>>   		io_req->op = rq_data_dir(req) == READ ? UBD_READ : UBD_WRITE;
>>>   		io_req->offsets[0] = 0;
>>>   		io_req->offsets[1] = dev->cow.data_offset;
>> These two hunks look fine.
>>
>>> @@ -1341,11 +1340,15 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>>>   static blk_status_t ubd_queue_rq(struct blk_mq_hw_ctx *hctx,
>>>   				 const struct blk_mq_queue_data *bd)
>>>   {
>>> +	struct ubd *ubd_dev = hctx->queue->queuedata;
>>>   	struct request *req = bd->rq;
>>> +	unsigned long flags;
>>>   	int ret = 0;
>>>   
>>>   	blk_mq_start_request(req);
>>>   
>>> +	spin_lock_irqsave(&ubd_dev->lock, flags);
>>> +
>>>   	if (req_op(req) == REQ_OP_FLUSH) {
>>>   		ret = ubd_queue_one_vec(hctx, req, 0, NULL);
>>>   	} else {
>>> @@ -1361,6 +1364,8 @@ static blk_status_t ubd_queue_rq(struct blk_mq_hw_ctx *hctx,
>>>   		}
>>>   	}
>>>   out:
>>> +	spin_unlock_irqrestore(&ubd_dev->lock, flags);
>>> +
>>>   	if (ret < 0) {
>>>   		blk_mq_requeue_request(req, true);
>>>   	}
>> This one doesn't need to be save/restore. I've already queued up
>> the locking fix yesterday, I'll add your initialization fix.
> 
> Apologies, I got a hang when testing it with just _irq, but I did not 
> rerun the test. I suspect it was on something completely unrelated as I 
> cannot reproduce it any more.
> 
> That is why I switched it to irqsave, which as you explained is 
> unnecessary.

No problem, I committed a cut down version and fixed up your
commit message as well:

http://git.kernel.dk/cgit/linux-block/commit/?h=for-linus&id=0033dfd92a5646a78025e86f8df4d5b18181ba2c

Anton Ivanov Nov. 8, 2018, 1:33 p.m. UTC | #6

On 11/8/18 1:24 PM, Jens Axboe wrote:
> On 11/8/18 6:22 AM, Anton Ivanov wrote:
>> On 11/8/18 1:10 PM, Jens Axboe wrote:
>>> On 11/8/18 6:07 AM, anton.ivanov@cambridgegreys.com wrote:
>>>> From: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>>>>
>>>> Initial version by Jens Axboe <axboe@kernel.dk>
>>>>
>>>> 1. Fix use of use unititalized req->error for SYNC commands
>>>> 2. Guard the request submission by a spinlock
>>>>
>>>> Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>>>> ---
>>>>    arch/um/drivers/ubd_kern.c | 9 +++++++--
>>>>    1 file changed, 7 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
>>>> index 74c002ddc0ce..e3d587d9eff6 100644
>>>> --- a/arch/um/drivers/ubd_kern.c
>>>> +++ b/arch/um/drivers/ubd_kern.c
>>>> @@ -1305,6 +1305,7 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>>>>    		io_req->fds[0] = dev->cow.fd;
>>>>    	else
>>>>    		io_req->fds[0] = dev->fd;
>>>> +	io_req->error = 0;
>>>>    
>>>>    	if (req_op(req) == REQ_OP_FLUSH) {
>>>>    		io_req->op = UBD_FLUSH;
>>>> @@ -1313,9 +1314,7 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>>>>    		io_req->cow_offset = -1;
>>>>    		io_req->offset = off;
>>>>    		io_req->length = bvec->bv_len;
>>>> -		io_req->error = 0;
>>>>    		io_req->sector_mask = 0;
>>>> -
>>>>    		io_req->op = rq_data_dir(req) == READ ? UBD_READ : UBD_WRITE;
>>>>    		io_req->offsets[0] = 0;
>>>>    		io_req->offsets[1] = dev->cow.data_offset;
>>> These two hunks look fine.
>>>
>>>> @@ -1341,11 +1340,15 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>>>>    static blk_status_t ubd_queue_rq(struct blk_mq_hw_ctx *hctx,
>>>>    				 const struct blk_mq_queue_data *bd)
>>>>    {
>>>> +	struct ubd *ubd_dev = hctx->queue->queuedata;
>>>>    	struct request *req = bd->rq;
>>>> +	unsigned long flags;
>>>>    	int ret = 0;
>>>>    
>>>>    	blk_mq_start_request(req);
>>>>    
>>>> +	spin_lock_irqsave(&ubd_dev->lock, flags);
>>>> +
>>>>    	if (req_op(req) == REQ_OP_FLUSH) {
>>>>    		ret = ubd_queue_one_vec(hctx, req, 0, NULL);
>>>>    	} else {
>>>> @@ -1361,6 +1364,8 @@ static blk_status_t ubd_queue_rq(struct blk_mq_hw_ctx *hctx,
>>>>    		}
>>>>    	}
>>>>    out:
>>>> +	spin_unlock_irqrestore(&ubd_dev->lock, flags);
>>>> +
>>>>    	if (ret < 0) {
>>>>    		blk_mq_requeue_request(req, true);
>>>>    	}
>>> This one doesn't need to be save/restore. I've already queued up
>>> the locking fix yesterday, I'll add your initialization fix.
>> Apologies, I got a hang when testing it with just _irq, but I did not
>> rerun the test. I suspect it was on something completely unrelated as I
>> cannot reproduce it any more.
>>
>> That is why I switched it to irqsave, which as you explained is
>> unnecessary.
> No problem, I committed a cut down version and fixed up your
> commit message as well:
>
> http://git.kernel.dk/cgit/linux-block/commit/?h=for-linus&id=0033dfd92a5646a78025e86f8df4d5b18181ba2c
>

Thanks,

The speed difference between irq and irqsave is quite substantial - I 
get 255MB/s vs 312MB/s on dd-ing a 16G image to /dev/null

I should probably look around the rest of UML as it uses _irqsave nearly 
everywhere and some of those may be converted to _irq.

Jens Axboe Nov. 8, 2018, 1:38 p.m. UTC | #7

On 11/8/18 6:33 AM, Anton Ivanov wrote:
> 
> On 11/8/18 1:24 PM, Jens Axboe wrote:
>> On 11/8/18 6:22 AM, Anton Ivanov wrote:
>>> On 11/8/18 1:10 PM, Jens Axboe wrote:
>>>> On 11/8/18 6:07 AM, anton.ivanov@cambridgegreys.com wrote:
>>>>> From: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>>>>>
>>>>> Initial version by Jens Axboe <axboe@kernel.dk>
>>>>>
>>>>> 1. Fix use of use unititalized req->error for SYNC commands
>>>>> 2. Guard the request submission by a spinlock
>>>>>
>>>>> Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>>>>> ---
>>>>>    arch/um/drivers/ubd_kern.c | 9 +++++++--
>>>>>    1 file changed, 7 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
>>>>> index 74c002ddc0ce..e3d587d9eff6 100644
>>>>> --- a/arch/um/drivers/ubd_kern.c
>>>>> +++ b/arch/um/drivers/ubd_kern.c
>>>>> @@ -1305,6 +1305,7 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>>>>>    		io_req->fds[0] = dev->cow.fd;
>>>>>    	else
>>>>>    		io_req->fds[0] = dev->fd;
>>>>> +	io_req->error = 0;
>>>>>    
>>>>>    	if (req_op(req) == REQ_OP_FLUSH) {
>>>>>    		io_req->op = UBD_FLUSH;
>>>>> @@ -1313,9 +1314,7 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>>>>>    		io_req->cow_offset = -1;
>>>>>    		io_req->offset = off;
>>>>>    		io_req->length = bvec->bv_len;
>>>>> -		io_req->error = 0;
>>>>>    		io_req->sector_mask = 0;
>>>>> -
>>>>>    		io_req->op = rq_data_dir(req) == READ ? UBD_READ : UBD_WRITE;
>>>>>    		io_req->offsets[0] = 0;
>>>>>    		io_req->offsets[1] = dev->cow.data_offset;
>>>> These two hunks look fine.
>>>>
>>>>> @@ -1341,11 +1340,15 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>>>>>    static blk_status_t ubd_queue_rq(struct blk_mq_hw_ctx *hctx,
>>>>>    				 const struct blk_mq_queue_data *bd)
>>>>>    {
>>>>> +	struct ubd *ubd_dev = hctx->queue->queuedata;
>>>>>    	struct request *req = bd->rq;
>>>>> +	unsigned long flags;
>>>>>    	int ret = 0;
>>>>>    
>>>>>    	blk_mq_start_request(req);
>>>>>    
>>>>> +	spin_lock_irqsave(&ubd_dev->lock, flags);
>>>>> +
>>>>>    	if (req_op(req) == REQ_OP_FLUSH) {
>>>>>    		ret = ubd_queue_one_vec(hctx, req, 0, NULL);
>>>>>    	} else {
>>>>> @@ -1361,6 +1364,8 @@ static blk_status_t ubd_queue_rq(struct blk_mq_hw_ctx *hctx,
>>>>>    		}
>>>>>    	}
>>>>>    out:
>>>>> +	spin_unlock_irqrestore(&ubd_dev->lock, flags);
>>>>> +
>>>>>    	if (ret < 0) {
>>>>>    		blk_mq_requeue_request(req, true);
>>>>>    	}
>>>> This one doesn't need to be save/restore. I've already queued up
>>>> the locking fix yesterday, I'll add your initialization fix.
>>> Apologies, I got a hang when testing it with just _irq, but I did not
>>> rerun the test. I suspect it was on something completely unrelated as I
>>> cannot reproduce it any more.
>>>
>>> That is why I switched it to irqsave, which as you explained is
>>> unnecessary.
>> No problem, I committed a cut down version and fixed up your
>> commit message as well:
>>
>> http://git.kernel.dk/cgit/linux-block/commit/?h=for-linus&id=0033dfd92a5646a78025e86f8df4d5b18181ba2c
>>
> 
> Thanks,
> 
> The speed difference between irq and irqsave is quite substantial - I 
> get 255MB/s vs 312MB/s on dd-ing a 16G image to /dev/null
> 
> I should probably look around the rest of UML as it uses _irqsave nearly 
> everywhere and some of those may be converted to _irq.

save/restore variants are probably more expensive in UML than elsewhere.
But yes, there's a difference on x86 bare metal as well.

Anton Ivanov Nov. 8, 2018, 2:38 p.m. UTC | #8

On 08/11/2018 13:38, Jens Axboe wrote:
> On 11/8/18 6:33 AM, Anton Ivanov wrote:
>> On 11/8/18 1:24 PM, Jens Axboe wrote:
>>> On 11/8/18 6:22 AM, Anton Ivanov wrote:
>>>> On 11/8/18 1:10 PM, Jens Axboe wrote:
>>>>> On 11/8/18 6:07 AM, anton.ivanov@cambridgegreys.com wrote:
>>>>>> From: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>>>>>>
>>>>>> Initial version by Jens Axboe <axboe@kernel.dk>
>>>>>>
>>>>>> 1. Fix use of use unititalized req->error for SYNC commands
>>>>>> 2. Guard the request submission by a spinlock
>>>>>>
>>>>>> Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>>>>>> ---
>>>>>>     arch/um/drivers/ubd_kern.c | 9 +++++++--
>>>>>>     1 file changed, 7 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
>>>>>> index 74c002ddc0ce..e3d587d9eff6 100644
>>>>>> --- a/arch/um/drivers/ubd_kern.c
>>>>>> +++ b/arch/um/drivers/ubd_kern.c
>>>>>> @@ -1305,6 +1305,7 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>>>>>>     		io_req->fds[0] = dev->cow.fd;
>>>>>>     	else
>>>>>>     		io_req->fds[0] = dev->fd;
>>>>>> +	io_req->error = 0;
>>>>>>     
>>>>>>     	if (req_op(req) == REQ_OP_FLUSH) {
>>>>>>     		io_req->op = UBD_FLUSH;
>>>>>> @@ -1313,9 +1314,7 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>>>>>>     		io_req->cow_offset = -1;
>>>>>>     		io_req->offset = off;
>>>>>>     		io_req->length = bvec->bv_len;
>>>>>> -		io_req->error = 0;
>>>>>>     		io_req->sector_mask = 0;
>>>>>> -
>>>>>>     		io_req->op = rq_data_dir(req) == READ ? UBD_READ : UBD_WRITE;
>>>>>>     		io_req->offsets[0] = 0;
>>>>>>     		io_req->offsets[1] = dev->cow.data_offset;
>>>>> These two hunks look fine.
>>>>>
>>>>>> @@ -1341,11 +1340,15 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
>>>>>>     static blk_status_t ubd_queue_rq(struct blk_mq_hw_ctx *hctx,
>>>>>>     				 const struct blk_mq_queue_data *bd)
>>>>>>     {
>>>>>> +	struct ubd *ubd_dev = hctx->queue->queuedata;
>>>>>>     	struct request *req = bd->rq;
>>>>>> +	unsigned long flags;
>>>>>>     	int ret = 0;
>>>>>>     
>>>>>>     	blk_mq_start_request(req);
>>>>>>     
>>>>>> +	spin_lock_irqsave(&ubd_dev->lock, flags);
>>>>>> +
>>>>>>     	if (req_op(req) == REQ_OP_FLUSH) {
>>>>>>     		ret = ubd_queue_one_vec(hctx, req, 0, NULL);
>>>>>>     	} else {
>>>>>> @@ -1361,6 +1364,8 @@ static blk_status_t ubd_queue_rq(struct blk_mq_hw_ctx *hctx,
>>>>>>     		}
>>>>>>     	}
>>>>>>     out:
>>>>>> +	spin_unlock_irqrestore(&ubd_dev->lock, flags);
>>>>>> +
>>>>>>     	if (ret < 0) {
>>>>>>     		blk_mq_requeue_request(req, true);
>>>>>>     	}
>>>>> This one doesn't need to be save/restore. I've already queued up
>>>>> the locking fix yesterday, I'll add your initialization fix.
>>>> Apologies, I got a hang when testing it with just _irq, but I did not
>>>> rerun the test. I suspect it was on something completely unrelated as I
>>>> cannot reproduce it any more.
>>>>
>>>> That is why I switched it to irqsave, which as you explained is
>>>> unnecessary.
>>> No problem, I committed a cut down version and fixed up your
>>> commit message as well:
>>>
>>> http://git.kernel.dk/cgit/linux-block/commit/?h=for-linus&id=0033dfd92a5646a78025e86f8df4d5b18181ba2c
>>>
>> Thanks,
>>
>> The speed difference between irq and irqsave is quite substantial - I
>> get 255MB/s vs 312MB/s on dd-ing a 16G image to /dev/null
>>
>> I should probably look around the rest of UML as it uses _irqsave nearly
>> everywhere and some of those may be converted to _irq.
> save/restore variants are probably more expensive in UML than elsewhere.
> But yes, there's a difference on x86 bare metal as well.
>
That is indeed the case :)

I have a question. Namely - what is the urgency to implement the other 
command codes?

While trying to debug this issue I noticed that we treat anything that 
is not simple READ as a WRITE.

That was OK with the older API. Is that still the case with the mq?

I see a raft of additional command codes - trim, write multiple, etc. 
Some of those should be easy to implement, some will take some work. 
Should we try to sort those out ASAP?

Anton Ivanov Nov. 8, 2018, 6:33 p.m. UTC | #9

On 11/8/18 2:38 PM, Anton Ivanov wrote:
>
> On 08/11/2018 13:38, Jens Axboe wrote:
>> On 11/8/18 6:33 AM, Anton Ivanov wrote:
>>> On 11/8/18 1:24 PM, Jens Axboe wrote:
>>>> On 11/8/18 6:22 AM, Anton Ivanov wrote:
>>>>> On 11/8/18 1:10 PM, Jens Axboe wrote:
>>>>>> On 11/8/18 6:07 AM, anton.ivanov@cambridgegreys.com wrote:
>>>>>>> From: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>>>>>>>
>>>>>>> Initial version by Jens Axboe <axboe@kernel.dk>
>>>>>>>
>>>>>>> 1. Fix use of use unititalized req->error for SYNC commands
>>>>>>> 2. Guard the request submission by a spinlock
>>>>>>>
>>>>>>> Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>>>>>>> ---
>>>>>>>     arch/um/drivers/ubd_kern.c | 9 +++++++--
>>>>>>>     1 file changed, 7 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/arch/um/drivers/ubd_kern.c 
>>>>>>> b/arch/um/drivers/ubd_kern.c
>>>>>>> index 74c002ddc0ce..e3d587d9eff6 100644
>>>>>>> --- a/arch/um/drivers/ubd_kern.c
>>>>>>> +++ b/arch/um/drivers/ubd_kern.c
>>>>>>> @@ -1305,6 +1305,7 @@ static int ubd_queue_one_vec(struct 
>>>>>>> blk_mq_hw_ctx *hctx, struct request *req,
>>>>>>>             io_req->fds[0] = dev->cow.fd;
>>>>>>>         else
>>>>>>>             io_req->fds[0] = dev->fd;
>>>>>>> +    io_req->error = 0;
>>>>>>>             if (req_op(req) == REQ_OP_FLUSH) {
>>>>>>>             io_req->op = UBD_FLUSH;
>>>>>>> @@ -1313,9 +1314,7 @@ static int ubd_queue_one_vec(struct 
>>>>>>> blk_mq_hw_ctx *hctx, struct request *req,
>>>>>>>             io_req->cow_offset = -1;
>>>>>>>             io_req->offset = off;
>>>>>>>             io_req->length = bvec->bv_len;
>>>>>>> -        io_req->error = 0;
>>>>>>>             io_req->sector_mask = 0;
>>>>>>> -
>>>>>>>             io_req->op = rq_data_dir(req) == READ ? UBD_READ : 
>>>>>>> UBD_WRITE;
>>>>>>>             io_req->offsets[0] = 0;
>>>>>>>             io_req->offsets[1] = dev->cow.data_offset;
>>>>>> These two hunks look fine.
>>>>>>
>>>>>>> @@ -1341,11 +1340,15 @@ static int ubd_queue_one_vec(struct 
>>>>>>> blk_mq_hw_ctx *hctx, struct request *req,
>>>>>>>     static blk_status_t ubd_queue_rq(struct blk_mq_hw_ctx *hctx,
>>>>>>>                      const struct blk_mq_queue_data *bd)
>>>>>>>     {
>>>>>>> +    struct ubd *ubd_dev = hctx->queue->queuedata;
>>>>>>>         struct request *req = bd->rq;
>>>>>>> +    unsigned long flags;
>>>>>>>         int ret = 0;
>>>>>>>             blk_mq_start_request(req);
>>>>>>>     +    spin_lock_irqsave(&ubd_dev->lock, flags);
>>>>>>> +
>>>>>>>         if (req_op(req) == REQ_OP_FLUSH) {
>>>>>>>             ret = ubd_queue_one_vec(hctx, req, 0, NULL);
>>>>>>>         } else {
>>>>>>> @@ -1361,6 +1364,8 @@ static blk_status_t ubd_queue_rq(struct 
>>>>>>> blk_mq_hw_ctx *hctx,
>>>>>>>             }
>>>>>>>         }
>>>>>>>     out:
>>>>>>> +    spin_unlock_irqrestore(&ubd_dev->lock, flags);
>>>>>>> +
>>>>>>>         if (ret < 0) {
>>>>>>>             blk_mq_requeue_request(req, true);
>>>>>>>         }
>>>>>> This one doesn't need to be save/restore. I've already queued up
>>>>>> the locking fix yesterday, I'll add your initialization fix.
>>>>> Apologies, I got a hang when testing it with just _irq, but I did not
>>>>> rerun the test. I suspect it was on something completely unrelated 
>>>>> as I
>>>>> cannot reproduce it any more.
>>>>>
>>>>> That is why I switched it to irqsave, which as you explained is
>>>>> unnecessary.
>>>> No problem, I committed a cut down version and fixed up your
>>>> commit message as well:
>>>>
>>>> http://git.kernel.dk/cgit/linux-block/commit/?h=for-linus&id=0033dfd92a5646a78025e86f8df4d5b18181ba2c 
>>>>
>>>>
>>> Thanks,
>>>
>>> The speed difference between irq and irqsave is quite substantial - I
>>> get 255MB/s vs 312MB/s on dd-ing a 16G image to /dev/null
>>>
>>> I should probably look around the rest of UML as it uses _irqsave 
>>> nearly
>>> everywhere and some of those may be converted to _irq.
>> save/restore variants are probably more expensive in UML than elsewhere.
>> But yes, there's a difference on x86 bare metal as well.
>>
> That is indeed the case :)
>
> I have a question. Namely - what is the urgency to implement the other 
> command codes?
>
> While trying to debug this issue I noticed that we treat anything that 
> is not simple READ as a WRITE.
>
> That was OK with the older API. Is that still the case with the mq?
>
> I see a raft of additional command codes - trim, write multiple, etc. 
> Some of those should be easy to implement, some will take some work. 
> Should we try to sort those out ASAP?

I answered those questions myself by doing some digging around the nbd, 
loop and other "non-physical" driver trees. Loop was most informative :)

I now have a cleaned up and more robust driver with working trim and 
zero fill support. Tested with ext4 and fstrim.

I am going to rebase it vs the linux-block tree and submit tomorrow once 
if it survives on the test rig overnight.

Fix breakage from switching to MQ BLK framework

Commit Message

Comments

Patch