From: Ming Lei
mainline inclusion
from mainline-5.11-rc1
commit fb01a2932e81a1fb2273f87ff92dc8172b8880ee
category: bugfix
issue: #I3ZXZF
CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
---------------------------
flush_end_io() may be called recursively from some driver, such as
nvme-loop, so lockdep may complain 'possible recursive locking'.
Commit b3c6a5997541("block: Fix a lockdep complaint triggered by
request queue flushing") tried to address this issue by assigning
dynamically allocated per-flush-queue lock class. This solution
adds synchronize_rcu() for each hctx's release handler, and causes
horrible SCSI MQ probe delay(more than half an hour on megaraid sas).
Add new API of blk_mq_hctx_set_fq_lock_class() for these drivers, so
we just need to use driver specific lock class for avoiding the
lockdep warning of 'possible recursive locking'.
Tested-by: Kashyap Desai
Reported-by: Qian Cai
Cc: Sumit Saxena
Cc: John Garry
Cc: Kashyap Desai
Cc: Bart Van Assche
Cc: Hannes Reinecke
Signed-off-by: Ming Lei
Reviewed-by: Hannes Reinecke
Signed-off-by: Jens Axboe
Signed-off-by: yangerkun
Reviewed-by: Jason Yan
Signed-off-by: Chen Jun
Signed-off-by: Yu Changchun
---
block/blk-flush.c | 25 +++++++++++++++++++++++++
include/linux/blk-mq.h | 3 +++
2 files changed, 28 insertions(+)
diff --git a/block/blk-flush.c b/block/blk-flush.c
index 7ee7e5e8905d..7ede7849b059 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -494,3 +494,28 @@ void blk_free_flush_queue(struct blk_flush_queue *fq)
kfree(fq->flush_rq);
kfree(fq);
}
+
+/*
+ * Allow driver to set its own lock class to fq->mq_flush_lock for
+ * avoiding lockdep complaint.
+ *
+ * flush_end_io() may be called recursively from some driver, such as
+ * nvme-loop, so lockdep may complain 'possible recursive locking' because
+ * all 'struct blk_flush_queue' instance share same mq_flush_lock lock class
+ * key. We need to assign different lock class for these driver's
+ * fq->mq_flush_lock for avoiding the lockdep warning.
+ *
+ * Use dynamically allocated lock class key for each 'blk_flush_queue'
+ * instance is over-kill, and more worse it introduces horrible boot delay
+ * issue because synchronize_rcu() is implied in lockdep_unregister_key which
+ * is called for each hctx release. SCSI probing may synchronously create and
+ * destroy lots of MQ request_queues for non-existent devices, and some robot
+ * test kernel always enable lockdep option. It is observed that more than half
+ * an hour is taken during SCSI MQ probe with per-fq lock class.
+ */
+void blk_mq_hctx_set_fq_lock_class(struct blk_mq_hw_ctx *hctx,
+ struct lock_class_key *key)
+{
+ lockdep_set_class(&hctx->fq->mq_flush_lock, key);
+}
+EXPORT_SYMBOL_GPL(blk_mq_hctx_set_fq_lock_class);
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index f8ea27423d1d..92bbc9a72355 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -5,6 +5,7 @@
#include
#include
#include
+#include
struct blk_mq_tags;
struct blk_flush_queue;
@@ -594,5 +595,7 @@ static inline void blk_mq_cleanup_rq(struct request *rq)
}
blk_qc_t blk_mq_submit_bio(struct bio *bio);
+void blk_mq_hctx_set_fq_lock_class(struct blk_mq_hw_ctx *hctx,
+ struct lock_class_key *key);
#endif
--
2.22.0