Message ID | da2a28305637aef648846f9bf75d269c0f7c6e57.1667822611.git.ritesh.list@gmail.com |
---|---|
State | Under Review |
Delegated to: | Theodore Ts'o |
Headers | show |
Series | e2fsprogs: Parallel fsck support | expand |
On Nov 7, 2022, at 5:20 AM, Ritesh Harjani (IBM) <ritesh.list@gmail.com> wrote: > > Currently this function was not correctly comparing against the right > length of the bitmap. Also when we compare bitarray v/s rbtree bitmap > the value returned by ext2fs_test_generic_bmap() could be different in > these two implementations. Hence only check against boolean value. > > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> > --- > lib/ext2fs/gen_bitmap.c | 9 ++++++--- > lib/ext2fs/gen_bitmap64.c | 10 +++++++--- > 2 files changed, 13 insertions(+), 6 deletions(-) > > diff --git a/lib/ext2fs/gen_bitmap.c b/lib/ext2fs/gen_bitmap.c > index 1536d4b3..f7764fca 100644 > --- a/lib/ext2fs/gen_bitmap.c > +++ b/lib/ext2fs/gen_bitmap.c > @@ -385,10 +385,13 @@ errcode_t ext2fs_compare_generic_bitmap(errcode_t magic, errcode_t neq, if ((bm1->start != bm2->start) || (bm1->end != bm2->end) || (memcmp(bm1->bitmap, bm2->bitmap, > (size_t) (bm1->end - bm1->start)/8))) > return neq; > > - for (i = bm1->end - ((bm1->end - bm1->start) % 8); i <= bm1->end; i++) On first review it appears that this is only comparing at most the last 7 bits before "end", which appears to be wrong. However, if you include the context earlier in the function (which I've added above), it is using memcmp() to compare the majority of the bitmaps in byte-wise order, so this is only needs to compare the remaining bits, and doing the full bitwise scan is much less efficient. I was going to say that the "start" value needs to be added to the bitmap offset for the comparison, but I don't think that is correct either. Looking at the code (and thinking about it some more), it doesn't make sense to allocate space for bits before "start", so this must be a virtual offset, and the *actual* bits are always starting at "bitmap", so the above code is correct, AFAICS. In summary, I don't think this patch will introduce a visible defect, but it makes the comparisons orders of magnitude less efficient when comparing each bit individually instead of using memcmp() that is likely hardware accelerated. It may well be that the calling convention of this code is such that it is always called with end=byte-aligned value so this code is never used, but that doesn't mean it shouldn't be coded properly, in case that isn't true in the future. > - if (ext2fs_fast_test_block_bitmap(gen_bm1, i) != > - ext2fs_fast_test_block_bitmap(gen_bm2, i)) > + for (i = bm1->start; i <= bm1->end; i++) { > + int ret1, ret2; > + ret1 = !!ext2fs_fast_test_block_bitmap(gen_bm1, i); > + ret2 = !!ext2fs_fast_test_block_bitmap(gen_bm2, i); Strictly speaking, the !! here is not needed, and ! is enough for comparing the inequality of the two values. However, this is only used for 0-7 bits and isn't performance critical since the memcmp() does the heavy lifting. > + if (ret1 != ret2) > return neq; > + } > > return 0; > } > diff --git a/lib/ext2fs/gen_bitmap64.c b/lib/ext2fs/gen_bitmap64.c > index c860c10e..f7710afd 100644 > --- a/lib/ext2fs/gen_bitmap64.c > +++ b/lib/ext2fs/gen_bitmap64.c > @@ -629,10 +629,14 @@ errcode_t ext2fs_compare_generic_bmap(errcode_t neq, > (bm1->end != bm2->end)) Conversely, *this* version of the function is *not* doing the memcmp() of the bulk of the bitmap contents, so it would appear to have a bug that the patch fixes, but in a very slow manner. It would be better to use memcmp(). > return neq; > > - for (i = bm1->end - ((bm1->end - bm1->start) % 8); i <= bm1->end; i++) > - if (ext2fs_test_generic_bmap(gen_bm1, i) != > - ext2fs_test_generic_bmap(gen_bm2, i)) > + for (i = bm1->start; i < bm1->end; i++) { > + int ret1, ret2; > + ret1 = !!ext2fs_test_generic_bmap(gen_bm1, i); > + ret2 = !!ext2fs_test_generic_bmap(gen_bm2, i); Cheers, Andreas
On Tue, Nov 22, 2022 at 10:04:58PM -0700, Andreas Dilger wrote: > On Nov 7, 2022, at 5:20 AM, Ritesh Harjani (IBM) <ritesh.list@gmail.com> wrote: > > > > Currently this function was not correctly comparing against the right > > length of the bitmap. Also when we compare bitarray v/s rbtree bitmap > > the value returned by ext2fs_test_generic_bmap() could be different in > > these two implementations. Hence only check against boolean value. > > > > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> > > --- > > lib/ext2fs/gen_bitmap.c | 9 ++++++--- The gen_bitmap.c file supports only the original 32-bit bitmaps, so there is not rbtree bitmap. That's why using memcmp is safe, and as near as I can tell, no changes are needed for the lib/ext2fs/gen_bitmap.c > > diff --git a/lib/ext2fs/gen_bitmap64.c b/lib/ext2fs/gen_bitmap64.c > > index c860c10e..f7710afd 100644 > > --- a/lib/ext2fs/gen_bitmap64.c > > +++ b/lib/ext2fs/gen_bitmap64.c > > @@ -629,10 +629,14 @@ errcode_t ext2fs_compare_generic_bmap(errcode_t neq, > > (bm1->end != bm2->end)) > > Conversely, *this* version of the function is *not* doing the memcmp() of > the bulk of the bitmap contents, so it would appear to have a bug that the > patch fixes, but in a very slow manner. It would be better to use memcmp(). This is where we have a problem, and the reason why we can't use memcmp is because in the case where the bitmap is really encoded using an rbtree, a memcmp isn't applicable. We *could* extract the bitmap into a bitarray, but that would require allocating memory, and in the case of a very large file system/bitmap, especially when running on a 32-bit NAS box, we might not be able to *afford* to allocate that much memory (or it might not even be addressable :-). So yes, we need a fix here, and yes, comparing bit by bit is very slow. Fortunately, from what I can tell, no one is actually calling this function, which is why no once noticed that the 64-bit compare bitmap fucntion was totally hosed. So applying Ritesh's fix here makes sense, since it at least fixes the obvious problems. In the long term, we should probably add proper unit regression tests so can test whether or not this function is actually working correctly, and if we *want* a faster version of it, we could do something faster in the case where both bitmaps are the same type (e.g., both bitarrays or both rbtrees), and if they could do something where we look for "runs" where all of the bits are all set or not set, and then compare that against the other bitmap, etc. But given that we don't have anyusers of this function any more (we used to, but it disappeared when we optimzied e2fsck's pass5 implementation), that's probably not an urgent fix. - Ted
diff --git a/lib/ext2fs/gen_bitmap.c b/lib/ext2fs/gen_bitmap.c index 1536d4b3..f7764fca 100644 --- a/lib/ext2fs/gen_bitmap.c +++ b/lib/ext2fs/gen_bitmap.c @@ -385,10 +385,13 @@ errcode_t ext2fs_compare_generic_bitmap(errcode_t magic, errcode_t neq, (size_t) (bm1->end - bm1->start)/8))) return neq; - for (i = bm1->end - ((bm1->end - bm1->start) % 8); i <= bm1->end; i++) - if (ext2fs_fast_test_block_bitmap(gen_bm1, i) != - ext2fs_fast_test_block_bitmap(gen_bm2, i)) + for (i = bm1->start; i <= bm1->end; i++) { + int ret1, ret2; + ret1 = !!ext2fs_fast_test_block_bitmap(gen_bm1, i); + ret2 = !!ext2fs_fast_test_block_bitmap(gen_bm2, i); + if (ret1 != ret2) return neq; + } return 0; } diff --git a/lib/ext2fs/gen_bitmap64.c b/lib/ext2fs/gen_bitmap64.c index c860c10e..f7710afd 100644 --- a/lib/ext2fs/gen_bitmap64.c +++ b/lib/ext2fs/gen_bitmap64.c @@ -629,10 +629,14 @@ errcode_t ext2fs_compare_generic_bmap(errcode_t neq, (bm1->end != bm2->end)) return neq; - for (i = bm1->end - ((bm1->end - bm1->start) % 8); i <= bm1->end; i++) - if (ext2fs_test_generic_bmap(gen_bm1, i) != - ext2fs_test_generic_bmap(gen_bm2, i)) + for (i = bm1->start; i < bm1->end; i++) { + int ret1, ret2; + ret1 = !!ext2fs_test_generic_bmap(gen_bm1, i); + ret2 = !!ext2fs_test_generic_bmap(gen_bm2, i); + if (ret1 != ret2) { return neq; + } + } return 0; }
Currently this function was not correctly comparing against the right length of the bitmap. Also when we compare bitarray v/s rbtree bitmap the value returned by ext2fs_test_generic_bmap() could be different in these two implementations. Hence only check against boolean value. Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> --- lib/ext2fs/gen_bitmap.c | 9 ++++++--- lib/ext2fs/gen_bitmap64.c | 10 +++++++--- 2 files changed, 13 insertions(+), 6 deletions(-)