Message ID | 20220922223216.4730-1-ematsumiya@suse.de |
---|---|
State | New |
Headers | show |
Series | [RFC] cifs: fix signature check for error responses | expand |
On 9/22/2022 6:32 PM, Enzo Matsumiya wrote: > Hi all, > > This patch is the (apparent) correct way to fix the issues regarding some > messages with invalid signatures. My previous patch > https://lore.kernel.org/linux-cifs/20220918235442.2981-1-ematsumiya@suse.de/ > was wrong because a) it covered up the real issue (*), and b) I could never > again "reproduce" the "race" -- I had other patches in place when I tested it, > and (thus?) the system memory was not in "pristine" conditions, so it's very > likely that there's no race at all. > > (*) Thanks a lot to Tom Talpey for pointing me to the right direction here. > > Since it sucks to be wrong twice, I'm sending this one as RFC because I > wanted to confirm some things: > > 1) Can I rely on the fact that status != STATUS_SUCCESS means no variable > data in the message? I could only infer from the spec, but not really > confirm. Technically, I think the answer is "no" but practically speaking maybe "yes". Not all errors are actually errors however, e.g. STOPPED_ON_SYMLINK and the additional error contexts which accompany it. We definitely don't want to skip validation of those. There's also STATUS_PENDING, but that's special because it isn't actually a response, it's just a "hang on I'm busy" that carries no other data. Finally, MS-SMB2 lists a few others in section 2.2.2.2, all of which need validation I think. > 2) (probably only in async cases) When the signature fails, for whatever > reason, we don't take any action. This doesn't seem right IMHO, > e.g. if the message is spoofed, we show a warning that the signatures > doesn't match, but I would expect at least for the operation to stop, > or maybe even a complete dis/reconnect. I don't think so. The spec calls for dropping the message, and after all it could have been simple packet corruption. The retries will sort it out. Absolutely positively do not print a message for each received packet, good or bad. > 3) For SMB1, I couldn't really use generic/465 as a real confirmation for > this because it says "O_DIRECT is not supported". From reading the > code and 'man mount.cifs' I understood that this is supported, so what > gives? Worth noting that I don't follow/use/test SMB1 too much. > The patch does work for other cases I tried though. Honestly, I don't think we care. No amount of patching can possibly save SMB1. Tom. > I hope someone can address my questions/concerts above, and please let me > know if the patch is technically and semantically correct. > > Patch follows. > > > Enzo > > -------- > When verifying a response's signature, the computation will go through > the iov buffer (header + response structs) and the over the page data, to > verify any dynamic data appended to the message (usually on an SMB2 READ > response). > > When the response is an error, however, specifically on async reads, > the page data is allocated before receiving the expected data. Being > "valid" data (from the signature computation POV; non-NULL, >0 pages), > it's included in the parsing and generates an invalid signature for the > message. > > Fix this by checking if the status is non-zero, and skip the page data > if it is. The issue happens in all protocol versions, and this fix > applies to all. > > This issue can be observed with xfstests generic/465. > > Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> > --- > fs/cifs/cifsencrypt.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/fs/cifs/cifsencrypt.c b/fs/cifs/cifsencrypt.c > index 46f5718754f9..e3260bb436b3 100644 > --- a/fs/cifs/cifsencrypt.c > +++ b/fs/cifs/cifsencrypt.c > @@ -32,15 +32,28 @@ int __cifs_calc_signature(struct smb_rqst *rqst, > int rc; > struct kvec *iov = rqst->rq_iov; > int n_vec = rqst->rq_nvec; > + bool has_error = false; > > /* iov[0] is actual data and not the rfc1002 length for SMB2+ */ > if (!is_smb1(server)) { > + struct smb2_hdr *shdr = (struct smb2_hdr *)iov[0].iov_base; > + > if (iov[0].iov_len <= 4) > return -EIO; > + > + if (shdr->Status != 0) > + has_error = true; > + > i = 0; > } else { > + struct smb_hdr *hdr = (struct smb_hdr *)iov[1].iov_base; > + > if (n_vec < 2 || iov[0].iov_len != 4) > return -EIO; > + > + if (hdr->Status.CifsError != 0) > + has_error = true; > + > i = 1; /* skip rfc1002 length */ > } > > @@ -61,6 +74,9 @@ int __cifs_calc_signature(struct smb_rqst *rqst, > } > } > > + if (has_error) > + goto out_final; > + > /* now hash over the rq_pages array */ > for (i = 0; i < rqst->rq_npages; i++) { > void *kaddr; > @@ -81,6 +97,7 @@ int __cifs_calc_signature(struct smb_rqst *rqst, > kunmap(rqst->rq_pages[i]); > } > > +out_final: > rc = crypto_shash_final(shash, signature); > if (rc) > cifs_dbg(VFS, "%s: Could not generate hash\n", __func__);
On 09/28, Tom Talpey wrote: >On 9/22/2022 6:32 PM, Enzo Matsumiya wrote: >>Hi all, >> >>This patch is the (apparent) correct way to fix the issues regarding some >>messages with invalid signatures. My previous patch >>https://lore.kernel.org/linux-cifs/20220918235442.2981-1-ematsumiya@suse.de/ >>was wrong because a) it covered up the real issue (*), and b) I could never >>again "reproduce" the "race" -- I had other patches in place when I tested it, >>and (thus?) the system memory was not in "pristine" conditions, so it's very >>likely that there's no race at all. >> >>(*) Thanks a lot to Tom Talpey for pointing me to the right direction here. >> >>Since it sucks to be wrong twice, I'm sending this one as RFC because I >>wanted to confirm some things: >> >> 1) Can I rely on the fact that status != STATUS_SUCCESS means no variable >> data in the message? I could only infer from the spec, but not really >> confirm. > >Technically, I think the answer is "no" but practically speaking maybe >"yes". > >Not all errors are actually errors however, e.g. STOPPED_ON_SYMLINK and >the additional error contexts which accompany it. We definitely don't >want to skip validation of those. > >There's also STATUS_PENDING, but that's special because it isn't actually >a response, it's just a "hang on I'm busy" that carries no other >data. > >Finally, MS-SMB2 lists a few others in section 2.2.2.2, all of which >need validation I think. Sorry, I might have not been clear. By "variable data" I actually meant data that's not going to be allocated in the requests' iovs, but rather in rqst->rq_pages. The variable buffers that are already expected (by PDU/spec), are AFAICS always allocated either through cifs*buf_get() or some kmalloc() variant, i.e. non-HIGHMEM. async IO, though, is the only place I can see that allocates rqst->rq_pages through iov_iter_get_pages_alloc2(), which is the source of my concern, and the reason of my patch. Particularly, on async read cases, the page data is pre-allocated with the size we *expect* to receive/read from the server, but if there's an error in a READ response, the response itself will contain no/incomplete data in the page data, but since those pages are allocated, the signature computation will account for it, and generate a mismatching signature. I'm running more tests now and I'll make sure to test the other possible errors in 2.2.2.2. >> 2) (probably only in async cases) When the signature fails, for whatever >> reason, we don't take any action. This doesn't seem right IMHO, >> e.g. if the message is spoofed, we show a warning that the signatures >> doesn't match, but I would expect at least for the operation to stop, >> or maybe even a complete dis/reconnect. > >I don't think so. The spec calls for dropping the message, and after >all it could have been simple packet corruption. The retries will sort >it out. > >Absolutely positively do not print a message for each received packet, >good or bad. For the good messages ok, of course, but for bad? Why not? Otherwise, e.g., spotting the misvalidation mentioned above would've taken much longer. (btw I spotted that misvalidation while debugging my AES-GMAC series, but the same behaviuour happens with AES-CMAC i.e. in current code) >> 3) For SMB1, I couldn't really use generic/465 as a real confirmation for >> this because it says "O_DIRECT is not supported". From reading the >> code and 'man mount.cifs' I understood that this is supported, so what >> gives? Worth noting that I don't follow/use/test SMB1 too much. >> The patch does work for other cases I tried though. > >Honestly, I don't think we care. No amount of patching can possibly >save SMB1. Ack :) >Tom. Cheers, Enzo >>I hope someone can address my questions/concerts above, and please let me >>know if the patch is technically and semantically correct. >> >>Patch follows. >> >> >>Enzo >> >>-------- >>When verifying a response's signature, the computation will go through >>the iov buffer (header + response structs) and the over the page data, to >>verify any dynamic data appended to the message (usually on an SMB2 READ >>response). >> >>When the response is an error, however, specifically on async reads, >>the page data is allocated before receiving the expected data. Being >>"valid" data (from the signature computation POV; non-NULL, >0 pages), >>it's included in the parsing and generates an invalid signature for the >>message. >> >>Fix this by checking if the status is non-zero, and skip the page data >>if it is. The issue happens in all protocol versions, and this fix >>applies to all. >> >>This issue can be observed with xfstests generic/465. >> >>Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> >>--- >> fs/cifs/cifsencrypt.c | 17 +++++++++++++++++ >> 1 file changed, 17 insertions(+) >> >>diff --git a/fs/cifs/cifsencrypt.c b/fs/cifs/cifsencrypt.c >>index 46f5718754f9..e3260bb436b3 100644 >>--- a/fs/cifs/cifsencrypt.c >>+++ b/fs/cifs/cifsencrypt.c >>@@ -32,15 +32,28 @@ int __cifs_calc_signature(struct smb_rqst *rqst, >> int rc; >> struct kvec *iov = rqst->rq_iov; >> int n_vec = rqst->rq_nvec; >>+ bool has_error = false; >> /* iov[0] is actual data and not the rfc1002 length for SMB2+ */ >> if (!is_smb1(server)) { >>+ struct smb2_hdr *shdr = (struct smb2_hdr *)iov[0].iov_base; >>+ >> if (iov[0].iov_len <= 4) >> return -EIO; >>+ >>+ if (shdr->Status != 0) >>+ has_error = true; >>+ >> i = 0; >> } else { >>+ struct smb_hdr *hdr = (struct smb_hdr *)iov[1].iov_base; >>+ >> if (n_vec < 2 || iov[0].iov_len != 4) >> return -EIO; >>+ >>+ if (hdr->Status.CifsError != 0) >>+ has_error = true; >>+ >> i = 1; /* skip rfc1002 length */ >> } >>@@ -61,6 +74,9 @@ int __cifs_calc_signature(struct smb_rqst *rqst, >> } >> } >>+ if (has_error) >>+ goto out_final; >>+ >> /* now hash over the rq_pages array */ >> for (i = 0; i < rqst->rq_npages; i++) { >> void *kaddr; >>@@ -81,6 +97,7 @@ int __cifs_calc_signature(struct smb_rqst *rqst, >> kunmap(rqst->rq_pages[i]); >> } >>+out_final: >> rc = crypto_shash_final(shash, signature); >> if (rc) >> cifs_dbg(VFS, "%s: Could not generate hash\n", __func__);
On 9/28/2022 12:30 PM, Enzo Matsumiya wrote: > On 09/28, Tom Talpey wrote: >> On 9/22/2022 6:32 PM, Enzo Matsumiya wrote: >>> Hi all, >>> >>> This patch is the (apparent) correct way to fix the issues regarding >>> some >>> messages with invalid signatures. My previous patch >>> https://lore.kernel.org/linux-cifs/20220918235442.2981-1-ematsumiya@suse.de/ >>> was wrong because a) it covered up the real issue (*), and b) I could >>> never >>> again "reproduce" the "race" -- I had other patches in place when I >>> tested it, >>> and (thus?) the system memory was not in "pristine" conditions, so >>> it's very >>> likely that there's no race at all. >>> >>> (*) Thanks a lot to Tom Talpey for pointing me to the right direction >>> here. >>> >>> Since it sucks to be wrong twice, I'm sending this one as RFC because I >>> wanted to confirm some things: >>> >>> 1) Can I rely on the fact that status != STATUS_SUCCESS means no >>> variable >>> data in the message? I could only infer from the spec, but not >>> really >>> confirm. >> >> Technically, I think the answer is "no" but practically speaking maybe >> "yes". >> >> Not all errors are actually errors however, e.g. STOPPED_ON_SYMLINK and >> the additional error contexts which accompany it. We definitely don't >> want to skip validation of those. >> >> There's also STATUS_PENDING, but that's special because it isn't actually >> a response, it's just a "hang on I'm busy" that carries no other >> data. >> >> Finally, MS-SMB2 lists a few others in section 2.2.2.2, all of which >> need validation I think. > > Sorry, I might have not been clear. By "variable data" I actually meant > data that's not going to be allocated in the requests' iovs, but rather > in rqst->rq_pages. > > The variable buffers that are already expected (by PDU/spec), are AFAICS > always allocated either through cifs*buf_get() or some kmalloc() variant, > i.e. non-HIGHMEM. > > async IO, though, is the only place I can see that allocates > rqst->rq_pages through iov_iter_get_pages_alloc2(), which is the source > of my concern, and the reason of my patch. Particularly, on async read > cases, the page data is pre-allocated with the size we *expect* to > receive/read from the server, but if there's an error in a READ > response, the response itself will contain no/incomplete data in the page > data, but since those pages are allocated, the signature computation will > account for it, and generate a mismatching signature. Aha - ok, got it. And I've been looking into those codepaths too because the RDMA code fails badly when I try to optimize the completion path. Right now, the code waits for ALL smbdirect packets to be sent before returning, which is ridiculously inefficient. It does this in both client and server sending. But if I remove the wait, nothing works as it appears the queued buffers are either freed, or unwound from the stack. Completely and totally bogus. > I'm running more tests now and I'll make sure to test the other possible > errors in 2.2.2.2. > >>> 2) (probably only in async cases) When the signature fails, for >>> whatever >>> reason, we don't take any action. This doesn't seem right IMHO, >>> e.g. if the message is spoofed, we show a warning that the >>> signatures >>> doesn't match, but I would expect at least for the operation to >>> stop, >>> or maybe even a complete dis/reconnect. >> >> I don't think so. The spec calls for dropping the message, and after >> all it could have been simple packet corruption. The retries will sort >> it out. >> >> Absolutely positively do not print a message for each received packet, >> good or bad. > > For the good messages ok, of course, but for bad? Why not? Otherwise, e.g., > spotting the misvalidation mentioned above would've taken much longer. It's basic MITM. If there is such an attacker, it can trivially redirect the client, by injecting an error undetected. Inspecting the payload before doing the integrity check is just insecurely wrong. Regarding the logging on signature validation, consider that an attacker will therefore trivially be able to overrun the log, either DOSing the system, or at the very least hiding other messages. Why alarm the sysadmin, for something they are not likely to be able to act upon? Bump a statistic, maybe. To me, the kernel log is not informational, it's for critical actions. For example, it requires root to even see it. Consider what you need to tell such a person each time it happens. "Oops, I dropped a packet" doesn't cut it. If it's a developer hint, then place it under CIFS FYI, or some sort of opt-in diagnostic tracing channel. Tom. > (btw I spotted that misvalidation while debugging my AES-GMAC series, > but the same behaviuour happens with AES-CMAC i.e. in current code) > >>> 3) For SMB1, I couldn't really use generic/465 as a real >>> confirmation for >>> this because it says "O_DIRECT is not supported". From reading the >>> code and 'man mount.cifs' I understood that this is supported, >>> so what >>> gives? Worth noting that I don't follow/use/test SMB1 too much. >>> The patch does work for other cases I tried though. >> >> Honestly, I don't think we care. No amount of patching can possibly >> save SMB1. > > Ack :) > >> Tom. > > > Cheers, > > Enzo > >>> I hope someone can address my questions/concerts above, and please >>> let me >>> know if the patch is technically and semantically correct. >>> >>> Patch follows. >>> >>> >>> Enzo >>> >>> -------- >>> When verifying a response's signature, the computation will go through >>> the iov buffer (header + response structs) and the over the page >>> data, to >>> verify any dynamic data appended to the message (usually on an SMB2 READ >>> response). >>> >>> When the response is an error, however, specifically on async reads, >>> the page data is allocated before receiving the expected data. Being >>> "valid" data (from the signature computation POV; non-NULL, >0 pages), >>> it's included in the parsing and generates an invalid signature for the >>> message. >>> >>> Fix this by checking if the status is non-zero, and skip the page data >>> if it is. The issue happens in all protocol versions, and this fix >>> applies to all. >>> >>> This issue can be observed with xfstests generic/465. >>> >>> Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> >>> --- >>> fs/cifs/cifsencrypt.c | 17 +++++++++++++++++ >>> 1 file changed, 17 insertions(+) >>> >>> diff --git a/fs/cifs/cifsencrypt.c b/fs/cifs/cifsencrypt.c >>> index 46f5718754f9..e3260bb436b3 100644 >>> --- a/fs/cifs/cifsencrypt.c >>> +++ b/fs/cifs/cifsencrypt.c >>> @@ -32,15 +32,28 @@ int __cifs_calc_signature(struct smb_rqst *rqst, >>> int rc; >>> struct kvec *iov = rqst->rq_iov; >>> int n_vec = rqst->rq_nvec; >>> + bool has_error = false; >>> /* iov[0] is actual data and not the rfc1002 length for SMB2+ */ >>> if (!is_smb1(server)) { >>> + struct smb2_hdr *shdr = (struct smb2_hdr *)iov[0].iov_base; >>> + >>> if (iov[0].iov_len <= 4) >>> return -EIO; >>> + >>> + if (shdr->Status != 0) >>> + has_error = true; >>> + >>> i = 0; >>> } else { >>> + struct smb_hdr *hdr = (struct smb_hdr *)iov[1].iov_base; >>> + >>> if (n_vec < 2 || iov[0].iov_len != 4) >>> return -EIO; >>> + >>> + if (hdr->Status.CifsError != 0) >>> + has_error = true; >>> + >>> i = 1; /* skip rfc1002 length */ >>> } >>> @@ -61,6 +74,9 @@ int __cifs_calc_signature(struct smb_rqst *rqst, >>> } >>> } >>> + if (has_error) >>> + goto out_final; >>> + >>> /* now hash over the rq_pages array */ >>> for (i = 0; i < rqst->rq_npages; i++) { >>> void *kaddr; >>> @@ -81,6 +97,7 @@ int __cifs_calc_signature(struct smb_rqst *rqst, >>> kunmap(rqst->rq_pages[i]); >>> } >>> +out_final: >>> rc = crypto_shash_final(shash, signature); >>> if (rc) >>> cifs_dbg(VFS, "%s: Could not generate hash\n", __func__); >
diff --git a/fs/cifs/cifsencrypt.c b/fs/cifs/cifsencrypt.c index 46f5718754f9..e3260bb436b3 100644 --- a/fs/cifs/cifsencrypt.c +++ b/fs/cifs/cifsencrypt.c @@ -32,15 +32,28 @@ int __cifs_calc_signature(struct smb_rqst *rqst, int rc; struct kvec *iov = rqst->rq_iov; int n_vec = rqst->rq_nvec; + bool has_error = false; /* iov[0] is actual data and not the rfc1002 length for SMB2+ */ if (!is_smb1(server)) { + struct smb2_hdr *shdr = (struct smb2_hdr *)iov[0].iov_base; + if (iov[0].iov_len <= 4) return -EIO; + + if (shdr->Status != 0) + has_error = true; + i = 0; } else { + struct smb_hdr *hdr = (struct smb_hdr *)iov[1].iov_base; + if (n_vec < 2 || iov[0].iov_len != 4) return -EIO; + + if (hdr->Status.CifsError != 0) + has_error = true; + i = 1; /* skip rfc1002 length */ } @@ -61,6 +74,9 @@ int __cifs_calc_signature(struct smb_rqst *rqst, } } + if (has_error) + goto out_final; + /* now hash over the rq_pages array */ for (i = 0; i < rqst->rq_npages; i++) { void *kaddr; @@ -81,6 +97,7 @@ int __cifs_calc_signature(struct smb_rqst *rqst, kunmap(rqst->rq_pages[i]); } +out_final: rc = crypto_shash_final(shash, signature); if (rc) cifs_dbg(VFS, "%s: Could not generate hash\n", __func__);
Hi all, This patch is the (apparent) correct way to fix the issues regarding some messages with invalid signatures. My previous patch https://lore.kernel.org/linux-cifs/20220918235442.2981-1-ematsumiya@suse.de/ was wrong because a) it covered up the real issue (*), and b) I could never again "reproduce" the "race" -- I had other patches in place when I tested it, and (thus?) the system memory was not in "pristine" conditions, so it's very likely that there's no race at all. (*) Thanks a lot to Tom Talpey for pointing me to the right direction here. Since it sucks to be wrong twice, I'm sending this one as RFC because I wanted to confirm some things: 1) Can I rely on the fact that status != STATUS_SUCCESS means no variable data in the message? I could only infer from the spec, but not really confirm. 2) (probably only in async cases) When the signature fails, for whatever reason, we don't take any action. This doesn't seem right IMHO, e.g. if the message is spoofed, we show a warning that the signatures doesn't match, but I would expect at least for the operation to stop, or maybe even a complete dis/reconnect. 3) For SMB1, I couldn't really use generic/465 as a real confirmation for this because it says "O_DIRECT is not supported". From reading the code and 'man mount.cifs' I understood that this is supported, so what gives? Worth noting that I don't follow/use/test SMB1 too much. The patch does work for other cases I tried though. I hope someone can address my questions/concerts above, and please let me know if the patch is technically and semantically correct. Patch follows. Enzo -------- When verifying a response's signature, the computation will go through the iov buffer (header + response structs) and the over the page data, to verify any dynamic data appended to the message (usually on an SMB2 READ response). When the response is an error, however, specifically on async reads, the page data is allocated before receiving the expected data. Being "valid" data (from the signature computation POV; non-NULL, >0 pages), it's included in the parsing and generates an invalid signature for the message. Fix this by checking if the status is non-zero, and skip the page data if it is. The issue happens in all protocol versions, and this fix applies to all. This issue can be observed with xfstests generic/465. Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> --- fs/cifs/cifsencrypt.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)