Message ID | CADZpyix6DZ93f8MQf3Aa1NVV7HCFMAXVAdzRMFBY7xWHHQMPog@mail.gmail.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
From: "Carlos O'Donell" <carlos@systemhalted.org> Date: Wed, 18 Apr 2012 00:08:47 -0400 > I don't clearly understand your position here, and perhaps that's my > own ignorance, but could you please clarify, with examples, exactly > why the change is not acceptable? My position is that since millions upon millions of Linux systems, in fact every single Linux system, exists right now with the current behavior we are not helping application writers at all by changing behavior now after it's been this way for nearly 20 years. Because if an application writer wants his code to work on systems that actually exist he has to accomodate the non-NULL termination situation if he wants to inspect or print out an AF_UNIX path. Because every system in existence right now allows the non-NULL terminated AF_UNIX paths, therefore it's possible on every system in existence right now. Catch my drift? The very thing the patch claims to help, it doesn't. We install this kernel patch now and then tell application writers that they can just assume all AF_UNIX paths are NULL terminated when they want to print it out, because such code will not actually be guarenteed to work on all deployed Linux machines out there. You cannot just ignore 20 years of precedence and say "oh let's change this in the kernel now, and that way application writers don't have to worry about that lack of NULL termination any more." It simply doesn't work like that. All of this talk about whether applications actually create non-NULL terminated AF_UNIX paths don't even factor into the conversation. So the value proposition for this patch simply does not exist. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Why not have: > > diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c > index d510353..f9f77a7 100644 > --- a/net/unix/af_unix.c > +++ b/net/unix/af_unix.c > @@ -216,6 +216,9 @@ static int unix_mkname(struct sockaddr_un > *sunaddr, int len, unsigned *hashp) > */ > ((char *)sunaddr)[len] = 0; > len = strlen(sunaddr->sun_path)+1+sizeof(short); > + /* No null terminator was found in the path. */ > + if (len > sizeof(*sunaddr)) > + return -EINVAL; > return len; That could generate a kernel page fault! (Depending on what follows (or rather doesn't follow!) sun_path.) You'd need to use memchr() not strlen(). David -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 18, 2012 at 12:16 AM, David Miller <davem@davemloft.net> wrote: > From: "Carlos O'Donell" <carlos@systemhalted.org> > Date: Wed, 18 Apr 2012 00:08:47 -0400 > >> I don't clearly understand your position here, and perhaps that's my >> own ignorance, but could you please clarify, with examples, exactly >> why the change is not acceptable? > > My position is that since millions upon millions of Linux systems, in > fact every single Linux system, exists right now with the current > behavior we are not helping application writers at all by changing > behavior now after it's been this way for nearly 20 years. > > Because if an application writer wants his code to work on systems > that actually exist he has to accomodate the non-NULL termination > situation if he wants to inspect or print out an AF_UNIX path. > > Because every system in existence right now allows the non-NULL > terminated AF_UNIX paths, therefore it's possible on every system > in existence right now. > > Catch my drift? > > The very thing the patch claims to help, it doesn't. We install this > kernel patch now and then tell application writers that they can just > assume all AF_UNIX paths are NULL terminated when they want to print > it out, because such code will not actually be guarenteed to work on > all deployed Linux machines out there. > > You cannot just ignore 20 years of precedence and say "oh let's change > this in the kernel now, and that way application writers don't have to > worry about that lack of NULL termination any more." It simply > doesn't work like that. > > All of this talk about whether applications actually create non-NULL > terminated AF_UNIX paths don't even factor into the conversation. > > So the value proposition for this patch simply does not exist. Thank you, this is the kind of position statement I can point to if I ever get asked about this again. In summary your opinion is that the API has and always will allow up to 108 chars to be used in sun_path? In which case I will talk to the Austin group to get a good example added to POSIX showing safe usage. Cheers, Carlos. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 18, 2012 at 09:17:26AM +0100, David Laight wrote: > > > > > Why not have: > > > > diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c > > index d510353..f9f77a7 100644 > > --- a/net/unix/af_unix.c > > +++ b/net/unix/af_unix.c > > @@ -216,6 +216,9 @@ static int unix_mkname(struct sockaddr_un > > *sunaddr, int len, unsigned *hashp) > > */ > > ((char *)sunaddr)[len] = 0; > > len = strlen(sunaddr->sun_path)+1+sizeof(short); > > + /* No null terminator was found in the path. */ > > + if (len > sizeof(*sunaddr)) > > + return -EINVAL; > > return len; > > That could generate a kernel page fault! > (Depending on what follows (or rather doesn't follow!) sun_path.) > You'd need to use memchr() not strlen(). > > David > Hi, David. What follows is a 0 byte, because it's set that way in the line before strlen. Note that len is tested for sizeof(*sunaddr), and there is a huge comment about that extra byte that was omitted. The whole function is at net/unix/af_unix.c:203. Regards, Cascardo. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: "Carlos O'Donell" <carlos@systemhalted.org> Date: Wed, 18 Apr 2012 08:57:58 -0400 > In summary your opinion is that the API has and always will allow up > to 108 chars to be used in sun_path? Yes. > In which case I will talk to the Austin group to get a good example > added to POSIX showing safe usage. Why would you add language to POSIX for Linux specific behavior? Just curious :-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 18, 2012 at 1:31 PM, David Miller <davem@davemloft.net> wrote: > From: "Carlos O'Donell" <carlos@systemhalted.org> > Date: Wed, 18 Apr 2012 08:57:58 -0400 > >> In summary your opinion is that the API has and always will allow up >> to 108 chars to be used in sun_path? > > Yes. > >> In which case I will talk to the Austin group to get a good example >> added to POSIX showing safe usage. > > Why would you add language to POSIX for Linux specific behavior? > Just curious :-) Why not? Do you ever feel crazy when people question what you think is perfectly reasonable? ;-) POSIX doesn't exist in a vacuum, we need to harmonize reality with the standard. If an implementation exists where sun_path has no null-terminator then it is useful to have POSIX clarify that null-termination is implementation defined behaviour, just like it says that sun_path's length undefined. Under "Application Usage" or "Examples" it's valid to talk about specific implementations. See: http://pubs.opengroup.org/onlinepubs/007904975/basedefs/sys/un.h.html, where it talks about BSD in the "Application Usage." It's about time we some "Linux this" and "Linux that" in there. Cheers, Carlos. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: "Carlos O'Donell" <carlos@systemhalted.org> Date: Wed, 18 Apr 2012 14:48:43 -0400 > POSIX doesn't exist in a vacuum, we need to harmonize reality with the > standard. If an implementation exists where sun_path has no > null-terminator then it is useful to have POSIX clarify that > null-termination is implementation defined behaviour, just like it > says that sun_path's length undefined. Under "Application Usage" or > "Examples" it's valid to talk about specific implementations. > > See: http://pubs.opengroup.org/onlinepubs/007904975/basedefs/sys/un.h.html, > where it talks about BSD in the "Application Usage." It's about time > we some "Linux this" and "Linux that" in there. Ok, thanks for explaining. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 18, 2012 at 4:16 PM, David Miller <davem@davemloft.net> wrote: > From: "Carlos O'Donell" <carlos@systemhalted.org> > Date: Wed, 18 Apr 2012 00:08:47 -0400 > >> I don't clearly understand your position here, and perhaps that's my >> own ignorance, but could you please clarify, with examples, exactly >> why the change is not acceptable? > > My position is that since millions upon millions of Linux systems, in > fact every single Linux system, exists right now with the current > behavior we are not helping application writers at all by changing > behavior now after it's been this way for nearly 20 years. > > Because if an application writer wants his code to work on systems > that actually exist he has to accomodate the non-NULL termination > situation if he wants to inspect or print out an AF_UNIX path. > > Because every system in existence right now allows the non-NULL > terminated AF_UNIX paths, therefore it's possible on every system > in existence right now. > > Catch my drift? > > The very thing the patch claims to help, it doesn't. We install this > kernel patch now and then tell application writers that they can just > assume all AF_UNIX paths are NULL terminated when they want to print > it out, because such code will not actually be guarenteed to work on > all deployed Linux machines out there. Hang on a moment. I did not suggest that we can just tell users they can forget about the past. Obviously, users will need to program to past kernel behavior here for a good long time yet. (As Alan says elsewhere in the thread "they'll be defensively coding for the existing API for another ten years for enterprise distros anyway".) However, this is about longer-term improvement of the quality of implementation; in X years (choose your X) time, a lot of new application may not need to care about the old broken behavior. See some related examples below. And you skipped past my other two points. Even if my understanding about POSIX mandates is correct, I can understand how we might ignore that point. But the last one is still germane: [[ 3. Considering these two sets: (a) [applications that rely on the assumption that there is a null terminator inside sizeof(sun_path) bytes] (b) [applications that would break if the kernel behavior changed] I suspect that set (a) is rather larger than set (b)--or, more likely still, applications ensure they go for the lowest common denominator limit of 92 (HP-UX) or 104 (historical BSD limit) bytes, and so avoid this issue completely. ]] There may well be potential breakages out there in set (a), and improving the QOI would help them. (To put things in terms of Alan's response: I suspect that there may well be existing applications that are *not* defensively handling the existing API). Taking the logic you've posed (my reading: "we shouldn't fix old brokenness because applications will still need to code to the brokenness") to the extreme, we'd *never* fix old pieces of brokenness. However, we certainly have precedents for doing exactly that: After nearly 15 years of brokenness (stretching back to the first kernels), commit 69be8f189653cd81aae5a74e26615b12871bb72e fixed this (sigaction(2)): BUGS In kernels up to and including 2.6.13, specifying SA_NODEFER in sa_flags prevents not only the delivered signal from being masked during execution of the handler, but also the signals specified in sa_mask. This bug was fixed in kernel 2.6.14. Similarly, after brokenness that had run through the entire preceding 2.4.x kernel series, Linux 2.6.4 fixed this: BUGS In kernel 2.4 (and earlier) there is some strangeness in the handling of X_OK tests for superuser. If all categories of execute permission are disabled for a nondirectory file, then the only access() test that returns -1 is when mode is speci‐ fied as just X_OK; if R_OK or W_OK is also specified in mode, then access() returns 0 for such files. Early 2.6 kernels (up to and including 2.6.3) also behaved in the same way as kernel 2.4. (A little background here: http://thread.gmane.org/gmane.linux.kernel/158814, and the fix eventually went in with http://thread.gmane.org/gmane.linux.kernel/178719) > You cannot just ignore 20 years of precedence and say "oh let's change > this in the kernel now, and that way application writers don't have to > worry about that lack of NULL termination any more." It simply > doesn't work like that. As should be clear from the above, I agree. But still, I don't think the logic "it's broken, and even if we fix it, users will still have to code to the old brokenness" is a sufficient argument against improving the QOI long-term. > All of this talk about whether applications actually create non-NULL > terminated AF_UNIX paths don't even factor into the conversation. > > So the value proposition for this patch simply does not exist. Of course, it's your call in the end, but I don't think things are as cut-and-dried as your response suggests. Cheers, Michael
From: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> Date: Thu, 19 Apr 2012 10:50:40 +1200 > However, this is about longer-term improvement of the quality of > implementation; in X years (choose your X) time, a lot of new > application may not need to care about the old broken behavior. There is really no value to this, the AF_UNIX NULL termination issue is significantly different from the signal examples you mention. If we're going to, like Carlos will, make mention in POSIX documents that one must account for possible lack of NULL termination, there is absolutely ZERO value in changing things because we are telling application writers the state of reality which is that they have to allot for this. Please drop this issue, the discussion was over a long time ago, thank you very much. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> 3. Considering these two sets: > > (a) [applications that rely on the assumption that there > is a null terminator inside sizeof(sun_path) bytes] > (b) [applications that would break if the kernel behavior changed] > > I suspect that set (a) is rather larger than set (b)--or, more > likely still, applications ensure they go for the lowest common > denominator limit of 92 (HP-UX) or 104 (historical BSD limit) > bytes, and so avoid this issue completely. Or another way of putting it 3(a) Sloppy coding that may have lots of other bugs 3(b) Interfaces and code we promised not to break. Alan -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Apr 19, 2012 at 10:19 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: >> 3. Considering these two sets: >> >> (a) [applications that rely on the assumption that there >> is a null terminator inside sizeof(sun_path) bytes] >> (b) [applications that would break if the kernel behavior changed] >> >> I suspect that set (a) is rather larger than set (b)--or, more >> likely still, applications ensure they go for the lowest common >> denominator limit of 92 (HP-UX) or 104 (historical BSD limit) >> bytes, and so avoid this issue completely. > > Or another way of putting it > > 3(a) Sloppy coding that may have lots of other bugs > > 3(b) Interfaces and code we promised not to break. Yes, it's another way of putting it. (Though regarding 3(b), part of the problem is that there never was a clearly specified contract.) Anyway, I've dug deeper, looking at hat happens on other platforms. It's a mess: the BSDs don't even guarantee that sun_path is null_terminated. So, here's how one has to portably deal with the variations: addrlen = sizeof(struct sockaddr_un); cfd = accept(lfd, &addr, &addlen); printf("%.*s", addrlen - offsetof(struct sockaddr_un, sun_path), addr.sun_path); That's pretty hideous! Thanks, Michael
On Thursday 2012-04-19 12:33, Michael Kerrisk (man-pages) wrote: > >Anyway, I've dug deeper, looking at hat happens on other platforms. >It's a mess: the BSDs don't even guarantee that sun_path is >null_terminated. So, here's how one has to portably deal with the >variations: > >addrlen = sizeof(struct sockaddr_un); >cfd = accept(lfd, &addr, &addlen); > >printf("%.*s", addrlen - offsetof(struct sockaddr_un, sun_path), addr.sun_path); What operating system made you write that? I just ask for fun and the record. Solaris's dirent is also constructed such that using sizeof(...) is useless. typedef struct dirent { ino_t d_ino; /* "inode number" of entry */ off_t d_off; /* offset of disk directory entry */ unsigned short d_reclen; /* length of this record */ char d_name[1]; /* name of file */ } dirent_t; -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index d510353..f9f77a7 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -216,6 +216,9 @@ static int unix_mkname(struct sockaddr_un *sunaddr, int len, unsigned *hashp) */ ((char *)sunaddr)[len] = 0; len = strlen(sunaddr->sun_path)+1+sizeof(short); + /* No null terminator was found in the path. */ + if (len > sizeof(*sunaddr)) + return -EINVAL; return len; }