Message ID | 1434388931-24487-2-git-send-email-aarcange@redhat.com |
---|---|
State | New |
Headers | show |
On Jun 15, 2015 7:22 AM, "Andrea Arcangeli" <aarcange@redhat.com> wrote: > > + if (cmd != UFFDIO_API) { > + if (ctx->state == UFFD_STATE_WAIT_API) > + return -EINVAL; > + BUG_ON(ctx->state != UFFD_STATE_RUNNING); > + } NAK. Once again: we don't add BUG_ON() as some kind of assert. If your non-critical code has s bug in it, you do WARN_ONCE() and you return. You don't kill the machine just because of some "this can't happen" situation. It turns out "this can't happen" happens way too often, just because code changes, or programmers didn't think all the cases through. And killing the machine is just NOT ACCEPTABLE. People need to stop adding machine-killing checks to code that just doesn't merit killing the machine. And if you are so damn sure that it really cannot happen ever, then you damn well had better remove the test too! BUG_ON is not a debugging tool, or a "I think this would be bad" helper. Linus
On Mon, Jun 15, 2015 at 08:11:50AM -1000, Linus Torvalds wrote: > On Jun 15, 2015 7:22 AM, "Andrea Arcangeli" <aarcange@redhat.com> wrote: > > > > + if (cmd != UFFDIO_API) { > > + if (ctx->state == UFFD_STATE_WAIT_API) > > + return -EINVAL; > > + BUG_ON(ctx->state != UFFD_STATE_RUNNING); > > + } > > NAK. > > Once again: we don't add BUG_ON() as some kind of assert. If your > non-critical code has s bug in it, you do WARN_ONCE() and you return. You > don't kill the machine just because of some "this can't happen" situation. > > It turns out "this can't happen" happens way too often, just because code > changes, or programmers didn't think all the cases through. And killing the > machine is just NOT ACCEPTABLE. > > People need to stop adding machine-killing checks to code that just doesn't > merit killing the machine. > > And if you are so damn sure that it really cannot happen ever, then you > damn well had better remove the test too! > > BUG_ON is not a debugging tool, or a "I think this would be bad" helper. Several times I got very hardly reproducible bugs noticed purely because of BUG_ON (not VM_BUG_ON) inserted out of pure paranoia, so I know as a matter of fact that they're worth the little cost. It's hard to tell if things didn't get worse, if the workload continued, or even if I ended up getting a bugreport in the first place with only a WARN_ON variant, precisely because a WARN_ON isn't necessarily a bug. Example: when a WARN_ON in the network code showup (and they do once in a while as there are so many), nobody panics because we assume it may not actually be a bug so we can cross finger it goes away at the next git fetch... not even sure if they all get reported in the first place. BUG_ONs are terribly annoying when they trigger, and even worse if they're false positives, but they're worth the pain in my view. Of course what's unacceptable is that BUG_ON can be triggered at will by userland, that would be a security issue. Just in case I verified to run two UFFDIO_API in a row and a UFFDIO_REGISTER without an UFFDIO_API before it, and no BUG_ON triggers with this code inserted. Said that it's your choice, so I'm not going to argue further about this and I'm sure fine with WARN_ONCE too, there were a few more to convert in the state machine invariant checks. While at it I can also use VM_WARN_ONCE to cover my performance concern. Thanks, Andrea
On Jun 15, 2015 11:43 AM, "Andrea Arcangeli" <aarcange@redhat.com> wrote: > > Several times I got very hardly reproducible bugs noticed purely > because of BUG_ON (not VM_BUG_ON) Feel free to use them while developing. Don't send me patches with your broken debug code, though. For users, a dead machine means that it is less likely you will ever get a bug report. People set "reboot on oops", and when running X is not always something that can be seen anyway. They'll just see a rebooting or a dead machine, and not send you any debug output. This is not negotiable. Seriously. Get rid of the BUG_ON if you expect your patches to be merged mainline. Also, even for debugging, using something like if (WARN_ON_ONCE(...)) return -EINVAL; is the right thing to do. Then you can unwind locks etc, and return cleanly, and you'll only get one warning rather than a stream of them etc. So stop making excuses for your bad BUG_ON use. Do it in private where nobody can see your perversions, or do it right. Linus
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 5f11678..b69d236 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1115,6 +1115,12 @@ static long userfaultfd_ioctl(struct file *file, unsigned cmd, int ret = -EINVAL; struct userfaultfd_ctx *ctx = file->private_data; + if (cmd != UFFDIO_API) { + if (ctx->state == UFFD_STATE_WAIT_API) + return -EINVAL; + BUG_ON(ctx->state != UFFD_STATE_RUNNING); + } + switch(cmd) { case UFFDIO_API: ret = userfaultfd_api(ctx, arg);
UFFDIO_API was already forced before read/poll could work. This makes the code more strict to force it also for all other ioctls. All users would already have been required to call UFFDIO_API before invoking other ioctls but this makes it more explicit. This will ensure we can change all ioctls (all but UFFDIO_API/struct uffdio_api) with a bump of uffdio_api.api. There's no actual plan or need to change the API or the ioctl, the current API already should cover fine even the non cooperative usage, but this is just for the longer term future just in case. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> --- fs/userfaultfd.c | 6 ++++++ 1 file changed, 6 insertions(+)