Message ID | 20160602001759.GF1644@akamai.com |
---|---|
State | Awaiting Upstream, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, Jun 01, 2016 at 08:17:59PM -0400, Vishwanath Pai wrote: > libxt_hashlimit: iptables-restore does not work as expected with xt_hashlimit > > Add the following iptables rule. > > $ iptables -A INPUT -m hashlimit --hashlimit-above 200/sec \ > --hashlimit-burst 5 --hashlimit-mode srcip --hashlimit-name hashlimit1 \ > --hashlimit-htable-expire 30000 -j DROP > > $ iptables-save > save.txt > > Edit save.txt and change the value of --hashlimit-above to 300: > > -A INPUT -m hashlimit --hashlimit-above 300/sec --hashlimit-burst 5 \ > --hashlimit-mode srcip --hashlimit-name hashlimit2 \ > --hashlimit-htable-expire 30000 -j DROP > > Now restore save.txt > > $ iptables-restore < save.txt In this case, we don't end up with two rules, we actually get one single hashlimit rule, given the sequence you provide. $ iptables-save > save.txt ... edit save.txt $ iptables-restore < save.txt > Now userspace thinks that the value of --hashlimit-above is 300 but it is > actually 200 in the kernel. This happens because when we add multiple > hash-limit rules with the same name they will share the same hashtable > internally. The kernel module tries to re-use the old hashtable without > updating the values. > > There are multiple problems here: > 1) We can add two iptables rules with the same name, but kernel does not > handle this well, one procfs file cannot work with two rules > 2) If the second rule has no effect because the hashtable has values from > rule 1 > 3) hashtable-restore does not work (as described above) > > To fix this I have made the following design change: > 1) If a second rule is added with the same name as an existing rule, > append a number when we create the procfs, for example hashlimit_1, > hashlimit_2 etc > 2) Two rules will not share the same hashtable unless they are similar in > every possible way > 3) This behavior has to be forced with a new userspace flag: > --hashlimit-ehanced-procfs, if this flag is not passed we default to > the old behavior. This is to make sure we do not break existing scripts > that rely on the existing behavior. We discussed this in netdev0.1, and I think we agreed on adding a new option, something like --hashlimit-update that would force an update to the existing hashlimit internal state (that is identified by the hashlimit name). I think the problem here is that you may want to update the internal state of an existing hashlimit object, and currently this is not actually happening. With the explicit --hashlimit-update flag, from the kernel we really know that the user wants an update. Let me know, thanks.
On 06/23/2016 06:25 AM, Pablo Neira Ayuso wrote: > On Wed, Jun 01, 2016 at 08:17:59PM -0400, Vishwanath Pai wrote: >> libxt_hashlimit: iptables-restore does not work as expected with xt_hashlimit >> >> Add the following iptables rule. >> >> $ iptables -A INPUT -m hashlimit --hashlimit-above 200/sec \ >> --hashlimit-burst 5 --hashlimit-mode srcip --hashlimit-name hashlimit1 \ >> --hashlimit-htable-expire 30000 -j DROP >> >> $ iptables-save > save.txt >> >> Edit save.txt and change the value of --hashlimit-above to 300: >> >> -A INPUT -m hashlimit --hashlimit-above 300/sec --hashlimit-burst 5 \ >> --hashlimit-mode srcip --hashlimit-name hashlimit2 \ >> --hashlimit-htable-expire 30000 -j DROP >> >> Now restore save.txt >> >> $ iptables-restore < save.txt > > In this case, we don't end up with two rules, we actually get one > single hashlimit rule, given the sequence you provide. > > $ iptables-save > save.txt > ... edit save.txt > $ iptables-restore < save.txt > Yes, we end up with just one rule, but the kernel data structure is not updated. Userspace thinks the value is 300/s but in the kernel it is still 200/s. >> Now userspace thinks that the value of --hashlimit-above is 300 but it is >> actually 200 in the kernel. This happens because when we add multiple >> hash-limit rules with the same name they will share the same hashtable >> internally. The kernel module tries to re-use the old hashtable without >> updating the values. >> >> There are multiple problems here: >> 1) We can add two iptables rules with the same name, but kernel does not >> handle this well, one procfs file cannot work with two rules >> 2) If the second rule has no effect because the hashtable has values from >> rule 1 >> 3) hashtable-restore does not work (as described above) >> >> To fix this I have made the following design change: >> 1) If a second rule is added with the same name as an existing rule, >> append a number when we create the procfs, for example hashlimit_1, >> hashlimit_2 etc >> 2) Two rules will not share the same hashtable unless they are similar in >> every possible way >> 3) This behavior has to be forced with a new userspace flag: >> --hashlimit-ehanced-procfs, if this flag is not passed we default to >> the old behavior. This is to make sure we do not break existing scripts >> that rely on the existing behavior. > > We discussed this in netdev0.1, and I think we agreed on adding a new > option, something like --hashlimit-update that would force an update > to the existing hashlimit internal state (that is identified by the > hashlimit name). > > I think the problem here is that you may want to update the internal > state of an existing hashlimit object, and currently this is not > actually happening. > > With the explicit --hashlimit-update flag, from the kernel we really > know that the user wants an update. > > Let me know, thanks. > Yes, I believe you had a discussion about this with Josh Hunt. This patch does add a new option, but it is called -enhanced-procfs instead. I am open to renaming this to something else. I chose this name because this patch will affect the names of the procfs files when multiple rules with the same name exist. This generally does not happen, but is a side effect of the way we create these files. In the case of restore example above - we get the call to "hashlimit_mt_check" for the new rule before the old rule is deleted, so there is a short window where we have two rules in the kernel with the same name. Other than that, we are doing exactly what you said, but creating a new entry in the hashtable instead of updating it. The previous entry will automatically be removed when the old rule is flushed/deleted. Users will see this new behavior only if the new option is passed, otherwise we default to the old behavior. We are also doing this in rev2 so old userspace tools will not be affected. Thanks, Vishwanath
Hi, On Fri, Jun 24, 2016 at 02:24:18PM -0400, Vishwanath Pai wrote: > On 06/23/2016 06:25 AM, Pablo Neira Ayuso wrote: > > On Wed, Jun 01, 2016 at 08:17:59PM -0400, Vishwanath Pai wrote: > >> libxt_hashlimit: iptables-restore does not work as expected with xt_hashlimit > >> > >> Add the following iptables rule. > >> > >> $ iptables -A INPUT -m hashlimit --hashlimit-above 200/sec \ > >> --hashlimit-burst 5 --hashlimit-mode srcip --hashlimit-name hashlimit1 \ > >> --hashlimit-htable-expire 30000 -j DROP > >> > >> $ iptables-save > save.txt > >> > >> Edit save.txt and change the value of --hashlimit-above to 300: > >> > >> -A INPUT -m hashlimit --hashlimit-above 300/sec --hashlimit-burst 5 \ > >> --hashlimit-mode srcip --hashlimit-name hashlimit2 \ > >> --hashlimit-htable-expire 30000 -j DROP > >> > >> Now restore save.txt > >> > >> $ iptables-restore < save.txt > > > > In this case, we don't end up with two rules, we actually get one > > single hashlimit rule, given the sequence you provide. > > > > $ iptables-save > save.txt > > ... edit save.txt > > $ iptables-restore < save.txt > > > > Yes, we end up with just one rule, but the kernel data structure is not > updated. Userspace thinks the value is 300/s but in the kernel it is > still 200/s. Right, but the main point of this is to honor the new rule configuration, ie. to update the internal hashlimit configuration of the previous rules. > >> Now userspace thinks that the value of --hashlimit-above is 300 but it is > >> actually 200 in the kernel. This happens because when we add multiple > >> hash-limit rules with the same name they will share the same hashtable > >> internally. The kernel module tries to re-use the old hashtable without > >> updating the values. > >> > >> There are multiple problems here: > >> 1) We can add two iptables rules with the same name, but kernel does not > >> handle this well, one procfs file cannot work with two rules > >> 2) If the second rule has no effect because the hashtable has values from > >> rule 1 > >> 3) hashtable-restore does not work (as described above) > >> > >> To fix this I have made the following design change: > >> 1) If a second rule is added with the same name as an existing rule, > >> append a number when we create the procfs, for example hashlimit_1, > >> hashlimit_2 etc > >> 2) Two rules will not share the same hashtable unless they are similar in > >> every possible way > >> 3) This behavior has to be forced with a new userspace flag: > >> --hashlimit-ehanced-procfs, if this flag is not passed we default to > >> the old behavior. This is to make sure we do not break existing scripts > >> that rely on the existing behavior. > > > > We discussed this in netdev0.1, and I think we agreed on adding a new > > option, something like --hashlimit-update that would force an update > > to the existing hashlimit internal state (that is identified by the > > hashlimit name). > > > > I think the problem here is that you may want to update the internal > > state of an existing hashlimit object, and currently this is not > > actually happening. > > > > With the explicit --hashlimit-update flag, from the kernel we really > > know that the user wants an update. > > > > Let me know, thanks. > > Yes, I believe you had a discussion about this with Josh Hunt. This > patch does add a new option, but it is called -enhanced-procfs instead. > I am open to renaming this to something else. I chose this name because > this patch will affect the names of the procfs files when multiple rules > with the same name exist. This generally does not happen, but is a side > effect of the way we create these files. In the case of restore example > above - we get the call to "hashlimit_mt_check" for the new rule before > the old rule is deleted, so there is a short window where we have two > rules in the kernel with the same name. I see, but I'm not convinced about this /proc rename feature. I think the main point of this, as well as other entries in bugzilla related to this, is ability to update an existing hashlimit state. So, I'm not proposing to rename --enhanced-procfs to something else, I think that a different approach consisting on adding a new option like --hashlimit-update that will update the internal state of an existing hashlimit object is just fine for your usecase, right? > Other than that, we are doing exactly what you said, but creating a new > entry in the hashtable instead of updating it. The previous entry will > automatically be removed when the old rule is flushed/deleted. What I'm missing is why we need this /proc rename at all.
On 06/25/2016 05:39 AM, Pablo Neira Ayuso wrote: > I see, but I'm not convinced about this /proc rename feature. > > I think the main point of this, as well as other entries in bugzilla > related to this, is ability to update an existing hashlimit state. > > So, I'm not proposing to rename --enhanced-procfs to something else, > I think that a different approach consisting on adding a new option > like --hashlimit-update that will update the internal state of an > existing hashlimit object is just fine for your usecase, right? > >> > Other than that, we are doing exactly what you said, but creating a new >> > entry in the hashtable instead of updating it. The previous entry will >> > automatically be removed when the old rule is flushed/deleted. > What I'm missing is why we need this /proc rename at all. The reason we need the procfs rename feature is because it is broken at the moment. Let us assume someone adds two rules with the same name (intentionally or otherwise). We cannot prevent them from doing this or error out when someone does this because all of this is done in hashlimit_mt_check which is called for every iptables rule change, even an entirely different rule. I'll demonstrate two scenarios here. I have put debug printk statements which prints everytime hashlimit_mt_check is called. 1) Add two rules with the same name but in different chains $ iptables -A chain1 -m hashlimit --hashlimit-above 200/sec \ --hashlimit-mode srcip --hashlimit-name hashlimit1 -j DROP $ dmesg -c [ 103.965578] hashlimit_mt_check for rule hashlimit1 $ iptables -A chain2 -m hashlimit --hashlimit-above 300/sec \ --hashlimit-mode srcip --hashlimit-name hashlimit1 -j DROP $ dmesg -c [ 114.613758] hashlimit_mt_check for rule hashlimit1 [ 114.621360] hashlimit_mt_check for rule hashlimit1 [ 114.627411] hashlimit_mt_destroy on hashlimit1 2) Replace an iptables rule with iptables-restore $ iptables -A chain1 -m hashlimit --hashlimit-above 200/sec \ --hashlimit-mode srcip --hashlimit-name hashlimit1 -j DROP $ iptables-save > /tmp/hashlimit $ vi /tmp/hashlimit (edit 200/s to 300/s) $ iptables-restore < /tmp/hashlimit $ dmesg -c [ 1585.411093] hashlimit_mt_check for rule hashlimit1 [ 1585.418948] hashlimit_mt_destroy on hashlimit1 In case 1 there exists two rules with the same name but we cannot have procfs files for both of the rules since they have to exist in the same directory. In case 2 there will be only one rule but there is a small window where two rules with same name exist. We cannot differentiate this from case 1. In both the cases we get the call for hashlimit_mt_check for the new rule before the old rule is deleted. Without the rename feature I do not know how to correctly handle the scenario where two rules with different parameters but the same name exist. I believe the rest of the patch handles the --hashlimit-update feature you mentioned, but instead of updating an existing object it creates a new one and the old object is deleted by the call to destroy. The hashtable match function is modified to include all parameters of the object and not just the name so that we can reuse objects that have the exact same features.
On 07/05/2016 04:13 PM, Vishwanath Pai wrote: > On 06/25/2016 05:39 AM, Pablo Neira Ayuso wrote: >> I see, but I'm not convinced about this /proc rename feature. >> >> I think the main point of this, as well as other entries in bugzilla >> related to this, is ability to update an existing hashlimit state. >> >> So, I'm not proposing to rename --enhanced-procfs to something else, >> I think that a different approach consisting on adding a new option >> like --hashlimit-update that will update the internal state of an >> existing hashlimit object is just fine for your usecase, right? >> >>>> Other than that, we are doing exactly what you said, but creating a new >>>> entry in the hashtable instead of updating it. The previous entry will >>>> automatically be removed when the old rule is flushed/deleted. >> What I'm missing is why we need this /proc rename at all. > > The reason we need the procfs rename feature is because it is broken at > the moment. Let us assume someone adds two rules with the same name > (intentionally or otherwise). We cannot prevent them from doing this or > error out when someone does this because all of this is done in > hashlimit_mt_check which is called for every iptables rule change, even > an entirely different rule. I'll demonstrate two scenarios here. I have > put debug printk statements which prints everytime hashlimit_mt_check is > called. > > 1) Add two rules with the same name but in different chains > > $ iptables -A chain1 -m hashlimit --hashlimit-above 200/sec \ > --hashlimit-mode srcip --hashlimit-name hashlimit1 -j DROP > > $ dmesg -c > [ 103.965578] hashlimit_mt_check for rule hashlimit1 > > > $ iptables -A chain2 -m hashlimit --hashlimit-above 300/sec \ > --hashlimit-mode srcip --hashlimit-name hashlimit1 -j DROP > > $ dmesg -c > [ 114.613758] hashlimit_mt_check for rule hashlimit1 > [ 114.621360] hashlimit_mt_check for rule hashlimit1 > [ 114.627411] hashlimit_mt_destroy on hashlimit1 > > 2) Replace an iptables rule with iptables-restore > > $ iptables -A chain1 -m hashlimit --hashlimit-above 200/sec \ > --hashlimit-mode srcip --hashlimit-name hashlimit1 -j DROP > > $ iptables-save > /tmp/hashlimit > > $ vi /tmp/hashlimit (edit 200/s to 300/s) > > $ iptables-restore < /tmp/hashlimit > > $ dmesg -c > [ 1585.411093] hashlimit_mt_check for rule hashlimit1 > [ 1585.418948] hashlimit_mt_destroy on hashlimit1 > > In case 1 there exists two rules with the same name but we cannot have > procfs files for both of the rules since they have to exist in the same > directory. In case 2 there will be only one rule but there is a small > window where two rules with same name exist. We cannot differentiate > this from case 1. In both the cases we get the call for > hashlimit_mt_check for the new rule before the old rule is deleted. > > Without the rename feature I do not know how to correctly handle the > scenario where two rules with different parameters but the same name exist. > > I believe the rest of the patch handles the --hashlimit-update feature > you mentioned, but instead of updating an existing object it creates a > new one and the old object is deleted by the call to destroy. The > hashtable match function is modified to include all parameters of the > object and not just the name so that we can reuse objects that have the > exact same features. > There is also another option of removing the procfs files altogether in revision 2. The files were only used for debugging and any users relying on the file being present will continue to see it as long as they are in revision 1. If we do need these files, we can create another option in rev2 which explicitly enables procfs but leave them disabled by default. And we can retain the existing behavior assuming whoever is using it knows its caveats. Please suggest. Thanks, Vishwanath
On Wed, Jul 06, 2016 at 06:26:39PM -0400, Vishwanath Pai wrote: > On 07/05/2016 04:13 PM, Vishwanath Pai wrote: > > On 06/25/2016 05:39 AM, Pablo Neira Ayuso wrote: > >> I see, but I'm not convinced about this /proc rename feature. > >> > >> I think the main point of this, as well as other entries in bugzilla > >> related to this, is ability to update an existing hashlimit state. > >> > >> So, I'm not proposing to rename --enhanced-procfs to something else, > >> I think that a different approach consisting on adding a new option > >> like --hashlimit-update that will update the internal state of an > >> existing hashlimit object is just fine for your usecase, right? > >> > >>>> Other than that, we are doing exactly what you said, but creating a new > >>>> entry in the hashtable instead of updating it. The previous entry will > >>>> automatically be removed when the old rule is flushed/deleted.> >> What I'm missing is why we need this /proc rename at all. > >> What I'm missing is why we need this /proc rename at all. > > > > The reason we need the procfs rename feature is because it is broken at > > the moment. Let us assume someone adds two rules with the same name > > (intentionally or otherwise). We cannot prevent them from doing this or > > error out when someone does this because all of this is done in > > hashlimit_mt_check which is called for every iptables rule change, even > > an entirely different rule. I'll demonstrate two scenarios here. I have > > put debug printk statements which prints everytime hashlimit_mt_check is > > called. > > > > 1) Add two rules with the same name but in different chains > > > > $ iptables -A chain1 -m hashlimit --hashlimit-above 200/sec \ > > --hashlimit-mode srcip --hashlimit-name hashlimit1 -j DROP > > > > $ dmesg -c > > [ 103.965578] hashlimit_mt_check for rule hashlimit1 > > > > > > $ iptables -A chain2 -m hashlimit --hashlimit-above 300/sec \ > > --hashlimit-mode srcip --hashlimit-name hashlimit1 -j DROP > > > > $ dmesg -c > > [ 114.613758] hashlimit_mt_check for rule hashlimit1 > > [ 114.621360] hashlimit_mt_check for rule hashlimit1 > > [ 114.627411] hashlimit_mt_destroy on hashlimit1 We have to keep the existing behaviour. Yes, it's broken or ambiguos but there may be people outthere relying on this. What I think we can do to resolve this scenario that you describe abobe is to provide a new option: --hashlimit-exclusive this turns on a flag in the configuration that results in a bail out if any of these two happens: 1) there is an existing hashlimit with that name already. 2) someone tries to add a hashlimit with a clashing name with no --hashlimit-exclusive. So the --hashlimit-exclusive just makes sure the user gets a unique name and silly things don't happen. If not specified, we rely on the existing (broken) behaviour. > > 2) Replace an iptables rule with iptables-restore > > > > $ iptables -A chain1 -m hashlimit --hashlimit-above 200/sec \ > > --hashlimit-mode srcip --hashlimit-name hashlimit1 -j DROP > > > > $ iptables-save > /tmp/hashlimit > > > > $ vi /tmp/hashlimit (edit 200/s to 300/s) > > > > $ iptables-restore < /tmp/hashlimit > > > > $ dmesg -c > > [ 1585.411093] hashlimit_mt_check for rule hashlimit1 > > [ 1585.418948] hashlimit_mt_destroy on hashlimit1 For this case, you rename your enhance-proc option to: --hashlimit-update as I proposed, and remove the /proc entry rename logic. > > In case 1 there exists two rules with the same name but we cannot have > > procfs files for both of the rules since they have to exist in the same > > directory. In case 2 there will be only one rule but there is a small > > window where two rules with same name exist. We cannot differentiate > > this from case 1. In both the cases we get the call for > > hashlimit_mt_check for the new rule before the old rule is deleted. > > > > Without the rename feature I do not know how to correctly handle the > > scenario where two rules with different parameters but the same name exist. With the explicit new options, we can indeed know where to go. If no options are specified, we just keep behaving in the same (broken) way that we provide now. This is the way hashlimit has been working since day one Does this sound good to you? Thanks for not giving up on sorting out this problem.
On 07/08/2016 07:54 AM, Pablo Neira Ayuso wrote: > We have to keep the existing behaviour. Yes, it's broken or ambiguos > but there may be people outthere relying on this. > > What I think we can do to resolve this scenario that you describe > abobe is to provide a new option: > > --hashlimit-exclusive > > this turns on a flag in the configuration that results in a bail out > if any of these two happens: > > 1) there is an existing hashlimit with that name already. > 2) someone tries to add a hashlimit with a clashing name with no > --hashlimit-exclusive. > > So the --hashlimit-exclusive just makes sure the user gets a unique > name and silly things don't happen. > > If not specified, we rely on the existing (broken) behaviour. > Bailing out if another rule of the same name exists is not an option right now. Case1 and case2 will look exactly the same to the kernel module, if you look at the dmesg call for case2 hashlimit_mt_check for the new rule is called first and then the old rule is deleted so as far as the kernel module is concerned it appears like someone is adding a new rule with same name as the existing rule. Adding --hashlimit-exclusive will solve case 1 but it will error out for case 2 as well. In case 1 the user is indeed trying to add a rule with the same name as an existing rule and we can bail out, but in case2 we are merely restoring it to its original state. But the kernel module won't be able to differentiate between case1 and case2 because we have no way of telling whether a user is adding another rule or restoring it. So if a user tries case2 with --hashlimit-exclusive then iptables-restore will fail which will be totally unexpected. One thing we can do to fix this is to make procfs file creation optional. Right now --hashlimit-name is a mandatory option, I think we can make this optional so that no proc file is created if the option is not specified. This will work perfectly with --hashlimit-update and with both case1 and case2. >>> > > 2) Replace an iptables rule with iptables-restore >>> > > >>> > > $ iptables -A chain1 -m hashlimit --hashlimit-above 200/sec \ >>> > > --hashlimit-mode srcip --hashlimit-name hashlimit1 -j DROP >>> > > >>> > > $ iptables-save > /tmp/hashlimit >>> > > >>> > > $ vi /tmp/hashlimit (edit 200/s to 300/s) >>> > > >>> > > $ iptables-restore < /tmp/hashlimit >>> > > >>> > > $ dmesg -c >>> > > [ 1585.411093] hashlimit_mt_check for rule hashlimit1 >>> > > [ 1585.418948] hashlimit_mt_destroy on hashlimit1 > For this case, you rename your enhance-proc option to: > > --hashlimit-update > > as I proposed, and remove the /proc entry rename logic. > I will remove the proc entry rename logic and change the option to --hashlimit-update. This will work for both the cases provided we make the procfs file creation optional. If a user specifies --hashlimit-name then we fall back to the old behavior. >>> > > In case 1 there exists two rules with the same name but we cannot have >>> > > procfs files for both of the rules since they have to exist in the same >>> > > directory. In case 2 there will be only one rule but there is a small >>> > > window where two rules with same name exist. We cannot differentiate >>> > > this from case 1. In both the cases we get the call for >>> > > hashlimit_mt_check for the new rule before the old rule is deleted. >>> > > >>> > > Without the rename feature I do not know how to correctly handle the >>> > > scenario where two rules with different parameters but the same name exist. > With the explicit new options, we can indeed know where to go. If no > options are specified, we just keep behaving in the same (broken) way > that we provide now. > > This is the way hashlimit has been working since day one > > Does this sound good to you? > > Thanks for not giving up on sorting out this problem. No problem, this turned out to be trickier to solve than I originally expected. Please let me know what you think about what I proposed above, I will send a V2 after that. Thanks, Vishwanath
diff --git a/extensions/libxt_hashlimit.c b/extensions/libxt_hashlimit.c index 4193464..ac67875 100644 --- a/extensions/libxt_hashlimit.c +++ b/extensions/libxt_hashlimit.c @@ -67,6 +67,7 @@ enum { O_HTABLE_MAX, O_HTABLE_GCINT, O_HTABLE_EXPIRE, + O_PROCFS, F_BURST = 1 << O_BURST, F_UPTO = 1 << O_UPTO, F_ABOVE = 1 << O_ABOVE, @@ -177,6 +178,7 @@ static const struct xt_option_entry hashlimit_mt_opts[] = { {.name = "hashlimit-mode", .id = O_MODE, .type = XTTYPE_STRING}, {.name = "hashlimit-name", .id = O_NAME, .type = XTTYPE_STRING, .flags = XTOPT_MAND | XTOPT_PUT, XTOPT_POINTER(s, name), .min = 1}, + {.name = "hashlimit-enhanced-procfs", .id = O_PROCFS, .type = XTTYPE_NONE}, XTOPT_TABLEEND, }; #undef s @@ -521,6 +523,9 @@ static void hashlimit_mt_parse(struct xt_option_call *cb) case O_DSTMASK: info->cfg.dstmask = cb->val.hlen; break; + case O_PROCFS: + info->cfg.flags |= XT_HASHLIMIT_FLAG_PROCFS; + break; } } @@ -856,6 +861,9 @@ hashlimit_mt_save(const struct hashlimit_cfg2 *cfg, const char* name, unsigned i printf(" --hashlimit-srcmask %u", cfg->srcmask); if (cfg->dstmask != dmask) printf(" --hashlimit-dstmask %u", cfg->dstmask); + + if ((revision == 2) && (cfg->flags & XT_HASHLIMIT_FLAG_PROCFS) ) + printf(" --hashlimit-enhanced-procfs"); } static void diff --git a/extensions/libxt_hashlimit.man b/extensions/libxt_hashlimit.man index 6aac3f2..0434f03 100644 --- a/extensions/libxt_hashlimit.man +++ b/extensions/libxt_hashlimit.man @@ -40,6 +40,9 @@ Like \-\-hashlimit\-srcmask, but for destination addresses. \fB\-\-hashlimit\-name\fP \fIfoo\fP The name for the /proc/net/ipt_hashlimit/foo entry. .TP +\fB\-\-hashlimit\-enhanced\-procfs\fP +Append _number to the procfs file when multiple rules with the same name exist +.TP \fB\-\-hashlimit\-htable\-size\fP \fIbuckets\fP The number of buckets of the hash table .TP diff --git a/include/linux/netfilter/xt_hashlimit.h b/include/linux/netfilter/xt_hashlimit.h index e493fc1..2954381 100644 --- a/include/linux/netfilter/xt_hashlimit.h +++ b/include/linux/netfilter/xt_hashlimit.h @@ -16,6 +16,10 @@ struct xt_hashlimit_htable; enum { + XT_HASHLIMIT_FLAG_PROCFS = 1 +}; + +enum { XT_HASHLIMIT_HASH_DIP = 1 << 0, XT_HASHLIMIT_HASH_DPT = 1 << 1, XT_HASHLIMIT_HASH_SIP = 1 << 2, @@ -74,6 +78,9 @@ struct hashlimit_cfg2 { __u32 expire; /* when do entries expire? */ __u8 srcmask, dstmask; + + __u8 procfs_suffix; + __u32 flags; }; struct xt_hashlimit_mtinfo1 {
libxt_hashlimit: iptables-restore does not work as expected with xt_hashlimit Add the following iptables rule. $ iptables -A INPUT -m hashlimit --hashlimit-above 200/sec \ --hashlimit-burst 5 --hashlimit-mode srcip --hashlimit-name hashlimit1 \ --hashlimit-htable-expire 30000 -j DROP $ iptables-save > save.txt Edit save.txt and change the value of --hashlimit-above to 300: -A INPUT -m hashlimit --hashlimit-above 300/sec --hashlimit-burst 5 \ --hashlimit-mode srcip --hashlimit-name hashlimit2 \ --hashlimit-htable-expire 30000 -j DROP Now restore save.txt $ iptables-restore < save.txt Now userspace thinks that the value of --hashlimit-above is 300 but it is actually 200 in the kernel. This happens because when we add multiple hash-limit rules with the same name they will share the same hashtable internally. The kernel module tries to re-use the old hashtable without updating the values. There are multiple problems here: 1) We can add two iptables rules with the same name, but kernel does not handle this well, one procfs file cannot work with two rules 2) If the second rule has no effect because the hashtable has values from rule 1 3) hashtable-restore does not work (as described above) To fix this I have made the following design change: 1) If a second rule is added with the same name as an existing rule, append a number when we create the procfs, for example hashlimit_1, hashlimit_2 etc 2) Two rules will not share the same hashtable unless they are similar in every possible way 3) This behavior has to be forced with a new userspace flag: --hashlimit-ehanced-procfs, if this flag is not passed we default to the old behavior. This is to make sure we do not break existing scripts that rely on the existing behavior. Signed-off-by: Vishwanath Pai <vpai@akamai.com>