[v10,2/3] NETFILTER module xt_hmark, new target for HASH based fwmark

The target allows you to create rules in the "raw" and "mangle" tables
which alter the netfilter mark (nfmark) field within a given range.
First a 32 bit hash value is generated then modulus by <limit> and
finally an offset is added before it's written to nfmark.
Prior to routing, the nfmark can influence the routing method (see
"Use netfilter MARK value as routing key") and can also be used by
other subsystems to change their behavior.

man page
   HMARK
       This  module  does  the  same  as MARK, i.e. set an fwmark, but the mark
       is based on a hash value.  The hash is based on saddr, daddr, sport,
       dport and proto. The same mark will be produced independent of direction
       if no masks is set or the same masks is used for src and dest.
       The hash mark could be adjusted by modulus and finally an offset could
       be added, i.e the final mark will be within a range. ICMP error will use
       the the original message for hash calculation not the icmp it self.

       Note: IPv4 packets with nf_defrag_ipv4 loaded will be defragmented before they reach hmark,
             IPv6 nf_defrag is not implemented this way, hence fragmented ipv6 packets will reach hmark.
             Default behavior is to completely ignore any fragment if it reach hmark.
             --hmark-method L3 is fragment safe since neither ports or L4 protocol field is used.
             None of the parameters effect the packet it self only the calculated hash value.

       Parameters: Short hand methods

       --hmark-method L3
              Do not use L4 protocol field, ports or spi, only Layer 3 addresses,
              mask length of L3 addresses can still be used. Fragment or not
              does not matter in this case since only L3 address can be used in
              calc. of hash value.

       --hmark-method L3-4 (Default)
              Include  L4  in  calculation. of hash value i.e. all masks below are valid.
              Fragments will be ignored. (i.e no hash value produced)

       For all masks default is all "1:s", to disable a field use mask 0

       --hmark-src-mask length
              The length of the mask to AND the source address with (saddr & value).

       --hmark-dst-mask length
              The length of the mask to AND the dest. address with (daddr & value).

       --hmark-sport-mask value
              A 16 bit value to AND the src port with (sport & value).

       --hmark-dport-mask value
              A 16 bit value to AND the dest port with (dport & value).

       --hmark-sport-set value
              A 16 bit value to OR the src port with (sport | value).

       --hmark-dport-set value
              A 16 bit value to OR the dest port with (dport | value).

       --hmark-spi-mask value
              Value to AND the spi field with (spi & value) valid for proto esp or ah.

       --hmark-spi-set value
              Value to OR the spi field with (spi | value) valid for proto esp or ah.

       --hmark-proto-mask value
              An 8 bit value to AND the L4 proto field with (proto & value).

       --hmark-ct
              When flag is set, conntrack data should be used. Useful when NAT internal
              addressed should be used in calculation.  Be careful when using DNAT
              since mangle table is handled before nat table. I.e it will not work as
              expected to put HMARK in table mangle and PREROUTING chain. The  initial
              packet will have it's hash based on the original address,
              while the rest of the flow will use the NAT:ed address.

       --hmark-rnd value
              A 32 bit initial value for hash calc, default is 0xc175a3b8.

       Final processing of the mark in order of execution.

       --hmark-mod value (must be > 0)
              The easiest way to describe this is:  hash = hash mod <value>

       --hmark-offset value
              The easiest way to describe this is:  hash = hash + <value>

       Examples:

       Default rule handles all TCP, UDP, SCTP, ESP & AH

              iptables -t mangle -A PREROUTING -m state --state NEW,ESTABLISHED,RELATED
               -j HMARK --hmark-offset 10000 --hmark-mod 10

       Handle SCTP and hash dest port only and produce a nfmark between 100-119.

              iptables -t mangle -A PREROUTING -p SCTP -j HMARK --src-mask 0 --dst-mask 0
               --sp-mask 0 --offset 100 --mod 20

       Fragment safe Layer 3 only, that keep a class C network flow together

              iptables -t mangle -A PREROUTING -j HMARK --method L3 --src-mask 24 --mod 20 --offset 100

Rev 10
     Even more simplified NAT handling just one switch --hmark-ct
     some renaming and some minor changes.
     Changes are based on Pablos review.

Rev 9
      Simplified NAT selections, cleanup of comments, added checkentry()
      change of #ifdef to #if IS_ENABLED and dependency.
      Some minor formating.
      Most changes are based on Pablos review.

Rev 8
      method L3 / L3-4 added i.e. Fragment handling changed to
      don't handle in "method L3-4"
      Syntax change in user mode more NF compatible.
      Most changes are based on Pablos review.

Rev 7
      IPv6 descending into icmp error hdr didn't work as expected
      with ipv6_find_hdr() Now it works as expected.

Rev 6
      Compile options with or without conntrack fixed.
      __ipv6_find_hdr() replaced by ipv6_find_hdr()

Rev 5
      IPv6 rewritten uses __ipv6_find_hdr() (P. Mc Hardy)
      Full mask and address used for IPv6 smask and dmask (J.Engelhart)
      Changes due to comments by Pablo Neira Ayuso  and Eric Dumazet
      i.e uses of skb_header_pointer() and Null check of info->hmod
      Man page changes

Rev 4
      different targets for IPv4 and IPv6
      Changes based on review by Pablo.

Rev 3
      Support added to SCTP for IPv6
Rev 2
      IPv6 header scan changed to follow RFC 2640
      IPv4 icmp echo fragmented does now use proto as ipv6
      IPv6 pskb_may_pull() check is done in every time in header loop.
      IPv4 nat support added.
      default added in IPv6 loop and null check of hp

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
---
 include/linux/netfilter/xt_HMARK.h |   62 +++++++
 net/netfilter/Kconfig              |   18 ++
 net/netfilter/Makefile             |    1 +
 net/netfilter/xt_HMARK.c           |  319 ++++++++++++++++++++++++++++++++++++
 4 files changed, 400 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/netfilter/xt_HMARK.h
 create mode 100644 net/netfilter/xt_HMARK.c

[v10,2/3] NETFILTER module xt_hmark, new target for HASH based fwmark

Commit Message

Patch