My recent looking into ufw and kernel modules has resulted in a considerable increase in my understanding about netfilter and iptables - including aspects like
the netfilter hooks
the built in chains
the different tables
connection tracking
kernel helper modules
As a result I was seeing some weaknesses in my iptables scripts and rule sets, and that my rulesets were wasting resources like processor time and memory usage.
So this web page is about a revisit to netfilter and iptables.
NB - this webpage only considers IPv4 - netfilter for IPv6 may have differences.
And this webpage is based on iptables - now although iptables still interacts with netfilter, ongoing development of netfilter and the kernel means that iptables is getting left behind - I should really be looking at nftables now - nftables is a newer userspace tool for programming netfilter.
Early versions of firewalld were based on iptables - newer versions are based on nftables.
My recent looking at ufw showed that ufw is still based on iptables, even on a machine running kernel version 6.1.
I guess the best place to start is with the kernel - and how netfilter interacts with the data packets travelling through the kernel.
Most websites describing netfilter list five "hooks" - or places where the kernel interacts with a data packet -
NF_IP_PRE_ROUTING - all incoming packets will go through this hook - both packets for the local machine and packets that are to be forwarded on to another network
NF_IP_LOCAL_IN - all incoming packets for the local machine go through this hook - so this hook is after the routing has been done to separate out the packets for the local machine and the packets to be forwarded on
NF_IP_FORWARD - the packets that are to be forwarded on go through this hook - so this hook is after the routing has been done to separate out the packets for the local machine and the packets to be forwarded on
NF_IP_LOCAL_OUT - all outgoing packets coming from the local machine go through this hook
NF_IP_POST_ROUTING - all outgoing packets coming from both the local machine and forwarding go through this hook
So the packet flow through netfilter for packets going from the local applications and processes out from the machine is -
local applications and processes
↓
routing
↓
NF_IP_LOCAL_OUT
↓
NF_IP_POST_ROUTING
↓
network interface
And the packet flow through netfilter for packets coming into the machine heading for the local applications and processes is -
network interface
↓
NF_IP_PREROUTING
↓
routing
↓
NF_IP_LOCAL_IN
↓
local applications and processes
The packet flow through netfilter for packets that are being forwarded through the firewall is -
network interface
↓
NF_IP_PREROUTING
↓
routing
↓
NF_IP_FORWARD
↓
NF_IP_POST_ROUTING
↓
network interface
As well as the above five hooks, a very few websites list another two - the ingress hook and the egress hook.
I think that the ingress hook was added in kernel version 4.2 - it doesn`t behave like the above five hooks, it appears to be attached to a network interface - and I am not sure but I don`t think that the userspace program iptables knows anything about it - I think you would have to be using nftables instead of iptables.
The ingress hook allows for some kinds of filtering to be done even before the pre-routing hook - but it is before any defragmentation is done on incoming packets, so this filtering will not work on some fragmented packets.
An iptables chain is a collection of rules - and typically the rules in each chain will have some kind of common factor.
Within each chain netfilter will try each rule in turn to see if the packet matches that rule - if the packet matches a rule then some kind of processing is done on the packet.
If the packet doesn`t match the rule, the next rule is tried - until all the rules have been tried - if there is no match the packet moves on to the next chain.
The user space program of iptables allows for the creation of chains using a command line instruction with the form
iptables -N
----------------------
The default setup of iptables is that there are several built-in chains - it appears that the majority of websites describing iptables discuss three built in chains -
INPUT - for packets coming into the machine for applications and processes on the machine
FORWARD - for packets coming into the machine that are going to be routed through the machine
OUTPUT - for packets coming from the applications and processes on the machine
But there are actually five built in classes of chains - and these chains have a correlation with the above five hooks
PREROUTING chain - NF_IP_PRE_ROUTING hook
INPUT chain - NF_IP_LOCAL_IN hook
FORWARD chain - NF_IP_FORWARD hook
OUTPUT chain - NF_IP_LOCAL_OUT hook
POSTROUTING chain - NF_IP_POST_ROUTING hook
Note that although these chains are built into iptables by default, none of them contain any rules - there are no default rules - all rules have to be set up.
Within iptables, a table is a group of chains - there are five tables built in by default -
RAW table - includes PREROUTING + OUTPUT chains
MANGLE table - includes PREROUTING + INPUT + OUTPUT + FORWARD + POSTROUTING chains
NAT table - includes PREROUTING + OUTPUT + POSTROUTING chains
FILTER table - includes INPUT + OUTPUT + FORWARD chains
SECURITY table - includes INPUT + OUTPUT + FORWARD chains
NB - the entry above for the NAT table is the information obtained from lots of websites - but on one of my Linux installations the file
/etc/iptables/rules.v4
shows that the NAT table contains four chains - ie -
PREROUTING + INPUT + OUTPUT + POSTROUTING chains
I don`t know which is correct, it may depend on kernel version - but it all means that there are 17 or 18 chains which iptables creates by default.
Each of the above chains in each of the above tables can have its own policy of ACCEPT or DROP - the default policy on all these chains in all the tables is ACCEPT - so a default netfilter setup with no rules or policy changes is wide open.
Each of these seventeen policies ( or eighteen ) can be set individually.
Here is a bit more information about each of the tables -
The RAW table is the first table that a packet will meet after coming in through a network interface - conventional thinking says that the RAW table has a narrow and specific function - it marks packets so that they are not connection tracked. However that statement hides the fact that the RAW table has a unique asset - it is the only table that comes before the connection tracking hook - more on connection tracking further down this web page.
The MANGLE table is designed to provide a way to modify packet headers.
The NAT table is designed to allow for network address translation.
The FILTER table is where the bulk of packet filtering is commonly done - by default if a rule is created through the command line or a shell script then iptables will put that rule in the FILTER table unless the command explicitly states that the rule should go in one of the other tables. Most websites about writing rules for packet filtering within netfilter assume that the rule should go in the FILTER table.
The SECURITY table is a bit different from the other tables - it is a sort of add-on table within netfilter that is used by security modules such as SELinux - many distributions of Linux don`t include it - I don`t really discuss it much within this web page.
The other four tables are essentially built-in parts of an iptables firewall.
So going back to the previously shown flow charts - we can now produce flow charts that show the tables and chains rather than the netfilter hooks.
The packet flow through netfilter for packets going from the local applications and processes out from the machine is -
local applications and processes
↓
routing
↓
RAW table - OUTPUT chain
↓
MANGLE table - OUTPUT chain
↓
NAT table - OUTPUT chain
↓
FILTER table - OUTPUT chain
↓
MANGLE table - POSTROUTING chain
↓
NAT table - POSTROUTING chain
↓
network interface
And the packet flow through netfilter for packets coming into the machine heading for the local applications and processes is -
network interface
↓
RAW table - PREROUTING chain
↓
MANGLE table - PREROUTING chain
↓
NAT table - PREROUTING chain
↓
routing
↓
MANGLE table - INPUT chain
↓
FILTER table - INPUT chain
↓
local applications and processes
The packet flow through netfilter for packets that are being forwarded through the firewall is -
network interface
↓
RAW table - PREROUTING chain
↓
MANGLE table - PREROUTING chain
↓
NAT table - PREROUTING chain
↓
routing
↓
MANGLE table - FORWARD chain
↓
FILTER table - FORWARD chain
↓
MANGLE table - POSTROUTING chain
↓
NAT table - POSTROUTING chain
↓
network interface
These flow charts certainly show how large is the default framework of tables and chains created by iptables.
Now I can absolutely see why chains exist - it is the chains that marry rulesets to the netfilter hooks - but I have difficulty in understanding why tables exist - the different chains in each table have little relationship to each other, they don`t share the rules - they are used in different parts of the netfilter stack.
The existence of tables makes it more difficult to understand the correlation between the chains and the netfilter hooks - which is the important thing.
Also worth noting that most websites that provide instruction on creating an iptables based firewall don`t really discuss tables - by default if you write a rule without specifying a table then that rule will go in the FILTER table - and you can create effective firewalls without specifying tables at all.
Connection tracking - known as "conntrack" in netfilter - is an essential part of stateful packet firewalls.
In essence - if an application or process on a machine sends out a data packet the envelope details of the packet are recorded - details such as ip address numbers, port numbers, and protocol.
Then when a packet arrives at the network interface of the machine, the details of the envelope are compared to the list of sent out packets - if the incoming packet is a reply to an outgoing packet then the envelope details match, and the firewall allows the packet in.
If the envelope details of the incoming packet don`t match any of the outgoing packets on the list - then the incoming packet is dumped.
Connection tracking and stateful packet firewalls are a great invention - they are not infallible, they can be hacked, but in general terms a stateful firewall can provide a much higher level of security than a stateless firewall - ie - a firewall that doesn`t do connection tracking.
----------------------
Just for clarity - the word "conntrack" is used in three different ways, with different meanings -
1 - the word "conntrack" is used to describe the mechanism within the Linux kernel and netfilter that tracks ( or doesn`t track ) every packet passing through the kernel and records it in a list
2 - the word "conntrack" is used in iptables rules to cause netfilter to examine the above list to see if a packet matches or corresponds to an existing entry in the list
3 - "conntrack" is the name of a user space command line application that allows an administrator to examine and modify the above list
----------------------
As stated above, in essence, connection tracking is quite a simple concept - record outgoing packets in a list - compare incoming packets to the entries in the list - allow in packets that match an entry on the list - drop packets that don`t have a match.
However the more I dug into it the more complicated it got - some websites describe the existence of two places within netfilter where packets are examined by conntrack - other websites describe three or four places.
This may be due to the version of the kernel - as said above, netfilter is continually evolving within the ongoing development of the kernel - and iptables is getting left behind.
----------------------
The connection tracking mechanism within the kernel is provided by one or more kernel modules - and it operates independently from iptables - however it doesn`t actually operate until there is an iptables rule that requires the kernel to consult the tracking list - then it starts to operate.
----------------------
As far as iptables in concerned, it appears that there are two places within the netfilter stack that conntrack intercepts the packet flow through netfilter - these conntrack hooks lie between -
the RAW table / OUTPUT chain and the MANGLE table / OUTPUT chain
and
the RAW table / PREROUTING chain and the MANGLE table / PREROUTING chain
The first placement records and tracks the packets outgoing from the local applications and processes - the second placement examines the packets coming into the machine, and compares them with the entries in the tracking list.
So now we can extend the flow charts shown above to include the placements of the conntrack hooks.
The packet flow through netfilter for packets going from the local applications and processes out from the machine is -
local applications and processes
↓
routing
↓
RAW table - OUTPUT chain
↓
connection tracking
↓
MANGLE table - OUTPUT chain
↓
NAT table - OUTPUT chain
↓
FILTER table - OUTPUT chain
↓
MANGLE table - POSTROUTING chain
↓
NAT table - POSTROUTING chain
↓
network interface
The packet flow through netfilter for packets coming into the machine heading for the local applications and processes is -
network interface
↓
RAW table - PREROUTING chain
↓
connection tracking
↓
MANGLE table - PREROUTING chain
↓
NAT table - PREROUTING chain
↓
routing
↓
MANGLE table - INPUT chain
↓
FILTER table - INPUT chain
↓
local applications and processes
The packet flow through netfilter for packets that are being forwarded through the firewall is now :-
network interface
↓
RAW table - PREROUTING chain
↓
connection tracking
↓
MANGLE table - PREROUTING chain
↓
NAT table - PREROUTING chain
↓
routing
↓
MANGLE table - FORWARD chain
↓
FILTER table - FORWARD chain
↓
MANGLE table - POSTROUTING chain
↓
NAT table - POSTROUTING chain
↓
network interface
Note that if an incoming packet does not match any entry on the tracking list, then the conntrack mechanism adds the details of that packet as a new entry on the tracking list - this is done automatically by the tracking mechanism quite independently from any action performed on the packet as a result of a rule within iptables - so even if a rule says the packet should be dropped, a new entry is created.
One of the results of this is that if a machine is attacked by the likes of DoS, DDoS, SYN floods, christmas tree attacks, nmap portscans - then for every one of these attacking packets a new entry is made in the tracking list and very quickly the tracking list will reach its maximum capacity - connection tracking will cease, and both wanted and unwanted packets will get lost.
It also consumes a lot of CPU processing time and memory.
The only way to change this behaviour is a rule in the PREROUTING chain in the RAW table - ie - a rule that sees the packet before the connection tracking hook.
----------------------
Also note that connection tracking doesn`t care about the function of the packet or the state of the connection that the packet is part of.
It doesn`t care whether the packet is part an established TCP connection or whether it is a a new TCP connection.
It doesn`t care if the packet is part of a connectionless flow such as UDP or ICMP.
It is just a packet.
Unfortunately the flow charts are still not complete - because we have to consider fragmentation.
And fragmented packets have to be defragmented before iptables can analyse them.
So we need to add defragmentation to the flow charts - defragmentation is done right after the network interface driver.
Fragmentation is not an issue for locally derived outgoing packets, so the packet flow through netfilter for packets going from the local applications and processes out from the machine is still -
local applications and processes
↓
routing
↓
RAW table - OUTPUT chain
↓
connection tracking
↓
MANGLE table - OUTPUT chain
↓
NAT table - OUTPUT chain
↓
FILTER table - OUTPUT chain
↓
MANGLE table - POSTROUTING chain
↓
NAT table - POSTROUTING chain
↓
network interface
But the packet flow through netfilter for packets coming into the machine heading for the local applications and processes is now -
network interface
↓
defragmentation
↓
RAW table - PREROUTING chain
↓
connection tracking
↓
MANGLE table - PREROUTING chain
↓
NAT table - PREROUTING chain
↓
routing
↓
MANGLE table - INPUT chain
↓
FILTER table - INPUT chain
↓
local applications and processes
The packet flow through netfilter for packets that are being forwarded through the firewall is now :-
network interface
↓
defragmentation
↓
RAW table - PREROUTING chain
↓
connection tracking
↓
MANGLE table - PREROUTING chain
↓
NAT table - PREROUTING chain
↓
routing
↓
MANGLE table - FORWARD chain
↓
FILTER table - FORWARD chain
↓
MANGLE table - POSTROUTING chain
↓
NAT table - POSTROUTING chain
↓
network interface
It is difficult to come to any conclusion other than that iptables is really rather bloated - and as a result there is little documentation about using it all - you can build effective firewalls using less than a quarter of its default framework -
- out of the four tables only one is usually used - the FILTER table
- out of 17 or 18 built-in chains only the INPUT, OUTPUT, and FORWARD chains in the FILTER table are commonly used
- the hidden asset of the RAW table coming before connection tracking is never mentioned in instructional websites
- I haven`t seen any instructional website talking about the use of the MANGLE table
Interesting to note that nftables - the successor to iptables - doesn`t create any kind of default framework of tables and chains - in nftables any required table or chain has to be specifically created, and the chains have to be married to a specific netfilter hook.
It was kernel version 2.4 that first produced iptables - it was kernel version 3.13 that produced nftables - so not much later.
Was it a bit of a tacit admission that the table and chain framework created by iptables is a bit excessive.
We are now up to kernel version 6.6 and nftables is still going strong.
Worth noting that a basic stateful firewall for a workstation on IPv4 can be made with nftables using
one table
one chain
two rules
Looks like iptables is long past its sell-by date - time to say goodbye.