UDP / TCP Checksum errors from tcpdump & NIC Hardware Offloading

If you’ve ever tried to trace a UDP or TCP stream by using the tcpdump tool on Linux then you may have noticed that all, or at least most, packets indicate checksum errors. This is caused because you have checksum offloading on your network card (NIC) and tcpdump reads IP packets from the Linux kernel right before the actual checksum takes place in the NIC’s chipset. That’s why you only see errors in tcpdump and your network traffic works ok.

So, just to proove my point, here is a tcpdump output while monitoring DNS traffic (udp/53)


$ sudo tcpdump -i eth0 -vvv -nn udp dst port 53

tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
17:04:48.145904 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 61)
10.0.0.2.56497 > 10.0.0.1.53: [bad udp cksum 0x8f54 -> 0xb8fc!] 30234+ AAAA? www.twitter.com. (33)
17:04:48.145925 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 61)
10.0.0.2.56497 > 10.0.0.1.53: [bad udp cksum 0x224d -> 0x2604!] 30234+ AAAA? www.twitter.com. (33)

After checking active NIC hardware offloading options you can see the obvious


$ sudo ethtool -k eth0 | grep on
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
generic-segmentation-offload: on
generic-receive-offload: on
rx-vlan-offload: on
tx-vlan-offload: on

After disabling TCO (tcp offloading) for TX/RX on the NIC the problem is gone


$ sudo ethtool -K eth0 tx off rx off


$ sudo tcpdump -i eth0 -vvv -nn udp dst port 53

tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
17:06:09.355411 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 57)
10.0.0.2.18964 > 10.0.0.1.53: [udp sum ok] 292+ AAAA? twitter.com. (29)
17:06:09.355431 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 57)
10.0.0.2.18964 > 10.0.0.1.53: [udp sum ok] 292+ AAAA? twitter.com. (29)

For the sake of performance, remember to turn TCO back on after each tcpdump execution. 😉

If you saved the tcpdump output and later you need to correct the bad checksums then you can do one of the following:


$ sudo tcpreplay -i eth0 -F -w output.cap input.cap

or


$ sudo tcprewrite -i input.cap -o output.cap -C

Edit:
In this excellent article you can see the whole process illustrated as well as the impact of Generic Segmentation Offloading during packet capture.

9 responses to “UDP / TCP Checksum errors from tcpdump & NIC Hardware Offloading

  1. Hello, in my case even setting NIC ( eth1) as you said I am getting checksum erros like these one:

    root@webtv:/usr/local/WowzaMediaServer/conf# ethtool -K eth1 tx off rx off
    root@webtv:/usr/local/WowzaMediaServer/conf# ethtool -k eth1 | grep on
    tcp-segmentation-offload: off
    udp-fragmentation-offload: off
    generic-segmentation-offload: off
    generic-receive-offload: on
    rx-vlan-offload: on
    tx-vlan-offload: on
    receive-hashing: on
    tcpdump -v -i eth1 | grep -i incorrect

    server.1935 > client_host.50568: Flags [.], cksum 0x1882 (incorrect -> 0x3c65), ack 662617, win 62436, length 0
    server_side.1935 > client_host.50568: Flags [.], cksum 0x1882 (incorrect -> 0x310d), ack 665521, win 62436, length 0
    server_side.1935 > client_host.50568: Flags [.], cksum 0x1882 (incorrect -> 0x2e0c), ack 666290, win 62436, length 0
    server_side.1935 > client_host.50568: Flags [.], cksum 0x1882 (incorrect -> 0x240f), ack 668847, win 62436, length 0
    server_side.1935 > client_host.50568: Flags [.], cksum 0x1882 (incorrect -> 0x18b7), ack 671751, win 62436, length 0

    Could you please help me ?

    TIA,

    E.

  2. Can TCP Checksum errors cause timeouts?
    I have a terrible timeout problems at my servers.
    My clients are trying to send get requests to plain html pages(no code,loops and images-plain html) and get timeouts.
    what may be the problem?
    tcpdump shows TCP Checksum errors
    T.I.A

    • Technically speaking, TCP checksum errors can cause performance degradation on network based communication. If I were in your position I would capture traffic on both ends and try to check the TCP re-transmission rate of any type of packet. If you see a lot of them then most probably you should investigate the reason. It can be either a buggy NIC driver, copper/fiber issues or wrong NIC settings (offloading, buffers, TX/RX negotiation, etc).

      If you need further help please don’t hesitate to reply back but IMHO an experienced on-site network engineer would be of great help to better investigate and solve the problem.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.