batman-adv and TCP problem - SOLVED

Message ID d9bba8670910070812s16f9a1bne9c3ab97d61bf6fd@mail.gmail.com (mailing list archive)
State Accepted, archived
Headers

Commit Message

a Oct. 7, 2009, 3:12 p.m. UTC
  Dear all,

I have solved my problem.

It is related to the TCP checksum offload function: I am using as embedded
platform PCEngines Alix 3d2, that do not support this function (and also
qemu virtual machines do not support it).

This patch disable the tx checksum offload and everything now works fine for
me!



I’m not a kernel expert, and then let me know what do you think about this
solution/workaround

andrea


On Tue, Oct 6, 2009 at 6:31 PM, a <a.mailevent@gmail.com> wrote:

> Dear all,
>
> analyzing in deep the problem I have seen that the problem is the checksum:
> as you can see in my previous .cap file, the packets with data from
> 192.168.100.2 have a wrong checksum (and my nodes are not using the HW
> checksum offload function)
> Then, I’m starting to investigate how the checksum is calculated. If you
> have any suggestion or comment, you are welcome!
>
> andrea
>
>
>
>
>
> On Tue, Oct 6, 2009 at 3:14 PM, a <a.mailevent@gmail.com> wrote:
>
>> Dear Marek,
>>
>> On Tue, Oct 6, 2009 at 2:55 PM, Marek Lindner <lindner_marek@yahoo.de>wrote:
>>
>>>
>>> Hi,
>>>
>>> > you can find as attachment the dump on eth2 of GW (tcpdump -ni eth2 -s
>>> 0 -w
>>> > gw.cap);
>>> > the output of batctl td -p 4 eth1 is:
>>>
>>> I could not find anything revealing in the logs you provided. Could you
>>> please
>>> follow Sven's suggestion to log both ends as well ?
>>>
>>> I could log on every interface; I will do it and send .cap files.
>>
>>
>>> Just to not forget the obvious:
>>> * What batman-adv version are you running ?
>>>
>> revision 1439
>>
>>
>>> * GW1 routes the packets - does this work via NAT or do you manually add
>>>
>> GW1 routes packets, without NAT
>>
>>
>>> routing entries to both ends ?
>>>
>>>
>>> > A question: am I the first one with this problem?
>>>
>>> AFAIK batman-adv has no problem transporting TCP traffic (unless you
>>> found an
>>> undiscovered bug nobody has seen before). The most common source of
>>> trouble is
>>> the configuration of the setup, in particular MTU settings or routing
>>> issues
>>> (which is why most people simply bridge).
>>>
>>
>> I understand your point about batman-adv; also for me, among batman nodes
>> everything works fine.
>> I will also test the system with bridge on GW
>>
>>
>> andrea
>>
>>>
>>> Regards,
>>> Marek
>>> _______________________________________________
>>> B.A.T.M.A.N mailing list
>>> B.A.T.M.A.N@lists.open-mesh.net
>>> https://lists.open-mesh.net/mm/listinfo/b.a.t.m.a.n
>>>
>>
>>
>
  

Comments

Marek Lindner Oct. 8, 2009, 1:33 a.m. UTC | #1
Hi,

> I have solved my problem.
> 
> It is related to the TCP checksum offload function: I am using as embedded
> platform PCEngines Alix 3d2, that do not support this function (and also
> qemu virtual machines do not support it).
> 
> This patch disable the tx checksum offload and everything now works fine
>  for me!

nice finding !
I looked through kernel source a bit and it seems we can live without that flag 
(the tun driver also does not set it). I would commit your patch to the 
repository unless there are objections ?

Regards,
Marek
  

Patch

--- soft-interface.c.old    2009-10-07 17:10:25.000000000 +0200
+++ soft-interface.c    2009-10-07 17:00:29.000000000 +0200
@@ -114,7 +114,7 @@ 
 #endif /* LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 29) */
     dev->destructor = free_netdev;

-    dev->features |= NETIF_F_NO_CSUM;
+    /* dev->features |= NETIF_F_NO_CSUM; */
     dev->mtu = hardif_min_mtu();
     dev->hard_header_len = BAT_HEADER_LEN; /* reserve more space in the
skbuff for our header */