Message ID | 1390299725-1873-1-git-send-email-antonio@meshcoding.com (mailing list archive) |
---|---|
State | Accepted, archived |
Commit | 2b108ccd0533e1375e44c73ec58c69dde9a71687 |
Headers |
Return-Path: <antonio@meshcoding.com> Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=178.209.62.157; helo=s3.neomailbox.net; envelope-from=antonio@meshcoding.com; receiver=b.a.t.m.a.n@lists.open-mesh.org Received: from s3.neomailbox.net (s3.neomailbox.net [178.209.62.157]) by open-mesh.org (Postfix) with ESMTPS id 9A6AA6021D7 for <b.a.t.m.a.n@lists.open-mesh.org>; Tue, 21 Jan 2014 11:23:16 +0100 (CET) From: Antonio Quartulli <antonio@meshcoding.com> To: b.a.t.m.a.n@lists.open-mesh.org Date: Tue, 21 Jan 2014 11:22:05 +0100 Message-Id: <1390299725-1873-1-git-send-email-antonio@meshcoding.com> Cc: Antonio Quartulli <antonio@meshcoding.com> Subject: [B.A.T.M.A.N.] [PATCH maint] batman-adv: fix soft-interface MTU computation X-BeenThere: b.a.t.m.a.n@lists.open-mesh.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: The list for a Better Approach To Mobile Ad-hoc Networking <b.a.t.m.a.n@lists.open-mesh.org> List-Id: The list for a Better Approach To Mobile Ad-hoc Networking <b.a.t.m.a.n.lists.open-mesh.org> List-Unsubscribe: <https://lists.open-mesh.org/mm/options/b.a.t.m.a.n>, <mailto:b.a.t.m.a.n-request@lists.open-mesh.org?subject=unsubscribe> List-Archive: <http://lists.open-mesh.org/pipermail/b.a.t.m.a.n/> List-Post: <mailto:b.a.t.m.a.n@lists.open-mesh.org> List-Help: <mailto:b.a.t.m.a.n-request@lists.open-mesh.org?subject=help> List-Subscribe: <https://lists.open-mesh.org/mm/listinfo/b.a.t.m.a.n>, <mailto:b.a.t.m.a.n-request@lists.open-mesh.org?subject=subscribe> X-List-Received-Date: Tue, 21 Jan 2014 10:23:18 -0000 |
Commit Message
Antonio Quartulli
Jan. 21, 2014, 10:22 a.m. UTC
The current MTU computation always returns a value
smaller than 1500bytes even if the real interfaces
have an MTU large enough to compensate the batman-adv
overhead.
Fix the computation by properly returning the highest
admitted value.
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
---
This patch is missing a Reported-by clause because I did not have
"russell"'s email address at hand.
Will be added later before being merged.
Cheers,
hard-interface.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)
Comments
On 21/01/14 11:22, Antonio Quartulli wrote: > The current MTU computation always returns a value > smaller than 1500bytes even if the real interfaces > have an MTU large enough to compensate the batman-adv > overhead. > > Fix the computation by properly returning the highest > admitted value. > Introduced by f7f2fe494388fca828094a4ebdab918a7b2d64f8 ("batman-adv: limit local translation table max size") > Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
>>>>> "Antonio" == Antonio Quartulli <antonio@meshcoding.com> writes:
Antonio> The current MTU computation always returns a value smaller
Antonio> than 1500bytes even if the real interfaces have an MTU large
Antonio> enough to compensate the batman-adv overhead.
Antonio> Fix the computation by properly returning the highest
Antonio> admitted value.
Antonio> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> ---
This seems to fix the bat0-MTU-unnecessarily-small problem I observed
last night and reported on the IRC channel. I haven't actually passed
any traffic over it yet, but the interface is up with the expected MTU
value with the patch.
Antonio> This patch is missing a Reported-by clause because I did not
Antonio> have "russell"'s email address at hand.
Antonio> Will be added later before being merged.
Reported-by: Russell Senior <russell@personaltelco.net>
On 21/01/14 19:43, Russell Senior wrote: >>>>>> "Antonio" == Antonio Quartulli <antonio@meshcoding.com> writes: > > Antonio> The current MTU computation always returns a value smaller > Antonio> than 1500bytes even if the real interfaces have an MTU large > Antonio> enough to compensate the batman-adv overhead. > > Antonio> Fix the computation by properly returning the highest > Antonio> admitted value. > > Antonio> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> --- > > This seems to fix the bat0-MTU-unnecessarily-small problem I observed > last night and reported on the IRC channel. I haven't actually passed > any traffic over it yet, but the interface is up with the expected MTU > value with the patch. Just to be sure the fix is not introducing any misbehaviour: have you tried setting smaller MTUs to your hard interface? In that case have you seen the bat0 reducing its MTU? > > Antonio> This patch is missing a Reported-by clause because I did not > Antonio> have "russell"'s email address at hand. > > Antonio> Will be added later before being merged. > > Reported-by: Russell Senior <russell@personaltelco.net> I'd also add Tested-by ;) Thanks a lot!
>>>>> "Russell" == Russell Senior <russell@personaltelco.net> writes: >>>>> "Antonio" == Antonio Quartulli <antonio@meshcoding.com> writes: Antonio> The current MTU computation always returns a value smaller Antonio> than 1500bytes even if the real interfaces have an MTU large Antonio> enough to compensate the batman-adv overhead. Antonio> Fix the computation by properly returning the highest Antonio> admitted value. Antonio> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> --- Russell> This seems to fix the bat0-MTU-unnecessarily-small problem I Russell> observed last night and reported on the IRC channel. I Russell> haven't actually passed any traffic over it yet, but the Russell> interface is up with the expected MTU value with the patch. Antonio> This patch is missing a Reported-by clause because I did not Antonio> have "russell"'s email address at hand. Russell> Reported-by: Russell Senior <russell@personaltelco.net> Followup, as requested, I tried setting a smaller MTU (1400) on the adhoc0 interface. When fragmentation was enabled, this resulted in no change to MTU (still 1500) for bat0. When I disabled fragmentation, the bat0 MTU dropped, as expected, to 1368. Interestingly, the MTU on the bridge that bat0 was a member of remained 1500 despite the lower bat0 MTU. Should that be? Also, for testing actual traffic over the batman-adv link, I build OpenWrt r39354 with the patch on a Soekris net4526, so that there were two nodes with the same revision (different architecture): ubnt-bullet-m with ath9k; net4826 with ath5k. I first noticed that I was losing about 100k of memory every couple seconds and pretty soon (with 20 minutes) the net4826 started oopsing on out-of-memory. I removed the patch, rev'd OpenWrt to r39365 and confirmed that the net4826 build was also leaking at a substantial rate. I am seeing a similar, though possibly slower, leak on the ubiquiti bullet m2hp. Right before rebooting, top shows kworker/u2:$N (where $N is 0 or 3) chewing up some cpu cycles. Has anybody else seen this memory leak? Leads on where it's coming from? Not a runaway process, at least not that top shows up. Just a gradual disappearance from MemFree that /proc/sys/vm/drop_caches doesn't fix. It isn't adhoc mode, and I can associate the two devices over adhoc and move a bunch of data with no memory lost, but turning on batman-adv seems to sink it.
On 22/01/14 07:04, Russell Senior wrote: >>>>>> "Russell" == Russell Senior <russell@personaltelco.net> writes: > >>>>>> "Antonio" == Antonio Quartulli <antonio@meshcoding.com> writes: > Antonio> The current MTU computation always returns a value smaller > Antonio> than 1500bytes even if the real interfaces have an MTU large > Antonio> enough to compensate the batman-adv overhead. > > Antonio> Fix the computation by properly returning the highest > Antonio> admitted value. > > Antonio> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> --- > > Russell> This seems to fix the bat0-MTU-unnecessarily-small problem I > Russell> observed last night and reported on the IRC channel. I > Russell> haven't actually passed any traffic over it yet, but the > Russell> interface is up with the expected MTU value with the patch. > > Antonio> This patch is missing a Reported-by clause because I did not > Antonio> have "russell"'s email address at hand. > > Russell> Reported-by: Russell Senior <russell@personaltelco.net> > > Followup, as requested, I tried setting a smaller MTU (1400) on the > adhoc0 interface. When fragmentation was enabled, this resulted in no > change to MTU (still 1500) for bat0. When I disabled fragmentation, > the bat0 MTU dropped, as expected, to 1368. Interestingly, the MTU on > the bridge that bat0 was a member of remained 1500 despite the lower > bat0 MTU. Should that be? > I don't really know how the bridge code behaves. As far as I remember it should adapt to the smallest MTU. But thanks for testing! This shows that the patch is working fine ;)
On 22/01/14 07:04, Russell Senior wrote: > Also, for testing actual traffic over the batman-adv link, I build > OpenWrt r39354 with the patch on a Soekris net4526, so that there were > two nodes with the same revision (different architecture): > ubnt-bullet-m with ath9k; net4826 with ath5k. I first noticed that I > was losing about 100k of memory every couple seconds and pretty soon > (with 20 minutes) the net4826 started oopsing on out-of-memory. > mh..does this happen with or without fragmentation enabled? Does this happen even if you don't generate traffic on the interface? > I removed the patch, rev'd OpenWrt to r39365 and confirmed that the > net4826 build was also leaking at a substantial rate. > > I am seeing a similar, though possibly slower, leak on the ubiquiti > bullet m2hp. Right before rebooting, top shows kworker/u2:$N (where > $N is 0 or 3) chewing up some cpu cycles. > > Has anybody else seen this memory leak? Leads on where it's coming > from? Not a runaway process, at least not that top shows up. Just a > gradual disappearance from MemFree that /proc/sys/vm/drop_caches > doesn't fix. It isn't adhoc mode, and I can associate the two devices > over adhoc and move a bunch of data with no memory lost, but turning > on batman-adv seems to sink it. > > Thanks for reporting!
On 01/22/2014 07:04 AM, Russell Senior wrote: > Has anybody else seen this memory leak? Leads on where it's coming > from? Not a runaway process, at least not that top shows up. Just a > gradual disappearance from MemFree that /proc/sys/vm/drop_caches > doesn't fix. It isn't adhoc mode, and I can associate the two devices > over adhoc and move a bunch of data with no memory lost, but turning > on batman-adv seems to sink it. Yes, and I tested (compile-time selected) with and without network coding, and (at run-time) with and without fragmentation (as I also bumped into the MTU calculation problem later fixed by the patch on this list) -- any 32MB RAM devices reboots after roughly 30 minutes due to OOM without substantial traffic, if there is traffic then apparently even faster...
>>>>> "Daniel" == Daniel <daniel@makrotopia.org> writes: Daniel> On 01/22/2014 07:04 AM, Russell Senior wrote: >> Has anybody else seen this memory leak? Leads on where it's coming >> from? Not a runaway process, at least not that top shows up. Just >> a gradual disappearance from MemFree that /proc/sys/vm/drop_caches >> doesn't fix. It isn't adhoc mode, and I can associate the two >> devices over adhoc and move a bunch of data with no memory lost, >> but turning on batman-adv seems to sink it. Daniel> Yes, and I tested (compile-time selected) with and without Daniel> network coding, and (at run-time) with and without Daniel> fragmentation (as I also bumped into the MTU calculation Daniel> problem later fixed by the patch on this list) -- any 32MB RAM Daniel> devices reboots after roughly 30 minutes due to OOM without Daniel> substantial traffic, if there is traffic then apparently even Daniel> faster... The memory leak I see seems to commence as soon as a batman-adv neighbor (same version, in this case 15) appears and stops when the neighbor goes away. I am going to try enabling kmemleak and see of that tells me anything.
On 22/01/14 18:45, Russell Senior wrote: >>>>>> "Daniel" == Daniel <daniel@makrotopia.org> writes: > > Daniel> On 01/22/2014 07:04 AM, Russell Senior wrote: >>> Has anybody else seen this memory leak? Leads on where it's coming >>> from? Not a runaway process, at least not that top shows up. Just >>> a gradual disappearance from MemFree that /proc/sys/vm/drop_caches >>> doesn't fix. It isn't adhoc mode, and I can associate the two >>> devices over adhoc and move a bunch of data with no memory lost, >>> but turning on batman-adv seems to sink it. > > Daniel> Yes, and I tested (compile-time selected) with and without > Daniel> network coding, and (at run-time) with and without > Daniel> fragmentation (as I also bumped into the MTU calculation > Daniel> problem later fixed by the patch on this list) -- any 32MB RAM > Daniel> devices reboots after roughly 30 minutes due to OOM without > Daniel> substantial traffic, if there is traffic then apparently even > Daniel> faster... > > The memory leak I see seems to commence as soon as a batman-adv > neighbor (same version, in this case 15) appears and stops when the > neighbor goes away. > Thank you very much for the hint Russel! Today I tried with one node only, but kmemleak did not report anything... > I am going to try enabling kmemleak and see of that tells me anything. > Thanks! Keep us informed! Cheers,
>>>>> "Antonio" == Antonio Quartulli <antonio@meshcoding.com> writes:
Russell> Has anybody else seen this memory leak? Leads on where it's
Russell> coming from? Not a runaway process, at least not that top
Russell> shows up. Just a gradual disappearance from MemFree that
Russell> /proc/sys/vm/drop_caches doesn't fix. It isn't adhoc mode,
Russell> and I can associate the two devices over adhoc and move a
Russell> bunch of data with no memory lost, but turning on batman-adv
Russell> seems to sink it.
Russell> The memory leak I see seems to commence as soon as a
Russell> batman-adv neighbor (same version, in this case 15) appears
Russell> and stops when the neighbor goes away.
Antonio> Thank you very much for the hint Russel! Today I tried with
Antonio> one node only, but kmemleak did not report anything...
Russell> I am going to try enabling kmemleak and see of that tells me
Russell> anything.
Antonio> Thanks! Keep us informed!
Here is a bootlog in which I spit out a bunch of kmemleak stuff into a
console (captured by /usr/bin/screen, sorry for the extraneous line
feed silliness).
https://personaltelco.net/~russell/kmemleak-batman-from-boot.log
If I count instances, it looks like batadv_orig_node_vlan_new (and the
things that are calling it) may be implicated.
Hope that helps!
I had the same problem which caused reboots but after the last batman-adv update i am not seeing it. all my devices are have mb ram i am using network coding and 1560 MTU On 01/22/2014 12:46 PM, Antonio Quartulli wrote: > On 22/01/14 18:45, Russell Senior wrote: >>>>>>> "Daniel" == Daniel <daniel@makrotopia.org> writes: >> >> Daniel> On 01/22/2014 07:04 AM, Russell Senior wrote: >>>> Has anybody else seen this memory leak? Leads on where it's coming >>>> from? Not a runaway process, at least not that top shows up. Just >>>> a gradual disappearance from MemFree that /proc/sys/vm/drop_caches >>>> doesn't fix. It isn't adhoc mode, and I can associate the two >>>> devices over adhoc and move a bunch of data with no memory lost, >>>> but turning on batman-adv seems to sink it. >> >> Daniel> Yes, and I tested (compile-time selected) with and without >> Daniel> network coding, and (at run-time) with and without >> Daniel> fragmentation (as I also bumped into the MTU calculation >> Daniel> problem later fixed by the patch on this list) -- any 32MB RAM >> Daniel> devices reboots after roughly 30 minutes due to OOM without >> Daniel> substantial traffic, if there is traffic then apparently even >> Daniel> faster... >> >> The memory leak I see seems to commence as soon as a batman-adv >> neighbor (same version, in this case 15) appears and stops when the >> neighbor goes away. >> > > Thank you very much for the hint Russel! > Today I tried with one node only, but kmemleak did not report anything... > >> I am going to try enabling kmemleak and see of that tells me anything. >> > > Thanks! Keep us informed! > > Cheers, >
>>>>> "cmsv" == cmsv <cmsv@wirelesspt.net> writes:
cmsv> I had the same problem which caused reboots but after the last
cmsv> batman-adv update i am not seeing it. all my devices are have
cmsv> mb ram i am using network coding and 1560 MTU
Which version are you running?
On 01/22/2014 06:57 PM, Russell Senior wrote: >>>>>> "cmsv" == cmsv <cmsv@wirelesspt.net> writes: > > cmsv> I had the same problem which caused reboots but after the last > cmsv> batman-adv update i am not seeing it. all my devices are have > cmsv> mb ram i am using network coding and 1560 MTU > > Which version are you running? > > Righ now with openwrt AA DISTRIB_REVISION="r39154" and batctl 2014.0.0 [batman-adv: 2014.0.0] Routers are dir 601 dir 615* tl wr703n and it is not happening I synced the feed less than 48h ago and recompiled. what version are you using ?
On 01/23/2014 01:10 AM, cmsv wrote: > > > On 01/22/2014 06:57 PM, Russell Senior wrote: >>>>>>> "cmsv" == cmsv <cmsv@wirelesspt.net> writes: >> >> cmsv> I had the same problem which caused reboots but after the last >> cmsv> batman-adv update i am not seeing it. all my devices are have >> cmsv> mb ram i am using network coding and 1560 MTU >> >> Which version are you running? >> >> > > Righ now with openwrt AA DISTRIB_REVISION="r39154" and batctl 2014.0.0 > [batman-adv: 2014.0.0] > > Routers are dir 601 dir 615* tl wr703n and it is not happening > I synced the feed less than 48h ago and recompiled. > > what version are you using ? out-of-memory every 20 minutes or so on OpenWrt trunk/BB r39365 on tl-wr841nd-v8 with batman-adv 2014.0.0 from openwrt's routing feed with "batman-adv: fix batman-adv header overhead calculation" and "batman-adv: fix soft-interface MTU computation" on top. A sample node is (occasionally) reachable via DN42 at 104.61.99.104 (feel free to ask for ssh or any kind of logs, serial access, remote gdb or whatever)
I now built OpenWrt trunk/BB r39365 with batman-adv 2013.4.0 instead of 2014.0.0, tried with all possible settings, no memory leak what-so-over, happy uptimes of more than a day by now :) all other system components and settings are exactly identical to my previous setup with batman-adv 2014.0.0. On 01/23/2014 04:35 AM, Daniel wrote: > On 01/23/2014 01:10 AM, cmsv wrote: >> what version are you using ? > out-of-memory every 20 minutes or so on OpenWrt trunk/BB r39365 on tl-wr841nd-v8 > with batman-adv 2014.0.0 from openwrt's routing feed with "batman-adv: fix > batman-adv header overhead calculation" and "batman-adv: fix soft-interface MTU > computation" on top. > A sample node is (occasionally) reachable via DN42 at 104.61.99.104 (feel free > to ask for ssh or any kind of logs, serial access, remote gdb or whatever)
On 26/01/14 13:57, Daniel wrote: > I now built OpenWrt trunk/BB r39365 with batman-adv 2013.4.0 instead of > 2014.0.0, tried with all possible settings, no memory leak what-so-over, happy > uptimes of more than a day by now :) > all other system components and settings are exactly identical to my previous > setup with batman-adv 2014.0.0. > > On 01/23/2014 04:35 AM, Daniel wrote: >> On 01/23/2014 01:10 AM, cmsv wrote: >>> what version are you using ? > >> out-of-memory every 20 minutes or so on OpenWrt trunk/BB r39365 on tl-wr841nd-v8 >> with batman-adv 2014.0.0 from openwrt's routing feed with "batman-adv: fix >> batman-adv header overhead calculation" and "batman-adv: fix soft-interface MTU >> computation" on top. >> A sample node is (occasionally) reachable via DN42 at 104.61.99.104 (feel free >> to ask for ssh or any kind of logs, serial access, remote gdb or whatever) > Thanks for testing guys! I found something wrong in the code and I am going to send a patch soon. I'd really appreciate if somebody could test it! Thanks!
Inline: On 01/26/2014 09:21 AM, Antonio Quartulli wrote: > On 26/01/14 13:57, Daniel wrote: >> I now built OpenWrt trunk/BB r39365 with batman-adv 2013.4.0 instead of >> 2014.0.0, tried with all possible settings, no memory leak what-so-over, happy >> uptimes of more than a day by now :) >> all other system components and settings are exactly identical to my previous >> setup with batman-adv 2014.0.0. >> >> On 01/23/2014 04:35 AM, Daniel wrote: >>> On 01/23/2014 01:10 AM, cmsv wrote: >>>> what version are you using ? >> >>> out-of-memory every 20 minutes or so on OpenWrt trunk/BB r39365 on tl-wr841nd-v8 >>> with batman-adv 2014.0.0 from openwrt's routing feed with "batman-adv: fix >>> batman-adv header overhead calculation" and "batman-adv: fix soft-interface MTU >>> computation" on top. >>> A sample node is (occasionally) reachable via DN42 at 104.61.99.104 (feel free >>> to ask for ssh or any kind of logs, serial access, remote gdb or whatever) >> > > Thanks for testing guys! > > I found something wrong in the code and I am going to send a patch soon. > I'd really appreciate if somebody could test it! Will this patch also be relevant to attitude adjustment ? The reason why i was if because with AA right now i do not experience the reboots. I should also add that i am using mac80211 r39150 and hostapd r39155 on top of the latest AA. Can you explain in what exactly your code findings have an impact on ? > > Thanks! >
On 26/01/14 17:05, cmsv wrote: > > Will this patch also be relevant to attitude adjustment ? The reason why > i was if because with AA right now i do not experience the reboots. > I should also add that i am using mac80211 r39150 and hostapd r39155 on > top of the latest AA. > > Can you explain in what exactly your code findings have an impact on ? This is a patch to fix the memleak we were discussing about. This bug appeared with and it is meant to be applied on batman-adv-2014.0.0 (regardless of the openwrt revision). Cheers,
On 26/01/14 17:07, Antonio Quartulli wrote:Can you explain in what > > This is a patch to fix the memleak we were discussing about. > This bug appeared with and it is meant to be applied on > batman-adv-2014.0.0 (regardless of the openwrt revision). sorry, bad copy/paste. The patch is for batman-adv-2014.0.0 (I don't know what version you have in AA). It fixes the memleak bug that we were discussing about.
On Tuesday 21 January 2014 11:31:08 Antonio Quartulli wrote: > On 21/01/14 11:22, Antonio Quartulli wrote: > > The current MTU computation always returns a value > > smaller than 1500bytes even if the real interfaces > > have an MTU large enough to compensate the batman-adv > > overhead. > > > > > > > > Fix the computation by properly returning the highest > > admitted value. > > > > > > Introduced by f7f2fe494388fca828094a4ebdab918a7b2d64f8 > ("batman-adv: limit local translation table max size") > > > Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Applied in revision 2b108cc. Thanks, Marek
Here is an update of some tests i ran in the past 24h with the following build: routers used: dlink dir 601a and tplink wr703n in "ng" mode. (atheros) My current AA DISTRIB_REVISION="r39154" mac80211 r39150 from openwrt trunk hostapd r39155 from trunk From batman-adv i am using the following patches: ls feeds/routing/batman-adv/patches/ 0001-batman-adv-fix-batman-adv-header-overhead-calculatio.patch From d72756b97529b3c6afa08933216aaa912bb16ce6 Mon Sep 17 00:00:00 2001 From: Marek Lindner <mareklindner@neomailbox.ch> Date: Wed, 15 Jan 2014 20:31:18 +0800 Subject: [PATCH] batman-adv: fix batman-adv header overhead calculation batman-adv/Makefile # $Id: Makefile 5624 2006-11-23 00:29:07Z nbd $ include $(TOPDIR)/rules.mk PKG_NAME:=batman-adv PKG_VERSION:=2014.0.0 BATCTL_VERSION:=2014.0.0 PKG_RELEASE:=1 PKG_MD5SUM:=8d58ecaede17dc05aab1b549dc09fa7d BATCTL_MD5SUM:=b0bcf29fef80ddcc33769e13f5937d0a I tried to find any memory leaks that could be causing reboots and i was unable to find any after having compiled the build with batman-adv-header-overhead-calculatio.patch. Before this patch i did get reboots caused by the leak. I keep monitoring memory usage with top, htop, ps and /proc/meminfo since i was not able to install valgrind due to lack of available flash memory given the size of the valgrind package. Got some tips from here: http://blog.thewebsitepeople.org/2011/03/linux-memory-leak-detection Additionally i ran iperf tests on both routers against each other to force them under heavy load during 24h: iperf -c <ip> -t 99999 -i 5 The mtu is 1560 for the adhoc. After 24h i still had 6 mb of ram and above on both routers. Once i stopped the tests; the ram increased. Dmesh and logread output nothing wrong and or errors. No reboots happened during this time which leads me to conclude that the problem might not be all from batman-adv side or maybe not even at all or maybe only happens when in use with something very specific. I would like to run a few more tests to be more sure about possible leaks but are there any other tools that someone might recommend ? @ daniel What did you use to find the leak and or how did you troubleshoot it ? On 01/26/2014 11:13 AM, Antonio Quartulli wrote: > On 26/01/14 17:07, Antonio Quartulli wrote:Can you explain in what >> >> This is a patch to fix the memleak we were discussing about. >> This bug appeared with and it is meant to be applied on >> batman-adv-2014.0.0 (regardless of the openwrt revision). > > sorry, bad copy/paste. > > The patch is for batman-adv-2014.0.0 (I don't know what version you have > in AA). > It fixes the memleak bug that we were discussing about. > >
>>>>> "cmsv" == cmsv <cmsv@wirelesspt.net> writes:
cmsv> Here is an update of some tests i ran in the past 24h with the
cmsv> following build:
cmsv> routers used: dlink dir 601a and tplink wr703n in "ng"
cmsv> mode. (atheros)
cmsv> My current AA DISTRIB_REVISION="r39154" mac80211 r39150 from
cmsv> openwrt trunk hostapd r39155 from trunk
I just went to try to set up an AA build environment from:
git://git.openwrt.org/12.09/openwrt.git
in order to replicate. The default feeds.conf from that tree seems to
point at a 'for-12.09.x' branch of the routing feed, and the
batman-adv Makefile there seems to use 2013.4.0, not 2014.0.0.
Can you paste your feeds.conf file?
inline: On 01/27/2014 08:21 PM, Russell Senior wrote: >>>>>> "cmsv" == cmsv <cmsv@wirelesspt.net> writes: > > cmsv> Here is an update of some tests i ran in the past 24h with the > cmsv> following build: > > cmsv> routers used: dlink dir 601a and tplink wr703n in "ng" > cmsv> mode. (atheros) > > cmsv> My current AA DISTRIB_REVISION="r39154" mac80211 r39150 from > cmsv> openwrt trunk hostapd r39155 from trunk > > I just went to try to set up an AA build environment from: > > git://git.openwrt.org/12.09/openwrt.git > > in order to replicate. The default feeds.conf from that tree seems to > point at a 'for-12.09.x' branch of the routing feed, and the > batman-adv Makefile there seems to use 2013.4.0, not 2014.0.0. > > Can you paste your feeds.conf file? Of course: for AA and batman-adv 2014.0.0 in feeds.default.conf src-svn packages svn://svn.openwrt.org/openwrt/branches/packages_12.09 src-git routing git://github.com/openwrt-routing/packages.git For the hostapd and mentioned mac80211 you will need to clone git clone git://git.openwrt.org/12.09/openwrt.git Then obtain the specific revisions and replace the original hostapd and mac80211 from AA.
>>>>> "cmsv" == cmsv <cmsv@wirelesspt.net> writes: >> Can you paste your feeds.conf file? cmsv> Of course: cmsv> for AA and batman-adv 2014.0.0 in feeds.default.conf cmsv> src-svn packages svn://svn.openwrt.org/openwrt/branches/packages_12.09 cmsv> src-git routing git://github.com/openwrt-routing/packages.git cmsv> For the hostapd and mentioned mac80211 you will need to clone cmsv> git clone git://git.openwrt.org/12.09/openwrt.git cmsv> Then obtain the specific revisions and replace the original cmsv> hostapd and mac80211 from AA. I am not following exactly. Do you know which change in particular makes the memory leak come and go? AA implies an older kernel, 3.3.8 or something. Also, obtain specific revisions from trunk? and then copy them into the AA tree? package/kernel/mac80211 r39150 = commit 886b3c876b71122ed9523834488f373908224663 package/network/services/hostapd r39155 = commit 64820db4b264472e03acb9ea6b5536fa7633a8ca Is that right? Do those mac80211/hostapd revisions come from bisection (i.e. the last "good" rev) or happenstance? Thanks for clarification!
inline reply: On 01/29/2014 03:10 AM, Russell Senior wrote: >>>>>> "cmsv" == cmsv <cmsv@wirelesspt.net> writes: > > >>> Can you paste your feeds.conf file? > > cmsv> Of course: > > > cmsv> for AA and batman-adv 2014.0.0 in feeds.default.conf > > cmsv> src-svn packages svn://svn.openwrt.org/openwrt/branches/packages_12.09 > cmsv> src-git routing git://github.com/openwrt-routing/packages.git > > > cmsv> For the hostapd and mentioned mac80211 you will need to clone > cmsv> git clone git://git.openwrt.org/12.09/openwrt.git > > cmsv> Then obtain the specific revisions and replace the original > cmsv> hostapd and mac80211 from AA. > > I am not following exactly. Do you know which change in particular > makes the memory leak come and go? I do not know exactly what causes the leak because i don't have the leak in my builds and have not found a better way than the ones mentioned before to try to find what may cause it. > AA implies an older kernel, 3.3.8 > or something. Yes 3.3.8 > > Also, obtain specific revisions from trunk? and then copy > them into the AA tree? Not from trunk. I posted the wrong git before. git clone git://nbd.name/aa-mac80211.git > package/kernel/mac80211 r39150 = commit 886b3c876b71122ed9523834488f373908224663 > package/network/services/hostapd r39155 = commit 64820db4b264472e03acb9ea6b5536fa7633a8ca > > Is that right? Do those mac80211/hostapd revisions come from > bisection (i.e. the last "good" rev) or happenstance? You have to ask the maintainer. To me they are in between AA and trunk in terms of stability. > Thanks for clarification! > >
I have an update in regards to this matter and i have CC' ed Felix Fietkau from openwrt (athk) here too since i am using nbd.name/aa-mac80211.git I decided to compile new images with the latest batman-adv stable patches and in the process of testing the new image as well as the old one i thought to be stable i got the routers to reboot. This time i tested this with more routers in the mesh and was able to replicate it. It happens that the routers reboot when the gateway disappears either by doing batctl gw client/off or rebooting the gw router. This then causes the others to reboot with Kernel panic - not syncing: Fatal exception in interrupt. Rebooting the gw router while maintaining gw off did not seem to reboot the other routers. With me the problem is easy to replicate when the router gateway which is providing gateway to the clients disappears. It' s disappearance causes the clients to reboot. Here is the reboot log: [ 239.410000] CPU 0 Unable to handle kernel paging request at virtual address 0000000c, epc == 80ea7914, ra == 80ea7910 [ 239.420000] Oops[#1]: [ 239.420000] Cpu 0 [ 239.420000] $ 0 : 00000000 00000001 00000000 00000000 [ 239.420000] $ 4 : 81b12380 80f7fb00 00000000 00000000 [ 239.420000] $ 8 : 00000037 00000000 00000000 00000000 [ 239.420000] $12 : 00000000 0000015f 80e82540 00000000 [ 239.420000] $16 : 81adbc00 00000000 81b12380 80f3e802 [ 239.420000] $20 : 80f7fb00 00000000 00000189 00000000 [ 239.420000] $24 : 00000002 80e365f0 [ 239.420000] $28 : 80fe6000 80fe7ae8 00000043 80ea7910 [ 239.420000] Hi : 000001d5 [ 239.420000] Lo : 0011e189 [ 239.420000] epc : 80ea7914 0x80ea7914 [ 239.420000] Tainted: G O [ 239.420000] ra : 80ea7910 0x80ea7910 [ 239.420000] Status: 1000f403 KERNEL EXL IE [ 239.420000] Cause : 00800008 [ 239.420000] BadVA : 0000000c [ 239.420000] PrId : 00019374 (MIPS 24Kc) [ 239.420000] Modules linked in: ath79_wdt batman_adv(O) nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp ipt_MASQUERADE iptable_nat nf_nat xt_conntrack xt_CT xt_NOTRACK iptable _raw xt_state nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack ipt_REJECT xt_TCPMSS ipt_LOG xt_comment xt_multiport xt_mac xt_limit iptable_mangle iptable_filter ip_tables xt_tcpudp x_tabl es ath9k(O) ath9k_common(O) ath9k_hw(O) ath(O) mac80211(O) libcrc32c crc16 cfg80211(O) compat(O) arc4 aes_generic crc32c crypto_hash crypto_algapi gpio_button_hotplug(O) [ 239.420000] Process udhcpc (pid: 1267, threadinfo=80fe6000, task=81af8850, tls=77929440) [ 239.420000] Stack : 00000000 00000000 00000000 00000000 0000002a 81adbc00 00000000 81adbc00 [ 239.420000] 81b12000 80f3e802 81b12380 00000000 00000189 80eb1fbc 81b12000 00000000 [ 239.420000] 80e8bd00 80eb86c0 00000000 00000000 00000000 801e98ac 81adbc00 00000000 [ 239.420000] 81b12000 00000000 80e8bd00 80eb86c0 00000000 801ec874 00000000 80dae000 [ 239.420000] 00000000 00000014 80fb7ca8 0200bc00 00000001 00000001 802e0000 81adbc00 [ 239.420000] ... [ 239.420000] Call Trace:[<80eb1fbc>] 0x80eb1fbc [ 239.420000] [<801e98ac>] 0x801e98ac [ 239.420000] [<801ec874>] 0x801ec874 [ 239.420000] [<801ecd5c>] 0x801ecd5c [ 239.420000] [<8026a388>] 0x8026a388 [ 239.420000] [<80218750>] 0x80218750 [ 239.420000] [<802689a4>] 0x802689a4 [ 239.420000] [<801dbf88>] 0x801dbf88 [ 239.420000] [<80218750>] 0x80218750 [ 239.420000] [<801ec874>] 0x801ec874 [ 239.420000] [<80216c50>] 0x80216c50 [ 239.420000] [<80218750>] 0x80218750 [ 239.420000] [<801ecd5c>] 0x801ecd5c [ 239.420000] [<80216c50>] 0x80216c50 [ 239.420000] [<802689b4>] 0x802689b4 [ 239.420000] [<80219eb0>] 0x80219eb0 [ 239.420000] [<80237bb8>] 0x80237bb8 [ 239.420000] [<80239734>] 0x80239734 [ 239.420000] [<8024f668>] 0x8024f668 [ 239.420000] [<801101d4>] 0x801101d4 [ 239.420000] [<8020e3dc>] 0x8020e3dc [ 239.420000] [<801fd38c>] 0x801fd38c [ 239.420000] [<802179f8>] 0x802179f8 [ 239.420000] [<8020ff04>] 0x8020ff04 [ 239.420000] [<801d8154>] 0x801d8154 [ 239.420000] [<80211184>] 0x80211184 [ 239.420000] [<800d8890>] 0x800d8890 [ 239.420000] [<800ec6f0>] 0x800ec6f0 [ 239.420000] [<801d9f58>] 0x801d9f58 [ 239.420000] [<801d93dc>] 0x801d93dc [ 239.420000] [<800d9114>] 0x800d9114 [ 239.420000] [<800d93dc>] 0x800d93dc [ 239.420000] [<801d9a70>] 0x801d9a70 [ 239.420000] [<8006a284>] 0x8006a284 [ 239.420000] [ 239.420000] [ 239.420000] Code: 0c3a9ac3 00402821 0040a821 <8c42000c> 54400052 00008021 8e050054 10a00005 8fb10010 [ 239.730000] ---[ end trace 7d873dc004108502 ]--- [ 239.740000] Kernel panic - not syncing: Fatal exception in interrupt [ 239.740000] Rebooting in 3 seconds.. Routers used: dir 601a & 615c1 tplink wr703n aa: DISTRIB_REVISION="r39154" hostapd and mac80211 from git://nbd.name/aa-mac80211.git hostapd: sync with trunk (as of r39155) mac80211: sync with openwrt trunk (as of r39150) I am able to confirm that this problem does not happen with [batman-adv: 2013.4.0] but it does happen with 2014.0.0 and it is easy to replicate. currently my batman-adv 2014.0.0 package as the following patches: $ ls feeds/routing/batman-adv/patches/ 0001-batman-adv-fix-batman-adv-header-overhead-calculatio.patch 0003-batman-adv-fix-soft-interface-MTU-computation.patch 0005-batman-adv-release-vlan-object-after-checking-the-CR.patch 0002-batman-adv-fix-potential-kernel-paging-error-for-uni.patch 0004-batman-adv-fix-TT-TVLV-parsing-on-OGM-reception.patch 0007-batman-adv-use-vlan_-eth_hdr-instead-of-skb-data-in-.patch On 01/29/2014 04:48 PM, cmsv wrote: > inline reply: > > On 01/29/2014 03:10 AM, Russell Senior wrote: >>>>>>> "cmsv" == cmsv <cmsv@wirelesspt.net> writes: >> >> >>>> Can you paste your feeds.conf file? >> >> cmsv> Of course: >> >> >> cmsv> for AA and batman-adv 2014.0.0 in feeds.default.conf >> >> cmsv> src-svn packages svn://svn.openwrt.org/openwrt/branches/packages_12.09 >> cmsv> src-git routing git://github.com/openwrt-routing/packages.git >> >> >> cmsv> For the hostapd and mentioned mac80211 you will need to clone >> cmsv> git clone git://nbd.name/aa-mac80211.git >> >> cmsv> Then obtain the specific revisions and replace the original >> cmsv> hostapd and mac80211 from AA. >> >> I am not following exactly. Do you know which change in particular >> makes the memory leak come and go? > I do not know exactly what causes the leak because i don't have the leak > in my builds and have not found a better way than the ones mentioned > before to try to find what may cause it. > > >> AA implies an older kernel, 3.3.8 >> or something. > Yes 3.3.8 > >> >> Also, obtain specific revisions from trunk? and then copy >> them into the AA tree? > > Not from trunk. I posted the wrong git before. > git clone git://nbd.name/aa-mac80211.git > >> package/kernel/mac80211 r39150 = commit 886b3c876b71122ed9523834488f373908224663 >> package/network/services/hostapd r39155 = commit 64820db4b264472e03acb9ea6b5536fa7633a8ca >> >> Is that right? Do those mac80211/hostapd revisions come from >> bisection (i.e. the last "good" rev) or happenstance? > You have to ask the maintainer. To me they are in between AA and trunk > in terms of stability. > > > >> Thanks for clarification! >> >> >
On 2014-02-08 04:08, cmsv wrote: > [ 239.410000] CPU 0 Unable to handle kernel paging request at virtual > address 0000000c, epc == 80ea7914, ra == 80ea7910 > [ 239.420000] Oops[#1]: > [ 239.420000] Cpu 0 > [ 239.420000] $ 0 : 00000000 00000001 00000000 00000000 > [ 239.420000] $ 4 : 81b12380 80f7fb00 00000000 00000000 > [ 239.420000] $ 8 : 00000037 00000000 00000000 00000000 > [ 239.420000] $12 : 00000000 0000015f 80e82540 00000000 > [ 239.420000] $16 : 81adbc00 00000000 81b12380 80f3e802 > [ 239.420000] $20 : 80f7fb00 00000000 00000189 00000000 > [ 239.420000] $24 : 00000002 80e365f0 > [ 239.420000] $28 : 80fe6000 80fe7ae8 00000043 80ea7910 > [ 239.420000] Hi : 000001d5 > [ 239.420000] Lo : 0011e189 > [ 239.420000] epc : 80ea7914 0x80ea7914 > [ 239.420000] Tainted: G O > [ 239.420000] ra : 80ea7910 0x80ea7910 > [ 239.420000] Status: 1000f403 KERNEL EXL IE > [ 239.420000] Cause : 00800008 > [ 239.420000] BadVA : 0000000c > [ 239.420000] PrId : 00019374 (MIPS 24Kc) > [ 239.420000] Modules linked in: ath79_wdt batman_adv(O) nf_nat_irc > nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp ipt_MASQUERADE iptable_nat > nf_nat xt_conntrack xt_CT xt_NOTRACK iptable > > > _raw xt_state nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack ipt_REJECT > xt_TCPMSS ipt_LOG xt_comment xt_multiport xt_mac xt_limit iptable_mangle > iptable_filter ip_tables xt_tcpudp x_tabl > > > es ath9k(O) ath9k_common(O) ath9k_hw(O) ath(O) mac80211(O) libcrc32c > crc16 cfg80211(O) compat(O) arc4 aes_generic crc32c crypto_hash > crypto_algapi gpio_button_hotplug(O) > [ 239.420000] Process udhcpc (pid: 1267, threadinfo=80fe6000, > task=81af8850, tls=77929440) > [ 239.420000] Stack : 00000000 00000000 00000000 00000000 0000002a > 81adbc00 00000000 81adbc00 > [ 239.420000] 81b12000 80f3e802 81b12380 00000000 00000189 > 80eb1fbc 81b12000 00000000 > [ 239.420000] 80e8bd00 80eb86c0 00000000 00000000 00000000 > 801e98ac 81adbc00 00000000 > [ 239.420000] 81b12000 00000000 80e8bd00 80eb86c0 00000000 > 801ec874 00000000 80dae000 > [ 239.420000] 00000000 00000014 80fb7ca8 0200bc00 00000001 > 00000001 802e0000 81adbc00 > [ 239.420000] ... > [ 239.420000] Call Trace:[<80eb1fbc>] 0x80eb1fbc > [ 239.420000] [<801e98ac>] 0x801e98ac > [ 239.420000] [<801ec874>] 0x801ec874 [...] Just a quick note about logs like this: They're completely worthless unless you enable CONFIG_KERNEL_KALLSYMS in your .config. Without that option, the kernel does not resolve function names, and the addresses shown with a custom build usually do not match the addresses of other builds. - Felix
On 08/02/14 04:08, cmsv wrote: > [ 239.420000] [<8020ff04>] 0x8020ff04 > [ 239.420000] [<801d8154>] 0x801d8154 > [ 239.420000] [<80211184>] 0x80211184 > [ 239.420000] [<800d8890>] 0x800d8890 > [ 239.420000] [<800ec6f0>] 0x800ec6f0 > [ 239.420000] [<801d9f58>] 0x801d9f58 > [ 239.420000] [<801d93dc>] 0x801d93dc > [ 239.420000] [<800d9114>] 0x800d9114 > [ 239.420000] [<800d93dc>] 0x800d93dc > [ 239.420000] [<801d9a70>] 0x801d9a70 > [ 239.420000] [<8006a284>] 0x8006a284 > [ 239.420000] > [ 239.420000] > [ 239.420000] Code: 0c3a9ac3 00402821 0040a821 <8c42000c> 54400052 > 00008021 8e050054 10a00005 8fb10010 > [ 239.730000] ---[ end trace 7d873dc004108502 ]--- > [ 239.740000] Kernel panic - not syncing: Fatal exception in interrupt > [ 239.740000] Rebooting in 3 seconds.. > Hi! Have you been able to run a test with kernel symbols enabled?? That would be a great help ;) Cheers,
inline On 02/12/2014 02:23 AM, Antonio Quartulli wrote: > On 08/02/14 04:08, cmsv wrote: >> [ 239.420000] [<8020ff04>] 0x8020ff04 >> [ 239.420000] [<801d8154>] 0x801d8154 >> [ 239.420000] [<80211184>] 0x80211184 >> [ 239.420000] [<800d8890>] 0x800d8890 >> [ 239.420000] [<800ec6f0>] 0x800ec6f0 >> [ 239.420000] [<801d9f58>] 0x801d9f58 >> [ 239.420000] [<801d93dc>] 0x801d93dc >> [ 239.420000] [<800d9114>] 0x800d9114 >> [ 239.420000] [<800d93dc>] 0x800d93dc >> [ 239.420000] [<801d9a70>] 0x801d9a70 >> [ 239.420000] [<8006a284>] 0x8006a284 >> [ 239.420000] >> [ 239.420000] >> [ 239.420000] Code: 0c3a9ac3 00402821 0040a821 <8c42000c> 54400052 >> 00008021 8e050054 10a00005 8fb10010 >> [ 239.730000] ---[ end trace 7d873dc004108502 ]--- >> [ 239.740000] Kernel panic - not syncing: Fatal exception in interrupt >> [ 239.740000] Rebooting in 3 seconds.. >> > > > Hi! > > Have you been able to run a test with kernel symbols enabled?? > That would be a great help ;) I have tried to compile images with with kernel symbols enabled; but no matter how much i trim/strip down the build to non essencial features; i am unable to create images that fit in 4 mb flash for the routers i have which are mostly dlink routers. Along with shortage of time that i have at the moment i will have to postpone this testing for later and stick with batman-adv 2013.4.0 for now since 2014 is not providing me the same stability. Last night i tried 2014 again and changed the router that was going to be the gateway and noticed that the reboot was only happening in 1 router instead of 2. Replicating is easy as long as i make the gateway disappear in some way. > > Cheers, >
On 12/02/14 11:40, cmsv wrote: >> >> >> Hi! >> >> Have you been able to run a test with kernel symbols enabled?? >> That would be a great help ;) > > I have tried to compile images with with kernel symbols enabled; but no > matter how much i trim/strip down the build to non essencial features; i > am unable to create images that fit in 4 mb flash for the routers i have > which are mostly dlink routers. > Along with shortage of time that i have at the moment i will have to > postpone this testing for later and stick with batman-adv 2013.4.0 for > now since 2014 is not providing me the same stability. > > Last night i tried 2014 again and changed the router that was going to > be the gateway and noticed that the reboot was only happening in 1 > router instead of 2. > Replicating is easy as long as i make the gateway disappear in some way. You should perform the same test now with the new patches that I just sent to the ml. Maybe your problem was a merely consequence of the bug we just fixed. Cheers,
I have noticed a few patches being sent but unless i missing something they are all for the development branch. Next week i will be deploying new firmware and create new access points and cannot afford "testing" on production environment. I will be returning to the 2014 branch later on after my trip and will try to debug the issue once and for all which by then i will report my findings. On 02/12/2014 06:41 AM, Antonio Quartulli wrote: > On 12/02/14 11:40, cmsv wrote: >>> >>> >>> Hi! >>> >>> Have you been able to run a test with kernel symbols enabled?? >>> That would be a great help ;) >> >> I have tried to compile images with with kernel symbols enabled; but no >> matter how much i trim/strip down the build to non essencial features; i >> am unable to create images that fit in 4 mb flash for the routers i have >> which are mostly dlink routers. >> Along with shortage of time that i have at the moment i will have to >> postpone this testing for later and stick with batman-adv 2013.4.0 for >> now since 2014 is not providing me the same stability. >> >> Last night i tried 2014 again and changed the router that was going to >> be the gateway and noticed that the reboot was only happening in 1 >> router instead of 2. >> Replicating is easy as long as i make the gateway disappear in some way. > > You should perform the same test now with the new patches that I just > sent to the ml. > > Maybe your problem was a merely consequence of the bug we just fixed. > > Cheers, >
On 13/02/14 01:55, cmsv wrote: > I have noticed a few patches being sent but unless i missing something > they are all for the development branch. No, most of them are for the maint branch (thus the 2014.0.0 branch). > Next week i will be deploying new firmware and create new access points > and cannot afford "testing" on production environment. I understand. but if you could give it a try before leaving it would be nice! :) Thanks a lot anyway!
diff --git a/hard-interface.c b/hard-interface.c index 6792e03..0eb0b3b 100644 --- a/hard-interface.c +++ b/hard-interface.c @@ -244,7 +244,7 @@ int batadv_hardif_min_mtu(struct net_device *soft_iface) { struct batadv_priv *bat_priv = netdev_priv(soft_iface); const struct batadv_hard_iface *hard_iface; - int min_mtu = ETH_DATA_LEN; + int min_mtu = INT_MAX; rcu_read_lock(); list_for_each_entry_rcu(hard_iface, &batadv_hardif_list, list) { @@ -259,8 +259,6 @@ int batadv_hardif_min_mtu(struct net_device *soft_iface) } rcu_read_unlock(); - atomic_set(&bat_priv->packet_size_max, min_mtu); - if (atomic_read(&bat_priv->fragmentation) == 0) goto out; @@ -271,13 +269,21 @@ int batadv_hardif_min_mtu(struct net_device *soft_iface) min_mtu = min_t(int, min_mtu, BATADV_FRAG_MAX_FRAG_SIZE); min_mtu -= sizeof(struct batadv_frag_packet); min_mtu *= BATADV_FRAG_MAX_FRAGMENTS; - atomic_set(&bat_priv->packet_size_max, min_mtu); - - /* with fragmentation enabled we can fragment external packets easily */ - min_mtu = min_t(int, min_mtu, ETH_DATA_LEN); out: - return min_mtu - batadv_max_header_len(); + /* report to the other components the maximum amount of bytes that + * batman-adv can send over the wire (without considering the payload + * overhead). For example, this value is used by TT to compute the + * maximum local table table size + */ + atomic_set(&bat_priv->packet_size_max, min_mtu); + + /* the real soft-interface MTU is computed by removing the payload + * overhead from the maximum amount of bytes that was just computed. + * + * However batman-adv does not support MTUs bigger than ETH_DATA_LEN + */ + return min_t(int, min_mtu - batadv_max_header_len(), ETH_DATA_LEN); } /* adjusts the MTU if a new interface with a smaller MTU appeared. */