batman-adv: bugfix for kernel crash in batadv_tt_local_table_free

Message ID OF5389A718.644E8229-ONC1257E0D.00566F9A-C1257E0D.0056A5A0@phoenixcontact.com (mailing list archive)
State Superseded, archived
Headers

Commit Message

Andreas Pape March 19, 2015, 3:46 p.m. UTC
  This missing check lead to a kernel crash when a hard_if is removed on a 
node forwarding
untagged and tagged traffic (VLANID 0) to and from the mesh network.

Signed-off-by: Andreas Pape <apape@phoenixcontact.com>
---
 translation-table.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

                }
  

Comments

Antonio Quartulli March 19, 2015, 4:01 p.m. UTC | #1
Hi Andreas,

On 19/03/15 16:46, Andreas Pape wrote:
> This missing check lead to a kernel crash when a hard_if is removed on a 
> node forwarding
> untagged and tagged traffic (VLANID 0) to and from the mesh network.
> 

Did you actually see the crash? if so, can you please report the stacktrace?

When creating bat0 (untagged interface), a "fake" vlan object is created
with vid = BATADV_NO_FLAGS, therefore in this context the object "vlan"
should never be NULL because there is always an object to retrieve.

Cheers,

> Signed-off-by: Andreas Pape <apape@phoenixcontact.com>
> ---
>  translation-table.c |    6 ++++--
>  1 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/translation-table.c b/translation-table.c
> index b20812b..4d3ab8d 100644
> --- a/translation-table.c
> +++ b/translation-table.c
> @@ -1143,8 +1143,10 @@ static void batadv_tt_local_table_free(struct 
> batadv_priv *bat_priv)
>                         /* decrease the reference held for this vlan */
>                         vlan = batadv_softif_vlan_get(bat_priv,
>  tt_common_entry->vid);
> -                       batadv_softif_vlan_free_ref(vlan);
> -                       batadv_softif_vlan_free_ref(vlan);
> +                       if (vlan) {
> +                               batadv_softif_vlan_free_ref(vlan);
> +                               batadv_softif_vlan_free_ref(vlan);
> +                       }
>  
>                         batadv_tt_local_entry_free_ref(tt_local);
>                 }
>
  
Andreas Pape March 19, 2015, 4:54 p.m. UTC | #2
Hi Antonio,

I saw the crash, but only in the case when I add the non-IP traffic using 
VLAN ID 0. No interface of the mesh node is part of that VLAN, i.e. I 
don't use further virtual VLAN interfaces on my mesh node. This VLAN 
traffic shall only be transparently forwarded over the mesh which - by the 
way - works flawlessly. 

As soon as I remove the WLAN interface from bat0 (batctl if del ath0) I 
get the following crash report:

/ # Unable to handle kernel paging request for data at address 0x00000020
Faulting instruction address: 0xc99c35a4
Oops: Kernel access of bad area, sig: 11 [#1]
MPC831x RDB
Modules linked in: batman_adv crc32c umac ath_dev(P) ath_rate_atheros(P) 
ath_dfs(P) ath_hal(P) asf(P) adf t23xsec2 t23xrm spi_mpc8xxx ipv6
NIP: c99c35a4 LR: c99c8ca4 CTR: c00253ac
REGS: c654ddd0 TRAP: 0300   Tainted: P            (2.6.32.26-svn1313)
MSR: 00009032 <EE,ME,IR,DR>  CR: 84002082  XER: 20000000
DAR: 00000020, DSISR: 20000000
TASK = c68680e0[1105] 'bat_events' THREAD: c654c000
GPR00: 00000020 c654de80 c68680e0 00000000 00008000 c0355078 c797f3e5 
00000080
GPR08: 00000001 00000000 00000001 00000000 44002088 100d5bc0 07ffa000 
00000000
GPR16: fa3fdfef 00000000 00000000 00000000 00000000 00000000 00020000 
07e958b0
GPR24: 00000000 00000174 c7b14320 c6d382c0 00000000 00000000 00200200 
00000000
NIP [c99c35a4] batadv_softif_vlan_free_ref+0x18/0x98 [batman_adv]
LR [c99c8ca4] batadv_tt_free+0xc0/0x30c [batman_adv]
Call Trace:
[c654de80] [c99c6690] batadv_tt_local_entry_free_ref+0x38/0x48 
[batman_adv] (unreliable)
[c654de90] [c99c8ca4] batadv_tt_free+0xc0/0x30c [batman_adv]
[c654dec0] [c99bdfb8] batadv_mesh_free+0x54/0x8c [batman_adv]
[c654dee0] [c99c34e8] batadv_softif_free+0x20/0x40 [batman_adv]
[c654df00] [c01fa2f0] netdev_run_todo+0x1a4/0x244
[c654df30] [c0206888] rtnl_unlock+0x10/0x20
[c654df40] [c01fad48] unregister_netdev+0x24/0x38
[c654df60] [c99c36cc] batadv_softif_destroy_finish+0x50/0x64 [batman_adv]
[c654df80] [c0031d44] worker_thread+0x108/0x1b0
[c654dfc0] [c0035ca0] kthread+0x78/0x7c
[c654dff0] [c0011444] kernel_thread+0x4c/0x68
Instruction dump:
4e800020 480088f1 7fe3fb78 3be00000 48008935 4bffffd8 9421fff0 7c0802a6
93e1000c 7c7f1b78 90010014 38030020 <7d200028> 3129ffff 7d20012d 40a2fff4
Kernel panic - not syncing: Fatal exception in interrupt
Rebooting in 180 seconds..


If I add the "if (vlan)" check, this doesn't happen anymore. The other 
thing which prevents the crash is not using the VLAN ID 0 traffic. 

Regards,
Andreas


"B.A.T.M.A.N" <b.a.t.m.a.n-bounces@lists.open-mesh.org> schrieb am 
19.03.2015 17:01:33:

> Von: Antonio Quartulli <antonio@meshcoding.com>
> An: b.a.t.m.a.n@lists.open-mesh.org, 
> Datum: 19.03.2015 17:03
> Betreff: Re: [B.A.T.M.A.N.] [PATCH] batman-adv: bugfix for kernel 
> crash in batadv_tt_local_table_free
> Gesendet von: "B.A.T.M.A.N" <b.a.t.m.a.n-bounces@lists.open-mesh.org>
> 
> Hi Andreas,
> 
> On 19/03/15 16:46, Andreas Pape wrote:
> > This missing check lead to a kernel crash when a hard_if is removed on 
a 
> > node forwarding
> > untagged and tagged traffic (VLANID 0) to and from the mesh network.
> > 
> 
> Did you actually see the crash? if so, can you please report the 
stacktrace?
> 
> When creating bat0 (untagged interface), a "fake" vlan object is created
> with vid = BATADV_NO_FLAGS, therefore in this context the object "vlan"
> should never be NULL because there is always an object to retrieve.
> 
> Cheers,
> 
> > Signed-off-by: Andreas Pape <apape@phoenixcontact.com>
> > ---
> >  translation-table.c |    6 ++++--
> >  1 files changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/translation-table.c b/translation-table.c
> > index b20812b..4d3ab8d 100644
> > --- a/translation-table.c
> > +++ b/translation-table.c
> > @@ -1143,8 +1143,10 @@ static void batadv_tt_local_table_free(struct 
> > batadv_priv *bat_priv)
> >                         /* decrease the reference held for this vlan 
*/
> >                         vlan = batadv_softif_vlan_get(bat_priv,
> >  tt_common_entry->vid);
> > -                       batadv_softif_vlan_free_ref(vlan);
> > -                       batadv_softif_vlan_free_ref(vlan);
> > +                       if (vlan) {
> > +                               batadv_softif_vlan_free_ref(vlan);
> > +                               batadv_softif_vlan_free_ref(vlan);
> > +                       }
> > 
> >                         batadv_tt_local_entry_free_ref(tt_local);
> >                 }
> > 
> 
> -- 
> Antonio Quartulli
> 
> [Anhang "signature.asc" gelöscht von Andreas Pape/Phoenix Contact] 


..................................................................
PHOENIX CONTACT ELECTRONICS GmbH

Sitz der Gesellschaft / registered office of the company: 31812 Bad Pyrmont
USt-Id-Nr.: DE811742156
Amtsgericht Hannover HRB 100528 / district court Hannover HRB 100528
Geschäftsführer / Executive Board: Roland Bent, Dr. Martin Heubeck
  
Antonio Quartulli April 22, 2015, 2:24 a.m. UTC | #3
Hi Adreas,

sorry for waiting so long before getting back to you, but this bug
slipped away of my todo list..

Anyhow, I have some questions inline:

On 19/03/15 17:54, Andreas Pape wrote:
> Hi Antonio,
> 
> I saw the crash, but only in the case when I add the non-IP traffic

What is this non-IP traffic? Can you mention what type of traffic are
you sending?

> using VLAN ID 0. 

Are you saying you are creating a bat0.0 ? If not, can you clarify the
meaning of VLAN ID 0 ?


Thanks!
Cheers,
  

Patch

diff --git a/translation-table.c b/translation-table.c
index b20812b..4d3ab8d 100644
--- a/translation-table.c
+++ b/translation-table.c
@@ -1143,8 +1143,10 @@  static void batadv_tt_local_table_free(struct 
batadv_priv *bat_priv)
                        /* decrease the reference held for this vlan */
                        vlan = batadv_softif_vlan_get(bat_priv,
 tt_common_entry->vid);
-                       batadv_softif_vlan_free_ref(vlan);
-                       batadv_softif_vlan_free_ref(vlan);
+                       if (vlan) {
+                               batadv_softif_vlan_free_ref(vlan);
+                               batadv_softif_vlan_free_ref(vlan);
+                       }
 
                        batadv_tt_local_entry_free_ref(tt_local);