From patchwork Sat Dec 13 22:32:15 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Linus_L=C3=BCssing?= X-Patchwork-Id: 4213 Return-Path: Received-SPF: None (no SPF record) identity=mailfrom; client-ip=188.40.49.9; helo=mail.passe0815.de; envelope-from=linus.luessing@c0d3.blue; receiver=b.a.t.m.a.n@lists.open-mesh.org Received: from mail.passe0815.de (mail.passe0815.de [188.40.49.9]) by open-mesh.org (Postfix) with ESMTP id 049AB600B8F for ; Sat, 13 Dec 2014 23:32:24 +0100 (CET) Received: from mail.passe0815.de (localhost [127.0.0.1]) by mail.passe0815.de (Postfix) with ESMTP id EB0B75860F4 for ; Sat, 13 Dec 2014 23:32:24 +0100 (CET) Received: from localhost (unknown [IPv6:2a01:170:1112:0:d8f0:9d92:8f9:b97c]) by mail.passe0815.de (Postfix) with ESMTPSA id B6CFE5860F1; Sat, 13 Dec 2014 23:32:23 +0100 (CET) From: =?UTF-8?q?Linus=20L=C3=BCssing?= To: b.a.t.m.a.n@lists.open-mesh.org Date: Sat, 13 Dec 2014 23:32:15 +0100 Message-Id: <1418509935-11849-1-git-send-email-linus.luessing@c0d3.blue> X-Mailer: git-send-email 1.9.1 MIME-Version: 1.0 X-GPG-Mailgate: Not encrypted, public key not found Subject: [B.A.T.M.A.N.] [PATCH maint] batman-adv: fix potential TT client + orig-node memory leak X-BeenThere: b.a.t.m.a.n@lists.open-mesh.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: The list for a Better Approach To Mobile Ad-hoc Networking List-Id: The list for a Better Approach To Mobile Ad-hoc Networking List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Dec 2014 22:32:26 -0000 This patch fixes a potential memory leak which can occur once an originator times out. On timeout the according global translation table entry might not get purged correctly. Furthermore, the non purged TT entry will cause its orig-node to leak, too. Which additionally can lead to the new multicast optimization feature not kicking in because of a therefore bogus counter. In the wild with larger mesh networks we saw this leak quite regularly, resulting in routers to reboot or killed processes. This was because of a combination of two bugs: The bug fixed by commit "batman-adv: fix delayed foreign originator recognition" (8a2ad5204674) amplified this memory leak heavily. Since that commit I'd expect it to happen rarely, probably only in paused and resumed VMs and devices previously in stand-by. The issue this patch fixes is caused by batadv_orig_node_free_rcu() never being called because of not yet released references to the orig-node. References which were supposed to be released through batadv_orig_node_free_rcu()->batadv_tt_global_del_orig(). Fixing the issue by moving batadv_tt_global_del_orig() out of the rcu callback. Signed-off-by: Linus Lüssing Acked-by: Antonio Quartulli --- originator.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/originator.c b/originator.c index 648bdba..bea8198 100644 --- a/originator.c +++ b/originator.c @@ -570,9 +570,6 @@ static void batadv_orig_node_free_rcu(struct rcu_head *rcu) batadv_frag_purge_orig(orig_node, NULL); - batadv_tt_global_del_orig(orig_node->bat_priv, orig_node, -1, - "originator timed out"); - if (orig_node->bat_priv->bat_algo_ops->bat_orig_free) orig_node->bat_priv->bat_algo_ops->bat_orig_free(orig_node); @@ -978,6 +975,9 @@ static void _batadv_purge_orig(struct batadv_priv *bat_priv) if (batadv_purge_orig_node(bat_priv, orig_node)) { batadv_gw_node_delete(bat_priv, orig_node); hlist_del_rcu(&orig_node->hash_entry); + batadv_tt_global_del_orig(orig_node->bat_priv, + orig_node, -1, + "originator timed out"); batadv_orig_node_free_ref(orig_node); continue; }