batman-adv: protect bonding with rcu locks

Message ID 20101230020917.GA4707@pandem0nium (mailing list archive)
State Superseded, archived
Headers

Commit Message

Simon Wunderlich Dec. 30, 2010, 2:09 a.m. UTC
  bonding / alternating candidates need to be secured by rcu locks
as well. This patch therefore converts the bonding list
from a plain pointer list to a rcu securable lists and references
the bonding candidates.

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
---
 originator.c |   17 +++++++-
 routing.c    |  140 +++++++++++++++++++++++++++++++++-------------------------
 types.h      |    4 +-
 unicast.c    |    9 +---
 4 files changed, 100 insertions(+), 70 deletions(-)
  

Comments

Marek Lindner Jan. 16, 2011, 11:35 a.m. UTC | #1
On Thursday 30 December 2010 03:09:17 Simon Wunderlich wrote:
> bonding / alternating candidates need to be secured by rcu locks
> as well. This patch therefore converts the bonding list
> from a plain pointer list to a rcu securable lists and references
> the bonding candidates.

Thanks for your patch! As the bonding / alternating candidate list is the last 
item on the way to an orig_hash_lock free kernel module this patch is a very 
welcome one. However, first tests revealed some memory leaks due to misbehaving 
reference counters. Therefore I took the time to dive deeper into this code 
section and propose an addition to your patch. The changes include:

* The bonding / alternating candidate list is not destroyed & re-created with 
each incoming OGM but candidates are added & deleted in a dynamic fashion.
* The memory leaks have been fixed / a workaround has been found. The "free 
neighbors when an interface is deactivated" patch is also required to get the 
full benefit of these memory leak fixes. 
* I introduced some style changes which are meant to increase readability & 
maintainability. Feedback is welcome.  ;-)

Note: One of the memory leaks turned out to be part of a bigger problem which 
we may or may not have in other code sections as well. purge_orig_neighbors() 
called call_rcu() on the same struct more then once in a short time period, so 
that each consecutive call overwrote the previous one. We have to look into 
this.

While reorganizing the code I stumbled over these lines which are part of the 
interference check:

/* we only care if the other candidate is even considered as candidate. */
if (!list_empty(&tmp_neigh_node2->bonding_list))
         continue;

This seems wrong in this context. If the list is empty then this neighbor is 
not part of the candidate list. Wouldn't you agree ?

Regards,
Marek
  
Simon Wunderlich Jan. 16, 2011, 5:07 p.m. UTC | #2
Hello Marek,

thanks for the review!

On Sun, Jan 16, 2011 at 12:35:53PM +0100, Marek Lindner wrote:
> On Thursday 30 December 2010 03:09:17 Simon Wunderlich wrote:
> [...]
> The changes include:
> 
> * The bonding / alternating candidate list is not destroyed & re-created with 
> each incoming OGM but candidates are added & deleted in a dynamic fashion.

This is a good idea. As discussed in private, this would add some latency in 
the update process, but this should not harm the bonding.

> * The memory leaks have been fixed / a workaround has been found. The "free 
> neighbors when an interface is deactivated" patch is also required to get the 
> full benefit of these memory leak fixes. 

Good job!

> * I introduced some style changes which are meant to increase readability & 
> maintainability. Feedback is welcome.  ;-)

Fine with me. :)

> While reorganizing the code I stumbled over these lines which are part of the 
> interference check:
> 
> /* we only care if the other candidate is even considered as candidate. */
> if (!list_empty(&tmp_neigh_node2->bonding_list))
>          continue;
> 
> This seems wrong in this context. If the list is empty then this neighbor is 
> not part of the candidate list. Wouldn't you agree ?

Agreed, this is wrong. Your patch fixes that, i see.

I've also checked your patch in my VM with 3 hosts and did some ping tests. 
No regressions found, the bonding and alternating worked fine. I did not look
into memleaks thou, but feel free to add my signoff and/or merge the patches.

Thanks,
	Simon
  

Patch

diff --git a/batman-adv/originator.c b/batman-adv/originator.c
index 899ab0b..3e18488 100644
--- a/batman-adv/originator.c
+++ b/batman-adv/originator.c
@@ -88,6 +88,7 @@  struct neigh_node *create_neighbor(struct orig_node *orig_node,
 		return NULL;
 
 	INIT_HLIST_NODE(&neigh_node->list);
+	INIT_LIST_HEAD(&neigh_node->bonding_list);
 
 	memcpy(neigh_node->addr, neigh, ETH_ALEN);
 	neigh_node->orig_node = orig_neigh_node;
@@ -103,13 +104,20 @@  struct neigh_node *create_neighbor(struct orig_node *orig_node,
 void orig_node_free_ref(struct kref *refcount)
 {
 	struct hlist_node *node, *node_tmp;
-	struct neigh_node *neigh_node;
+	struct neigh_node *neigh_node, *tmp_neigh_node;
 	struct orig_node *orig_node;
 
 	orig_node = container_of(refcount, struct orig_node, refcount);
 
 	spin_lock_bh(&orig_node->neigh_list_lock);
 
+	/* for all bonding members ... */
+	list_for_each_entry_safe(neigh_node, tmp_neigh_node,
+				 &orig_node->bond.selected, bonding_list) {
+		list_del_rcu(&neigh_node->bonding_list);
+		call_rcu(&neigh_node->rcu, neigh_node_free_rcu);
+	}
+
 	/* for all neighbors towards this originator ... */
 	hlist_for_each_entry_safe(neigh_node, node, node_tmp,
 				  &orig_node->neigh_list, list) {
@@ -202,6 +210,7 @@  struct orig_node *get_orig_node(struct bat_priv *bat_priv, uint8_t *addr)
 		return NULL;
 
 	INIT_HLIST_HEAD(&orig_node->neigh_list);
+	INIT_LIST_HEAD(&orig_node->bond.selected);
 	spin_lock_init(&orig_node->ogm_cnt_lock);
 	spin_lock_init(&orig_node->neigh_list_lock);
 	kref_init(&orig_node->refcount);
@@ -285,6 +294,12 @@  static bool purge_orig_neighbors(struct bat_priv *bat_priv,
 			neigh_purged = true;
 
 			hlist_del_rcu(&neigh_node->list);
+
+			if (!list_empty(&neigh_node->bonding_list)) {
+				orig_node->bond.candidates--;
+				list_del_rcu(&neigh_node->bonding_list);
+				call_rcu(&neigh_node->rcu, neigh_node_free_rcu);
+			}
 			call_rcu(&neigh_node->rcu, neigh_node_free_rcu);
 		} else {
 			if ((!*best_neigh_node) ||
diff --git a/batman-adv/routing.c b/batman-adv/routing.c
index 557e7d7..ad8d237 100644
--- a/batman-adv/routing.c
+++ b/batman-adv/routing.c
@@ -517,7 +517,6 @@  void update_bonding_candidates(struct bat_priv *bat_priv,
 	int best_tq;
 	struct hlist_node *node, *node2;
 	struct neigh_node *tmp_neigh_node, *tmp_neigh_node2;
-	struct neigh_node *first_candidate, *last_candidate;
 
 	/* update the candidates for this originator */
 	if (!orig_node->router) {
@@ -525,6 +524,7 @@  void update_bonding_candidates(struct bat_priv *bat_priv,
 		return;
 	}
 
+	spin_lock_bh(&orig_node->neigh_list_lock);
 	best_tq = orig_node->router->tq_avg;
 
 	/* update bond.candidates */
@@ -535,19 +535,14 @@  void update_bonding_candidates(struct bat_priv *bat_priv,
 	 * as "bonding partner" */
 
 	/* first, zero the list */
-	rcu_read_lock();
-	hlist_for_each_entry_rcu(tmp_neigh_node, node,
-				 &orig_node->neigh_list, list) {
-		tmp_neigh_node->next_bond_candidate = NULL;
+	list_for_each_entry_safe(tmp_neigh_node, tmp_neigh_node2,
+				 &orig_node->bond.selected, bonding_list) {
+		list_del_rcu(&tmp_neigh_node->bonding_list);
+		kref_put(&tmp_neigh_node->refcount, neigh_node_free_ref);
 	}
-	rcu_read_unlock();
 
-	first_candidate = NULL;
-	last_candidate = NULL;
-
-	rcu_read_lock();
 	hlist_for_each_entry_rcu(tmp_neigh_node, node,
-				 &orig_node->neigh_list, list) {
+		&orig_node->neigh_list, list) {
 
 		/* only consider if it has the same primary address ...  */
 		if (memcmp(orig_node->orig,
@@ -572,7 +567,7 @@  void update_bonding_candidates(struct bat_priv *bat_priv,
 
 			/* we only care if the other candidate is even
 			 * considered as candidate. */
-			if (!tmp_neigh_node2->next_bond_candidate)
+			if (!list_empty(&tmp_neigh_node2->bonding_list))
 				continue;
 
 
@@ -589,24 +584,16 @@  void update_bonding_candidates(struct bat_priv *bat_priv,
 		if (interference_candidate)
 			continue;
 
-		if (!first_candidate) {
-			first_candidate = tmp_neigh_node;
-			tmp_neigh_node->next_bond_candidate = first_candidate;
-		} else
-			tmp_neigh_node->next_bond_candidate = last_candidate;
-
-		last_candidate = tmp_neigh_node;
+		list_add_rcu(&tmp_neigh_node->bonding_list,
+				&orig_node->bond.selected);
+		kref_get(&tmp_neigh_node->refcount);
 
 		candidates++;
 	}
-	rcu_read_unlock();
-
-	if (candidates > 0) {
-		first_candidate->next_bond_candidate = last_candidate;
-		orig_node->bond.selected = first_candidate;
-	}
-
 	orig_node->bond.candidates = candidates;
+
+	spin_unlock_bh(&orig_node->neigh_list_lock);
+
 }
 
 void receive_bat_packet(struct ethhdr *ethhdr,
@@ -1110,16 +1097,18 @@  out:
 }
 
 /* find a suitable router for this originator, and use
- * bonding if possible. */
+ * bonding if possible. increases the found neighbors
+ * refcount.*/
 struct neigh_node *find_router(struct bat_priv *bat_priv,
 			       struct orig_node *orig_node,
 			       struct batman_if *recv_if)
 {
 	struct orig_node *primary_orig_node;
 	struct orig_node *router_orig;
-	struct neigh_node *router, *first_candidate, *best_router;
+	struct neigh_node *router, *first_candidate, *tmp_neigh_node;
 	static uint8_t zero_mac[ETH_ALEN] = {0, 0, 0, 0, 0, 0};
 	int bonding_enabled;
+	int best_router_tq;
 
 	if (!orig_node)
 		return NULL;
@@ -1132,15 +1121,23 @@  struct neigh_node *find_router(struct bat_priv *bat_priv,
 
 	bonding_enabled = atomic_read(&bat_priv->bonding);
 
-	if ((!recv_if) && (!bonding_enabled))
-		return orig_node->router;
-
+	rcu_read_lock();
+	/* select default router to output */
+	router = orig_node->router;
 	router_orig = orig_node->router->orig_node;
+	if (!router_orig) {
+		rcu_read_unlock();
+		return NULL;
+	}
+
+
+	if ((!recv_if) && (!bonding_enabled))
+		goto return_router;
 
 	/* if we have something in the primary_addr, we can search
 	 * for a potential bonding candidate. */
 	if (memcmp(router_orig->primary_addr, zero_mac, ETH_ALEN) == 0)
-		return orig_node->router;
+		goto return_router;
 
 	/* find the orig_node which has the primary interface. might
 	 * even be the same as our router_orig in many cases */
@@ -1149,60 +1146,83 @@  struct neigh_node *find_router(struct bat_priv *bat_priv,
 				router_orig->orig, ETH_ALEN) == 0) {
 		primary_orig_node = router_orig;
 	} else {
-		rcu_read_lock();
 		primary_orig_node = hash_find(bat_priv->orig_hash, compare_orig,
 					       choose_orig,
 					       router_orig->primary_addr);
-		rcu_read_unlock();
-
 		if (!primary_orig_node)
-			return orig_node->router;
+			goto return_router;
 	}
-
 	/* with less than 2 candidates, we can't do any
 	 * bonding and prefer the original router. */
 
 	if (primary_orig_node->bond.candidates < 2)
-		return orig_node->router;
+		goto return_router;
 
 
 	/* all nodes between should choose a candidate which
 	 * is is not on the interface where the packet came
 	 * in. */
-	first_candidate = primary_orig_node->bond.selected;
-	router = first_candidate;
+
+	first_candidate = NULL;
+	router = NULL;
 
 	if (bonding_enabled) {
 		/* in the bonding case, send the packets in a round
 		 * robin fashion over the remaining interfaces. */
-		do {
+
+		list_for_each_entry_rcu(tmp_neigh_node,
+			&primary_orig_node->bond.selected, bonding_list) {
+			if (!first_candidate)
+				first_candidate = tmp_neigh_node;
 			/* recv_if == NULL on the first node. */
-			if (router->if_incoming != recv_if)
+			if (tmp_neigh_node->if_incoming != recv_if) {
+				router = tmp_neigh_node;
 				break;
+			}
+		}
 
-			router = router->next_bond_candidate;
-		} while (router != first_candidate);
+		/* use the first candidate if nothing was found. */
+		if (!router)
+			router = first_candidate;
 
-		primary_orig_node->bond.selected = router->next_bond_candidate;
+		/* selected should point to the next element
+		 * after the current router */
+		spin_lock_bh(&primary_orig_node->neigh_list_lock);
+		/* this is a list_move(), which unfortunately
+		 * does not exist as rcu version */
+		list_del_rcu(&primary_orig_node->bond.selected);
+		list_add_rcu(&primary_orig_node->bond.selected,
+				&router->bonding_list);
+		spin_unlock_bh(&primary_orig_node->neigh_list_lock);
 
 	} else {
 		/* if bonding is disabled, use the best of the
 		 * remaining candidates which are not using
 		 * this interface. */
-		best_router = first_candidate;
+		best_router_tq = 0;
+		list_for_each_entry_rcu(tmp_neigh_node,
+			&primary_orig_node->bond.selected, bonding_list) {
+			if (!first_candidate)
+				first_candidate = tmp_neigh_node;
 
-		do {
 			/* recv_if == NULL on the first node. */
-			if ((router->if_incoming != recv_if) &&
-				(router->tq_avg > best_router->tq_avg))
-					best_router = router;
+			if (tmp_neigh_node->if_incoming != recv_if)
+				/* if we don't have a router yet
+				 * or this one is better, choose it. */
+				if ((!router) ||
+				(tmp_neigh_node->tq_avg > router->tq_avg)) {
+					router = tmp_neigh_node;
+					best_router_tq = 0;
+				}
+		}
 
-			router = router->next_bond_candidate;
-		} while (router != first_candidate);
-
-		router = best_router;
+		/* use the first candidate if nothing was found. */
+		if (!router)
+			router = first_candidate;
 	}
-
+return_router:
+	kref_get(&router->refcount);
+	rcu_read_unlock();
 	return router;
 }
 
@@ -1210,7 +1230,7 @@  static int check_unicast_packet(struct sk_buff *skb, int hdr_size)
 {
 	struct ethhdr *ethhdr;
 
-	/* drop packet if it has not necessary minimum size */
+/* drop packet if it has not necessary minimum size */
 	if (unlikely(!pskb_may_pull(skb, hdr_size)))
 		return -1;
 
@@ -1262,13 +1282,13 @@  int route_unicast_packet(struct sk_buff *skb, struct batman_if *recv_if,
 		goto unlock;
 
 	kref_get(&orig_node->refcount);
+	rcu_read_unlock();
+
+	/* find_router() increases neigh_nodes refcount if found. */
 	neigh_node = find_router(bat_priv, orig_node, recv_if);
 
 	if (!neigh_node)
-		goto unlock;
-
-	kref_get(&neigh_node->refcount);
-	rcu_read_unlock();
+		goto out;
 
 	/* create a copy of the skb, if needed, to modify it. */
 	if (skb_cow(skb, sizeof(struct ethhdr)) < 0)
diff --git a/batman-adv/types.h b/batman-adv/types.h
index 52b6b08..8264050 100644
--- a/batman-adv/types.h
+++ b/batman-adv/types.h
@@ -92,7 +92,7 @@  struct orig_node {
 	spinlock_t ogm_cnt_lock; /* protects ogm counter */
 	struct {
 		uint8_t candidates;
-		struct neigh_node *selected;
+		struct list_head selected;
 	} bond;
 };
 
@@ -116,7 +116,7 @@  struct neigh_node {
 	uint8_t tq_index;
 	uint8_t tq_avg;
 	uint8_t last_ttl;
-	struct neigh_node *next_bond_candidate;
+	struct list_head bonding_list;
 	unsigned long last_valid;
 	unsigned long real_bits[NUM_WORDS];
 	struct kref refcount;
diff --git a/batman-adv/unicast.c b/batman-adv/unicast.c
index 67bed2d..cfe08dd 100644
--- a/batman-adv/unicast.c
+++ b/batman-adv/unicast.c
@@ -314,14 +314,11 @@  trans_search:
 	orig_node = transtable_search(bat_priv, ethhdr->h_dest);
 
 find_router:
-	rcu_read_lock();
+	/* find_router() increases neigh_nodes refcount if found. */
 	neigh_node = find_router(bat_priv, orig_node, NULL);
 
 	if (!neigh_node)
-		goto unlock;
-
-	kref_get(&neigh_node->refcount);
-	rcu_read_unlock();
+		goto out;
 
 	if (neigh_node->if_incoming->if_status != IF_ACTIVE)
 		goto out;
@@ -353,8 +350,6 @@  find_router:
 	ret = 0;
 	goto out;
 
-unlock:
-	rcu_read_unlock();
 out:
 	if (neigh_node)
 		kref_put(&neigh_node->refcount, neigh_node_free_ref);