routing loops on interconnected routers / adhoc + ethernet

Message ID 201203041613.05681.lindner_marek@yahoo.de (mailing list archive)
State Not Applicable, archived
Headers

Commit Message

Marek Lindner March 4, 2012, 8:13 a.m. UTC
  On Sunday, March 04, 2012 10:30:14 Nicolás Echániz wrote:
> When marisa-mr_wlan0 is disabled, pings loose 0% end-to-end even with
> big packets an 0.1s interval and transfer speed is quite good, around
> 5Mbit/s end to end and 30Mbit/s on the best links.
> 
> 
> One other thing that took some time to learn (the hard way) is that the
> network won't fully recover when we revert the changes in marisa-mr.
> Loops are gone and pings are normal, but transfer speed sucks.
> All related nodes (marisa-mr, marisa-blt, cisterna and czuk) need to be
> restarted for the transfer speed to get back to normal. We assumed this
> is some kind of problem with the wireles modules, ath9k and ath9k_htc
> that we use, but it's just an assumption.
> 
> 
> 
> Let me know if you find anything unusual in the setup I sent you.

The setup looks good as far as I can tell. I backported 2 patches for 2011.4.0 
that we currently have in the pipeline. They are meant to address temporary 
routing loops. Somewhat similar to what you are experiencing but I am not 100% 
sure it will help in your case. Could you please apply these patches to 
batman-adv and install the patched binary on all nodes ? It is essential that 
all the nodes run the same version to avoid odd routing behavior.

Let us know how it goes!

Cheers,
Marek
  

Comments

Nicolás Echániz March 4, 2012, 9:32 a.m. UTC | #1
On 03/04/2012 05:13 AM, Marek Lindner wrote:
> On Sunday, March 04, 2012 10:30:14 Nicolás Echániz wrote:

>> Let me know if you find anything unusual in the setup I sent you.
> 
> The setup looks good as far as I can tell. I backported 2 patches for 2011.4.0 
> that we currently have in the pipeline. They are meant to address temporary 
> routing loops. Somewhat similar to what you are experiencing but I am not 100% 
> sure it will help in your case. Could you please apply these patches to 
> batman-adv and install the patched binary on all nodes ? It is essential that 
> all the nodes run the same version to avoid odd routing behavior.
> 
> Let us know how it goes!

Marek,

these experimental nodes were installed from daily snapshot. and batman
installed with: opkg install kmod-batman-adv. Are these patches already
present in batman-adv 2012.0.0?

If so, we might just re-flash the routers with current trunk; right?

This will take a while because routers are already installed on
roof-tops and the network is in use by some neighbors, but it's still
experimental, so we will do what needs to be donde to get the best
results possible.
  
Marek Lindner March 4, 2012, 10:52 a.m. UTC | #2
On Sunday, March 04, 2012 17:32:38 Nicolás Echániz wrote:
> these experimental nodes were installed from daily snapshot. and batman
> installed with: opkg install kmod-batman-adv. Are these patches already
> present in batman-adv 2012.0.0?
> 
> If so, we might just re-flash the routers with current trunk; right?

No, these patches are not part of any release yet. You need to compile your 
own batman-adv with the patches applied. We have a document explaining how you 
should proceed: http://www.open-mesh.org/wiki/batman-adv/Building-with-openwrt


> This will take a while because routers are already installed on
> roof-tops and the network is in use by some neighbors, but it's still
> experimental, so we will do what needs to be donde to get the best
> results possible.

There is no need to reflash all the routers. You can simply build a patched 
batman-adv package and install it on top of your current image. The important 
part is that you compile the batman-adv kernel module for exactly the same 
kernel you are running.

Regards,
Marek
  
Nicolás Echániz March 18, 2012, 6:31 a.m. UTC | #3
On 03/04/2012 07:52 AM, Marek Lindner wrote:
> On Sunday, March 04, 2012 17:32:38 Nicolás Echániz wrote:
>> these experimental nodes were installed from daily snapshot. and batman
>> installed with: opkg install kmod-batman-adv. Are these patches already
>> present in batman-adv 2012.0.0?
>>
>> If so, we might just re-flash the routers with current trunk; right?
> 
> No, these patches are not part of any release yet. You need to compile your 
> own batman-adv with the patches applied. We have a document explaining how you 
> should proceed: http://www.open-mesh.org/wiki/batman-adv/Building-with-openwrt
> 
> 
>> This will take a while because routers are already installed on
>> roof-tops and the network is in use by some neighbors, but it's still
>> experimental, so we will do what needs to be donde to get the best
>> results possible.
> 
> There is no need to reflash all the routers. You can simply build a patched 
> batman-adv package and install it on top of your current image. The important 
> part is that you compile the batman-adv kernel module for exactly the same 
> kernel you are running.

Hi Marek,

I've finally had the time to look into this again.

The routers have been updated to current OpenWRT trunk and batman-adv
version has changed, would it be too much inconvenience for you to send
me your patches for 2012.0.0?

thanks in advance.

NicoEchániz
  

Patch

From 4c01bde01c9977bdcf06cf4cdb16805c28f5e634 Mon Sep 17 00:00:00 2001
From: Marek Lindner <lindner_marek@yahoo.de>
Date: Sun, 29 Jan 2012 21:45:45 +0800
Subject: [PATCH 2/2] batman-adv: avoid temporary routing loops by being
 strict on forwarded OGMs

batman-adv would forward OGMs from non-besthops while replacing the the TQ
and TTL values with the values from the best hop. In certain corner cases
this leads to a temporary routing loop.
This patch changes this behavior: Only packets from best next hops are
forwarded - TQ and TTL values won't be replaced anymore. However, the protocol
needs to rebroadcast OGMs from single hop neighbors regardless of whether or
not they are the best hop. To handle this case a new flag is introduced to
alert neighboring nodes about the forwarded OGM that is not from my best
next hop. It is to be discarded by all nodes except for the one originating
the OGM.

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
---
 bat_iv_ogm.c |   61 ++++++++++++++++++++++++++++++---------------------------
 packet.h     |    1 +
 2 files changed, 33 insertions(+), 29 deletions(-)

diff --git a/bat_iv_ogm.c b/bat_iv_ogm.c
index 52eb052..a0aa3b3 100644
--- a/bat_iv_ogm.c
+++ b/bat_iv_ogm.c
@@ -465,11 +465,10 @@  static void bat_ogm_forward(struct orig_node *orig_node,
 			    const struct ethhdr *ethhdr,
 			    struct batman_ogm_packet *batman_ogm_packet,
 			    bool is_single_hop_neigh,
+			    bool is_from_best_next_hop,
 			    struct hard_iface *if_incoming)
 {
 	struct bat_priv *bat_priv = netdev_priv(if_incoming->soft_iface);
-	struct neigh_node *router;
-	uint8_t in_tq, in_ttl, tq_avg = 0;
 	uint8_t tt_num_changes;
 
 	if (batman_ogm_packet->ttl <= 1) {
@@ -477,41 +476,31 @@  static void bat_ogm_forward(struct orig_node *orig_node,
 		return;
 	}
 
-	router = orig_node_get_router(orig_node);
+	if (!is_from_best_next_hop) {
+		/**
+		* Mark the forwarded packet when it is not coming from our best
+		* next hop. We still need to forward the packet for our neighbor
+		* link quality detection to work in case the packet originated
+		* from a single hop neighbor. Otherwise we can simply drop the
+		* ogm.
+		*/
+		if (is_single_hop_neigh)
+			batman_ogm_packet->flags |= NOT_BEST_NEXT_HOP;
+		else
+			return;
+	}
 
-	in_tq = batman_ogm_packet->tq;
-	in_ttl = batman_ogm_packet->ttl;
 	tt_num_changes = batman_ogm_packet->tt_num_changes;
 
 	batman_ogm_packet->ttl--;
 	memcpy(batman_ogm_packet->prev_sender, ethhdr->h_source, ETH_ALEN);
 
-	/* rebroadcast tq of our best ranking neighbor to ensure the rebroadcast
-	 * of our best tq value */
-	if (router && router->tq_avg != 0) {
-
-		/* rebroadcast ogm of best ranking neighbor as is */
-		if (!compare_eth(router->addr, ethhdr->h_source)) {
-			batman_ogm_packet->tq = router->tq_avg;
-
-			if (router->last_ttl)
-				batman_ogm_packet->ttl = router->last_ttl - 1;
-		}
-
-		tq_avg = router->tq_avg;
-	}
-
-	if (router)
-		neigh_node_free_ref(router);
-
 	/* apply hop penalty */
 	batman_ogm_packet->tq = hop_penalty(batman_ogm_packet->tq, bat_priv);
 
 	bat_dbg(DBG_BATMAN, bat_priv,
-		"Forwarding packet: tq_orig: %i, tq_avg: %i, "
-		"tq_forw: %i, ttl_orig: %i, ttl_forw: %i\n",
-		in_tq, tq_avg, batman_ogm_packet->tq, in_ttl - 1,
-		batman_ogm_packet->ttl);
+		"Forwarding packet: tq: %i, ttl: %i\n",
+		batman_ogm_packet->tq, batman_ogm_packet->ttl);
 
 	batman_ogm_packet->seqno = htonl(batman_ogm_packet->seqno);
 	batman_ogm_packet->tt_crc = htons(batman_ogm_packet->tt_crc);
@@ -905,6 +894,7 @@  static void bat_ogm_process(const struct ethhdr *ethhdr,
 	int is_my_addr = 0, is_my_orig = 0, is_my_oldorig = 0;
 	int is_broadcast = 0, is_bidirectional;
 	bool is_single_hop_neigh = false;
+	bool is_from_best_next_hop = false;
 	int is_duplicate;
 	uint32_t if_incoming_seqno;
 
@@ -1029,6 +1019,13 @@  static void bat_ogm_process(const struct ethhdr *ethhdr,
 		return;
 	}
 
+	if (batman_ogm_packet->flags & NOT_BEST_NEXT_HOP) {
+		bat_dbg(DBG_BATMAN, bat_priv,
+			"Drop packet: ignoring all packets not forwarded from "
+			"the best next hop (sender: %pM)\n", ethhdr->h_source);
+		return;
+	}
+
 	orig_node = get_orig_node(bat_priv, batman_ogm_packet->orig);
 	if (!orig_node)
 		return;
@@ -1053,6 +1050,10 @@  static void bat_ogm_process(const struct ethhdr *ethhdr,
 	if (router)
 		router_router = orig_node_get_router(router->orig_node);
 
+	if ((router && router->tq_avg != 0) &&
+	    (compare_eth(router->addr, ethhdr->h_source)))
+		is_from_best_next_hop = true;
+
 	/* avoid temporary routing loops */
 	if (router && router_router &&
 	    (compare_eth(router->addr, batman_ogm_packet->prev_sender)) &&
@@ -1103,7 +1104,8 @@  static void bat_ogm_process(const struct ethhdr *ethhdr,
 
 		/* mark direct link on incoming interface */
 		bat_ogm_forward(orig_node, ethhdr, batman_ogm_packet,
-				is_single_hop_neigh, if_incoming);
+				is_single_hop_neigh, is_from_best_next_hop,
+				if_incoming);
 
 		bat_dbg(DBG_BATMAN, bat_priv, "Forwarding packet: "
 			"rebroadcast neighbor packet with direct link flag\n");
@@ -1126,7 +1128,8 @@  static void bat_ogm_process(const struct ethhdr *ethhdr,
 	bat_dbg(DBG_BATMAN, bat_priv,
 		"Forwarding packet: rebroadcast originator packet\n");
 	bat_ogm_forward(orig_node, ethhdr, batman_ogm_packet,
-			is_single_hop_neigh, if_incoming);
+			is_single_hop_neigh, is_from_best_next_hop,
+			if_incoming);
 
 out_neigh:
 	if ((orig_neigh_node) && (!is_single_hop_neigh))
diff --git a/packet.h b/packet.h
index 4d9e54c..667ed75 100644
--- a/packet.h
+++ b/packet.h
@@ -39,6 +39,7 @@  enum bat_packettype {
 #define COMPAT_VERSION 14
 
 enum batman_flags {
+	NOT_BEST_NEXT_HOP   = 1 << 3,
 	PRIMARIES_FIRST_HOP = 1 << 4,
 	VIS_SERVER	    = 1 << 5,
 	DIRECTLINK	    = 1 << 6
-- 
1.7.9