[v5,3/4] batman-adv: B.A.T.M.A.N. V - implement GW selection logic

Message ID 20160612041426.26339-4-a@unstable.cc (mailing list archive)
State Superseded, archived
Delegated to: Marek Lindner
Headers

Commit Message

Antonio Quartulli June 12, 2016, 4:14 a.m. UTC
  Since the GW selection logic has been made routing protocol specific
it is now possible for B.A.T.M.A.N V to have its own mechanism by
providing the API implementation.

Implement the GW specific API in the B.A.T.M.A.N. V protocol in
order to provide a working GW selection mechanism.

Signed-off-by: Antonio Quartulli <a@unstable.cc>
---
 net/batman-adv/bat_v.c          | 220 +++++++++++++++++++++++++++++++++++++++-
 net/batman-adv/gateway_client.c |   9 +-
 net/batman-adv/gateway_client.h |   2 +
 3 files changed, 227 insertions(+), 4 deletions(-)
  

Comments

Sven Eckelmann June 13, 2016, 11:12 a.m. UTC | #1
On Sunday 12 June 2016 12:14:25 Antonio Quartulli wrote:
> --- a/net/batman-adv/gateway_client.c
> +++ b/net/batman-adv/gateway_client.c
> @@ -215,6 +215,10 @@ void batadv_gw_election(struct batadv_priv *bat_priv)
>         if (!batadv_atomic_dec_not_zero(&bat_priv->gw.reselect) && curr_gw)
>                 goto out;
>  
> +       /* if gw.reselect is set to 1 it means that a previous call to
> +        * gw.is_eligible() said that we have a new best GW, therefore it can
> +        * now be picked from the list and selected
> +        */
>         next_gw = bat_priv->algo_ops->gw.get_best_gw_node(bat_priv);

I would guess that this should be either in patch 2/4 on in an extra patch. At
least it is about nothing which is touched otherwise in this patch

Kind regards,
	Sven
  
Sven Eckelmann June 13, 2016, 11:26 a.m. UTC | #2
On Sunday 12 June 2016 12:14:25 Antonio Quartulli wrote:
> +       if (orig_throughput < (gw_throughput + threshold))
> +               goto out;

Possible overflow problem in batadv_v_gw_is_eligible. We don't
know what the user will add here and what the gw_throughput is.

We already had a similar problem in the BATMAN_IV gw selection
code which resulted in weird gw selections. See f63c54bba31d
("batman-adv: Avoid u32 overflow during gateway select").


    if (orig_throughput < gw_throughput)
        goto out;

    if ((orig_throughput - gw_throughput) < threshold)
        goto out;

Kind regards,
	Sven
  
Sven Eckelmann June 13, 2016, 11:45 a.m. UTC | #3
On Sunday 12 June 2016 12:14:25 Antonio Quartulli wrote:
> +static void batadv_v_gw_print(struct batadv_priv *bat_priv,
> +                             struct seq_file *seq)
> +{
> +       struct batadv_gw_node *gw_node;
> +       int gw_count = 0;
> +
> +       seq_printf(seq,
> +                  "      Gateway        ( throughput)           Nexthop [outgoingIF]: advertised uplink bandwidth\n");

Maybe you wanted to use seq_puts here. At least you've used it in
similar changes.

Kind regards,
	Sven
  
Antonio Quartulli June 13, 2016, 4:02 p.m. UTC | #4
On Mon, Jun 13, 2016 at 01:12:52PM +0200, Sven Eckelmann wrote:
> On Sunday 12 June 2016 12:14:25 Antonio Quartulli wrote:
> > --- a/net/batman-adv/gateway_client.c
> > +++ b/net/batman-adv/gateway_client.c
> > @@ -215,6 +215,10 @@ void batadv_gw_election(struct batadv_priv *bat_priv)
> >         if (!batadv_atomic_dec_not_zero(&bat_priv->gw.reselect) && curr_gw)
> >                 goto out;
> >  
> > +       /* if gw.reselect is set to 1 it means that a previous call to
> > +        * gw.is_eligible() said that we have a new best GW, therefore it can
> > +        * now be picked from the list and selected
> > +        */
> >         next_gw = bat_priv->algo_ops->gw.get_best_gw_node(bat_priv);
> 
> I would guess that this should be either in patch 2/4 on in an extra patch. At
> least it is about nothing which is touched otherwise in this patch

mh...you are right.

will fix this
  
Antonio Quartulli June 27, 2016, 2:36 a.m. UTC | #5
On Mon, Jun 13, 2016 at 01:26:31PM +0200, Sven Eckelmann wrote:
> On Sunday 12 June 2016 12:14:25 Antonio Quartulli wrote:
> > +       if (orig_throughput < (gw_throughput + threshold))
> > +               goto out;
> 
> Possible overflow problem in batadv_v_gw_is_eligible. We don't
> know what the user will add here and what the gw_throughput is.
> 
> We already had a similar problem in the BATMAN_IV gw selection
> code which resulted in weird gw selections. See f63c54bba31d
> ("batman-adv: Avoid u32 overflow during gateway select").

Sven, correct me if I am wrong, but in f63c54bba31d we are changing the type of
the variables aimed to contain the value computed by the "overflowing"
operation, while in this case we have everything in a if condition: should the
temporary result be implicitly casted to a larger variable as required ?

I may be wrong - but I wanted to ask you to be sure.

Cheers,
  
Antonio Quartulli June 27, 2016, 2:41 a.m. UTC | #6
On Mon, Jun 13, 2016 at 01:12:52PM +0200, Sven Eckelmann wrote:
> On Sunday 12 June 2016 12:14:25 Antonio Quartulli wrote:
> > --- a/net/batman-adv/gateway_client.c
> > +++ b/net/batman-adv/gateway_client.c
> > @@ -215,6 +215,10 @@ void batadv_gw_election(struct batadv_priv *bat_priv)
> >         if (!batadv_atomic_dec_not_zero(&bat_priv->gw.reselect) && curr_gw)
> >                 goto out;
> >  
> > +       /* if gw.reselect is set to 1 it means that a previous call to
> > +        * gw.is_eligible() said that we have a new best GW, therefore it can
> > +        * now be picked from the list and selected
> > +        */
> >         next_gw = bat_priv->algo_ops->gw.get_best_gw_node(bat_priv);
> 
> I would guess that this should be either in patch 2/4 on in an extra patch. At
> least it is about nothing which is touched otherwise in this patch

Good catch :) Thanks !

I am moving the comment to patch 2/4.

Cheers,
  
Sven Eckelmann June 27, 2016, 6:19 a.m. UTC | #7
On Monday 27 June 2016 10:36:13 Antonio Quartulli wrote:
> On Mon, Jun 13, 2016 at 01:26:31PM +0200, Sven Eckelmann wrote:
> > On Sunday 12 June 2016 12:14:25 Antonio Quartulli wrote:
> > > +       if (orig_throughput < (gw_throughput + threshold))
> > > +               goto out;
> > 
> > Possible overflow problem in batadv_v_gw_is_eligible. We don't
> > know what the user will add here and what the gw_throughput is.
> > 
> > We already had a similar problem in the BATMAN_IV gw selection
> > code which resulted in weird gw selections. See f63c54bba31d
> > ("batman-adv: Avoid u32 overflow during gateway select").
> 
> Sven, correct me if I am wrong, but in f63c54bba31d we are changing the type of
> the variables aimed to contain the value computed by the "overflowing"
> operation, while in this case we have everything in a if condition: should the
> temporary result be implicitly casted to a larger variable as required ?

No, it isn't casted to a larger type. Yes, C has some wild casting rules but I
would not be aware of an "help the programmer to avoid overflows" casting
rule.

We had no other idea how to fix it in the mentioned commit. But here we can do
it without casting to u64.

Kind regards,
	Sven
  
Antonio Quartulli June 27, 2016, 6:19 a.m. UTC | #8
On Mon, Jun 27, 2016 at 08:19:22AM +0200, Sven Eckelmann wrote:
> On Monday 27 June 2016 10:36:13 Antonio Quartulli wrote:
> > On Mon, Jun 13, 2016 at 01:26:31PM +0200, Sven Eckelmann wrote:
> > > On Sunday 12 June 2016 12:14:25 Antonio Quartulli wrote:
> > > > +       if (orig_throughput < (gw_throughput + threshold))
> > > > +               goto out;
> > > 
> > > Possible overflow problem in batadv_v_gw_is_eligible. We don't
> > > know what the user will add here and what the gw_throughput is.
> > > 
> > > We already had a similar problem in the BATMAN_IV gw selection
> > > code which resulted in weird gw selections. See f63c54bba31d
> > > ("batman-adv: Avoid u32 overflow during gateway select").
> > 
> > Sven, correct me if I am wrong, but in f63c54bba31d we are changing the type of
> > the variables aimed to contain the value computed by the "overflowing"
> > operation, while in this case we have everything in a if condition: should the
> > temporary result be implicitly casted to a larger variable as required ?
> 
> No, it isn't casted to a larger type. Yes, C has some wild casting rules but I
> would not be aware of an "help the programmer to avoid overflows" casting
> rule.
> 
> We had no other idea how to fix it in the mentioned commit. But here we can do
> it without casting to u64.

ok, thanks !

I will modify the patch in the next series.

Cheers,
  

Patch

diff --git a/net/batman-adv/bat_v.c b/net/batman-adv/bat_v.c
index 90fd5ee..f68be92 100644
--- a/net/batman-adv/bat_v.c
+++ b/net/batman-adv/bat_v.c
@@ -25,6 +25,7 @@ 
 #include <linux/init.h>
 #include <linux/jiffies.h>
 #include <linux/kernel.h>
+#include <linux/kref.h>
 #include <linux/netdevice.h>
 #include <linux/rculist.h>
 #include <linux/rcupdate.h>
@@ -40,6 +41,7 @@ 
 #include "gateway_common.h"
 #include "hard-interface.h"
 #include "hash.h"
+#include "log.h"
 #include "originator.h"
 #include "packet.h"
 
@@ -350,6 +352,210 @@  static ssize_t batadv_v_show_sel_class(struct batadv_priv *bat_priv, char *buff)
 	return sprintf(buff, "%u.%u MBit\n", class / 10, class % 10);
 }
 
+/**
+ * batadv_v_gw_throughput_get - retrieve the GW-bandwidth for a given GW
+ * @gw_node: the GW to retrieve the metric for
+ * @bw: the pointer where the metric will be stored. The metric is computed as
+ *  the minimum between the GW advertised throughput and the path throughput to
+ *  it in the mesh
+ *
+ * Return: 0 on success, -1 on failure
+ */
+static int batadv_v_gw_throughput_get(struct batadv_gw_node *gw_node, u32 *bw)
+{
+	struct batadv_neigh_ifinfo *router_ifinfo = NULL;
+	struct batadv_orig_node *orig_node;
+	struct batadv_neigh_node *router;
+	int ret = -1;
+
+	orig_node = gw_node->orig_node;
+	router = batadv_orig_router_get(orig_node, BATADV_IF_DEFAULT);
+	if (!router)
+		goto out;
+
+	router_ifinfo = batadv_neigh_ifinfo_get(router, BATADV_IF_DEFAULT);
+	if (!router_ifinfo)
+		goto out;
+
+	/* the GW metric is computed as the minimum between the path throughput
+	 * to reach the GW itself and the advertised bandwidth.
+	 * This gives us an approximation of the effective throughput that the
+	 * client can expect via this particular GW node
+	 */
+	*bw = router_ifinfo->bat_v.throughput;
+	*bw = min_t(u32, *bw, gw_node->bandwidth_down);
+
+	ret = 0;
+out:
+	if (router)
+		batadv_neigh_node_put(router);
+	if (router_ifinfo)
+		batadv_neigh_ifinfo_put(router_ifinfo);
+
+	return ret;
+}
+
+/**
+ * batadv_v_gw_get_best_gw_node - retrieve the best GW node
+ * @bat_priv: the bat priv with all the soft interface information
+ *
+ * Return: the GW node having the best GW-metric, NULL if no GW is known
+ */
+static struct batadv_gw_node *
+batadv_v_gw_get_best_gw_node(struct batadv_priv *bat_priv)
+{
+	struct batadv_gw_node *gw_node, *curr_gw = NULL;
+	u32 max_bw = 0, bw;
+
+	rcu_read_lock();
+	hlist_for_each_entry_rcu(gw_node, &bat_priv->gw.list, list) {
+		if (!kref_get_unless_zero(&gw_node->refcount))
+			continue;
+
+		if (batadv_v_gw_throughput_get(gw_node, &bw) < 0)
+			goto next;
+
+		if (curr_gw && (bw <= max_bw))
+			goto next;
+
+		if (curr_gw)
+			batadv_gw_node_put(curr_gw);
+
+		curr_gw = gw_node;
+		kref_get(&curr_gw->refcount);
+		max_bw = bw;
+
+next:
+		batadv_gw_node_put(gw_node);
+	}
+	rcu_read_unlock();
+
+	return curr_gw;
+}
+
+/**
+ * batadv_v_gw_is_eligible - check if a originator would be selected as GW
+ * @bat_priv: the bat priv with all the soft interface information
+ * @curr_gw_orig: originator representing the currently selected GW
+ * @orig_node: the originator representing the new candidate
+ *
+ * Return: true if orig_node can be selected as current GW, false otherwise
+ */
+static bool batadv_v_gw_is_eligible(struct batadv_priv *bat_priv,
+				    struct batadv_orig_node *curr_gw_orig,
+				    struct batadv_orig_node *orig_node)
+{
+	struct batadv_gw_node *curr_gw = NULL, *orig_gw = NULL;
+	u32 gw_throughput, orig_throughput, threshold;
+	bool ret = false;
+
+	threshold = atomic_read(&bat_priv->gw.sel_class);
+
+	curr_gw = batadv_gw_node_get(bat_priv, curr_gw_orig);
+	if (!curr_gw) {
+		ret = true;
+		goto out;
+	}
+
+	if (batadv_v_gw_throughput_get(curr_gw, &gw_throughput) < 0) {
+		ret = true;
+		goto out;
+	}
+
+	orig_gw = batadv_gw_node_get(bat_priv, orig_node);
+	if (!orig_node)
+		goto out;
+
+	if (batadv_v_gw_throughput_get(orig_gw, &orig_throughput) < 0)
+		goto out;
+
+	if (orig_throughput < (gw_throughput + threshold))
+		goto out;
+
+	batadv_dbg(BATADV_DBG_BATMAN, bat_priv,
+		   "Restarting gateway selection: better gateway found (throughput curr: %u, throughput new: %u)\n",
+		   gw_throughput, orig_throughput);
+
+	ret = true;
+out:
+	if (curr_gw)
+		batadv_gw_node_put(curr_gw);
+	if (orig_gw)
+		batadv_gw_node_put(orig_gw);
+
+	return ret;
+}
+
+/* fails if orig_node has no router */
+static int batadv_v_gw_write_buffer_text(struct batadv_priv *bat_priv,
+					 struct seq_file *seq,
+					 const struct batadv_gw_node *gw_node)
+{
+	struct batadv_gw_node *curr_gw;
+	struct batadv_neigh_node *router;
+	struct batadv_neigh_ifinfo *router_ifinfo = NULL;
+	int ret = -1;
+
+	router = batadv_orig_router_get(gw_node->orig_node, BATADV_IF_DEFAULT);
+	if (!router)
+		goto out;
+
+	router_ifinfo = batadv_neigh_ifinfo_get(router, BATADV_IF_DEFAULT);
+	if (!router_ifinfo)
+		goto out;
+
+	curr_gw = batadv_gw_get_selected_gw_node(bat_priv);
+
+	seq_printf(seq, "%s %pM (%9u.%1u) %pM [%10s]: %u.%u/%u.%u MBit\n",
+		   (curr_gw == gw_node ? "=>" : "  "),
+		   gw_node->orig_node->orig,
+		   router_ifinfo->bat_v.throughput / 10,
+		   router_ifinfo->bat_v.throughput % 10, router->addr,
+		   router->if_incoming->net_dev->name,
+		   gw_node->bandwidth_down / 10,
+		   gw_node->bandwidth_down % 10,
+		   gw_node->bandwidth_up / 10,
+		   gw_node->bandwidth_up % 10);
+	ret = seq_has_overflowed(seq) ? -1 : 0;
+
+	if (curr_gw)
+		batadv_gw_node_put(curr_gw);
+out:
+	if (router_ifinfo)
+		batadv_neigh_ifinfo_put(router_ifinfo);
+	if (router)
+		batadv_neigh_node_put(router);
+	return ret;
+}
+
+/**
+ * batadv_v_gw_print - print the gateway list
+ * @bat_priv: the bat priv with all the soft interface information
+ * @seq: gateway table seq_file struct
+ */
+static void batadv_v_gw_print(struct batadv_priv *bat_priv,
+			      struct seq_file *seq)
+{
+	struct batadv_gw_node *gw_node;
+	int gw_count = 0;
+
+	seq_printf(seq,
+		   "      Gateway        ( throughput)           Nexthop [outgoingIF]: advertised uplink bandwidth\n");
+
+	rcu_read_lock();
+	hlist_for_each_entry_rcu(gw_node, &bat_priv->gw.list, list) {
+		/* fails if orig_node has no router */
+		if (batadv_v_gw_write_buffer_text(bat_priv, seq, gw_node) < 0)
+			continue;
+
+		gw_count++;
+	}
+	rcu_read_unlock();
+
+	if (gw_count == 0)
+		seq_puts(seq, "No gateways in range ...\n");
+}
+
 static struct batadv_algo_ops batadv_batman_v __read_mostly = {
 	.name = "BATMAN_V",
 	.iface = {
@@ -371,6 +577,9 @@  static struct batadv_algo_ops batadv_batman_v __read_mostly = {
 	.gw = {
 		.store_sel_class = batadv_v_store_sel_class,
 		.show_sel_class = batadv_v_show_sel_class,
+		.get_best_gw_node = batadv_v_gw_get_best_gw_node,
+		.is_eligible = batadv_v_gw_is_eligible,
+		.print = batadv_v_gw_print,
 	},
 };
 
@@ -397,7 +606,16 @@  void batadv_v_hardif_init(struct batadv_hard_iface *hard_iface)
  */
 int batadv_v_mesh_init(struct batadv_priv *bat_priv)
 {
-	return batadv_v_ogm_init(bat_priv);
+	int ret = 0;
+
+	ret = batadv_v_ogm_init(bat_priv);
+	if (ret < 0)
+		return ret;
+
+	/* set default throughput difference threshold to 5Mbps */
+	atomic_set(&bat_priv->gw.sel_class, 50);
+
+	return 0;
 }
 
 /**
diff --git a/net/batman-adv/gateway_client.c b/net/batman-adv/gateway_client.c
index 564f9d9..a77a179 100644
--- a/net/batman-adv/gateway_client.c
+++ b/net/batman-adv/gateway_client.c
@@ -215,6 +215,10 @@  void batadv_gw_election(struct batadv_priv *bat_priv)
 	if (!batadv_atomic_dec_not_zero(&bat_priv->gw.reselect) && curr_gw)
 		goto out;
 
+	/* if gw.reselect is set to 1 it means that a previous call to
+	 * gw.is_eligible() said that we have a new best GW, therefore it can
+	 * now be picked from the list and selected
+	 */
 	next_gw = bat_priv->algo_ops->gw.get_best_gw_node(bat_priv);
 
 	if (curr_gw == next_gw)
@@ -356,9 +360,8 @@  static void batadv_gw_node_add(struct batadv_priv *bat_priv,
  *
  * Return: gateway node if found or NULL otherwise.
  */
-static struct batadv_gw_node *
-batadv_gw_node_get(struct batadv_priv *bat_priv,
-		   struct batadv_orig_node *orig_node)
+struct batadv_gw_node *batadv_gw_node_get(struct batadv_priv *bat_priv,
+					  struct batadv_orig_node *orig_node)
 {
 	struct batadv_gw_node *gw_node_tmp, *gw_node = NULL;
 
diff --git a/net/batman-adv/gateway_client.h b/net/batman-adv/gateway_client.h
index 4c9edde..6b40432 100644
--- a/net/batman-adv/gateway_client.h
+++ b/net/batman-adv/gateway_client.h
@@ -47,5 +47,7 @@  bool batadv_gw_out_of_range(struct batadv_priv *bat_priv, struct sk_buff *skb);
 enum batadv_dhcp_recipient
 batadv_gw_dhcp_recipient_get(struct sk_buff *skb, unsigned int *header_len,
 			     u8 *chaddr);
+struct batadv_gw_node *batadv_gw_node_get(struct batadv_priv *bat_priv,
+					  struct batadv_orig_node *orig_node);
 
 #endif /* _NET_BATMAN_ADV_GATEWAY_CLIENT_H_ */