# Patch which fixes a uint32overflow ; fix algorithm which was unintented changed on 2014 update

## Commit Message

Changelog
* fix overflow of uint32-value while multiplying
* restore algorithm to origin calculation which expect kbit/s values
Signed-off-by: Ruben Wisniewsi <ruben@vfn-nrw.de>
---
net/batman-adv/gateway_client.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

## Comments

On Saturday 16 May 2015 22:15:37 Ruben Wisniewski wrote:
> Changelog
>
> * fix overflow of uint32-value while multiplying
> * restore algorithm to origin calculation which expect kbit/s values
>
> Signed-off-by: Ruben Wisniewsi <ruben@vfn-nrw.de>
> ---
Please read
http://www.open-mesh.org/projects/open-mesh/wiki/Contribute
before submitting patches.
And please also include the commit which caused the regression. This makes
it easier to find out which kernel versions must be patched in case the
problem is critical. Here is an example how this information should be
included:
http://git.open-mesh.org/batman-adv.git/commit/210a3cb5a5e09f0067386fd07b3a7a843fcf66c5
Kind regards,
Sven

On Sunday 17 May 2015 00:11:52 Sven Eckelmann wrote:
> And please also include the commit which caused the regression. This makes
> it easier to find out which kernel versions must be patched in case the
> problem is critical. Here is an example how this information should be
> included:
You are most likely want to fix the regression introduced in 0853ec7 ("batman-
adv: tvlv - gateway download/upload bandwidth container"). The gateway client
code for fast node selection was previously using batadv_gw_bandwidth_to_kbit
to convert the input parameters to Kibit/s. After the TLV restructuring it
should have multiplied the TLV value by 100 to get the exact same result
because the batadv_tvlv_gateway_data bandwidth_down/bandwidth_up stores the
throughput with the base unit of 100 Kibit/s.
The problem here would be to decide if it changes the result of the algorithm.
The code goes through all gateways. For each gateway it calculates the formula
(gw_down is in 100 Kibit/s; gw_divisor is constant ):
gw_factor = |_ (tq_avg ** 2 * gw_down * 10 ** 4) / gw_divisor _|
You want to change it to
gw_factor = |_ (tq_avg ** 2 * gw_down * 10 ** 6) / gw_divisor _|
The rest of the algorithm is just comparing the gw_factor of each gateway. The
question which should have been answered by the commit message would be:
Why is this constant of 100 affecting your results? Is it rounding related?
How much does it affect the algorithm?
tq_avg is a value <= 255. A realistic value for bandwidth_down is maybe
something like 20. gw_divisor is 2 ** 18.
So each difference in "tq_avg ** 2 * gw_down * 10 ** 4" for each gateway must
be larger than 2 ** 18 to always be distinguishable by the algorithm. For a
fixed tq_avg value of 255 the algorithm could currently compare two gateways
with a gw_down difference of (1. / 2480) and still distinguish them. It would
still detect a difference when the tq_avg is 6 and the gw_down difference is
1.
For a fixed gw_down value of 1, the current algorithm would have problems to
distinguish the tq_avg values < 14 when the tq_avg difference is small.
Your change would make it 100x more precise for gw_down. But I am currently
not seeing the actual impact. Maybe you could explain it in more detail in the
commit message.
Kind regards,
Sven

Please don't drop the mailing list when replying (unless it is actually
private conversation).
On Sunday 17 May 2015 11:21:10 Ruben Wisniewski wrote:
> > should be included:
> Since the overflow-bug seem to be is since epoch, the patch have to be
> applied to all versions.
I am talking about the part which you described yourself as regression. But I
agree that the overflow bug is in there since the beginning of the gateway
code.
> But I think batman-adv 2013.x is that old that
> nobody is using it anymore or kernels with that version. So we just
> have to fix all 2014.x versions.
There is a stable branch for Linux 2.6.32.x. Linux 2.6.32 came out 2009. So it
is possible that David Miller decides to submit the overflow patch to stable
for 2.6.32. This is not in our jurisdiction. The same may be the case for the
regression - but then somebody (David Miller/stable team) must know how severe
the problem is and when it was introduced.
Kind regards,
Sven

b/net/batman-adv/gateway_client.c index bb01586..6f00584 100644
@@ -153,7 +153,7 @@ batadv_gw_get_best_gw_node(struct batadv_priv
*bat_priv) struct batadv_neigh_node *router;
struct batadv_neigh_ifinfo *router_ifinfo;
struct batadv_gw_node *gw_node, *curr_gw = NULL;
- uint32_t max_gw_factor = 0, tmp_gw_factor = 0;
+ uint64_t max_gw_factor = 0, tmp_gw_factor = 0;
uint32_t gw_divisor;
uint8_t max_tq = 0;
uint8_t tq_avg;
@@ -185,7 +185,7 @@ batadv_gw_get_best_gw_node(struct batadv_priv
*bat_priv) switch (atomic_read(&bat_priv->gw_sel_class)) {
case 1: /* fast connection */
tmp_gw_factor = tq_avg * tq_avg;
- tmp_gw_factor *= gw_node->bandwidth_down;
+ tmp_gw_factor *= gw_node->bandwidth_down * 100;
tmp_gw_factor *= 100 * 100;
tmp_gw_factor /= gw_divisor;