[maintv2,3/3] batman-adv: detect local excess vlans in TT request

Message ID 1440600113-21305-4-git-send-email-sw@simonwunderlich.de (mailing list archive)
State Superseded, archived
Commit 2dd1d9f06ac1208b1921aa90d479c3940bc70b4f
Headers

Commit Message

Simon Wunderlich Aug. 26, 2015, 2:41 p.m. UTC
  If the local representation of the global TT table of one originator has
more VLAN entries than the respective TT update, there is some
inconsistency present. By detecting and reporting this inconsistency,
the global table gets updated and the excess VLAN will get removed in
the process.

Reported-by: Alessandro Bolletta <alessandro@mediaspot.net>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
---
Changes to PATCH:
 * count instead of explicitly comparing VLANs, which makes it much
   simpler (thank Antonio)
---
 net/batman-adv/translation-table.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)
  

Comments

Antonio Quartulli Sept. 1, 2015, 10:07 a.m. UTC | #1
On 26/08/15 16:41, Simon Wunderlich wrote:
> If the local representation of the global TT table of one originator has
> more VLAN entries than the respective TT update, there is some
> inconsistency present. By detecting and reporting this inconsistency,
> the global table gets updated and the excess VLAN will get removed in
> the process.

This a nice catch, but I am not sure this is the right way of
implementing the fix.

Imagine this sequence of events:
1) originator O1 sends an OGM
2) client C1 connects to O1 on a newly created VLAN and starts sending
traffic
3) originator O2 detects (speedy join) C1 before receiving the O1's OGM
4) O2 receives O1's OGM and the check will kill C1 because its VLAN is
not advertised in the OGM. O2 needs to wait for O1's next OGM before
getting to know C1 again

Maybe this scenario is rather unlikely? What do you think?
  
Simon Wunderlich Sept. 2, 2015, 5:39 p.m. UTC | #2
On Tuesday 01 September 2015 12:07:36 Antonio Quartulli wrote:
> On 26/08/15 16:41, Simon Wunderlich wrote:
> > If the local representation of the global TT table of one originator has
> > more VLAN entries than the respective TT update, there is some
> > inconsistency present. By detecting and reporting this inconsistency,
> > the global table gets updated and the excess VLAN will get removed in
> > the process.
> 
> This a nice catch, but I am not sure this is the right way of
> implementing the fix.
> 
> Imagine this sequence of events:
> 1) originator O1 sends an OGM
> 2) client C1 connects to O1 on a newly created VLAN and starts sending
> traffic
> 3) originator O2 detects (speedy join) C1 before receiving the O1's OGM
> 4) O2 receives O1's OGM and the check will kill C1 because its VLAN is
> not advertised in the OGM. O2 needs to wait for O1's next OGM before
> getting to know C1 again
> 
> Maybe this scenario is rather unlikely? What do you think?

Mhm, I'd argue its rather unlikely. This would imply that the data frame is 
reaching O2 faster the OGM. I don't think its very likely that it overtakes 
it, and even if it happens, its probably a very tight race.

Also, I'd think that nodes typically just use or don't use a VLAN, and don't 
create VLANs all the time ...

Even if that happens, the race is resolved with one originator interval.

I'm open for better ideas, though. :)

Cheers,
    Simon
  
Antonio Quartulli Sept. 2, 2015, 5:57 p.m. UTC | #3
On 02/09/15 19:39, Simon Wunderlich wrote:
> On Tuesday 01 September 2015 12:07:36 Antonio Quartulli wrote:
>> On 26/08/15 16:41, Simon Wunderlich wrote:
>>> If the local representation of the global TT table of one originator has
>>> more VLAN entries than the respective TT update, there is some
>>> inconsistency present. By detecting and reporting this inconsistency,
>>> the global table gets updated and the excess VLAN will get removed in
>>> the process.
>>
>> This a nice catch, but I am not sure this is the right way of
>> implementing the fix.
>>
>> Imagine this sequence of events:
>> 1) originator O1 sends an OGM
>> 2) client C1 connects to O1 on a newly created VLAN and starts sending
>> traffic
>> 3) originator O2 detects (speedy join) C1 before receiving the O1's OGM
>> 4) O2 receives O1's OGM and the check will kill C1 because its VLAN is
>> not advertised in the OGM. O2 needs to wait for O1's next OGM before
>> getting to know C1 again
>>
>> Maybe this scenario is rather unlikely? What do you think?
> 
> Mhm, I'd argue its rather unlikely. This would imply that the data frame is 
> reaching O2 faster the OGM. I don't think its very likely that it overtakes 
> it, and even if it happens, its probably a very tight race.
> 
> Also, I'd think that nodes typically just use or don't use a VLAN, and don't 
> create VLANs all the time ...
> 
> Even if that happens, the race is resolved with one originator interval.
> 

Yeah, I think we can live with this "limitation".


Cheers,
  

Patch

diff --git a/net/batman-adv/translation-table.c b/net/batman-adv/translation-table.c
index f629c21..e7f2998 100644
--- a/net/batman-adv/translation-table.c
+++ b/net/batman-adv/translation-table.c
@@ -2391,8 +2391,8 @@  static bool batadv_tt_global_check_crc(struct batadv_orig_node *orig_node,
 {
 	struct batadv_tvlv_tt_vlan_data *tt_vlan_tmp;
 	struct batadv_orig_node_vlan *vlan;
+	int i, orig_num_vlan;
 	uint32_t crc;
-	int i;
 
 	/* check if each received CRC matches the locally stored one */
 	for (i = 0; i < num_vlan; i++) {
@@ -2418,6 +2418,18 @@  static bool batadv_tt_global_check_crc(struct batadv_orig_node *orig_node,
 			return false;
 	}
 
+	/* check if any excess VLANs exist locally for the originator
+	 * which are not mentioned in the TVLV from the originator.
+	 */
+	rcu_read_lock();
+	orig_num_vlan = 0;
+	list_for_each_entry_rcu(vlan, &orig_node->vlan_list, list)
+		orig_num_vlan++;
+	rcu_read_unlock();
+
+	if (orig_num_vlan > num_vlan)
+		return false;
+
 	return true;
 }