batman-adv: Map VID 0 to untagged TT VLAN

Message ID 20241216-no-vlan-0-v1-1-62586f97fd88@narfation.org (mailing list archive)
State Accepted, archived
Delegated to: Antonio Quartulli
Headers
Series batman-adv: Map VID 0 to untagged TT VLAN |

Commit Message

Sven Eckelmann Dec. 16, 2024, 6:37 p.m. UTC
  VID 0 is not a valid VLAN according to "802.1Q-2011" "Table 9-2—Reserved
VID values". It is only used to indicate "priority tag" frames which only
contain priority information and no VID.

The 8021q is also redirecting the priority tagged frames to the underlying
interface since commit ad1afb003939 ("vlan_dev: VLAN 0 should be treated as
"no vlan tag" (802.1p packet)"). But at the same time, it automatically
adds the VID 0 to all devices to ensure that VID 0 is in the allowed list
of the HW filter. This resulted in a VLAN 0 which was always announced in
OGM messages.

batman-adv should therefore not create a new batadv_softif_vlan for VID 0
and handle all VID 0 related frames using the "untagged" global/local
translation tables.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
---
 net/batman-adv/main.c           |  7 +++++++
 net/batman-adv/soft-interface.c | 14 ++++++++++++++
 2 files changed, 21 insertions(+)


---
base-commit: 4e395d4d6908da373f00752c363c82ccf99a427e
change-id: 20241216-no-vlan-0-5855407b9c6c

Best regards,
  

Comments

Antonio Quartulli Dec. 16, 2024, 8:51 p.m. UTC | #1
On 16/12/2024 19:37, Sven Eckelmann wrote:
> VID 0 is not a valid VLAN according to "802.1Q-2011" "Table 9-2—Reserved
> VID values". It is only used to indicate "priority tag" frames which only
> contain priority information and no VID.
> 
> The 8021q is also redirecting the priority tagged frames to the underlying
> interface since commit ad1afb003939 ("vlan_dev: VLAN 0 should be treated as
> "no vlan tag" (802.1p packet)"). But at the same time, it automatically
> adds the VID 0 to all devices to ensure that VID 0 is in the allowed list
> of the HW filter. This resulted in a VLAN 0 which was always announced in
> OGM messages.
> 
> batman-adv should therefore not create a new batadv_softif_vlan for VID 0
> and handle all VID 0 related frames using the "untagged" global/local
> translation tables.
> 
> Signed-off-by: Sven Eckelmann <sven@narfation.org>

Acked-by: Antonio Quartulli <antonio@mandelbit.com>
  
Linus Lüssing Dec. 17, 2024, 1:53 p.m. UTC | #2
On Mon, Dec 16, 2024 at 07:37:12PM +0100, Sven Eckelmann wrote:
> diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c
> index 8e0f44c71696f642d80304ec2724e8b5e56a5d93..333e947afcce7ca4128be8406f23295df723515c 100644
> --- a/net/batman-adv/main.c
> +++ b/net/batman-adv/main.c
> @@ -637,6 +637,13 @@ unsigned short batadv_get_vid(struct sk_buff *skb, size_t header_len)
>  
>  	vhdr = (struct vlan_ethhdr *)(skb->data + header_len);
>  	vid = ntohs(vhdr->h_vlan_TCI) & VLAN_VID_MASK;
> +
> +	/* VID 0 is only used to indicate "priority tag" frames which only
> +	 * contain priority information and no VID.
> +	 */
> +	if (vid == 0)
> +		return BATADV_NO_FLAGS;
> +
>  	vid |= BATADV_VLAN_HAS_TAG;
>  
>  	return vid;

I guess with this patch all TT entries previously in
TT VLAN 0 would be moved to untagged/NO_FLAGS TT entries, right?

Wouldn't that technically break compatibility? Let's say someone
uses VLAN headers with VID 0 to be able to use priorities / QoS.
What if some old nodes still announced+used VLAN 0 in batman-adv
while others used it after this patch, with the mapping to
NO_FLAGS?
  
Sven Eckelmann Dec. 17, 2024, 4:38 p.m. UTC | #3
On Tuesday, 17 December 2024 14:53:23 CET Linus Lüssing wrote:
> On Mon, Dec 16, 2024 at 07:37:12PM +0100, Sven Eckelmann wrote:
> > diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c
> > index 8e0f44c71696f642d80304ec2724e8b5e56a5d93..333e947afcce7ca4128be8406f23295df723515c 100644
> > --- a/net/batman-adv/main.c
> > +++ b/net/batman-adv/main.c
> > @@ -637,6 +637,13 @@ unsigned short batadv_get_vid(struct sk_buff *skb, size_t header_len)
> >  
> >  	vhdr = (struct vlan_ethhdr *)(skb->data + header_len);
> >  	vid = ntohs(vhdr->h_vlan_TCI) & VLAN_VID_MASK;
> > +
> > +	/* VID 0 is only used to indicate "priority tag" frames which only
> > +	 * contain priority information and no VID.
> > +	 */
> > +	if (vid == 0)
> > +		return BATADV_NO_FLAGS;
> > +
> >  	vid |= BATADV_VLAN_HAS_TAG;
> >  
> >  	return vid;
> 
> I guess with this patch all TT entries previously in
> TT VLAN 0 would be moved to untagged/NO_FLAGS TT entries, right?

Yes, as specified by 802.1Q-2011, it is meant to transport only priority 
information and not a VID. For a switch, the PVID would be used but because 
batman-adv is here used as the lower device (for either a VLAN aware bridge or 
8021q device), we don't have to add the PVID - the VID is simply missing 
(because it is !BATADV_VLAN_HAS_TAG) and therefore has to check for the 
"untagged" TT global entries (or add entries to the "untagged" TT local part)

> Wouldn't that technically break compatibility? Let's say someone
> uses VLAN headers with VID 0 to be able to use priorities / QoS.

Then this person should have noticed that it broken at the moment and doesn't 
work as expected (to reach the "untagged remotes" with the priority tagged
packets)

> What if some old nodes still announced+used VLAN 0 in batman-adv
> while others used it after this patch, with the mapping to
> NO_FLAGS?

Then the misbehaving old node would still misbehave. Because you should 
actually be able to talk with VID 0 to the untagged global TT entries - which 
the old node fails to do. So I could also add

  Fixes: 0ffa9e8d86d6 ("batman-adv: use vid when computing local and global TT CRC")
  Fixes: 5d2c05b21337 ("batman-adv: add per VLAN interface attribute framework")

if you prefer and transmit it via the batadv/net queue.

But I considered VID 0 somewhat esoteric for in-Linux usage because most tools 
just use DSCP. I am only away of tools like isochron-send which just inject 
raw packets with the VLAN headers directly. And using another 8021q device 
with VID on one side is a good way to create a unidirectional communication 
(when you want a bidirectional one) because the other end will just reply
with a vanilla, untagged packet. And because of that, things like ARP will
not be able to "finish" because the answers are received on the non-VID0 
interface.

But maybe I am wrong about that.

Kind regards,
	Sven
  
Antonio Quartulli Dec. 18, 2024, 8:02 a.m. UTC | #4
On 17/12/2024 17:38, Sven Eckelmann wrote:
> On Tuesday, 17 December 2024 14:53:23 CET Linus Lüssing wrote:
>> On Mon, Dec 16, 2024 at 07:37:12PM +0100, Sven Eckelmann wrote:
>>> diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c
>>> index 8e0f44c71696f642d80304ec2724e8b5e56a5d93..333e947afcce7ca4128be8406f23295df723515c 100644
>>> --- a/net/batman-adv/main.c
>>> +++ b/net/batman-adv/main.c
>>> @@ -637,6 +637,13 @@ unsigned short batadv_get_vid(struct sk_buff *skb, size_t header_len)
>>>   
>>>   	vhdr = (struct vlan_ethhdr *)(skb->data + header_len);
>>>   	vid = ntohs(vhdr->h_vlan_TCI) & VLAN_VID_MASK;
>>> +
>>> +	/* VID 0 is only used to indicate "priority tag" frames which only
>>> +	 * contain priority information and no VID.
>>> +	 */
>>> +	if (vid == 0)
>>> +		return BATADV_NO_FLAGS;
>>> +
>>>   	vid |= BATADV_VLAN_HAS_TAG;
>>>   
>>>   	return vid;
>>
>> I guess with this patch all TT entries previously in
>> TT VLAN 0 would be moved to untagged/NO_FLAGS TT entries, right?
> 
> Yes, as specified by 802.1Q-2011, it is meant to transport only priority
> information and not a VID. For a switch, the PVID would be used but because
> batman-adv is here used as the lower device (for either a VLAN aware bridge or
> 8021q device), we don't have to add the PVID - the VID is simply missing
> (because it is !BATADV_VLAN_HAS_TAG) and therefore has to check for the
> "untagged" TT global entries (or add entries to the "untagged" TT local part)
> 
>> Wouldn't that technically break compatibility? Let's say someone
>> uses VLAN headers with VID 0 to be able to use priorities / QoS.
> 
> Then this person should have noticed that it broken at the moment and doesn't
> work as expected (to reach the "untagged remotes" with the priority tagged
> packets)
> 
>> What if some old nodes still announced+used VLAN 0 in batman-adv
>> while others used it after this patch, with the mapping to
>> NO_FLAGS?
> 
> Then the misbehaving old node would still misbehave. Because you should
> actually be able to talk with VID 0 to the untagged global TT entries - which
> the old node fails to do. So I could also add
> 
>    Fixes: 0ffa9e8d86d6 ("batman-adv: use vid when computing local and global TT CRC")
>    Fixes: 5d2c05b21337 ("batman-adv: add per VLAN interface attribute framework")
> 
> if you prefer and transmit it via the batadv/net queue.
> 
> But I considered VID 0 somewhat esoteric for in-Linux usage because most tools
> just use DSCP. I am only away of tools like isochron-send which just inject
> raw packets with the VLAN headers directly. And using another 8021q device
> with VID on one side is a good way to create a unidirectional communication
> (when you want a bidirectional one) because the other end will just reply
> with a vanilla, untagged packet. And because of that, things like ARP will
> not be able to "finish" because the answers are received on the non-VID0
> interface.

I am not aware of 0 being used as a sane VID.
I have seen it mostly used internally, but never truly used with 
something like "eth0.0".

Therefore I agree with Sven that this should not cause any real compat 
issue. If we truly break something, then we can probably assume that the 
scenario was already ill formed.

Regards,
  
Linus Lüssing Dec. 30, 2024, 9:50 p.m. UTC | #5
On Wed, Dec 18, 2024 at 09:02:27AM +0100, Antonio Quartulli wrote:
> I am not aware of 0 being used as a sane VID.
> I have seen it mostly used internally, but never truly used with something
> like "eth0.0".
> 
> Therefore I agree with Sven that this should not cause any real compat
> issue. If we truly break something, then we can probably assume that the
> scenario was already ill formed.

Ok, thanks you two for elaborating on this. Ok, these are good
points. If no one uses VLAN 0 (in some partially broken/worked around/
out-of-spec way) and no one screams right now then I also like this change.

And would probably make sense to make this change now, before
adding (some) VLAN learning to batman-adv. Where otherwise
potentially less informed, random user (compared to node administrators?)
would otherwise more likely use VLAN 0 in an unintended/out-of-spec way.
  

Patch

diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c
index 8e0f44c71696f642d80304ec2724e8b5e56a5d93..333e947afcce7ca4128be8406f23295df723515c 100644
--- a/net/batman-adv/main.c
+++ b/net/batman-adv/main.c
@@ -637,6 +637,13 @@  unsigned short batadv_get_vid(struct sk_buff *skb, size_t header_len)
 
 	vhdr = (struct vlan_ethhdr *)(skb->data + header_len);
 	vid = ntohs(vhdr->h_vlan_TCI) & VLAN_VID_MASK;
+
+	/* VID 0 is only used to indicate "priority tag" frames which only
+	 * contain priority information and no VID.
+	 */
+	if (vid == 0)
+		return BATADV_NO_FLAGS;
+
 	vid |= BATADV_VLAN_HAS_TAG;
 
 	return vid;
diff --git a/net/batman-adv/soft-interface.c b/net/batman-adv/soft-interface.c
index 1f06861bc86a7d23c48d91e61298f48f6ec6b3f9..282a8f9b144471b12f62a547b3e57666cbb22c6d 100644
--- a/net/batman-adv/soft-interface.c
+++ b/net/batman-adv/soft-interface.c
@@ -637,6 +637,14 @@  static int batadv_interface_add_vid(struct net_device *dev, __be16 proto,
 	if (proto != htons(ETH_P_8021Q))
 		return -EINVAL;
 
+	/* VID 0 is only used to indicate "priority tag" frames which only
+	 * contain priority information and no VID. No management structures
+	 * should be created for this VID and it should be handled like an
+	 * untagged frame.
+	 */
+	if (vid == 0)
+		return 0;
+
 	vid |= BATADV_VLAN_HAS_TAG;
 
 	/* if a new vlan is getting created and it already exists, it means that
@@ -684,6 +692,12 @@  static int batadv_interface_kill_vid(struct net_device *dev, __be16 proto,
 	if (proto != htons(ETH_P_8021Q))
 		return -EINVAL;
 
+	/* "priority tag" frames are handled like "untagged" frames
+	 * and no softif_vlan needs to be destroyed
+	 */
+	if (vid == 0)
+		return 0;
+
 	vlan = batadv_softif_vlan_get(bat_priv, vid | BATADV_VLAN_HAS_TAG);
 	if (!vlan)
 		return -ENOENT;