Inter-VLAN routing on CURRENT: any known issues?

From: O. Hartmann <ohartmann_at_walstatt.org>
Date: Wed, 12 Jul 2017 21:43:34 +0200
Since a couple of days for now I fail to setup VLAN trunking on a FreeBSD 12-CURRENT box
(FreeBSD 12.0-CURRENT #9 r320913: Wed Jul 12 17:26:22 CEST 2017 amd64) which is based on
a PCEngines APU 2C4 board with three Intel i210 NICs.

igb0 is connected to a Allnet VDSL modem via tun0/ppp.
igb2 is unused.

igb1 is considered "multihomed" and comprises several VLANs:

[/etc/rc.conf]
gateway_enable="YES"
...
ifconfig_igb1="up"
vlans_igb1="1000 2 3 10 66 100"
ifconfig_igb1_1000="inet 192.168.0.1/24"
create_args_igb1_1000="vlanpcp 7"
ifconfig_igb1_2="inet 192.168.2.1/24"
ifconfig_igb1_3="inet 192.168.3.1/24"
ifconfig_igb1_10="inet 192.168.10.1/24"
ifconfig_igb1_66="inet 192.168.66.1/24"
ifconfig_igb1_100="inet 192.168.100.1/24"
...

VLAN 1000 is considered my internal network, the others are for special purpose, e.g.
VLAN 2 is for VoIP equiment.

After booting (a customised) kernel the router shows the following settings:

root_at_gate:~ # netstat -Warn
Routing tables

Internet:
Destination        Gateway            Flags       Use    Mtu      Netif Expire
default            111.111.111.111    US          570   1492       tun0
111.111.111.111    link#12            UHS           0   1492       tun0
22.33.44.55        link#12            UHS           0  16384        lo0
127.0.0.1          link#4             UH          115  16384        lo0
192.168.0.0/24     link#2             U         13930   1500  igb1.1000
192.168.0.1        link#2             UHS           0  16384        lo0
192.168.2.0/24     link#7             U             1   1500     igb1.2
192.168.2.1        link#7             UHS           0  16384        lo0
192.168.3.0/24     link#8             U             0   1500     igb1.3
192.168.3.1        link#8             UHS           0  16384        lo0
192.168.10.0/24    link#9             U             0   1500    igb1.10
192.168.10.1       link#9             UHS           0  16384        lo0
192.168.66.0/24    link#10            U             0   1500    igb1.66
192.168.66.1       link#10            UHS           0  16384        lo0
192.168.100.0/24   link#11            U             0   1500   igb1.100
192.168.100.1      link#11            UHS           0  16384        lo0

All interfaces (including vlan) show "UP" in their status. 

sshd, named and services are bound on the router to 192.168.0.1, which is its IP.

The router's igb1-NIC is physically connected to a SoHo switch Netgear GS110TP.

Its config in short according to the manual
(http://www.netgear.com/support/product/GS110TP.aspx#docs , chapter 3, pagus 84) is as
follows.

Port gs9 is considered the trunk/etherchannel port (via GBIC 1 Gig). Accordingly to my
setup, the VLANs 1,2,3 (switch-native),10, 66, 100 and 1000 are defined. In VLAN
membership configuration for VLAN 1, only port g1 is marked "U", this is my
maintenance port. For VLAN 1000 ports g1-g4 are "U" untagged, g9 is "T" tagged. For VLAN
2, port g7 is "U", g8 is "T" (the VoIP telephone has vlan tag 2) and the trunk is g9 "T".
VLAN 100 occupies port g5 "U", port g9 is "T". The other VLANs are unused at the moment.

According to handbook section "Port VLAN ID Configuration" (PVID), g1-g4 are PVID 1000,
Accept. Frame Type is "Admit All" and Ingress Filtering is "disabled". The settings for
the other so called "access ports" are accordingly. 
g9, the trunk port, has PVID 1, Admit all, Ingress Filtering is disabled. Other
configurations are mostly as the switch is set-up after factory reset.

On ports g1 - g4 I have a dual-port NIC'ed server (one port vlan 1000, other vlan 100)
running and a notebook, which I can configure freely.

Now the FUN PART:

From any host in any VLAN I'm able to ping hosts on the wild internet via their IP, on
VLAN 1000 there is a DNS running, so I'm also able to resolv names like google.com or
FreeBSD.org. But I can NOT(!) access any host via http/www or ssh. 

I also can not access a host's sshd in the neighbour VLAN routed via the router, say
from a host/server on VLAN 1000 to a host/VoIP telephone on VLAN 2. I can ping the hosts
from each VLAN to the other (so ICMP flows), but any IP service seems to get sacked by a
black hole. From hosts on VLAN 1000 I can access the router's sshd (192.168.0.1).

More disturbing: from the router itself, I'm able to access the sshd of each host on
each VLAN, i.e. VLAN 1000, VLAN 2 (VoIP), but when setting up a notebook (FreeBSD
12-CURRENT of the same or similar revision) in VLAN 2 or VLAN 100 or VLAN 66 with SSHD
listening on all interfaces, I'm able to connect to that system. Also, from the router
itself, I can ping any host on any VLAN and the internet (routed via tun0/igb0/modem).
From any host on any VLAN, I can ping the router, I can ping the world, I can ping other
hosts on other VLANs. Obviously, ICMP is routed.

Any attempt to access a service from a host in any VLAN to a hosts's service on another
VLAN fails. IP is not routed and I do not see why.

The kernel is compiled with in-kernel IPFW. No matter what I do, either ipfw "OPEN" or
using my ruleset which works in the special case I describe later, routing through VLANs
seems not to work for any IP packet!

Using tcpdump on the router while trying to ssh into another host, I see the initial [S]
marked attempt to connect, i.e. 192.168.0.128 > 192.168.2.50: [S]. Onece the packet has
been sent from sender to the router, I never is passed to the recipient. 

Before I start attempting making weird speculations, I must confess that using tcpdump
and other network tools is not my favourite and I'm quite new/novice on that field. 

I need advice. Also, I need to know whether the setup I showed is working or whether I
make a serious and stupid mistake (maybe due to not having understood FreeBSD's routing
or routing at all). 

If on the setup shown above the VLAN is dumped and when I use only igb1 as the "vanilla"
NIC, everything works smoothly - execpt the fact I do not have network separations. But
it shows me that in principle the complete setup isn't complete bullshit. From that
persepctive, even just changing igb1 to igb1.1000 (a tagged VLAN), it should work. But it
doesn't.

I'm not sure whether IPFW is the culprit or not or anothe knob, for the record, these
settings are for ipfw in the kernel:

[...]
options         NETGRAPH                # netgraph(4) system
options         NETGRAPH_IPFW
options         NETGRAPH_NETFLOW
options         NETGRAPH_ETHER
options         NETGRAPH_NAT
options         NETGRAPH_DEVICE
options         NETGRAPH_PPPOE
options         NETGRAPH_SOCKET
options         NETGRAPH_ASYNC
options         NETGRAPH_TEE

# IPFW firewall
options         IPFIREWALL
options         IPFIREWALL_VERBOSE
options         IPFIREWALL_VERBOSE_LIMIT=0
options                 IPFIREWALL_NAT          #ipfw kernel nat support
options                 LIBALIAS                #ipfw kernel nat support
options                 IPDIVERT                # NAT
options         DUMMYNET                # traffic shaper
#
#options                IPFIREWALL_DEFAULT_TO_ACCEPT
[...]

and from sysctl:
kern.features.ipfw_ctl3: 1
net.link.ether.ipfw: 0
net.link.bridge.ipfw: 0
net.link.bridge.ipfw_arp: 0


So, if someone is willing to give me some hints, I'd be glad to hear from you. I'm
starting getting insane over this problem :-(

Kind regards and thanks for your patience,

Oliver



-- 
O. Hartmann

Ich widerspreche der Nutzung oder Übermittlung meiner Daten für
Werbezwecke oder für die Markt- oder Meinungsforschung (§ 28 Abs. 4 BDSG).

Received on Wed Jul 12 2017 - 17:43:55 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:12 UTC