new arp code snapshot for review...

From: Luigi Rizzo <rizzo_at_icir.org>
Date: Sun, 25 Apr 2004 09:49:40 -0700
Here is a snapshot of the new arp code that i have been working on
lately, based a on Andre's ideas. (I say 'ARP' for brevity, what i
mean is the layer3-to-layer2 address translation code -- arp, aarp, nd6
all fit in the category).

The basic idea is to have per-ifp, per-af tables linked to the
ifnet itself. Each table is address-family specific, and as such
is managed by the protocol itself. It can be structured as a
list, an array with direct access, or a hash table depending on
the requirements. The search key is always the layer3 address.

The advantage is a reduction in size of the routing table, because
it does not have to store ARP entries anymore, and
a likely speedup of the arp lookups because now the table lends
itself nicely to quick lookup and easy management.

Also, when the approach is used for INET6 as well (which is the
only other AF using the routing table to store arp entries) rtentry's
will not need to support cloning anymore, nor store 'rt_gwroute',
'rt_llinfo', 'rt_genmask' and 'rt_parent' fields, which means another
large chunk of code simply goes away.

Entries in the table are tagged with some flags so the code knows
which ones refer to dynamic entries, local interface addresses,
or statically configured entries.

Compatibility with userland tools is preserved usign some stub
routines which trap requests on the routing sockets and manipulate
the arp tables accordingly.

I have tried to keep the changes to a minimum (see below)
Basically all the existing functionality should be preserved, with
a few minor differences:

+ routing entries associated to interfaces are now non-clonable

+ the 'useloopback' flag is not yet implemented, because i have some
  doubts on its semantic. At the moment, and despite what you might
  think, 'useloopback' means "when you create (by cloning) a routing
  entry to the local host, use the loopback interface if useloopback
  is set at the time of cloning".
  Because there is no cloning anymore, the above semantics
  (which is not a design decision, just an accident) has to change
  slightly, to one of these two forms:
  - use the loopback interface for any local traffic if useloopback=1
  - create a routing entry that uses the loopback interface if
    useloopback is set when you assign an address to an interface
  The former is a lot simpler, so i would vote for that.

I also have patches for nd6, but these are a bit more extensive
and i am trying to see if i can write them in a way to minimize
differences with the existing code. In any case, ipv6 should work
unmodified.

--- Code changes: ---

src/usr.sbin/arp/arp.c
    one small change to make 'arp' requests clearly identifiable;

src/sys/net/route.c
    rtinit calls the new arp code, arp_ifscrub(), to remove an
    interface address when the address goes away.
    It also creates a route-to-interface, non clonable, entry
    when a new interface address is configured.

src/sys/net/rtsock.c
    route_output() calls the new arp code, arp_rt_output(), to
    implement routing socket requests that relate to the arp table.
    Another method, sysctl_dumparp(), is used for the sysctl interface
    to the arp table. In both cases, the input and output data format
    is the same as before

src/sys/netinet/if_ether.c
    this is the core of the new arp code for ipv4.
    At the moment this file also contains a number of generic routines
    which are not specific for ipv4 and so could be well moved to a
    different file.

    Note that arpresolve now completely ignores the 'rtentry' parameter
    passed by the upper layer.

src/sys/netinet/if_ether.h
    contains the definition of the 'struct lltable' and various
    flags that control the behaviour of the each entry.
    All this should probably go elsewhere as it is not INET specific.


----------------------

comments welcome. The questions i have is mainly:

    Have i forgotten anything ?

(the routing API is quite hard to follow...)
Please keep in mind that some things such as malloc vs uma,
field and variable names, location of code are going to change, so
if you have preferences please state them.
Also, as you see, there is no locking in place yet, i am leaving that task
to the locking gurus

cheers
luigi

=========================================

Index: src/usr.sbin/arp/arp.c
===================================================================
RCS file: /home/ncvs/src/usr.sbin/arp/arp.c,v
retrieving revision 1.50
diff -u -p -r1.50 arp.c
--- src/usr.sbin/arp/arp.c	13 Apr 2004 14:16:37 -0000	1.50
+++ src/usr.sbin/arp/arp.c	25 Apr 2004 15:42:21 -0000
_at__at_ -439,6 +439,17 _at__at_ delete(char *host, int do_proxy)
 		    !(rtm->rtm_flags & RTF_GATEWAY) &&
 		    valid_type(sdl->sdl_type) )
 			break;	/* found it */
+		/* check the new arp interface */
+		if (sdl->sdl_family == AF_LINK &&
+		    !(rtm->rtm_flags & RTF_GATEWAY) &&
+		    valid_type(sdl->sdl_type) ) {
+			/*
+			 * found it. But overwrite the address to make
+			 * sure that we really get it.
+			 */
+			addr->sin_addr.s_addr = dst->sin_addr.s_addr;
+			break;
+		}
 		if (dst->sin_other & SIN_PROXY) {
 			fprintf(stderr, "delete: cannot locate %s\n",host);
 			return (1);
Index: src/sys/net/route.c
===================================================================
RCS file: /home/ncvs/src/sys/net/route.c,v
retrieving revision 1.104
diff -u -p -r1.104 route.c
--- src/sys/net/route.c	25 Apr 2004 01:39:00 -0000	1.104
+++ src/sys/net/route.c	25 Apr 2004 16:13:39 -0000
_at__at_ -42,6 +42,7 _at__at_
 #include <sys/kernel.h>
 
 #include <net/if.h>
+#include <net/if_dl.h>		/* for sockaddr_dl */
 #include <net/route.h>
 
 #include <netinet/in.h>
_at__at_ -1105,9 +1106,13 _at__at_ rt_maskedcopy(struct sockaddr *src, stru
 		bzero((caddr_t)cp2, (unsigned)(cplim2 - cp2));
 }
 
+void arp_ifscrub(struct ifnet *ifp, uint32_t addr);
+
 /*
  * Set up a routing table entry, normally
  * for an interface.
+ * Instead of the destination address, use a sockaddr_dl for the
+ * gateway, using the index and type of the interface.
  */
 int
 rtinit(struct ifaddr *ifa, int cmd, int flags)
_at__at_ -1118,6 +1123,7 _at__at_ rtinit(struct ifaddr *ifa, int cmd, int 
 	struct rtentry *rt = NULL;
 	struct rt_addrinfo info;
 	int error;
+	static struct sockaddr_dl null_sdl = {sizeof(null_sdl), AF_LINK};
 
 	if (flags & RTF_HOST) {
 		dst = ifa->ifa_dstaddr;
_at__at_ -1126,6 +1132,13 _at__at_ rtinit(struct ifaddr *ifa, int cmd, int 
 		dst = ifa->ifa_addr;
 		netmask = ifa->ifa_netmask;
 	}
+	printf("rtinit cmd %d flags 0x%x, ifa_ifp %p dst %d:0x%x gw %d:0x%x\n",
+	    cmd, flags, ifa->ifa_ifp,
+	    dst->sa_family,
+	    ntohl(((struct sockaddr_in *)dst)->sin_addr.s_addr),
+	    ifa->ifa_addr->sa_family,
+	    ntohl(((struct sockaddr_in *)ifa->ifa_addr)->sin_addr.s_addr));
+
 	/*
 	 * If it's a delete, check that if it exists, it's on the correct
 	 * interface or we might scrub a route to another ifa which would
_at__at_ -1136,6 +1149,9 _at__at_ rtinit(struct ifaddr *ifa, int cmd, int 
 		struct radix_node_head *rnh;
 		struct radix_node *rn;
 
+		if (dst->sa_family == AF_INET)
+		    arp_ifscrub(ifa->ifa_ifp,
+			((struct sockaddr_in *)dst)->sin_addr.s_addr);
 		/*
 		 * It's a delete, so it should already exist..
 		 * If it's a net, mask off the host bits
_at__at_ -1175,10 +1191,14 _at__at_ bad:
 	info.rti_ifa = ifa;
 	info.rti_flags = flags | ifa->ifa_flags;
 	info.rti_info[RTAX_DST] = dst;
-	info.rti_info[RTAX_GATEWAY] = ifa->ifa_addr;
+	info.rti_info[RTAX_GATEWAY] = (struct sockaddr *)&null_sdl;
 	info.rti_info[RTAX_NETMASK] = netmask;
 	error = rtrequest1(cmd, &info, &rt);
 	if (error == 0 && rt != NULL) {
+		((struct sockaddr_dl *)rt->rt_gateway)->sdl_type =
+			rt->rt_ifp->if_type;
+		((struct sockaddr_dl *)rt->rt_gateway)->sdl_index =
+			rt->rt_ifp->if_index;
 		/*
 		 * notify any listening routing agents of the change
 		 */
Index: src/sys/net/rtsock.c
===================================================================
RCS file: /home/ncvs/src/sys/net/rtsock.c,v
retrieving revision 1.107
diff -u -p -r1.107 rtsock.c
--- src/sys/net/rtsock.c	19 Apr 2004 07:20:32 -0000	1.107
+++ src/sys/net/rtsock.c	25 Apr 2004 15:39:49 -0000
_at__at_ -91,6 +91,10 _at__at_ static void	rt_getmetrics(const struct r
 			struct rt_metrics *out);
 static void	rt_dispatch(struct mbuf *, const struct sockaddr *);
 
+/* support new arp code */
+int arp_rt_output(struct rt_msghdr *rtm, struct rt_addrinfo *info);
+int sysctl_dumparp(int af, struct sysctl_req *wr);
+
 /*
  * It really doesn't make any sense at all for this code to share much
  * with raw_usrreq.c, since its functionality is so restricted.  XXX
_at__at_ -275,6 +279,8 _at__at_ static struct pr_usrreqs route_usrreqs =
 	sosend, soreceive, sopoll, pru_sosetlabel_null
 };
 
+
+
 /*ARGSUSED*/
 static int
 route_output(struct mbuf *m, struct socket *so)
_at__at_ -350,6 +356,11 _at__at_ route_output(struct mbuf *m, struct sock
 		if (info.rti_info[RTAX_GATEWAY] == NULL)
 			senderr(EINVAL);
 		saved_nrt = NULL;
+		if (info.rti_info[RTAX_GATEWAY]->sa_family == AF_LINK) {
+			/* support for new ARP code */
+			arp_rt_output(rtm, &info);
+			break;
+		}
 		error = rtrequest1(RTM_ADD, &info, &saved_nrt);
 		if (error == 0 && saved_nrt) {
 			RT_LOCK(saved_nrt);
_at__at_ -363,6 +374,11 _at__at_ route_output(struct mbuf *m, struct sock
 
 	case RTM_DELETE:
 		saved_nrt = NULL;
+		if (info.rti_info[RTAX_GATEWAY]->sa_family == AF_LINK) {
+			/* support for new ARP code */
+			arp_rt_output(rtm, &info);
+			break;
+		}
 		error = rtrequest1(RTM_DELETE, &info, &saved_nrt);
 		if (error == 0) {
 			RT_LOCK(saved_nrt);
_at__at_ -1069,6 +1085,7 _at__at_ sysctl_rtsock(SYSCTL_HANDLER_ARGS)
 	int	i, lim, s, error = EINVAL;
 	u_char	af;
 	struct	walkarg w;
+	int found = 0;
 
 	name ++;
 	namelen--;
_at__at_ -1100,8 +1117,17 _at__at_ sysctl_rtsock(SYSCTL_HANDLER_ARGS)
 			    	error = rnh->rnh_walktree(rnh,
 				    sysctl_dumpentry, &w);/* could sleep XXX */
 				/* RADIX_NODE_HEAD_UNLOCK(rnh); */
-			} else if (af != 0)
-				error = EAFNOSUPPORT;
+				if (error)
+					break;
+				found = 1;
+			}
+		/*
+		 * take care of llinfo entries. XXX check AF_INET ?
+		 */
+		if (w.w_op == NET_RT_FLAGS && (RTF_LLINFO & w.w_arg))
+			error = sysctl_dumparp(af, w.w_req);
+		else if (af != 0 && found == 0)
+			error = EAFNOSUPPORT;
 		break;
 
 	case NET_RT_IFLIST:
Index: src/sys/netinet/if_ether.c
===================================================================
RCS file: /home/ncvs/src/sys/netinet/if_ether.c,v
retrieving revision 1.127
diff -u -p -r1.127 if_ether.c
--- src/sys/netinet/if_ether.c	25 Apr 2004 15:00:17 -0000	1.127
+++ src/sys/netinet/if_ether.c	25 Apr 2004 16:14:03 -0000
_at__at_ -27,7 +27,7 _at__at_
  * SUCH DAMAGE.
  *
  *	_at_(#)if_ether.c	8.1 (Berkeley) 6/10/93
- * $FreeBSD: src/sys/netinet/if_ether.c,v 1.127 2004/04/25 15:00:17 luigi Exp $
+ * $FreeBSD$
  */
 
 /*
_at__at_ -101,7 +101,6 _at__at_ struct llinfo_arp {
 static	LIST_HEAD(, llinfo_arp) llinfo_arp;
 
 static struct	ifqueue arpintrq;
-static int	arp_allocated;
 
 static int	arp_maxtries = 5;
 static int	useloopback = 1; /* use loopback interface for local traffic */
_at__at_ -116,18 +115,303 _at__at_ SYSCTL_INT(_net_link_ether_inet, OID_AUT
 	   &arp_proxyall, 0, "");
 
 static void	arp_init(void);
-static void	arp_rtrequest(int, struct rtentry *, struct rt_addrinfo *);
 static void	arprequest(struct ifnet *,
 			struct in_addr *, struct in_addr *, u_char *);
 static void	arpintr(struct mbuf *);
 static void	arptfree(struct llinfo_arp *);
 static void	arptimer(void *);
-static struct llinfo_arp
-		*arplookup(u_long, int, int);
+struct llentry *arplookup(struct ifnet *ifp, uint32_t addr, uint32_t flags);
 #ifdef INET
 static void	in_arpinput(struct mbuf *);
 #endif
 
+/***
+ ***
+ *** Start of new arp support routines which should go to a separate file.
+ ***
+ ***/
+#define DEB(x)
+#define	DDB(x)	x
+
+struct llentry {
+	struct llentry *lle_next;
+	struct mbuf	*la_hold;
+	uint16_t	flags; /* see values in if_ether.h */
+	uint8_t		la_preempt;
+	uint8_t		la_asked;
+	time_t		expire;
+	struct in_addr	l3_addr;
+	union {
+		uint64_t	mac_aligned;
+		uint16_t	mac16[3];
+	} ll_addr;
+};
+
+MALLOC_DEFINE(M_ARP, "arp", "arp entries");	/* XXX will move to UMA */
+
+int arp_rt_output(struct rt_msghdr *rtm, struct rt_addrinfo *info);
+int sysctl_dumparp(int af, struct sysctl_req *wr);
+void arp_ifscrub(struct ifnet *ifp, uint32_t addr);
+
+/*
+ * called by in_ifscrub to remove entry from the table when
+ * the interface goes away
+ */
+void
+arp_ifscrub(struct ifnet *ifp, uint32_t addr)
+{
+	arplookup(ifp, addr, LLE_DELETE | LLE_IFADDR);
+}
+
+/*
+ * Find an interface address matching the ifp-addr pair.
+ * This may replicate some of the functions of ifa_ifwithnet()
+ */
+static struct ifaddr *
+find_ifa(struct ifnet *ifp, uint32_t addr)
+{
+	struct ifaddr *ifa;
+
+	if (ifp == NULL)
+		return NULL;
+	TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) {
+		if (ifa->ifa_addr->sa_family != AF_INET)
+			continue;
+		if (ifp->if_flags & IFF_POINTOPOINT)
+			break;
+		if (((addr ^ SIN(ifa->ifa_addr)->sin_addr.s_addr) &  
+		    SIN(ifa->ifa_netmask)->sin_addr.s_addr ) == 0)
+			break; /* found! */
+	}
+	return ifa;
+}
+
+static void
+llentry_free(struct llentry **e)
+{
+	struct llentry *x;
+
+	if (e == 0)
+		panic("llentry_free: null ptr");
+	x = *e;
+	*e = x->lle_next;
+	if (x->la_hold)
+		m_freem(x->la_hold);
+	free(x, M_ARP);
+}
+
+/*
+ * Add a new table at the head of the list for interface ifp  
+ */
+struct lltable *
+lltable_new(struct ifnet *ifp, int af)
+{
+	struct lltable *t;
+
+	t = malloc(sizeof (struct lltable), M_ARP, M_DONTWAIT | M_ZERO);
+	if (t != NULL) {
+		t->llt_next = ifp->lltables;
+		t->llt_af = af;
+		ifp->lltables = t;
+	}
+	return t;
+}
+
+struct lltable **
+lltable_free(struct lltable **t)   
+{
+	struct lltable *x;
+
+	if (t == NULL)
+		panic("lltable_free: null ptr");
+	x = *t;
+	*t = x->llt_next;     
+	free(x, M_ARP);
+	return t;
+}
+
+static void
+newarptimer(__unused void *ignored_arg)
+{
+	struct lltable *t;
+	struct llentry **e;
+	struct ifnet *ifp;
+
+	IFNET_RLOCK();
+	printf("arptimer!\n");
+	TAILQ_FOREACH(ifp, &ifnet, if_link) {
+		for (t = ifp->lltables; t ; t = t->llt_next) {
+			if (t->llt_af != AF_INET)
+				continue;
+			for (e = (struct llentry **)&t->lle_head; *e; ) {
+				int kill;
+
+				if ((*e)->flags & LLE_DELETED)
+					kill = 1;
+				else if ((*e)->flags & LLE_STATIC)
+					kill = 0;
+				else
+					kill = time_second >= (*e)->expire;
+				if (kill)
+					llentry_free(e);
+				else
+					e = &((*e)->lle_next);
+			}
+		}
+	}
+	IFNET_RUNLOCK();
+	callout_reset(&arp_callout, arpt_prune * hz, newarptimer, NULL);
+}
+
+static int
+inet_dumparp(struct ifnet *ifp, void *head, struct sysctl_req *wr)
+{
+	struct llentry *e;
+	int error = 0;   
+
+	for (e = head; e; e = e->lle_next) {
+		struct {
+			struct rt_msghdr        rtm;
+			struct sockaddr_inarp   sin2;
+			struct sockaddr_dl      sdl;
+			//struct sockaddr_inarp addr2;
+		} d;
+
+		DEB(printf("ifp %p index %d flags 0x%x ip %x %s\n",
+		    ifp, ifp->if_index,
+		    e->flags,
+		    ntohl(e->l3_addr.s_addr),
+		    (e->flags & LLA_VALID) ? "valid" : "incomplete");)
+		if (e->flags & LLE_DELETED) /* skip deleted entries */
+			continue;
+		/*
+		 * produce a msg made of:
+		 *  struct rt_msghdr;
+		 *  struct sockaddr_inarp;
+		 *  struct sockaddr_dl;
+		 */
+		bzero(&d, sizeof (d));
+		d.rtm.rtm_msglen = sizeof(d);
+		d.sin2.sin_family = AF_INET;
+		d.sin2.sin_len = sizeof(d.sin2);
+		d.sin2.sin_addr.s_addr = e->l3_addr.s_addr;
+
+		if (e->flags & LLA_VALID) { /* valid MAC */
+			d.sdl.sdl_family = AF_LINK;
+			d.sdl.sdl_len = sizeof(d.sdl);
+			d.sdl.sdl_alen = ifp->if_addrlen;
+			d.sdl.sdl_index = ifp->if_index;
+			d.sdl.sdl_type = ifp->if_type;
+			bcopy(&e->ll_addr, LLADDR(&d.sdl), ifp->if_addrlen);
+		}
+		d.rtm.rtm_rmx.rmx_expire =
+		    e->flags & LLE_STATIC ? 0 : e->expire;
+		d.rtm.rtm_flags = RTF_LLINFO;
+		if (e->flags & LLE_STATIC)
+			d.rtm.rtm_flags |= RTF_STATIC;
+		d.rtm.rtm_index = ifp->if_index;
+		error = SYSCTL_OUT(wr, &d, sizeof(d));
+		if (error)
+			break;
+	}
+	return error;
+}
+
+/*
+ * glue to dump arp tables
+ */
+int
+sysctl_dumparp(int af, struct sysctl_req *wr)
+{
+	struct lltable *t;
+	struct ifnet *ifp;
+	int error = 0;
+
+	IFNET_RLOCK();
+	TAILQ_FOREACH(ifp, &ifnet, if_link) {
+		for (t = ifp->lltables; t ; t = t->llt_next) {
+			if (af != 0 && t->llt_af != af)
+				continue;
+			switch (af) {
+			case AF_INET:
+				error = inet_dumparp(ifp, t->lle_head, wr);
+				break;
+			/* other handlers, if any */
+			}
+			if (error)
+				goto done;
+		}
+	 }
+done:
+	IFNET_RUNLOCK();
+	return (error);
+}
+
+/*
+ * Called in route_output when adding/deleting a route to an interface.
+ */
+int
+arp_rt_output(struct rt_msghdr *rtm, struct rt_addrinfo *info)
+{
+	struct sockaddr_dl *dl =
+		(struct sockaddr_dl *)info->rti_info[RTAX_GATEWAY];
+	struct sockaddr_in *dst =
+		(struct sockaddr_in *)info->rti_info[RTAX_DST];
+	struct ifnet *ifp;
+	struct llentry *la;
+	u_int flags;
+
+	printf("arp_rt_output type %d af: gw %d dst %d:%x if_index %d\n",
+	    rtm->rtm_type,
+	    dl ? dl->sdl_family : 0,
+	    dst ? dst->sin_family : 0,
+	    dst && dst->sin_family == AF_INET ?
+	    ntohl(dst->sin_addr.s_addr) : 0,
+	    dl ? dl->sdl_index : 0);
+	if (dl == NULL || dl->sdl_family != AF_LINK) {
+		/* XXX should also check (dl->sdl_index < if_indexlim) */
+		printf("invalid gateway/index\n");
+		return EINVAL;
+	}
+	ifp = ifnet_byindex(dl->sdl_index);
+	if (ifp == NULL) {
+		printf("invalid ifp\n");
+		return EINVAL;
+	}
+
+	switch (rtm->rtm_type) {
+	case RTM_ADD:
+		flags = LLE_CREATE;
+		break;
+
+	case RTM_CHANGE:
+	default:
+		return EINVAL; /* XXX not implemented yet */
+
+	case RTM_DELETE:
+		flags = LLE_DELETE;
+		break;
+	}
+	la = arplookup(ifp, dst->sin_addr.s_addr, flags);
+	if (la == NULL) {
+		bcopy(LLADDR(dl), &la->ll_addr, ifp->if_addrlen);
+		la->flags |= LLA_VALID;
+		if (rtm->rtm_flags & RTF_STATIC)
+			la->flags |= LLE_STATIC;
+		else
+			la->expire = time_second + arpt_keep;
+	}
+	return 0;
+}
+
+
+
+/***
+ ***
+ *** End of new arp support routines which should go to a separate file.
+ ***
+ ***/
+
 /*
  * Timeout routine.  Age arp_tab entries periodically.
  */
_at__at_ -152,6 +436,9 _at__at_ arptimer(ignored_arg)
 	callout_reset(&arp_callout, arpt_prune * hz, arptimer, NULL);
 }
 
+#if 0 /* this is unused */
+static int	arp_allocated;
+
 /*
  * Parallel to llc_rtrequest.
  */
_at__at_ -284,6 +571,7 _at__at_ arp_rtrequest(req, rt, info)
 		Free((caddr_t)la);
 	}
 }
+#endif /* arp_rtrequest unused */
 
 /*
  * Broadcast an ARP request. Caller specifies:
_at__at_ -301,6 +589,28 _at__at_ arprequest(ifp, sip, tip, enaddr)
 	struct arphdr *ah;
 	struct sockaddr sa;
 
+	if (sip == NULL) {
+		/*
+		 * The caller did not supply a source address, try to find
+		 * a compatible one among those assigned to this interface.
+		 */
+		struct ifaddr *ifa;
+
+		TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) {
+			if (!ifa->ifa_addr ||
+			    ifa->ifa_addr->sa_family != AF_INET)
+				continue;
+			sip = &SIN(ifa->ifa_addr)->sin_addr;
+			if (0 == ((sip->s_addr ^ tip->s_addr) &
+			    SIN(ifa->ifa_netmask)->sin_addr.s_addr) )
+				break;	/* found it. */
+		}
+	}
+	if (sip == NULL) {  
+		printf(" cannot find matching address, no arprequest\n");
+		return;
+	}
+
 	if ((m = m_gethdr(M_DONTWAIT, MT_DATA)) == NULL)
 		return;
 	m->m_len = sizeof(*ah) + 2*sizeof(struct in_addr) +
_at__at_ -344,16 +654,11 _at__at_ int
 arpresolve(struct ifnet *ifp, struct rtentry *rt0, struct mbuf *m,
 	struct sockaddr *dst, u_char *desten)
 {
-	struct llinfo_arp *la = 0;
+	struct llentry *la = 0;
 	struct sockaddr_dl *sdl;
-	int error;
 	struct rtentry *rt;
-
-	error = rt_check(&rt, &rt0, dst);
-	if (error) {
-		m_freem(m);
-		return error;
-	}
+	u_int flags = (ifp->if_flags & (IFF_NOARP | IFF_STATICARP)) ?
+			0 : LLE_CREATE;
 
 	if (m->m_flags & M_BCAST) {	/* broadcast */
 		(void)memcpy(desten, ifp->if_broadcastaddr, ifp->if_addrlen);
_at__at_ -363,51 +668,39 _at__at_ arpresolve(struct ifnet *ifp, struct rte
 		ETHER_MAP_IP_MULTICAST(&SIN(dst)->sin_addr, desten);
 		return (0);
 	}
-	if (rt)
-		la = (struct llinfo_arp *)rt->rt_llinfo;
-	if (la == 0) {
-		la = arplookup(SIN(dst)->sin_addr.s_addr, 1, 0);
-		if (la)
-			rt = la->la_rt;
-	}
-	if (la == 0 || rt == 0) {
-		log(LOG_DEBUG, "arpresolve: can't allocate llinfo for %s%s%s\n",
-			inet_ntoa(SIN(dst)->sin_addr), la ? "la" : "",
-				rt ? "rt" : "");
+	la = arplookup(ifp, SIN(dst)->sin_addr.s_addr, flags);
+	if (la == NULL) {
+		if (flags & LLE_CREATE)
+			log(LOG_DEBUG,
+				"arpresolve: can't allocate llinfo for %s\n",
+				inet_ntoa(SIN(dst)->sin_addr));
 		m_freem(m);
 		return (EINVAL); /* XXX */
 	}
 	sdl = SDL(rt->rt_gateway);
 	/*
-	 * Check the address family and length is valid, the address
-	 * is resolved; otherwise, try to resolve.
+	 * If the entry is valid and not expired, use it.
 	 */
-	if ((rt->rt_expire == 0 || rt->rt_expire > time_second) &&
-	    sdl->sdl_family == AF_LINK && sdl->sdl_alen != 0) {
+	if (la->flags & LLA_VALID &&
+	    (la->flags & LLE_STATIC || la->expire > time_second)) {
+		bcopy(&la->ll_addr, desten, ifp->if_addrlen);
 		/*
 		 * If entry has an expiry time and it is approaching,
 		 * see if we need to send an ARP request within this
 		 * arpt_down interval.
 		 */
-		if ((rt->rt_expire != 0) &&
-		    (time_second + la->la_preempt > rt->rt_expire)) {
-			arprequest(ifp,
-				   &SIN(rt->rt_ifa->ifa_addr)->sin_addr,
-				   &SIN(dst)->sin_addr,
-				   IF_LLADDR(ifp));
+		if (!(la->flags & LLE_STATIC) &&
+		    time_second + la->la_preempt > la->expire) {
+			arprequest(ifp, NULL,
+				&SIN(dst)->sin_addr, IF_LLADDR(ifp));
+
 			la->la_preempt--;
 		} 
-
-		bcopy(LLADDR(sdl), desten, sdl->sdl_alen);
 		return (0);
 	}
-	/*
-	 * If ARP is disabled or static on this interface, stop.
-	 * XXX
-	 * Probably should not allocate empty llinfo struct if we are
-	 * not going to be sending out an arp request.
-	 */
-	if (ifp->if_flags & (IFF_NOARP | IFF_STATICARP)) {
+	if (la->flags & LLE_STATIC) {	/* should not happen! */
+		log(LOG_DEBUG, "arpresolve: ouch, empty static llinfo for %s\n",
+				inet_ntoa(SIN(dst)->sin_addr));
 		m_freem(m);
 		return (EINVAL);
 	}
_at__at_ -419,26 +712,26 _at__at_ arpresolve(struct ifnet *ifp, struct rte
 	if (la->la_hold)
 		m_freem(la->la_hold);
 	la->la_hold = m;
-	if (rt->rt_expire) {
-		RT_LOCK(rt);
-		rt->rt_flags &= ~RTF_REJECT;
-		if (la->la_asked == 0 || rt->rt_expire != time_second) {
-			rt->rt_expire = time_second;
-			if (la->la_asked++ < arp_maxtries) {
-				arprequest(ifp,
-					   &SIN(rt->rt_ifa->ifa_addr)->sin_addr,
-					   &SIN(dst)->sin_addr,
-					   IF_LLADDR(ifp));
-			} else {
-				rt->rt_flags |= RTF_REJECT;
-				rt->rt_expire += arpt_down;
-				la->la_asked = 0;
-				la->la_preempt = arp_maxtries;
-			}
-
+	/*
+	 * Now implement the logic to issue requests -- we can send up
+	 * to arp_maxtries with a 1-sec spacing, followed by a pause
+	 * of arpt_down seconds if no replies are coming back.
+	 * Take the chance to enforce limits on arp_maxtries and arpt_down
+	 */
+	if (la->expire <= time_second) { /* ok, expired */
+		if (arp_maxtries > 100) /* enforce a sane limit */
+			arp_maxtries = 100;
+		else if (arp_maxtries < 3)
+			arp_maxtries = 3;
+		if (la->la_asked++ < arp_maxtries)
+			la->expire = time_second + 1;
+		else {
+			la->la_asked = 0;
+			la->expire = time_second + arpt_down;
+			la->la_preempt = arp_maxtries;
 		}
-		RT_UNLOCK(rt);
-	}
+		arprequest(ifp, NULL, &SIN(dst)->sin_addr, IF_LLADDR(ifp));
+        }
 	return (EWOULDBLOCK);
 }
 
_at__at_ -518,16 +811,12 _at__at_ in_arpinput(m)
 {
 	struct arphdr *ah;
 	struct ifnet *ifp = m->m_pkthdr.rcvif;
-	struct iso88025_header *th = (struct iso88025_header *)0;
-	struct iso88025_sockaddr_dl_data *trld;
-	struct llinfo_arp *la = 0;
-	struct rtentry *rt;
+	struct llentry *la = 0;
 	struct ifaddr *ifa;
 	struct in_ifaddr *ia;
-	struct sockaddr_dl *sdl;
 	struct sockaddr sa;
 	struct in_addr isaddr, itaddr, myaddr;
-	int op, rif_len;
+	int op;
 	int req_len;
 
 	req_len = arphdr_len2(ifp->if_addrlen, sizeof(struct in_addr));
_at__at_ -540,6 +829,19 _at__at_ in_arpinput(m)
 	op = ntohs(ah->ar_op);
 	(void)memcpy(&isaddr, ar_spa(ah), sizeof (isaddr));
 	(void)memcpy(&itaddr, ar_tpa(ah), sizeof (itaddr));
+	/*
+	 * sanity check for the address length.
+	 * XXX this does not work for protocols with variable address
+	 * length. -is
+	 */
+	if (ifp->if_addrlen != ah->ar_hln) {
+		log(LOG_WARNING,
+		    "arp from %*D: addr len: new %d, i/f %d (ignored)",
+		    ifp->if_addrlen, (u_char *) ar_sha(ah), ":",
+		    ah->ar_hln, ifp->if_addrlen);
+		goto drop;
+	}
+
 #ifdef BRIDGE
 #define BRIDGE_TEST (do_bridge)
 #else
_at__at_ -592,62 +894,41 _at__at_ match:
 	}
 	if (ifp->if_flags & IFF_STATICARP)
 		goto reply;
-	la = arplookup(isaddr.s_addr, itaddr.s_addr == myaddr.s_addr, 0);
-	if (la && (rt = la->la_rt) && (sdl = SDL(rt->rt_gateway))) {
-		/* the following is not an error when doing bridging */
-		if (!BRIDGE_TEST && rt->rt_ifp != ifp) {
-			if (log_arp_wrong_iface)
-				log(LOG_ERR, "arp: %s is on %s but got reply from %*D on %s\n",
-				    inet_ntoa(isaddr),
-				    rt->rt_ifp->if_xname,
-				    ifp->if_addrlen, (u_char *)ar_sha(ah), ":",
-				    ifp->if_xname);
-			goto reply;
-		}
-		if (sdl->sdl_alen &&
-		    bcmp(ar_sha(ah), LLADDR(sdl), sdl->sdl_alen)) {
-			if (rt->rt_expire) {
-			    if (log_arp_movements)
-			        log(LOG_INFO, "arp: %s moved from %*D to %*D on %s\n",
-				    inet_ntoa(isaddr),
-				    ifp->if_addrlen, (u_char *)LLADDR(sdl), ":",
-				    ifp->if_addrlen, (u_char *)ar_sha(ah), ":",
-				    ifp->if_xname);
-			} else {
+	/* Look up the source. If I am the target, create an entry for it. */
+	la = arplookup(ifp, isaddr.s_addr,
+	    (itaddr.s_addr == myaddr.s_addr) ? LLE_CREATE : 0);
+	if (la != NULL) {
+		/* We have a valid entry. Check and store the MAC. */
+		if (la->flags & LLA_VALID &&
+		    bcmp(ar_sha(ah), &la->ll_addr, ifp->if_addrlen)) {
+			if (la->flags & LLE_STATIC) {
 			    log(LOG_ERR,
 				"arp: %*D attempts to modify permanent entry for %s on %s\n",
 				ifp->if_addrlen, (u_char *)ar_sha(ah), ":",
 				inet_ntoa(isaddr), ifp->if_xname);
 			    goto reply;
 			}
+			if (log_arp_movements)
+			        log(LOG_INFO, "arp: %s moved from %*D to %*D on %s\n",
+				    inet_ntoa(isaddr),
+				    ifp->if_addrlen, (u_char *)&la->ll_addr, ":",
+				    ifp->if_addrlen, (u_char *)ar_sha(ah), ":",
+				    ifp->if_xname);
 		}
-		/*
-		 * sanity check for the address length.
-		 * XXX this does not work for protocols with variable address
-		 * length. -is
-		 */
-		if (sdl->sdl_alen &&
-		    sdl->sdl_alen != ah->ar_hln) {
-			log(LOG_WARNING,
-			    "arp from %*D: new addr len %d, was %d",
-			    ifp->if_addrlen, (u_char *) ar_sha(ah), ":",
-			    ah->ar_hln, sdl->sdl_alen);
-		}
-		if (ifp->if_addrlen != ah->ar_hln) {
-			log(LOG_WARNING,
-			    "arp from %*D: addr len: new %d, i/f %d (ignored)",
-			    ifp->if_addrlen, (u_char *) ar_sha(ah), ":",
-			    ah->ar_hln, ifp->if_addrlen);
-			goto reply;
-		}
-		(void)memcpy(LLADDR(sdl), ar_sha(ah),
-		    sdl->sdl_alen = ah->ar_hln);
+		bcopy(ar_sha(ah), &la->ll_addr, ifp->if_addrlen);
+		la->flags |= LLA_VALID;
+#if 0 /* XXX this needs to be fixed */
 		/*
 		 * If we receive an arp from a token-ring station over
 		 * a token-ring nic then try to save the source
 		 * routing info.
 		 */
 		if (ifp->if_type == IFT_ISO88025) {
+			struct iso88025_header *th;
+			struct iso88025_sockaddr_dl_data *trld;
+			struct sockaddr_dl *sdl;
+			int rif_len;
+
 			th = (struct iso88025_header *)m->m_pkthdr.header;
 			trld = SDL_ISO88025(sdl);
 			rif_len = TR_RCF_RIFLEN(th->rcf);
_at__at_ -673,15 +954,20 _at__at_ match:
 			m->m_pkthdr.len += 8;
 			th->rcf = trld->trld_rcf;
 		}
-		RT_LOCK(rt);
-		if (rt->rt_expire)
-			rt->rt_expire = time_second + arpt_keep;
-		rt->rt_flags &= ~RTF_REJECT;
-		RT_UNLOCK(rt);
+#endif
+		if (!(la->flags & LLE_STATIC))
+			la->expire = time_second + arpt_keep;
 		la->la_asked = 0;
 		la->la_preempt = arp_maxtries;
 		if (la->la_hold) {
-			(*ifp->if_output)(ifp, la->la_hold, rt_key(rt), rt);
+			struct sockaddr_in sin;
+
+			bzero(&sin, sizeof(sin));
+			sin.sin_len = sizeof(struct sockaddr_in);
+			sin.sin_family = AF_INET;
+			sin.sin_addr.s_addr = la->l3_addr.s_addr;
+			ifp->if_output(ifp, la->la_hold,
+				(struct sockaddr *)&sin, NULL);
 			la->la_hold = 0;
 		}
 	}
_at__at_ -693,9 +979,10 _at__at_ reply:
 		(void)memcpy(ar_tha(ah), ar_sha(ah), ah->ar_hln);
 		(void)memcpy(ar_sha(ah), IF_LLADDR(ifp), ah->ar_hln);
 	} else {
-		la = arplookup(itaddr.s_addr, 0, SIN_PROXY);
+		la = arplookup(ifp, itaddr.s_addr, LLE_PROXY);
 		if (la == NULL) {
 			struct sockaddr_in sin;
+			struct rtentry *rt;
 
 			if (!arp_proxyall)
 				goto drop;
_at__at_ -747,10 +1034,8 _at__at_ reply:
 			       inet_ntoa(itaddr));
 #endif
 		} else {
-			rt = la->la_rt;
 			(void)memcpy(ar_tha(ah), ar_sha(ah), ah->ar_hln);
-			sdl = SDL(rt->rt_gateway);
-			(void)memcpy(ar_sha(ah), LLADDR(sdl), ah->ar_hln);
+			(void)memcpy(ar_sha(ah),  &la->ll_addr, ah->ar_hln);
 		}
 	}
 
_at__at_ -798,66 +1083,77 _at__at_ arptfree(la)
 /*
  * Lookup or enter a new address in arptab.
  */
-static struct llinfo_arp *
-arplookup(addr, create, proxy)
-	u_long addr;
-	int create, proxy;
+struct llentry *
+arplookup(struct ifnet *ifp, uint32_t l3addr, u_int flags)
 {
-	struct rtentry *rt;
-	struct sockaddr_inarp sin;
-	const char *why = 0;
-
-	bzero(&sin, sizeof(sin));
-	sin.sin_len = sizeof(sin);
-	sin.sin_family = AF_INET;
-	sin.sin_addr.s_addr = addr;
-	if (proxy)
-		sin.sin_other = SIN_PROXY;
-	rt = rtalloc1((struct sockaddr *)&sin, create, 0UL);
-	if (rt == 0)
-		return (0);
-
-	if (rt->rt_flags & RTF_GATEWAY)
-		why = "host is not on local network";
-	else if ((rt->rt_flags & RTF_LLINFO) == 0)
-		why = "could not allocate llinfo";
-	else if (rt->rt_gateway->sa_family != AF_LINK)
-		why = "gateway route is not ours";
-
-	if (why) {
-#define	ISDYNCLONE(_rt) \
-	(((_rt)->rt_flags & (RTF_STATIC | RTF_WASCLONED)) == RTF_WASCLONED)
-		if (create)
-			log(LOG_DEBUG, "arplookup %s failed: %s\n",
-			    inet_ntoa(sin.sin_addr), why);
-		/*
-		 * If there are no references to this Layer 2 route,
-		 * and it is a cloned route, and not static, and
-		 * arplookup() is creating the route, then purge
-		 * it from the routing table as it is probably bogus.
-		 */
-		if (rt->rt_refcnt == 1 && ISDYNCLONE(rt))
-			rtexpunge(rt);
-		RTFREE_LOCKED(rt);
-		return (0);
-#undef ISDYNCLONE
-	} else {
-		RT_REMREF(rt);
-		RT_UNLOCK(rt);
-		return ((struct llinfo_arp *)rt->rt_llinfo);
-	}
+	struct llentry *e;
+	struct lltable *t;
+	// uint proxy = flags & LLE_PROXY;
+
+	if (ifp == NULL)
+		return NULL;
+	/* LOCK_IFNET */
+	for (t = ifp->lltables; t && t->llt_af != AF_INET; t = t->llt_next)
+		;
+	if (t == NULL && flags & LLE_CREATE)
+		t = lltable_new(ifp, AF_INET);
+	if (t == NULL) {
+		/* UNLOCK_ALL_TABLES */
+		return NULL;    /* failed! */
+	}
+	/* LOCK_TABLE(t) */
+	/* UNLOCK_ALL_TABLES */
+	for (e = (struct llentry *)t->lle_head; e ; e = e->lle_next) {
+		if (e->flags & LLE_DELETED)
+			continue;
+		if (l3addr == e->l3_addr.s_addr)
+			break;
+        }
+	if (e == NULL) {        /* entry not found */ 
+		if (!(flags & LLE_CREATE))
+			goto done;
+		if (find_ifa(ifp, l3addr) == NULL) {
+			printf("host is not on local network\n");
+			goto done;
+		}
+		e = malloc(sizeof (struct llentry), M_ARP, M_DONTWAIT | M_ZERO);
+		if (e == NULL) {
+			printf("arp malloc failed\n");
+			goto done;
+		}
+		e->expire = time_second; /* mark expired */
+		e->l3_addr.s_addr = l3addr;
+		e->lle_next = t->lle_head;
+		t->lle_head = e;
+	}
+	if (flags & LLE_DELETE &&
+	    (e->flags & LLE_IFADDR) == (flags & LLE_IFADDR))
+		e->flags = LLE_DELETED;
+done:
+	/* UNLOCK(t) */
+	return e;
 }
 
+
 void
 arp_ifinit(ifp, ifa)
 	struct ifnet *ifp;
 	struct ifaddr *ifa;
 {
+	struct llentry *la;
+
+	printf("arp_ifinit ifp %p addr 0x%x\n",
+	    ifp, ntohl(IA_SIN(ifa)->sin_addr.s_addr));
+
 	if (ntohl(IA_SIN(ifa)->sin_addr.s_addr) != INADDR_ANY)
 		arprequest(ifp, &IA_SIN(ifa)->sin_addr,
 				&IA_SIN(ifa)->sin_addr, IF_LLADDR(ifp));
-	ifa->ifa_rtrequest = arp_rtrequest;
-	ifa->ifa_flags |= RTF_CLONING;
+	la = arplookup(ifp, IA_SIN(ifa)->sin_addr.s_addr, LLE_CREATE);
+	if (la) {	/* store our address */
+		bcopy(IF_LLADDR(ifp), &la->ll_addr, ifp->if_addrlen);
+		la->flags |= LLA_VALID | LLE_STATIC | LLE_IFADDR;
+	}
+	ifa->ifa_rtrequest = NULL;
 }
 
 static void
_at__at_ -866,9 +1162,8 _at__at_ arp_init(void)
 
 	arpintrq.ifq_maxlen = 50;
 	mtx_init(&arpintrq.ifq_mtx, "arp_inq", NULL, MTX_DEF);
-	LIST_INIT(&llinfo_arp);
 	callout_init(&arp_callout, CALLOUT_MPSAFE);
 	netisr_register(NETISR_ARP, arpintr, &arpintrq, NETISR_MPSAFE);
-	callout_reset(&arp_callout, hz, arptimer, NULL);
+	callout_reset(&arp_callout, hz, newarptimer, NULL);
 }
 SYSINIT(arp, SI_SUB_PROTO_DOMAIN, SI_ORDER_ANY, arp_init, 0);
Index: src/sys/netinet/if_ether.h
===================================================================
RCS file: /home/ncvs/src/sys/netinet/if_ether.h,v
retrieving revision 1.30
diff -u -p -r1.30 if_ether.h
--- src/sys/netinet/if_ether.h	7 Apr 2004 20:46:13 -0000	1.30
+++ src/sys/netinet/if_ether.h	25 Apr 2004 15:09:46 -0000
_at__at_ -112,6 +112,33 _at__at_ extern u_char	ether_ipmulticast_max[ETHE
 int	arpresolve(struct ifnet *ifp, struct rtentry *rt,
 		struct mbuf *m, struct sockaddr *dst, u_char *desten);
 void	arp_ifinit(struct ifnet *, struct ifaddr *);
+
+/*
+ * Support routines for the new arp table
+ */
+struct lltable *lltable_new(struct ifnet *ifp, int af);
+struct lltable **lltable_free(struct lltable **t);
 #endif
 
+struct lltable {
+	struct lltable	*llt_next;
+	void	*lle_head;	/* pointer to the list of address entries */
+	int	llt_af;		/* address family */
+};
+
+/*
+ * flags to be passed to arplookup.
+ */
+#define LLE_DELETED     0x0001  /* entry must be deleted        */
+#define LLE_STATIC      0x0002  /* entry is static              */
+#define LLE_IFADDR      0x0004  /* entry is interface addr      */
+#define LLA_VALID       0x0008  /* ll_addr is valid             */
+#define LLE_PROXY       0x0010  /* proxy entry ???              */
+#define LLE_PUB         0x0020  /* publish entry ???            */
+#define LLE_CREATE      0x8000  /* create on a lookup miss      */
+#define LLE_DELETE      0x4000  /* delete on a lookup - match LLE_IFADDR */
+
+/*
+ * End of support code for the new arp table
+ */
 #endif
Received on Sun Apr 25 2004 - 07:49:41 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:52 UTC