Dear all: Just a quick note on what I've been up to do for the last week or so: For the last few years, we've been running with rather inadequate synchronization over certain network interface-related data structures -- global address lists, some portions of network interface setup and teardown, and per-interface address lists. Reports of bugs have been few and far between, because in practice most network configurations simply don't see these data structures change much, and when they do, it's really on multiple CPUs at a time. As such, work on network locking has largely focused elsewhere. However, as core count has increased (>= 8 cores becoming normal) we have seen a few more reports lately on IPSEC tunnel servers, PPPoE servers, etc, where interfaces are being manipulated dynamicaly and in quantity. Interestingly, some of these races existed long before the SMPng project, but were only exercisable under heavy load leading to lots of paging/swapping or high memory pressure. In any case, I've been working on a set of related changes: - Add missing locking for per-interface address lists (specifically if_addrhead, but also some other similar situations). - Improve the formalization of our network interface life cycle, and add ifnet refcounts so that syscall-generated operations for monitoring or management interfaces aren't at risk of having it "go away" while they copy in or out of userspace. - Add new locks and locking for global protocol address lists (especially IP hash lists and full address lists). Many of these changes should have no impact on practical performance, as most relate to administrative operations such as adding interfaces or addresses, or relatively rare processing (such as bulk network broadcast processing). However, adding locks for the global protocol address lists and hash chains does touch the fast path. These changes aren't in the tree yet, and make use of rmlocks in order to avoid touching non-local cache lines across CPUs, but even so it should be possible to measure a very small change in those paths, since critical sections are used for read operations. If I could ask people doing regular performance testing to keep an eye out for significant changes in performance (better or worse), perhaps above say %1 pps or the like, and let me know if they see it, that would be helpful. My expectation is that the impact will be minor, but as CPU/network performance ratios and workloads vary a great deal, it would be helpful to know about significant changes as early as possible so we can identify the precise source and look for ways to mitigate it. Most of these changes will be merged to 7-STABLE in time for FreeBSD 7.3. Robert N M Watson Computer Laboratory University of CambridgeReceived on Fri Apr 24 2009 - 08:33:25 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:46 UTC