Re: Generic Kernel API

From: Chuck Swiger <cswiger_at_mac.com>
Date: Thu, 10 Nov 2005 01:08:45 -0500
Scott Long wrote:
> Chuck Swiger wrote:
[ ...many lines deleted... ]
>>> Remember that the target audience for much of the Apple documentation is
>>> people who have never programmed in a Unix kernel before, be they coming
>>> from Windows or coming from OS9.  In fact, the Apple docs go out of 
>>> their way to discourage you from writing kernel modules entirely.
>>
>> Sure-- don't you agree that anything which can be done in userland, 
>> generally ought to be done there?  Apple has to contend with 
>> developers who are looking to hook into the vertical blanking handler 
>> for screensavers and clock programs and who knows what else, just like 
>> they did in OS 9.  Discouraging such things from going into the kernel 
>> is a good idea.

Retained for context.

>> Also remember that Mach is closer to being a microkernel than the 
>> other BSD kernels are, and the philosophy is showing in the design.  
>> That doesn't mean it's always the best approach, but Mach feels more 
>> consistent to me. 
> 
> The use of Mach in OSX in a whole lot more limited that you might think.

You're talking about what someone else might think?  Telepathy...?

> The three uses are the BSD+IOkit kernel, the window server, and the 
> security server.  The filesystems are still inside the BSD task, as are
> most drivers.  The exception here is certain kinds of USB peripheral
> drivers.  The USB hardware driver itself is still inside the kernel task.

There are plenty of reasons why normal userland apps might use Mach messaging 
or VM calls directly, particularly things like databases which want to control 
their own paging behavior, or things which value both latency and bandwidth.

It's also rather common for Apple's system daemons to use Mach messaging, a 
quick test suggests that 3 out of 4 do:

[ ...pasted as quoted-text to avoid wrapping, I hope... ]
> % uname -a
> Darwin pan.codefab.com 8.3.0 Darwin Kernel Version 8.3.0: Mon Oct  3 20:04:04 PDT 2005; root:xnu-792.6.22.obj~2/RELEASE_PPC Power Macintosh powerpc
> % ps auxw | grep root | sort -n +1 | head -20
> root         1   0.0  0.1    28356    516  ??  S<s  Fri03PM   0:01.65 /sbin/launchd
> root        23   0.0  0.1    27272    504  ??  Ss   Fri03PM   0:00.01 /sbin/dynamic_pager -F /private/var/vm/swapfile
> root        27   0.0  0.9    30648   4692  ??  Ss   Fri03PM   0:05.85 kextd
> root        54   0.0  0.7    29928   3920  ??  Ss   Fri03PM   0:01.03 /usr/sbin/configd
> root        55   0.0  1.0    29492   5056  ??  Ss   Fri03PM   0:05.51 /usr/sbin/coreaudiod
> root        56   0.0  0.6    27788   3128  ??  Ss   Fri03PM   0:00.72 /usr/sbin/diskarbitrationd
> root        57   0.0  0.3    28328   1644  ??  Ss   Fri03PM   0:00.10 /usr/sbin/memberd -x
> root        58   0.0  0.8    29232   4356  ??  Ss   Fri03PM   0:08.49 /usr/sbin/securityd
> root        60   0.0  0.2    27872    952  ??  Ss   Fri03PM   0:02.66 /usr/sbin/notifyd
> root        61   0.0  1.1    31088   5620  ??  Ss   Fri03PM   0:05.16 /usr/sbin/DirectoryService
> root        62   0.0  0.1    28252    744  ??  Ss   Fri03PM   0:00.01 /usr/sbin/KernelEventAgent
> root        63   0.0  0.5    28088   2792  ??  Ss   Fri03PM   0:30.89 /usr/sbin/mDNSResponder -launchdaemon
> root        64   0.1  0.2    27600   1168  ??  Ss   Fri03PM   0:06.23 /usr/sbin/netinfod -s local
> root        73   0.0  0.4    27680   2060  ??  Ss   Fri03PM   0:00.36 /usr/sbin/distnoted
> root        79   0.0  0.2    27256    836  ??  Ss   Fri03PM   3:50.77 /usr/sbin/update
> root        87   0.0  2.0    40148  10428  ??  Ss   Fri03PM   0:02.09 /System/Library/CoreServices/coreservicesd
> root        90   0.2  0.2    29332   1080  ??  Ss   Fri03PM   0:19.00 /usr/sbin/lookupd
> root       121   0.0  0.2    27516    848  ??  Ss   Fri03PM   0:22.31 ntpd -f /var/run/ntp.drift -p /var/run/ntpd.pid
> root       137   0.0  0.1    27260    672  ??  Ss   Fri03PM   0:00.00 /usr/libexec/crashreporterd
> root       157   0.0  0.1    29316    532  ??  Ss   Fri03PM   0:00.00 nfsiod -n 4

[ ...minor pathname frobbing via awk deleted... ]
> % grep -nl _mach_msg `cat /tmp/file_list`
> /sbin/launchd
> /sbin/dynamic_pager
> /usr/libexec/kextd
> /usr/sbin/configd
> /usr/sbin/coreaudiod
> /usr/sbin/diskarbitrationd
> /usr/sbin/memberd
> /usr/sbin/securityd
> /usr/sbin/notifyd
> /usr/sbin/DirectoryService
> /usr/sbin/KernelEventAgent
> /usr/sbin/mDNSResponder
> /usr/sbin/lookupd
> /usr/libexec/crashreporterd
> % grep -nl _mach_msg `cat /tmp/file_list` | wc -l                                                                      /usr/sbin
>       14

As one might expect, portable BSD programs like ntpd and nfsiod do not use Mach 
messaging.  Neither does update, the equivalent of [syncer], nor would /bin/sh, 
etc.

>>> There is already a well established and stable API for doing DMA in 
>>> FreeBSD.  Just about every driver in the kernel uses it.  Why change?
>>
>> You mean isa_dmacascade(), isa_dma_acquire(), isa_dmainit() and 
>> bus_dma_*...?
>>
>> Eww.
> 
> Uh, what?

"Eww." as in "Yuck."  :-)

>> The forces of entropy are winning the fight to keep the ISA bus and 
>> DMA bounce buffers which must be less than 64K around forever, even on 
>> hardware which doesn't have such limitations.  :-)
> 
> Until the G5 was introduced, OSX never had to worry about making 32-bit 
> DMA work on >4GB memory configurations, and it certainly never worried
> about ISA DMA.  These are all still realities for i386 and amd64.  There
> are a lot of common I/O controllers out there, including traditional 
> ATA, that can't do 64-bit DMA and thus __require__ bounce buffers.

There is nothing wrong with using bounce buffers when the situation requires 
it.  The converse statement is also true: there *is* something wrong with using 
bounce buffers when it is not necessary.

> Sparc64 requires that you program the IOMMU in order to do any DMA.
> Busdma makes all of this transparent.  And as for the G5, it does h0h0
> magic to make 32bit DMA work that is outside the scope of the IOMemory
> classes.

"h0h0 magic"...?  :-)

 From what I can tell, both MacOS X and Solaris will happily use much of the 
first 4GB of physical RAM for DMA by wiring the pages down until the DMA 
transfer completes, and map those pages into either a 32-bit or 64-bit virtual 
address space, depending on what the process happens to be.

> So, I'm sure what you have against the existing APIs, but they work well
> for the FreeBSD environment.

I prefer an event-driven work-loop API model rather than a continuation-based 
model, and I like Mach vm_objects a lot.  There's an interesting discussion of 
the tradeoffs between Mach messaging versus BSD copyin and copyout here:

http://developer.apple.com/documentation/Darwin/Conceptual/KernelProgramming/boundaries/chapter_14_section_3.html

[ ... ]
>> On the other hand, using inheritence for drivers seems to work pretty 
>> well in practice, and the notion of encapsulation seems to help Darwin 
>> avoid running into nearly as many lock-order reversals and layering 
>> violations.
> 
> Again, IOKit doesn't cover pseudo drivers, and it papers over locking by
> providing high level serialization constructs.

If the system-provided API ensures correct serialization, that's better than 
getting serialization wrong, yes...?

> It would be interesting to write an IOKit driver two different ways, one
> that uses work loops and one that uses mutexes directly, as see if there is
> any performance difference on SMP.  Until then, it's hard to say that work
> loops have a practical advantage in high performance environments.

Indeed.  Well, there appear to be a lot of dual-core machines coming down the 
road, so SMP is going to be a lot more common than it has been previously.  I 
suspect that Solaris is going to do particularly well on quad-proc and higher 
boxes.

> I'm starting to see evidence in FreeBSD that excessive serialization in
> device drivers is not good.  Also, workloops aren't available outside of
> IOKit, and Darwin provides no good tools like WITNESS to detect and
> debugging locking problems, so it must be done through trial and error. That
> is really not fun.  As interesting as Darwin is, I still prefer to work in 
> FreeBSD.

Sure.

-- 
-Chuck
Received on Thu Nov 10 2005 - 05:08:50 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:47 UTC