Ryan Stone wrote: > I find that the best way to profile the kernel is with pmc. You don't > need to compile anything with a special option(other than including > the hwpmc hooks in the kernel with the HWPMC_HOOKS option) so you can > use it at any time on the same code you'll be shipping. pmc does > statistical profiling; it uses whatever performance monitoring > counters are provided by the hardware. It has a pretty low overhead, > especially compared with other profiling techniques. It's really easy > to use, too: thanks for all this. BTW I just tried the old kgmon/gprof profiling as a control. it appears that on amd64 it doesn't work. gprof can't read the file that the kernel puts out. (useful!). > > 1) If hwpmc is not compiled into your kernel, kldload hwpmc > 2) Run pmcstat to begin taking samples(make sure that whatever you are > profiling is busy doing work first!): > > pmcstat -S unhalted-cycles -O /tmp/samples.out > > The -S option specifies what event you want to use to trigger > sampling. The unhalted-cycles is the best event to use if your > hardware supports it; pmc will take a sample every 64K non-idle CPU > cycles, which is basically equivalent to sampling based on time. If > the unhalted-cycles event is not supported by your hardware then the > instructions event will probably be the next best choice(although it's > nowhere near as good, as it will not be able to tell you, for example, > if a particular function is very expensive because it takes a lot of > cache misses compared to the rest of your program). One caveat with > the unhalted-cycles event is that time spent spinning on a spinlock or > adaptively spinning on a MTX_DEF mutex will not be counted by this > event, because most of the spinning time is spent executing an hlt > instruction that idles the CPU for a short period of time. > > Modern Intel and AMD CPUs offer a dizzying array of events. They're > mostly only useful if you suspect that a particular kind of event is > hurting your performance and you would like to know what is causing > those events. For example, if you suspect that data cache misses are > causing you problems you can take samples on cache misses. > Unfortunately on some of the newer CPUs(namely the Core2 family, > because that's what I'm doing most of my profiling on nowadays) I find > it difficult to figure out just what event to use to profile based on > cache misses. man pmc will give you an overview of pmc, and there are > manpages for every CPU family supported(eg man pmc.core2) > > 3) After you've run pmcstat for "long enough"(a proper definition of > long enough requires a statistician, which I most certainly am not, > but I find that for a busy system 10 seconds is enough), Control-C it > to stop it*. You can use pmcstat to post-process the samples into > human-readable text: > > pmcstat -R /tmp/samples.out -G /tmp/graph.txt > > The graph.txt file will show leaf functions on the left and their > callers beneath them, indented to reflect the callchain. It's not too > easy to describe and I don't have sample output available right now. > > > Another interesting tool for post-processing the samples is > pmcannotate. I've never actually used the tool before but it will > annotate the program's source to show which lines are the most > expensive. This of course needs unstripped modules to work. I think > that it will also work if the GNU "debug link" is in the stripped > module pointing to the location of the file with symbols. > > > * Here's a tip I picked up from Joseph Koshy's blog: to collect > samples for a fixed period of time(say 1 minute), have pmcstat run the > sleep command: > > pmcstat -S unhalted-cycles -O /tmp/samples.out sleep 60 > _______________________________________________ > freebsd-current_at_freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"Received on Mon Dec 14 2009 - 18:56:10 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:59 UTC