On 19 August 2010 17:34, John Baldwin <jhb_at_freebsd.org> wrote: > On Thursday, August 19, 2010 5:29:25 am pluknet wrote: >> On 19 August 2010 00:07, John Baldwin <jhb_at_freebsd.org> wrote: >> > On Wednesday, August 18, 2010 3:17:56 pm pluknet wrote: >> >> On 18 August 2010 23:11, pluknet <pluknet_at_gmail.com> wrote: >> >> > On 18 August 2010 17:46, Kostik Belousov <kostikbel_at_gmail.com> wrote: >> >> >> On Wed, Aug 18, 2010 at 02:43:19PM +0400, pluknet wrote: >> >> >>> On 18 August 2010 12:07, pluknet <pluknet_at_gmail.com> wrote: >> >> >>> > On 17 August 2010 20:04, Kostik Belousov <kostikbel_at_gmail.com> wrote: >> >> >>> > >> >> >>> >> >> >> >>> >> Also please take a note of the John' suggestion to use the taskqueue. >> >> >>> > >> >> >>> > I decided to go this road. Thank you both. >> >> >>> > Now I do nfs buildkernel survive and prepare some benchmark results. >> >> >>> > >> >> >>> >> >> >>> So, I modified the patch to defer proc_create() with taskqueue(9). >> >> >>> Below is `time make -j5 buildkernel WITHOUT_MODULES=yes` perf. >> > evaluation. >> >> >>> Done on 4-way CPU on clean /usr/obj with /usr/src & /usr/obj both >> >> >>> nfs-mounted over 1Gbit LAN. >> >> >>> >> >> >>> clean old >> >> >>> 1137.985u 239.411s 7:42.15 298.0% 6538+2133k 87+43388io 226pf+0w >> >> >>> >> >> >>> clean new >> >> >>> 1134.755u 240.032s 7:41.25 298.0% 6553+2133k 87+43367io 224pf+0w >> >> >>> >> >> >>> Patch needs polishing, though it generally works. >> >> >>> Not sure if shep_chan (or whatever name it will get) needs locking. >> >> >> As I said yesterday, if several requests to create nfsiod coming one >> >> >> after another, you would loose all but the last. >> >> >> >> >> >> You should put the requests into the list, probably protected by >> >> >> nfs_iod_mtx. >> >> >> >> >> > >> >> > How about this patch? Still several things to ask. >> >> > 1) I used malloc instance w/ M_NOWAIT, since it's called with nfs_iod_mtx >> > held. >> >> > 2) Probably busy/done gymnastics is a wrong mess. Your help is >> > appreciated. >> >> > 3) if (1) is fine, is it right to use fail: logic (i.e. set >> >> > NFSIOD_NOT_AVAILABLE) >> >> > on memory shortage? Not tested. >> >> > >> >> > There are debug printf() left intentionally to see how 3 contexts run >> > under load >> >> > to each other. I attached these messages as well if that makes sense. >> >> > >> >> >> >> Ah, yes. Sorry, forgot about that. >> > >> > Your task handler needs to run in a loop until the list of requests is empty. >> > If multiple threads call taskqueue_enqueue() before your task is scheduled, >> > they will be consolidated down to a single call of your task handler. >> >> Thanks for your suggestions. >> >> So I converted nfs_nfsiodnew_tq() into loop, and as such >> I changed a cleanup SLIST_FOREACH() as well to free a bulk of >> (potentially consolidated) completed requests in one pass. >> kproc_create() is still out of the SLIST_FOREACH() to avoid calling it >> with nfs_iod_mtx held. >> >> > >> > - int error, i; >> > - int newiod; >> > + int i, newiod, error; >> > >> > You should sort these alphabetically if you are going to change this. I would >> > probably just leave it as-is. >> >> Err.. that's resulted after a set of changes. Thanks for noting that. >> >> > >> > Also, I'm not sure how this works as you don't actually wait for the task to >> > complete. If the taskqueue_enqueue() doesn't preempt, then you may read >> > ni_error as zero before the kproc has actually been created and return success >> > even though an nfsiod hasn't been created. >> > >> >> I put taskqueue_drain() right after taskqueue_enqueue() to be in sync with >> task handler. That was part to think about, as I didn't find such a use pattern. >> It seems though that tasks are launched now strictly one-by-one, without >> visible parallelism (as far as debug printf()s don't overlap anymore). > > Ah, if it is safe to sleep then I have a couple of suggestions: > Cool, credits go to John :). I just adopted all of your changes (attached). > - Use M_WAITOK to malloc() so you don't have to worry about the failure case > (I see Rick already suggested this). After all that reduces to the following replacement in nfs_nfsiodnew(). So, no regression should be there in theory. mtx_unlock(&nfs_iod_mtx); - kproc_create(...) + malloc(...) mtx_lock(&nfs_iod_mtx); It survived after this simple test running for an hour, which forces creation of many iods with r/w from 300 threads: nfsserv:/home/svn/freebsd/obj_exp on /usr/obj (nfs) ./randomio /usr/obj/file 100 0.5 0 4096 60 100 & ./randomio /usr/obj/file 100 0.5 0 4096 60 100 & ./randomio /usr/obj/file 100 0.5 0 4096 60 100 & while [ true ]; do sysctl vfs.nfs.iodmax=2; sysctl vfs.nfs.iodmax=60; sleep 20; done randomio is from ports/149838. P.S. it's funny to see in top 0 root 10 44 0 0K 144K deadlk 2 23:16 28.86% kernel Someone might think the kernel is in deadlock. -- wbr, pluknet
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:06 UTC