On Mon, Nov 1, 2010 at 8:14 PM, John Baldwin <jhb_at_freebsd.org> wrote: > On Monday, November 01, 2010 1:09:22 pm Giovanni Trematerra wrote: >> On Tue, Oct 19, 2010 at 8:55 AM, Andriy Gapon <avg_at_freebsd.org> wrote: >> > on 19/10/2010 00:01 Giovanni Trematerra said the following: >> >> >> >> Your patch seems just a work around about initial slab size where the >> >> keg is backed. >> > >> > Well, setting aside my confusion with the terminology - yes, the patch is > just >> > that, and precisely because I only tried to solve that particular problem. >> > >> >> Having dynamic slab sizes would allow to have the keg backed on a larger > slab >> >> without going OFFPAGE. >> > >> > I agree in principle. >> > But without seeing code that implements that I can't guess if it would > really be >> > more efficient or more maintainable, i.e. more useful in general. >> > Still a very good idea. >> > >> >> Here the patch that was in my mind. >> The patch doesn't implement dynamic slab size just allow >> to have a multipage slab to back uma_zone objects. >> I'm going to work more on the topic "dynamic slab size" soon. >> I tested the patch on qemu with -smp 32. >> I'm looking for real hw to test the patch on. >> >> here some interesting output: >> qemulab# vmstat -z | more >> ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP >> >> UMA Kegs: 208, 0, 149, 4, 149, 0, 0 >> UMA Zones: 4480, 0, 149, 0, 149, 0, 0 >> UMA Slabs: 568, 0, 836, 4, 1187, 0, 0 >> UMA RCntSlabs: 568, 0, 202, 1, 202, 0, 0 >> UMA Hash: 256, 0, 2, 13, 3, 0, 0 >> >> qemulab# sysctl kern | grep cpu >> kern.ccpu: 0 >> <cpu count="32" mask="0xffffffff">0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, >> 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, >> 28, 29, 30, 31</cpu> >> kern.smp.cpus: 32 >> kern.smp.maxcpus: 32 >> >> Any feedback will be appreciate. >> >> -- >> Giovanni Trematerra >> >> >> ============================================================== >> diff -r 36572cbc6817 sys/vm/uma_core.c >> --- a/sys/vm/uma_core.c Tue Oct 05 04:49:04 2010 -0400 >> +++ b/sys/vm/uma_core.c Mon Nov 01 11:54:38 2010 -0400 >> _at__at_ -930,27 +930,36 _at__at_ startup_alloc(uma_zone_t zone, int bytes >> { >> uma_keg_t keg; >> uma_slab_t tmps; >> - >> - keg = zone_first_keg(zone); >> + u_int16_t pages; >> + >> + keg = zone_first_keg(zone); >> + pages = bytes / PAGE_SIZE; >> + >> + /* Account for remainder */ >> + if ((pages * PAGE_SIZE) < bytes) >> + pages++; >> + KASSERT(pages > 0, ("startup_alloc can't reserve 0 pages\n")); > > You can use 'pages = howmany(bytes, PAGE_SIZE)' here. Thanks for the hint. > Also, why did you make > pages a uint16_t instead of an int? An int is generally more convenient > unless you really need a uint16_t (and C99 spells it without an _ after the > leading 'u'.. FYI). Uhm just to be coherent with field uk_ppera of struct keg, but I think I can just use an int. BTW is new code supposed to use C99 form even if the rest of the file use u_int* form? > >> /* >> * Check our small startup cache to see if it has pages remaining. >> */ >> mtx_lock(&uma_boot_pages_mtx); >> - if ((tmps = LIST_FIRST(&uma_boot_pages)) != NULL) { >> - LIST_REMOVE(tmps, us_link); >> + do { >> + if ((tmps = LIST_FIRST(&uma_boot_pages)) != NULL) >> + LIST_REMOVE(tmps, us_link); >> + } while (--pages && tmps != NULL); >> + if (tmps != NULL) { >> mtx_unlock(&uma_boot_pages_mtx); >> *pflag = tmps->us_flags; >> return (tmps->us_data); >> - } >> + } else if (booted == 0) >> + panic("UMA: Increase vm.boot_pages"); >> mtx_unlock(&uma_boot_pages_mtx); >> - if (booted == 0) >> - panic("UMA: Increase vm.boot_pages"); > > Probably best to make the pages test here explicit. Also, is there any reason > you can't do this as: > > while (--pages > 0) { > tmps = LIST_FIRST(&uma_boot_pages); > if (tmps != NULL) > LIST_REMOVE(tmps, us_link); > else if (booted == 0) > panic(...); > } > Well, no, even if I'll need to initialize tmps to NULL otherwise the compiler will raise a warning. do {} while(); might be still better than a while(){}. bytes parameter will never be zero so pages will always be at least one and KASSERT will catch some wired behavior. Anyway that looks to me more readable, thanks. I could add an "else break;" just in the few cases that "pages" is still > 0 and tmps == NULL, that could be useless though. > One question btw, how does this work since if you need to allocate more than 1 > page it seems that the 'tmps' values for all but the last are simply ignored > and leaked? When you extract one item from the list you have tmps->us_data pointing to start address of the memory page. The pages are contiguous in decrescent order of address (see below) so when you extract 2 items the last one will point at the start address of 2 contiguous pages of memory, just what I need to return. > > Is there some unwritten assumption that the free pages are all virtually > contiguous (and in order), so you can just pull off a run of X and use > the address from X? > sys/vm/vm_page.c:351 _at__at_ vm_page_startup(vm_offset_t vaddr) /* * Allocate memory for use when boot strapping the kernel memory * allocator. */ new_end = end - (boot_pages * UMA_SLAB_SIZE); new_end = trunc_page(new_end); mapped = pmap_map(&vaddr, new_end, end, VM_PROT_READ | VM_PROT_WRITE); bzero((void *)mapped, end - new_end); uma_startup((void *)mapped, boot_pages); -------- sys/vm/uma_core.c:1683 _at__at_ uma_startup(void *bootmem, int boot_pages) #ifdef UMA_DEBUG printf("Filling boot free list.\n"); #endif for (i = 0; i < boot_pages; i++) { slab = (uma_slab_t)((u_int8_t *)bootmem + (i * UMA_SLAB_SIZE)); slab->us_data = (u_int8_t *)slab; slab->us_flags = UMA_SLAB_BOOT; LIST_INSERT_HEAD(&uma_boot_pages, slab, us_link); } So if I'm not wrong I'd say that the pages are virtually contiguous. Just for the record, attilio_at_ made a light review of this patch in private. -- Giovanni TrematerraReceived on Mon Nov 01 2010 - 21:02:21 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:08 UTC