Re: HTT on current

From: Jens Rehsack <rehsack_at_liwing.de>
Date: Thu, 28 Aug 2003 07:37:25 +0000
Bruce Evans wrote:
> On Tue, 26 Aug 2003, John Baldwin wrote:
> 
> 
>>On 26-Aug-2003 Yamada Ken Takeshi wrote:

[...]

>>One test is not sufficient.  -current is also not the best
>>place to test. :)  When I first implemented HTT in -current
> 
> The above times seem slow enough to be partly the result of
> debugging options in -current.  I get buildworld times ranging
> from 1401 seconds (2002/09/01) to 2427 seconds (2003/05/06)
> on Athlon XP1600 x 1 depending on configuration and tuning (but
> never with any expensive debugging options).  A Xeon 2.8GHz x 2
> should be a bit faster than an old Athlon.

Hm, IFAIK I shouldn't have any debug options enabled, but who
knows - didn't have time to check more than kernel and malloc.

By the way, attached times shows that using HTT speeds up
buildworld by 10 (compare HTT/no-HTT) and this is IMHO
a real improvement. Many people buy much more expensive
processor to get 10% more speed.

>>... I did 16 trials (first one was
>>throwaway) of back-to-back buildworlds of the same version
>>of -stable using make, make -j2, and make -j4 for the
>>following configurations: UP, HTT, HTT with smp_idle_hlt, and
>>HTT with pause instructions added to stable and smp_idle_hlt.
>>The fastest build time belonged to UP without any -j option.
> 
> All benchmarks using -j are invalid because of a pessimization
> in make(1).  It sleeps for up to SEL_USEC = 100000 usec after
> completion of every job (average 50 msec).  With -j2 this increases
> !SMP buildworld times of approx. 2000 seconds by approx. 15%.  I think
> it has a smaller effect for larger -j values and for SMP but haven't
> benchmarked it.  This is fixed in NetBSD.  I think fixing it was more
> urgent in NetBSD because NetBSD never changed SEL_USEC from its
> 4.4Lite default of 500000.  500000 was large enough to be noticeable
> even in 1997 when it was reduced in FreeBSD.

That would explain the small slow down from -j8 to -j20. But the
results seems to me to be schoolbook like: 4 processes per processor
as I learned produces best results.

Okay, that's so far.

Best regards
Jens

Jens

1) HTT + PAT (-j4)
   - 4736.474u 569.157s 52:19.56 168.9%      4041+2517k 16977+153222io 5761pf+0w
   - 4737.875u 571.889s 51:07.10 173.1%      4039+2516k 1519+153172io 450pf+0w
   - 4734.651u 570.591s 51:51.69 170.4%      4039+2519k 12822+153198io 3264pf+0w
2) HTT + PAT (-j20)
   - 4754.903u 604.875s 51:10.69 174.5%      -3981+2500k 3503+153256io 2952pf+0w
   - 4770.237u 613.092s 50:46.67 176.6%      -3948+2501k 3132+153183io 3143pf+0w
   - 4772.232u 614.315s 50:45.60 176.8%      -3942+2501k 2861+153184io 2963pf+0w
3) no-HTT, PAT, -j4
   - 2843.366u 431.189s 57:54.23 94.2%       3981+2475k 1549+153171io 1276pf+0w
   - 2843.791u 430.378s 58:22.65 93.4%       3983+2476k 1293+153166io 450pf+0w
   - 2842.366u 432.233s 57:42.22 94.5%       3981+2473k 1277+153170io 450pf+0w
4) HTT, PAT, -j8
   - 4761.587u 592.745s 50:40.74 176.0%      -3986+2509k 1280+153185io 450pf+0w
   - 4756.575u 593.777s 50:49.25 175.4%      -3991+2509k 1277+153198io 450pf+0w
   - 4766.158u 595.531s 50:36.81 176.5%      -3975+2510k 1284+153182io 450pf+0w
5) Single-User-Mode, HTT, PAT, -j4
   - 4732.941u 578.183s 51:08.25 173.0%      4035+2515k 1274+153164io 450pf+0w
   - 4720.249u 573.533s 51:16.59 172.3%      4034+2515k 1276+153165io 450pf+0w
   - 4737.491u 575.077s 51:07:71 173.1%      4035+2517k 1273+153173io 450pf+0w
5) Single-User-Mode, HTT, PAT, -j8
   - 4756.060u 604.684s 50:33.13 176.7%      -3978+2509k 1284+153173io 450pf+0w
   - 4803.037u 604.068s 52:26.76 171.8%      -3904+2511k 17043+153163io 5763pf+0w

Copyright (c) 1992-2003 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD 5.1-CURRENT #0: Tue Aug 26 13:39:29 GMT 2003
    root_at_statler:/usr/obj/usr/src/sys/STATLER
Preloaded elf kernel "/boot/kernel/kernel" at 0xc05b0000.
Preloaded elf module "/boot/kernel/acpi.ko" at 0xc05b021c.
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Pentium(R) 4 CPU 2.40GHz (2398.86-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0xf29  Stepping = 9
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Hyperthreading: 2 logical CPUs
real memory  = 1072889856 (1023 MB)
avail memory = 1036087296 (988 MB)
Programming 24 pins in IOAPIC #0
IOAPIC #0 intpin 2 -> irq 0
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): apic id:  0, version: 0x00050014, at 0xfee00000
 cpu1 (AP):  apic id:  1, version: 0x00050014, at 0xfee00000
 io0 (APIC): apic id:  2, version: 0x00178020, at 0xfec00000
Pentium Pro MTRR support enabled
VESA: v3.0, 32768k memory, flags:0x1, mode table:0xc00c52cd (c00052cd)
VESA: Matrox Graphics Inc.
npx0: <math processor> on motherboard
npx0: INT 16 interface
acpi0: <A M I  OEMXSDT > on motherboard
pcibios: BIOS version 2.10
Using $PIR table, 14 entries at 0xc00f5410
acpi0: power button is handled as a fixed feature programming model.
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
acpi_cpu0: <CPU> port 0x530-0x537 on acpi0
acpi_cpu1: <CPU> port 0x530-0x537 on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
IOAPIC #0 intpin 16 -> irq 2
IOAPIC #0 intpin 19 -> irq 3
IOAPIC #0 intpin 18 -> irq 5
IOAPIC #0 intpin 23 -> irq 7
IOAPIC #0 intpin 17 -> irq 10
agp0: <Intel 82865 host to AGP bridge> mem 0xf8000000-0xfbffffff at device 0.0 on pci0
pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0
pcib1: could not get PCI interrupt routing table for \\_SB_.PCI0.P0P1 - AE_NOT_FOUND
pci1: <ACPI PCI bus> on pcib1
drm0: <Matrox G550 (AGP)> mem 0xfe000000-0xfe7fffff,0xfe9fc000-0xfe9fffff,0xf4000000-0xf5ffffff irq 2 at device 0.0 on pci1
info: [drm] AGP at 0xf8000000 64MB
info: [drm] Initialized mga 3.1.0 20021029 on minor 0
uhci0: <Intel 82801EB (ICH5) USB controller USB-A> port 0xef00-0xef1f irq 2 at device 29.0 on pci0
usb0: <Intel 82801EB (ICH5) USB controller USB-A> on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
ums0: Logitech USB-PS/2 Optical Mouse, rev 2.00/11.10, addr 2, iclass 3/1
ums0: 3 buttons and Z dir.
uhci1: <Intel 82801EB (ICH5) USB controller USB-B> port 0xef20-0xef3f irq 3 at device 29.1 on pci0
usb1: <Intel 82801EB (ICH5) USB controller USB-B> on uhci1
usb1: USB revision 1.0
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2: <Intel 82801EB (ICH5) USB controller USB-C> port 0xef40-0xef5f irq 5 at device 29.2 on pci0
usb2: <Intel 82801EB (ICH5) USB controller USB-C> on uhci2
usb2: USB revision 1.0
uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
uhci3: <Intel 82801EB (ICH5) USB controller USB-D> port 0xef80-0xef9f irq 2 at device 29.3 on pci0
usb3: <Intel 82801EB (ICH5) USB controller USB-D> on uhci3
usb3: USB revision 1.0
uhub3: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub3: 2 ports with 2 removable, self powered
ehci0: <EHCI (generic) USB 2.0 controller> mem 0xfebfbc00-0xfebfbfff irq 7 at device 29.7 on pci0
ehci_pci_attach: companion usb0
ehci_pci_attach: companion usb1
ehci_pci_attach: companion usb2
ehci_pci_attach: companion usb3
usb4: EHCI version 1.0
usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3
usb4: <EHCI (generic) USB 2.0 controller> on ehci0
usb4: USB revision 2.0
uhub4: (0x8086) EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub4: 8 ports with 8 removable, self powered
pcib2: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci2: <ACPI PCI bus> on pcib2
IOAPIC #0 intpin 20 -> irq 11
IOAPIC #0 intpin 22 -> irq 16
fwohci0: <VIA VT6306> port 0xdc00-0xdc7f mem 0xfeaff800-0xfeafffff irq 11 at device 3.0 on pci2
fwohci0: OHCI version 1.0 (ROM=1)
fwohci0: No. of Isochronous channel is 4.
fwohci0: EUI64 00:e0:18:00:00:20:ae:b5
fwohci0: Phy 1394a available S400, 2 ports.
fwohci0: Link S400, max_rec 2048 bytes.
firewire0: <IEEE1394(FireWire) bus> on fwohci0
fwohci0: Initiate bus reset
fwohci0: BUS reset
fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode
firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me)
firewire0: bus manager 0 (me)
pci2: <network, ethernet> at device 5.0 (no driver attached)
xl0: <3Com 3c905C-TX Fast Etherlink XL> port 0xd480-0xd4ff mem 0xfeaff400-0xfeaff47f irq 16 at device 10.0 on pci2
xl0: Ethernet address: 00:01:02:9b:c3:c3
miibus0: <MII bus> on xl0
xlphy0: <3c905C 10/100 internal PHY> on miibus0
xlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
pcm0: <Creative EMU10K1> port 0xdf40-0xdf5f irq 11 at device 12.0 on pci2
pcm0: <SigmaTel STAC9721/23 AC97 Codec>
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH5 UDMA100 controller> port 0xfc00-0xfc0f,0-0x3,0-0x7,0-0x3,0-0x7 irq 5 at device 31.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
ichsmb0: <SMBus controller> port 0x400-0x41f irq 10 at device 31.3 on pci0
acpi_button0: <Power Button> on acpi0
atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
speaker0 port 0x61 on acpi0
pmtimer0 on isa0
orm0: <Option ROMs> at iomem 0xc9000-0xc97ff,0xc0000-0xc8fff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
APIC_IO: Testing 8254 interrupt delivery
APIC_IO: routing 8254 via IOAPIC #0 intpin 2

Timecounters tick every 10.000 msec
IP Filter: v3.4.31 initialized.  Default = block all, Logging = enabled
acpi_cpu: throttling enabled, 8 steps (100% to 12.5%), currently 100.0%
ad0: 78533MB <IC35L090AVV207-0> [159560/16/63] at ata0-master UDMA100
ad1: 78533MB <IC35L090AVV207-0> [159560/16/63] at ata0-slave UDMA100
ad2: DMA limited to UDMA33, non-ATA66 cable or device
ad2: 58644MB <IBM-DTLA-307060> [119150/16/63] at ata1-master UDMA33
ums0: at uhub0 port 2 (addr 2) disconnected
ums0: detached
ums0: Logitech USB-PS/2 Optical Mouse, rev 2.00/11.10, addr 2, iclass 3/1
ums0: 3 buttons and Z dir.
SMP: AP CPU #1 Launched!
Mounting root from ufs:/dev/ad0s1a
IP Filter: already initialized
IP Filter: already initialized
Received on Wed Aug 27 2003 - 22:37:33 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:20 UTC