On 2014-05-12 14:25, Adrian Chadd wrote: > On 12 May 2014 10:35, Allan Jude <freebsd_at_allanjude.com> wrote: >> I have this system: >> >> hw.model: Intel(R) Xeon(R) CPU E3-1220 v3 _at_ 3.10GHz >> hw.ncpu: 4 >> >> http://ark.intel.com/products/75052 >> >> dev.cpu.0.%desc: ACPI CPU >> dev.cpu.0.%driver: cpu >> dev.cpu.0.%location: handle=\_PR_.CPU0 >> dev.cpu.0.%pnpinfo: _HID=none _UID=0 >> dev.cpu.0.%parent: acpi0 >> dev.cpu.0.freq: 3100 >> dev.cpu.0.freq_levels: 3101/80000 3100/80000 2900/72713 2800/69558 >> 2600/62669 2400/56794 2300/53935 2100/47673 1900/42370 1800/39795 >> 1600/34136 1500/31729 1300/26432 1137/23128 1100/21994 1000/19851 >> 875/17369 800/15113 700/13223 600/11334 500/9445 400/7556 300/5667 >> 200/3778 100/1889 >> dev.cpu.0.cx_supported: C1/1/1 C2/2/148 >> dev.cpu.0.cx_lowest: C8 >> dev.cpu.0.cx_usage: 9.01% 90.98% last 807us >> dev.cpu.1.%desc: ACPI CPU >> dev.cpu.1.%driver: cpu >> dev.cpu.1.%location: handle=\_PR_.CPU1 >> dev.cpu.1.%pnpinfo: _HID=none _UID=0 >> dev.cpu.1.%parent: acpi0 >> dev.cpu.1.cx_supported: C1/1/1 C2/2/148 >> dev.cpu.1.cx_lowest: C8 >> dev.cpu.1.cx_usage: 11.70% 88.29% last 21303us >> dev.cpu.2.%desc: ACPI CPU >> dev.cpu.2.%driver: cpu >> dev.cpu.2.%location: handle=\_PR_.CPU2 >> dev.cpu.2.%pnpinfo: _HID=none _UID=0 >> dev.cpu.2.%parent: acpi0 >> dev.cpu.2.cx_supported: C1/1/1 C2/2/148 >> dev.cpu.2.cx_lowest: C8 >> dev.cpu.2.cx_usage: 15.17% 84.82% last 22987us >> dev.cpu.3.%desc: ACPI CPU >> dev.cpu.3.%driver: cpu >> dev.cpu.3.%location: handle=\_PR_.CPU3 >> dev.cpu.3.%pnpinfo: _HID=none _UID=0 >> dev.cpu.3.%parent: acpi0 >> dev.cpu.3.cx_supported: C1/1/1 C2/2/148 >> dev.cpu.3.cx_lowest: C8 >> dev.cpu.3.cx_usage: 11.74% 88.25% last 6073us >> > So ACPI is exposing C1 and C2 only. > >> According to the Intel specs (Page 11), this processor supports C1, C1E, >> C3, C6 and C7 >> >> The above sysctl dump shows only C1 and C2. I wonder if the C2 is >> actually C3 >> >> http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e3-1200v3-vol-1-datasheet.pdf > It'd say C2/3/xxx in that case. > > Chances are you'll end up seeing it fall into deeper sleep states. Try > installing intel-pcm; kldload cpuctl; run pcm.x 1 . See if it's > entering lower CPU states. > >> How is our support for the newer Cx States introduced in Haswell, which >> can apparently go as high as C10 > I don't know if we get those exposed via ACPI. I know there's a bunch > of cute things we could be doing with MWAIT that we aren't, but we > certainly should be drifting into lower sleep states. > > Just run intel-pcm and see. > > Thanks, > > > > -a stock configuration: # pcm.x 10 Intel(r) Performance Counter Monitor V2.6 (2013-11-04 13:43:31 +0100 ID=db05e43) Copyright (c) 2009-2013 Intel Corporation Number of physical cores: 4 Number of logical cores: 4 Threads (logical cores) per physical core: 1 Num sockets: 1 Core PMU (perfmon) version: 3 Number of core PMU generic (programmable) counters: 8 Width of generic (programmable) counters: 48 bits Number of core PMU fixed counters: 3 Width of fixed counters: 48 bits Nominal core frequency: 3100000000 Hz Package thermal spec power: 80 Watt; Package minimum power: 0 Watt; Package maximum power: 0 Watt; Detected Intel(R) Xeon(R) CPU E3-1220 v3 _at_ 3.10GHz "Intel(r) microarchitecture codename Haswell" EXEC : instructions per nominal CPU cycle IPC : instructions per CPU cycle FREQ : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost) AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state' (includes Intel Turbo Boost) L3MISS: L3 cache misses L2MISS: L2 cache misses (including other core's L2 cache *hits*) L3HIT : L3 cache hit ratio (0.00-1.00) L2HIT : L2 cache hit ratio (0.00-1.00) L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00) READ : bytes read from memory controller (in GBytes) WRITE : bytes written to memory controller (in GBytes) TEMP : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature Core (SKT) | EXEC | IPC | FREQ | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK | READ | WRITE | TEMP 0 0 0.00 0.74 0.00 0.76 39 K 120 K 0.67 0.66 0.13 0.09 N/A N/A 74 1 0 0.00 0.72 0.00 0.75 17 K 80 K 0.79 0.71 0.07 0.10 N/A N/A 76 2 0 0.00 0.62 0.00 0.61 8037 33 K 0.76 0.58 0.08 0.08 N/A N/A 76 3 0 0.00 0.72 0.00 0.76 18 K 98 K 0.81 0.70 0.07 0.10 N/A N/A 76 ------------------------------------------------------------------------------------------------------------------- SKT 0 0.00 0.72 0.00 0.74 83 K 332 K 0.75 0.68 0.09 0.09 0.38 0.01 74 ------------------------------------------------------------------------------------------------------------------- TOTAL * 0.00 0.72 0.00 0.74 83 K 332 K 0.75 0.68 0.09 0.09 0.38 0.01 N/A Instructions retired: 118 M ; Active cycles: 165 M ; Time (TSC): 30 Gticks ; C0 (active,non-halted) core residency: 0.18 % C1 core residency: 99.82 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 0.00 %; C2 package residency: 0.00 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %; PHYSICAL CORE IPC : 0.72 => corresponds to 17.93 % utilization for cores in active state Instructions per nominal CPU cycle: 0.00 => corresponds to 0.02 % core utilization over time interval ---------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------- SKT 0 package consumed 68.42 Joules ---------------------------------------------------------------------------------------------- TOTAL: 68.42 Joules Then just enabling the higher Cx states (no powerd or anything): # sysctl hw.acpi.cpu.cx_lowest=c8 hw.acpi.cpu.cx_lowest: C1 -> C8 # pcm.x 10 Intel(r) Performance Counter Monitor V2.6 (2013-11-04 13:43:31 +0100 ID=db05e43) Copyright (c) 2009-2013 Intel Corporation Number of physical cores: 4 Number of logical cores: 4 Threads (logical cores) per physical core: 1 Num sockets: 1 Core PMU (perfmon) version: 3 Number of core PMU generic (programmable) counters: 8 Width of generic (programmable) counters: 48 bits Number of core PMU fixed counters: 3 Width of fixed counters: 48 bits Nominal core frequency: 3100000000 Hz Package thermal spec power: 80 Watt; Package minimum power: 0 Watt; Package maximum power: 0 Watt; Detected Intel(R) Xeon(R) CPU E3-1220 v3 _at_ 3.10GHz "Intel(r) microarchitecture codename Haswell" EXEC : instructions per nominal CPU cycle IPC : instructions per CPU cycle FREQ : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost) AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state' (includes Intel Turbo Boost) L3MISS: L3 cache misses L2MISS: L2 cache misses (including other core's L2 cache *hits*) L3HIT : L3 cache hit ratio (0.00-1.00) L2HIT : L2 cache hit ratio (0.00-1.00) L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00) READ : bytes read from memory controller (in GBytes) WRITE : bytes written to memory controller (in GBytes) TEMP : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature Core (SKT) | EXEC | IPC | FREQ | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK | READ | WRITE | TEMP 0 0 0.00 0.11 0.00 0.99 611 K 629 K 0.03 0.02 1.89 0.01 N/A N/A 73 1 0 0.00 0.19 0.00 0.99 152 K 169 K 0.10 0.04 1.36 0.04 N/A N/A 76 2 0 0.00 0.20 0.00 0.99 153 K 171 K 0.10 0.04 1.29 0.04 N/A N/A 77 3 0 0.00 0.20 0.00 1.00 159 K 180 K 0.12 0.04 1.14 0.03 N/A N/A 76 ------------------------------------------------------------------------------------------------------------------- SKT 0 0.00 0.16 0.00 0.99 1077 K 1150 K 0.06 0.03 1.55 0.03 0.14 0.01 72 ------------------------------------------------------------------------------------------------------------------- TOTAL * 0.00 0.16 0.00 0.99 1077 K 1150 K 0.06 0.03 1.55 0.03 0.14 0.01 N/A Instructions retired: 19 M ; Active cycles: 125 M ; Time (TSC): 31 Gticks ; C0 (active,non-halted) core residency: 0.10 % C1 core residency: 1.85 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 98.05 %; C2 package residency: 8.10 %; C3 package residency: 7.46 %; C6 package residency: 79.20 %; C7 package residency: 0.00 %; PHYSICAL CORE IPC : 0.16 => corresponds to 3.96 % utilization for cores in active state Instructions per nominal CPU cycle: 0.00 => corresponds to 0.00 % core utilization over time interval ---------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------- SKT 0 package consumed 22.77 Joules ---------------------------------------------------------------------------------------------- TOTAL: 22.77 Joules Suggest it is spending lots of time in C6 and C7 Will try to grab results from a few more machinesReceived on Tue May 13 2014 - 00:09:16 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:49 UTC