之前写过一篇文章,关于SQLSERVER能识别多少个逻辑CPU的,前些天在论坛里有人问Windows处理器编组是如何划分的??
在帖子给出了两篇文章,我们现在来看一下
Uneven Windows Processor Groups
Uneven Windows Processor Groups(不均匀的处理器编组)
这篇文章主要讨论64个逻辑cpu的硬件。
我们讨论Windows 2008R2 他支持64个逻辑处理器。当前可用的硬件是8个核的物理处理器/socket接口。
尽管加上超线程,那么意味着是16个逻辑cpu。每一个socket接口形成一个或两个NUMA节点。4个或8个逻辑cpu形成一个处理器编组。
处理器编组的分配是在操作系统启动的时候分配好的。因为这个原因,Windows2008R2 和之后的Windows操作系统会检查物理硬件架构为了
分配跟NUMA节点相对应的处理器编组,并且检查内存延时,为了决定分配哪一个逻辑cpu到哪一个处理器编组。一旦分配完成,就不能再动态更改!
这样的分配工作只会发生在超过64个逻辑cpu的硬件架构。在典型的8-socket服务器,资源和内存的分布通常是不均匀的,在不同的处理器编组之间
(除了一些在2009年和2010年的时候一些市场上出现的96个逻辑cpu的奇怪的硬件)
已经开发好的软件面对处理器编组这个概念会发生什么?在大于64个逻辑cpu的时候,软件会怎样选择不同的逻辑处理器
实际上,Windows会在应用程序启动的时候分配其中一个处理器编组给它。应用程序会检查64逻辑cpu窗口是否在运行。
然而应用程序会检查完整的内存资源。典型的应用程序会被调度到其中一个处理器编组。
只要处理器编组有均匀的分布和软件不需要依赖某些NUMA节点的可用性,一切都很好。
然而,这个平衡受到英特尔发布的最新版本的Intel Xeon E7处理器核心家族的( 10和20逻辑处理器)的影响
显然,核心的数量和逻辑处理器的数量加起来不太好对于64核cpus。在我的博客里,我已经列出了我讨论的
处理器影响到SQLSERVER服务器关联掩码的设置。
到目前位置我们并没有讨论到Windows2008R2是如何分配4-socket服务器上的80个逻辑处理器或一个8-socket服务器上的160个逻辑处理器的情况。
Windows2008R2的原来算法实现就是创建尽可能少的处理器编组并且保持每个处理器编组里的处理器数量尽可能足够大。
让我们看看发生了什么事。
检测当前的处理器编组信息
为了检测Windows2008R2上面的确切的处理器编组的信息,硬件通常需要编出超过64个逻辑CPU的线程。执行检查的工具的名字叫“coreinfo ”
下载地址:http://technet.microsoft.com/en-us/sysinternals/cc835722.aspx
下载地址:https://files.cnblogs.com/lyhabc/Coreinfo.zip
请下载coreinfo .exe然后在cmd窗口里运行它。
最好使用下面语句将coreinfo的信息输出到文本文件以便分析
coreinfo > structure.txt
structure.txt文件内容
Intel(R) Pentium(R) CPU G630 @ 2.70GHz Intel64 Family 6 Model 42 Stepping 7, GenuineIntel HTT * Hyperthreading enabled HYPERVISOR - Hypervisor is present VMX * Supports Intel hardware-assisted virtualization SVM - Supports AMD hardware-assisted virtualization EM64T * Supports 64-bit mode SMX - Supports Intel trusted execution SKINIT - Supports AMD SKINIT NX * Supports no-execute page protection SMEP - Supports Supervisor Mode Execution Prevention SMAP - Supports Supervisor Mode Access Prevention PAGE1GB - Supports 1 GB large pages PAE * Supports > 32-bit physical addresses PAT * Supports Page Attribute Table PSE * Supports 4 MB pages PSE36 * Supports > 32-bit address 4 MB pages PGE * Supports global bit in page tables SS * Supports bus snooping for cache operations VME * Supports Virtual-8086 mode RDWRFSGSBASE - Supports direct GS/FS base access FPU * Implements i387 floating point instructions MMX * Supports MMX instruction set MMXEXT - Implements AMD MMX extensions 3DNOW - Supports 3DNow! instructions 3DNOWEXT - Supports 3DNow! extension instructions SSE * Supports Streaming SIMD Extensions SSE2 * Supports Streaming SIMD Extensions 2 SSE3 * Supports Streaming SIMD Extensions 3 SSSE3 * Supports Supplemental SIMD Extensions 3 SSE4a - Supports Sreaming SIMDR Extensions 4a SSE4.1 * Supports Streaming SIMD Extensions 4.1 SSE4.2 * Supports Streaming SIMD Extensions 4.2 AES - Supports AES extensions AVX - Supports AVX intruction extensions FMA - Supports FMA extensions using YMM state MSR * Implements RDMSR/WRMSR instructions MTRR * Supports Memory Type Range Registers XSAVE * Supports XSAVE/XRSTOR instructions OSXSAVE * Supports XSETBV/XGETBV instructions RDRAND - Supports RDRAND instruction RDSEED - Supports RDSEED instruction CMOV * Supports CMOVcc instruction CLFSH * Supports CLFLUSH instruction CX8 * Supports compare and exchange 8-byte instructions CX16 * Supports CMPXCHG16B instruction BMI1 - Supports bit manipulation extensions 1 BMI2 - Supports bit manipulation extensions 2 ADX - Supports ADCX/ADOX instructions DCA - Supports prefetch from memory-mapped device F16C - Supports half-precision instruction FXSR * Supports FXSAVE/FXSTOR instructions FFXSR - Supports optimized FXSAVE/FSRSTOR instruction MONITOR * Supports MONITOR and MWAIT instructions MOVBE - Supports MOVBE instruction ERMSB - Supports Enhanced REP MOVSB/STOSB PCLULDQ * Supports PCLMULDQ instruction POPCNT * Supports POPCNT instruction LZCNT - Supports LZCNT instruction SEP * Supports fast system call instructions LAHF-SAHF * Supports LAHF/SAHF instructions in 64-bit mode HLE - Supports Hardware Lock Elision instructions RTM - Supports Restricted Transactional Memory instructions DE * Supports I/O breakpoints including CR4.DE DTES64 * Can write history of 64-bit branch addresses DS * Implements memory-resident debug buffer DS-CPL * Supports Debug Store feature with CPL PCID * Supports PCIDs and settable CR4.PCIDE INVPCID - Supports INVPCID instruction PDCM * Supports Performance Capabilities MSR RDTSCP * Supports RDTSCP instruction TSC * Supports RDTSC instruction TSC-DEADLINE * Local APIC supports one-shot deadline timer TSC-INVARIANT * TSC runs at constant rate xTPR * Supports disabling task priority messages EIST * Supports Enhanced Intel Speedstep ACPI * Implements MSR for power management TM * Implements thermal monitor circuitry TM2 * Implements Thermal Monitor 2 control APIC * Implements software-accessible local APIC x2APIC - Supports x2APIC CNXT-ID - L1 data cache mode adaptive or BIOS MCE * Supports Machine Check, INT18 and CR4.MCE MCA * Implements Machine Check Architecture PBE * Supports use of FERR#/PBE# pin PSN - Implements 96-bit processor serial number PREFETCHW * Supports PREFETCHW instruction Maximum implemented CPUID leaves: 0000000D (Basic), 80000008 (Extended). Logical to Physical Processor Map: *- Physical Processor 0 -* Physical Processor 1 Logical Processor to Socket Map: ** Socket 0 Logical Processor to NUMA Node Map: ** NUMA Node 0 Logical Processor to Cache Map: *- Data Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64 *- Instruction Cache 0, Level 1, 32 KB, Assoc 8, LineSize 64 *- Unified Cache 0, Level 2, 256 KB, Assoc 8, LineSize 64 -* Data Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64 -* Instruction Cache 1, Level 1, 32 KB, Assoc 8, LineSize 64 -* Unified Cache 1, Level 2, 256 KB, Assoc 8, LineSize 64 ** Unified Cache 2, Level 3, 3 MB, Assoc 12, LineSize 64 Logical Processor to Group Map: ** Group 0