mpollice / AmdMsrTweaker

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

how to set P states 4-7

gfanini opened this issue · comments

This is the default on a lenovo ideapad 320 15ABR
A10-9620P family 15h revision 65h

AmdMsrTweaker v1.1

.:. General
---
  AMD family 0x15, model 0x65 CPU, 4 cores
  Default reference clock: 100 MHz
  Available multipliers: 1 .. 63
  Available voltage IDs: 0 .. 1.55 (0.0125 steps)

.:. Turbo
---
  enabled
  unlocked
  Max multiplier: 36

.:. P-states
---
  8 of 8 enabled (P0 .. P7)
  Turbo P-states: P0 P1 P2
  ---
  P0: 35x at 1.1625V
      NorthBridge in NB_P0
  P1: 30x at 0.9V
      NorthBridge in NB_P0
  P2: 27x at 0.725V
      NorthBridge in NB_P0
  P3: 25x at 0.6V
      NorthBridge in NB_P0
  P4: 21x at 0.4V
      NorthBridge in NB_P0
  P5: 16x at 0.225V
      NorthBridge in NB_P0
  P6: 13x at 0.15V
      NorthBridge in NB_P1
  P7: 8x at 0.05V
      NorthBridge in NB_P1
  ---
  NB_P0: 11x at 0.325V
  NB_P1: 10x at 0.275V
  NB_P2: 10x at 0.275V
  NB_P3: 7x at 0.125V

If I run this :

AmdMsrTweaker.exe Turbo=1 APM=0 P2=36@1.16 P1=36@1.16 P0=36@1.16

bcdedit /set useplatformclock true
bcdedit /set tscsyncpolicy Enhanced
bcdedit /set disabledynamictick yes

it sets the P states 0-7

 P0: 36x at 1.1625V
      NorthBridge in NB_P0
  P1: 36x at 1.1625V
      NorthBridge in NB_P0
  P2: 36x at 1.1625V
      NorthBridge in NB_P0

but I cannot set P states NB_P0/1
is there any way to set those states ?

this is how high I can go without freeze/reboot :

AmdMsrTweaker.exe Turbo=1 APM=0 P7=21@0.2 P6=21@0.2 P5=21@0.4 P4=21@0.4 P3=25@0.7 P2=36@1.16 P1=36@1.16 P0=36@1.16

This is more or less working as intended (by AMD mostly). While I never updated the tool to properly support anything newer than Kaveri, other than the wrong voltages displayed the tool works as well as for the older models. As far as I know, there is no solution to the freeze/reboot issue if you change the P0 state (i.e. base clock, in this case the P3 state, as this tool doesn't "rename" the p-states according to their boost/normal status; AMD has hardware and software p-state naming, this tool will always show the hardware p-state names only).

What you could try is raising the P4 through P7 states to something higher than 21 but below 25 (or at most 25 but that may not work either). It should be possible to change the voltage of the P3 state but not the multiplier.

this seems to be as high as possible P0-P7 :

AmdMsrTweaker.exe Turbo=1 APM=0 P7=25@1.1 P6=25@1.1 P5=25@1.1 P4=25@1.1 P3=25@1.1 P2=36@1.16 P1=36@1.16 P0=36@1.16

can you explain the logic of changing north bridge NB_P0-NB_P3 ? is it possible to change these multipliers on A10 cpu ?
I can do this NB_low=8
which should make all P states use the highest north bridge frequency, is that so ?
this cpu seems to throttle down to about 1.6 GHz under load, ignoring the multipliers, have you any idea how to disable this self protection, should I use APM=0
it should be somewhere in this manual perhaps : https://www.amd.com/system/files/TechDocs/42300_15h_Mod_10h-1Fh_BKDG.pdf
it should be as a function of temperature, is there any way to act on this ?

image

I modified to display current P state continuously, it seems to use states P0-6 on all cores, regardless of Turbo and APM settings :

AmdMsrTweaker.cpp

#define MAX(x, y) (((x) > (y)) ? (x) : (y))
#define MIN(x, y) (((x) < (y)) ? (x) : (y))
extern void SwitchTo(int logicalCPUIndex);
void WaitForKey()
{
	Info info;
	info.Initialize();
	Worker worker(info);
	cout << endl << "Press any key to exit... ";
	SYSTEM_INFO sysInfo;
	GetSystemInfo(&sysInfo);
	const int numLogicalCPUs = sysInfo.dwNumberOfProcessors;
	int minP=100;
	int maxP=0;
	while (!_kbhit())
	{

		for (int j = 0; j < numLogicalCPUs; j++)
		{
			SwitchTo(j);
			int currentPState = info.GetCurrentPState();
			minP=MIN(minP,currentPState);
			maxP=MAX(maxP,currentPState);
			SwitchTo(j);
			int currentMulti=info.GetCoreMultiplier(j);
			printf("cpu %d P %d M %d ",j,currentPState,currentMulti);
		}
		printf(" min P %d max P %d                               \r",minP,maxP);
		Sleep(1000);
	}

	cout << endl;
}

Info.cpp
// from open hardware monitor
double Info::GetCoreMultiplier(int cpu) 
{
	//SwitchTo(cpu);

	const QWORD msr = Rdmsr(0xc0010071);
	unsigned int cofvidEax = GetBits(msr, 0, 32);

	
	switch (Family) {
        case 0x10:
        case 0x11: 
        case 0x15: 
        case 0x16: {
            // 8:6 CpuDid: current core divisor ID
            // 5:0 CpuFid: current core frequency ID
            unsigned int cpuDid = (cofvidEax >> 6) & 7;
            unsigned int cpuFid = cofvidEax & 0x1F;
            return 0.5 * (cpuFid + 0x10) / (1 << (int)cpuDid);
          }
        case 0x12: {
            // 8:4 CpuFid: current CPU core frequency ID
            // 3:0 CpuDid: current CPU core divisor ID
            unsigned int cpuFid = (cofvidEax >> 4) & 0x1F;
            unsigned int cpuDid = cofvidEax & 0xF;
            double divisor;
            switch (cpuDid) {
              case 0: divisor = 1; break;
              case 1: divisor = 1.5; break;
              case 2: divisor = 2; break;
              case 3: divisor = 3; break;
              case 4: divisor = 4; break;
              case 5: divisor = 6; break;
              case 6: divisor = 8; break;
              case 7: divisor = 12; break;
              case 8: divisor = 16; break;
              default: divisor = 1; break;
            }
            return (cpuFid + 0x10) / divisor;
          }
        case 0x14: {
            // 8:4: current CPU core divisor ID most significant digit
            // 3:0: current CPU core divisor ID least significant digit
            unsigned int divisorIdMSD = (cofvidEax >> 4) & 0x1F;
            unsigned int divisorIdLSD = cofvidEax & 0xF;
            unsigned int value = 0;
           // Ring0.ReadPciConfig(miscellaneousControlAddress,CLOCK_POWER_TIMING_CONTROL_0_REGISTER, out value);
            unsigned int frequencyId = value & 0x1F;
            return (frequencyId + 0x10) /
              (divisorIdMSD + (divisorIdLSD * 0.25) + 1);
          }
        default:
          return 1;
      }
    }

can you shed any light why amdmsrtweaker reports say P3 multi 25 whereas this open hardware method displays multi about 8 ? it's something to do with reported bus speed 200 MHz ? i.e. 8 x 200 = 1600 MHz might it be ? however it's not using the multi set in the P states by amdmsrtweaker it seems ?

image

that tool "core temp" at times it shows 100 MHz x multi, other times 138 MHz x multi like here

image

50742_15h_Models_60h-6Fh_BKDG.pdf

P-state is limited by:

  1. HTC hardware thermal control
  2. D18F3x68[SwPstateLimit]
  3. SBI ?
  4. CPB core performance boost

I tried to set P state 4 with method 2) above, it could be that 1) limits P state in hardware under load monitoring the temperature ? could it be possible to disable htc ...

I used this gitlab branch which specifically targets family 15h https://github.com/LogioTek/AmdMsrTweaker
that branch doesn't seem to have an issues tab, meaning "as is" I would presume

Booting into linux corefreq iso it can sustain 2.5 Ghz steady

image

I see it shows state C1E enabled, is it in the bios ? can it be disabled on windows ?

  1. To be honest I don't know a lot about the NB P-States. This functionality was written by the prior author of this tool. There is some interaction related to memory and GPU clock with these (i.e. if the GPU is in high clock mode the high NB P-State is supposed to be used, this is explained in some APU BKDG, similarly regarding different memory clocks and some requirements there). Personally I never really much used this option and left it as is. Reading up on it in the BKDGs or some forums may shed additional light.
  2. I'm not sure you can (or really want) to fully disable the thermal protection. You can try APM=0. You can also look for additional MSRs that may have settings regarding this. The HTC related stuff seems to be read-only. There may be undocumented SMU commands to disable this. If you look up the HTC documentation in the BKDG it explains the SMU will set a product-specific P-state. That may be where the 16x is coming from.
  3. The GetCoreMultiplier function is from somewhere else? The 200MHz bus speed is an error resulting in the non-existant model-detection for these APUs and this is the default for Fam15h. Your APU has a 100MHz bus speed. Similarly the voltages as they use the wrong "big" steps and thus are way too low, but due to the math in the conversion if you know what you are doing you can still use it to set voltages as you know. So I strongly doubt the 200MHz bus speed is the reason here.
  4. The fork you referenced only adds a few things regarding the NB P-States, nothing really groundbreaking imho (it won't hurt though).
  5. The linux stuff is interesting but are you really sure it is stable at 2.5GHz? Also thermals may be dependant on a specific load. So it may be stable at 2.5GHz in one load but not another. C1E is not under our control as far as I know and are defined via ACPI. The only thing we may be able to mess around with is the CC6 and PC6 states. But these are mostly additional power saving features.

Apologies if I make any mistakes, in my fuzzy understanding there are P states, linked to NB_P states, which are also linked to memory states as in that logiotek fork, in case if anyone knows any code to figure out how windows fiddles with cpu speed via ACPI ? and if somehow preventable ?

You can try to set min and max CPU performance in the power profile to 100%. I doubt the stuff you observe comes from Windows though.