Blacksmith on non-Coffee Lake CPUs

Question

Blacksmith on non-Coffee Lake CPUs

hariv opened this issue 3 years ago · comments

Did anyone try running blacksmith on CPUs other than Coffee Lake?

I was able to run it successfully on Kaby Lake, but it didn't work on Comet Lake. It errors out immediately saying it could not find conflicting address sets and asks if the number of banks has been defined correctly (which I checked is correct).

Patrick Jattke · Answer 1 · Thu Nov 18 2021 16:11:01 GMT+0800 (China Standard Time)

Hi @hariv

The DRAM address functions are obtained from an i7-8700K (Coffee Lake). It is very likely that those functions are different on other CPUs. In this case, you would need to first-reverse engineer them (e.g., using DRAMA or TRRespass' DRAMA) and then update the DRAM addressing matrices in DRAMAddr.cpp.

Hari Venugopalan · Answer 2 · Thu Nov 18 2021 17:36:47 GMT+0800 (China Standard Time)

Got it. Thank you @pjattke.

Heechul Yun · Answer 3 · Sat Nov 20 2021 10:10:52 GMT+0800 (China Standard Time)

Hi @hariv

The DRAM address functions are obtained from an i7-8700K (Coffee Lake). It is very likely that those functions are different on other CPUs. In this case, you would need to first-reverse engineer them (e.g., using DRAMA or TRRespass' DRAMA) and then update the DRAM addressing matrices in DRAMAddr.cpp.

Hi. @pjattke.
I've got the following TRRespass's DRAMA outcome on a skylake machine. can you explain how DRAMAddr.cpp should be modified to reflect the mapping? Thanks.

         Valid Function: 0x4080                  bits: 7 + 14
         Valid Function: 0x88000                 bits: 15 + 19
         Valid Function: 0x110000                bits: 16 + 20
         Valid Function: 0x220000                bits: 17 + 21
         Valid Function: 0x440000                bits: 18 + 22
         Valid Function: 0x4b300                 bits: 8 + 9 + 12 + 13 + 15 + 18
0x4080
0x88000
0x110000
0x220000
0x440000
0x4b300

DominikBucko · Answer 4 · Sat Nov 27 2021 01:44:42 GMT+0800 (China Standard Time)

Could we get any followup on this? I used DRAMA tool as well and obtained memory functions, but don't know how to input them into the code.

Patrick Jattke · Answer 5 · Mon Nov 29 2021 04:21:20 GMT+0800 (China Standard Time)

Dear @heechul and @DominikBucko,

I'm sorry for the late reply but I didn't have time to work on that yet. I'll soon (in the next days) provide you with a script that generates the addressing matrices that you can input into Blacksmith.

Thanks for your patience and understanding!

Patrick Jattke · Answer 6 · Mon Dec 06 2021 02:16:15 GMT+0800 (China Standard Time)

Dear @heechul and @DominikBucko,

Finally, I managed to find some time to update our DRAM addressing matrices script. Sorry for the delay.

You can find the script mat-gen.py in this gist. You can find a # TODO note showing the section that you need to edit. It should be enough if you fill out dram_fns and row_fn based on the information from DRAMA.

Regards,
Patrick

Heechul Yun · Answer 7 · Mon Dec 06 2021 23:54:38 GMT+0800 (China Standard Time)

Hi @pjattke

Thanks a lot for sharing the script. I have a question.

If I'm not mistaken, the output of the script seems a bit different from the default configuration in the blacksmith repository when dram_fns and row_fn in the script was configured to match with the known functions in the repository (i.e., dram_fns = [0x2040, 0x24000, 0x48000, 0x90000], row_fn = 0x3ffe0000). For example, in the aforementioned single_rank configuration, DRAM_MTX[4] - DRAM_MTX[10] are shifted right by 1 bit in the generated matrix compared to the matrix in the code repository. Can you clarify on this?

Thanks

Heechul.

Patrick Jattke · Answer 8 · Tue Dec 14 2021 20:35:19 GMT+0800 (China Standard Time)

Hi @heechul

To summarize what my colleague told me, who has implemented this part of Blacksmith:

It shouldn't matter all that much because changing one bit changes also the other one if you want to stay in the same bank, so it's either/or. If we assume we have the row/col bit overlapping with a bank function (multiple XORed bits), then having that row bit on the higher bit or the lower bit shouldn't matter since you can't change one without changing the other.

This is because the bank/rank functions on our CPU (i7-8700K) consist each of two bits that are combined by XOR. So if you change any of them, you will end up in a different bank.

However, coming back to your question: I cannot tell why the output is different (shifted by one bit). My colleague told me that he will look into this more once he finds time. Meanwhile, you can just try to use the output generated by mat-gen.py and report back if that worked for you.

In any way, I will try to replace this DRAM addressing part in the next couple of weeks by something that makes it easier to work with as I recognize that the current solution is cumbersome.

Heechul Yun · Answer 9 · Thu Dec 16 2021 04:42:22 GMT+0800 (China Standard Time)

Hi @pjattke

Thanks for following this up.

I was able to obtain the same i7-8700K (coffeelake) processor and generate bitflips with the original code in the repository I will see if I can also generate bitflips with using the mat-gen.py generated tables as well.
below is my attempt to understand the addr<-->bank|col|row mapping tables. can you confirm if this is correct? Indeed, the addressing part was/is a bit tricky to understand. So, it will be great if you could make it easier to understand.

struct MemConfiguration single_rank= {
..
 // bank_rank_functions = std::vector<uint64_t>({0x2040, 0x24000, 0x48000, 0x90000});
  .DRAM_MTX = { /* addr -> bank (4 bits) | col (13 bits) | row (13 bits) */
    0b000000000000000010000001000000, /* 0x02040 bank b3 = addr b6 + b13 */
    0b000000000000100100000000000000, /* 0x24000 bank b2 = addr b14 + b17 */
    0b000000000001001000000000000000, /* 0x48000 bank b1 = addr b15 + b18 */
    0b000000000010010000000000000000, /* 0x90000 bank b0 = addr b16 + b19 */
    0b000000000000000010000000000000, /* col b12 = addr b13 */
    0b000000000000000001000000000000, /* col b11 = addr b12 */
    0b000000000000000000100000000000, /* col b10 = addr b11 */
    0b000000000000000000010000000000, /* col b9 = addr b10 */
    0b000000000000000000001000000000, /* col b8 = addr b9 */
    0b000000000000000000000100000000, /* col b7 = addr b8*/
    0b000000000000000000000010000000, /* col b6 = addr b7 */
    0b000000000000000000000000100000, /* col b5 = addr b5 (not b6)*/
    0b000000000000000000000000010000, /* col b4 = addr b4*/
    0b000000000000000000000000001000, /* col b3 = addr b3 */
    0b000000000000000000000000000100, /* col b2 = addr b2 */
    0b000000000000000000000000000010, /* col b1 = addr b1 */
    0b000000000000000000000000000001, /* col b0 = addr b0*/
    0b100000000000000000000000000000, /* row b12 = addr b29 */
    0b010000000000000000000000000000, /* row b11 = addr b28 */
    0b001000000000000000000000000000, /* row b10 = addr b27 */
    0b000100000000000000000000000000, /* row b9 = addr b26 */
    0b000010000000000000000000000000, /* row b8 = addr b25 */
    0b000001000000000000000000000000, /* row b7 = addr b24 */
    0b000000100000000000000000000000, /* row b6 = addr b23 */
    0b000000010000000000000000000000, /* row b5 = addr b22 */
    0b000000001000000000000000000000, /* row b4 = addr b21 */
    0b000000000100000000000000000000, /* row b3 = addr b20 */
    0b000000000010000000000000000000, /* row b2 = addr b19 */
    0b000000000001000000000000000000, /* row b1 = addr b18 */
    0b000000000000100000000000000000, /* row b0 = addr b17 */
  },
  .ADDR_MTX =  { /* bank | col | row --> addr */
    0b000000000000000001000000000000, /* addr b29 = row b12 */
    0b000000000000000000100000000000, /* addr b28 = row b11 */
    0b000000000000000000010000000000, /* addr b27 = row b10 */
    0b000000000000000000001000000000, /* addr b26 = row b9 */
    0b000000000000000000000100000000, /* addr b25 = row b8 */
    0b000000000000000000000010000000, /* addr b24 = row b7 */
    0b000000000000000000000001000000, /* addr b23 = row b6 */
    0b000000000000000000000000100000, /* addr b22 = row b5 */
    0b000000000000000000000000010000, /* addr b21 = row b4 */
    0b000000000000000000000000001000, /* addr b20 = row b3 */
    0b000000000000000000000000000100, /* addr b19 = row b2 */
    0b000000000000000000000000000010, /* addr b18 = row b1 */
    0b000000000000000000000000000001, /* addr b17 = row b0 */
    0b000100000000000000000000000100, /* addr b16 = bank b0 + row b2 (addr b19) */
    0b001000000000000000000000000010, /* addr b15 = bank b1 + row b1 (addr b18) */
    0b010000000000000000000000000001, /* addr b14 = bank b2 + row b0 (addr b17) */
    0b000010000000000000000000000000, /* addr b13 = col b12 */
    0b000001000000000000000000000000, /* addr b12 = col b11 */
    0b000000100000000000000000000000, /* addr b11 = col b10 */
    0b000000010000000000000000000000, /* addr b10 = col b9 */
    0b000000001000000000000000000000, /* addr b9 = col b8 */
    0b000000000100000000000000000000, /* addr b8 = col b7 */
    0b000000000010000000000000000000, /* addr b7 = col b6 */
    0b100010000000000000000000000000, /* addr b6 = bank b3 + col b12 (addr b13)*/
    0b000000000001000000000000000000, /* addr b5 = col b5 */
    0b000000000000100000000000000000, /* addr b4 = col b4 */
    0b000000000000010000000000000000, /* addr b3 = col b3 */
    0b000000000000001000000000000000, /* addr b2 = col b2 */
    0b000000000000000100000000000000, /* addr b1 = col b1 */
    0b000000000000000010000000000000  /* addr b0 = col b0 */
}

I found that my i5-6500 skylake processor (the one I originally used before getting a coffeelake) is also having the same mapping with the coffeelake machine when 1 DIMM module was plugged.
Lastly, I tried to find a tigerlake machine's bank/rank mapping functions using tresspass's DRAMA but was not successful. I wonder if you have a tigerlake machine and if so whether you could successfully reverse engineer the mapping.

Thanks

Patrick Jattke · Answer 10 · Wed Dec 29 2021 20:45:04 GMT+0800 (China Standard Time)

Dear @heechul,

Thanks for your update.

I am glad to hear that you could reproduce bit flips on an i7-8700K (Coffee Lake). Do you already have results for the matrices generated by mat-gen.py? It would be helpful to know that so I can start integrating mat-gen.py more properly.
I can confirm that your reverse-engineered annotations are indeed correct. I added your annotations to the repo's code so in future people will have it easier to understand them. However, they ideally should not have to change anything in these matrices (except for replacing the whole matrix by the output of mat-gen.py in case they use a CPU with a different micro-architecture). As a first step towards making this easier, I have added a CPU model check in Blacksmith.cpp.
Regarding the Skylake address functions, I am sorry but I cannot help as we don't have any Skylake machines. However, there is some existing work, e.g., DRAMA (Fig. 4c and Table 2b) and work by Barenghi et al. (Fig. 4) that you may want to use to compare with. From what I see, it looks like the bank/rank functions are slightly different than on Coffee Lake.
No, I am sorry we do not have any Tiger Lake system.

Regards,
Patrick

Heechul Yun · Answer 11 · Thu Dec 30 2021 19:53:09 GMT+0800 (China Standard Time)

Hi @pjattke

Thanks for confirming the annotation.
I can report that we got bitflips with the mat-gen.py generated matrices.
Thanks for the pointers regarding Skylake mapping functions.

Happy new year!
Heechul

JKRde · Answer 12 · Fri Feb 11 2022 00:00:16 GMT+0800 (China Standard Time)

Hi Patrik,

for an i3-8350k system I have created a log with DRAMA, see attachment.
Unfortunately I don't know how to get the values for dram_fns and row-fn out of this information.
Could you explain how to determine these?

drama_output.log

Patrick Jattke · Answer 13 · Fri Feb 18 2022 23:44:12 GMT+0800 (China Standard Time)

Hi @JKRde,
Could you meanwhile figure it out or do you need help? Basically, you need to take the bits DRAMA found to be part of the masks, then create its hexadecimal representation, and then use the mat-gen.py script to translate the masks into the DRAM addressing matrices used by Blacksmith.

JKRde · Answer 14 · Tue Feb 22 2022 21:19:59 GMT+0800 (China Standard Time)

Hi Patrik,

Unfortunately I have not yet managed to determine the dram_fns & row_fn values with the TRRespass' DRAMA tools. Maybe you could give me a step by step guide for dummies ;-)

BR
Jens

SilentDawn · Answer 15 · Mon Apr 03 2023 09:47:05 GMT+0800 (China Standard Time)

Hi @pjattke ,
I have run the drama from Trrespass repo and get the result of DRAM mapping function info as below.

~~~~~~~~~~ Found Functions ~~~~~~~~~~
	 Valid Function: 0x8000 		 bits: 15 
	 Valid Function: 0x10000 		 bits: 16 
	 Valid Function: 0x20080 		 bits: 7 + 17 
	 Valid Function: 0x1000040 		 bits: 6 + 24 
	 Valid Function: 0x2200000 		 bits: 21 + 25 
	 Valid Function: 0x4400000 		 bits: 22 + 26 
	 Valid Function: 0x8800000 		 bits: 23 + 27 
	 Valid Function: 0x145140 		 bits: 6 + 8 + 12 + 14 + 18 + 20 
0x8000
0x10000
0x20080
0x1000040
0x2200000
0x4400000
0x8800000
0x145140
~~~~~~~~~~ Looking for row bits ~~~~~~~~~~
[LOG] - Set #0
[LOG] - 184716da80 - 18824693eb	 Time: 273 <== GOTCHA
[LOG] - 184716da80 - 18553ecda1	 Time: 270 <== GOTCHA
[LOG] - 184716da80 - 18160cf469	 Time: 270 <== GOTCHA
[LOG] - 184716da80 - 189352a594	 Time: 267 <== GOTCHA
[LOG] - 184716da80 - 180714b92f	 Time: 264 <== GOTCHA
[LOG] - Set #1
[LOG] - 1833714c40 - 18541b138a	 Time: 273 <== GOTCHA
[LOG] - 1833714c40 - 1899d90349	 Time: 276 <== GOTCHA
[LOG] - 1833714c40 - 18373f65f1	 Time: 279 <== GOTCHA
[LOG] - 1833714c40 - 1808cd7a0f	 Time: 279 <== GOTCHA
[LOG] - 1833714c40 - 1822712c20	 Time: 279 <== GOTCHA
[LOG] - Row mask: 0xffff800000 		 bits: 23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 
0xffff800000

Continuously, I parse them into the script mat-gen.py as below.

num_channels = 4
num_dimms = 16
num_ranks = 2
num_banks = 16

dram_fns = [0x8000, 0x10000, 0x20080, 0x1000040, 0x2200000, 0x4400000, 0x8800000, 0x145140]
row_fn = 0xffff800000
col_fn = 8192 - 1

However, the script mat-gen.py will throw the error because https://gist.github.com/pjattke/b56baff62be77f16ad8d33376789be67#file-mat-gen-py-L56 requires a 30x30 square which is not satisfied by my drama result and parsed info in mat-gen.py.
I'm confused with that is 30x30 enforced? Obviously, my daram result is not.

Patrick Jattke · Answer 16 · Thu Apr 06 2023 07:20:26 GMT+0800 (China Standard Time)

Hi @TheSilentDawn. Thanks for your interest in Blacksmith. Could you please provide us with some more information:

Which CPU are these functions from?
Do you really have a system equipped with 4 x 16 = 64 dual-rank DIMMs? The timing-based DRAMA cannot figure out the DIMM/channel functions.

The 30x30 constraint comes from the fact that we are using a superpage, and thus cannot control any bits higher than bit 30. It needs to be a square matrix and invertible (i.e., have full rank) such that we can compute the -to- translation matrix using linear algebra.

Best
Patrick

SilentDawn · Answer 17 · Thu Apr 06 2023 11:21:03 GMT+0800 (China Standard Time)

Hi @pjattke ,
Thanks for your prompt reply.

I'm using Intel Xeon E5-2690 v3.
Yes, the server is equipped with 64 dual-rank DIMMs. However, to simplify the process, I have unplugged them and only one DRAM is left whose information is below.

num_channels = 1
num_dimms = 1
num_ranks = 2
num_banks = 16

I rerun drama from trrespass. The result is below.

root@ubuntu: ~/trrespass-master/drama/obj#./tester -s 16 -t 460 -o access.csv -v
...
~~~~~~~~~~ Found Functions ~~~~~~~~~~
	 Valid Function: 0x2000 		 bits: 13 
	 Valid Function: 0x200040 		 bits: 6 + 21 
	 Valid Function: 0x440000 		 bits: 18 + 22 
	 Valid Function: 0x880000 		 bits: 19 + 23 
	 Valid Function: 0x1100000 		 bits: 20 + 24 
0x2000
0x200040
0x440000
0x880000
0x1100000
~~~~~~~~~~ Looking for row bits ~~~~~~~~~~
[LOG] - Set #0
[LOG] - 3d6b5b940 - 3bb6a30f0	 Time: 413 <== GOTCHA
[LOG] - 3d6b5b940 - 39cb725cd	 Time: 407 <== GOTCHA
[LOG] - 3d6b5b940 - 3aa59e8af	 Time: 458 <== GOTCHA
[LOG] - 3d6b5b940 - 3e097750e	 Time: 458 <== GOTCHA
[LOG] - 3d6b5b940 - 3961e2b9a	 Time: 458 <== GOTCHA
[LOG] - Set #1
[LOG] - 3c86ac200 - 3e0485651	 Time: 458 <== GOTCHA
[LOG] - 3c86ac200 - 3e8c106cb	 Time: 458 <== GOTCHA
[LOG] - 3c86ac200 - 3b1f18208	 Time: 458 <== GOTCHA
[LOG] - 3c86ac200 - 3f53f1a9f	 Time: 458 <== GOTCHA
[LOG] - 3c86ac200 - 3ddf18e91	 Time: 458 <== GOTCHA
[LOG] - Row mask: 0xffff000000 		 bits: 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39 
0xffff000000

Based on my understanding the variable dram_fns in mat-gen.py should be configurated as [0x2000, 0x200040, 0x440000, 0x880000, 0x1100000] and the variable row_fn should be configurated as 0xffff000000 following the result from trrespass drama. However, I'm confused with the variable col_fn, which value should it be? If I try to create a 30x30 matrix, it should be 524288 - 1. But the script mat-gen.py will throw an error meaning not invertible.

SilentDawn · Answer 18 · Thu Apr 06 2023 15:21:29 GMT+0800 (China Standard Time)

Hi @pjattke ,
I also got another result running on Intel(R) Xeon(R) CPU E5-2690 and 1x16G DRAM which Part Number is HMT42GR7BFR4A-PB.
The drama output is below.

xxx:~/trrespass-master/drama # ./obj/tester -s 8 -o ddr3.csv -v
~~~~~~~~~~ Found Functions ~~~~~~~~~~
         Valid Function: 0x4000                  bits: 14
         Valid Function: 0x80000                 bits: 19
         Valid Function: 0x42000                 bits: 13 + 18
0x4000
0x80000
0x42000
~~~~~~~~~~ Looking for row bits ~~~~~~~~~~
[LOG] - Set #0
[LOG] - 2d9cde780 - 2fe09de11    Time: 288 <== GOTCHA
[LOG] - 2d9cde780 - 2de7842b5    Time: 280 <== GOTCHA
[LOG] - 2d9cde780 - 2a4b8df0f    Time: 272 <== GOTCHA
[LOG] - 2d9cde780 - 2e74ac4b4    Time: 280 <== GOTCHA
[LOG] - 2d9cde780 - 2b10de59e    Time: 304 <== GOTCHA
[LOG] - Set #1
[LOG] - 2f0f163c0 - 2b636d892    Time: 280 <== GOTCHA
[LOG] - 2f0f163c0 - 2af73e536    Time: 304 <== GOTCHA
[LOG] - 2f0f163c0 - 29eb7593b    Time: 284 <== GOTCHA
[LOG] - 2f0f163c0 - 2d2d4c2c3    Time: 284 <== GOTCHA
[LOG] - 2f0f163c0 - 2f3074bc3    Time: 276 <== GOTCHA
[LOG] - Row mask: 0xffff000000           bits: 24 + 25 + 26 + 27 + 28 + 29 + 30 + 31 + 32 + 33 + 34 + 35 + 36 + 37 + 38 + 39
0xffff000000

Could you please help to explain what configuration should be in mat-gen.py?

Patrick Jattke · Answer 19 · Wed Apr 12 2023 05:28:04 GMT+0800 (China Standard Time)

Hi @TheSilentDawn. I'm sorry, but I do not have the resources anytime in near future to help with this further. There is a little chance that one of my students will have the time to make mat-gen.py nice over the next weeks, but I cannot promise.

You will need to study the mat-gen.py carefully. It's basically just a translation matrix that it computes, so you need to have a square matrix (e.g., 30x30) with full rank (i.e., linearly independent rows). If this is not given, you either have the wrong functions (or row/column masks), or you need to augment it with "dummy" matrix rows (this would correspond to bits not involved in DRAM addressing).

I'm sorry that I cannot give you a more positive reply. I hope you understand. Good luck!

Luca Wilke · Answer 20 · Wed Apr 12 2023 14:58:59 GMT+0800 (China Standard Time)

@pjattke We have a student who is using blacksmith in a project. As part of the project he did some polishing on the address function import part of blacksmith and will post a PR soon.