Returning to SimpleScalar and TLB

- September 29, 2017

The topic of my master's degree thesis is Virtual Memory. More specifically, it is about a low-power Translation Lookaside Buffer design. You can look at my previous posts for an introduction.

We have proven that more than 97% of virtual adress reading is done from the same page. So it is clear that the motivation of putting the TLB in low-power mode is a good idea if the cost of changing power mode is not too high.

The motivation part is ok. Now what I need is calculating the latency that is caused by putting the TLB in low-power mode. After that, I need to put there a mini-TLB, and simulate the situation. If the result is not better, then I'll say that mini-TLB is not required.

Note: Virtualbox->Settings->Display->Enable 3d acceleration

We check if the power mode is changed for every instruction:

if (itlb) {

if (CACHE_TAGSET(itlb, IACOMPRESS(fetch_regs_PC)) == itlb->last_tagset) {

if (low_power == 0) {
power_mode_changed = 1;
} else {
power_mode_changed = 0;
}

low_power = 1;
++low_power_count;

} else {

if (low_power == 1) {
power_mode_changed = 1;
} else {
power_mode_changed = 0;
}

low_power = 0;
++normal_power_count;
}

tlb_lat = cache_access(itlb, Read, IACOMPRESS(fetch_regs_PC),
NULL, ISCOMPRESS(sizeof(md_inst_t)), sim_cycle, NULL, NULL);

if (power_mode_changed) {
tlb_lat += 1;
}

So if the power mode is changed, we add 1 clock cycle to the latency. Because according to Zhao et. al. we can use 1 cycle approximately. [1]

After running SPEC2000 programs with and without power-mode changes, I calculated the cost of changing power mode. Increase in total cycles depend on the latency caused by switching power mode.

GZIP

Latency: 1 cycle
normal: 580713144
power mode: 626491951
increment:

Latency: 10 cycle
normal: 580713144
power mode: 907701677
increment:

Latency: 25 cycle
normal: 580713144
power mode: 1388998717
increment:

Latency: 75 cycle
normal: 580713144
power mode: 2994729275
increment:

Latency: 100 cycles
normal: 580713144
power mode: 3797723125
increment: %18

crafty: latency=100   cycle: 5507522361
latency=75 cycle: 4325626411
latency=25 cycle: 1963406203
latency=10 cycle: 1256704284
latency=1   cycle: 864566362
latency=0   cycle: 800535108

parser: latency=0 cycle: 651830989
    latency=1 cycle: 682037147
    latency=10 cycle: 964905768
    latency=25 cycle: 1483275240
    latency=75 cycle: 3228328486
    latency=100 cycle: 4101332934

MESA

Latency: 1 cycle
normal: 706604
power mode: 710210
increment: % 0.5

Latency: 10 cycle
normal: 706604
power mode: 873556
increment: %23

Latency: 25 cycle
normal: 706604
power mode: 1153888
increment: %63

Latency: 75 cycle
normal: 706604

power mode: 2092983

increment: %196

Latency: 100 cycles
normal: 706604
power mode: 2562917
increment: %262

Latency: 8000 cycles
total cycle: 706570
power mode: 151051283
increment: %212

MCF

Latency: 1 cycle
normal: 16270
power mode: 17236
increment: %5

Latency: 10 cycles
normal: 16270
power mode: 21796
increment: %33

Latency: 25 cycles
normal: 16270
power mode: 29826
increment: %83

Latency: 75 cycles
normal: 16270
power mode: 58668
increment: %260

Latency: 100 cycles
normal: 16270
power mode: 73243
increment: %350

BZIP2

Latency: 1 cycle
normal: 562851
power mode: 564804
increment:%0.34

Latency: 10 cycles
normal: 562851
power mode: 573788
increment: %1.9

Latency: 25 cycles

normal: 562851
power mode: 590123
increment: %4.8

Latency: 75 cycles

normal: 562851
power mode: 647554
increment: %15

Latency: 100 cycles
normal: 562851
power mode: 676534
increment: %20

PARSER

Latency: 1 cycles
normal: 15593
power mode: 16596
increment: %6

Latency: 10 cycles
normal: 15593
power mode: 20507
increment: %31

Latency: 25 cycles
normal: 15593
power mode: 27416
increment: %75

Latency: 75 cycles
normal: 15593
power mode: 52106
increment: %234

Latency: 100 cycles
normal: 15593
power mode: 64581
increment: %314

VPR

Latency: 1 cycle
normal: 19343
power mode: 20829
increment: %7

Latency: 10 cycle
normal: 19343
power mode: 27955
increment: %44

Latency: 25 cycle
normal: 19343
power mode: 40560
increment: %109

Latency: 75 cycle
normal: 19343
power mode: 83936
increment: %333

Latency: 100 cycles
normal: 19343
power mode: 105786
increment: %446

EQUAKE

Latency: 1 cycle
normal: 22668
power mode: 24636
increment: %8

Latency: 10 cycles
normal: 22668
power mode: 34961
increment: %54

Latency: 25 cycles
normal: 22668
power mode: 52939
increment: %133

Latency: 75 cycles
normal: 22668
power mode: 114399
increment: %404

Latency: 100 cycles
normal: 22668
power mode: 145374
increment: %541

[1] Zhao, L. E. I., et al. "A leakage efficient instruction tlb design for embedded processors." IEICE TRANSACTIONS on Information and Systems 94.8 (2011): 1565-1574.

Search This Blog

COMPUTER ENGINEERING DIARIES

The WeakReference class, monitoring memory leak and garbage collection in a Java application

Returning to SimpleScalar and TLB

Comments

Popular posts from this blog

My Crappy Looking Solution to "Binary Tree Common Ancestor" Problem

Trie Data Structure and Finding Patterns in a Collection of Words

A Graph Application in Java: Using WordNet to Find Outcast Words