Stabilizing your memory overclock on Core i7 platform - Back-to-Back Cas Delay Investigated

Overclocking/OC-Team.be by massman @ 2009-06-30

In today´s short article, we´ll have a look at the effect of just one memory timing, the Back-to-Back Cas Delay timing, which seems to be one of the more important timings both performance and stability-wise. If you´re looking for a solution for your high-frequency capable memory kit, performance increase with low-frequency memory or stabilizing high BCLK/high memory ... this is something for you.

  • prev
  • next

Findings and data

Data on performance:

Since we now already know that this particular timing influences the stability of the memory overclock, we now focus on the effect in performance. As you all know, the higher a memory timing is set, the less performance you have. Why? To make it simple: the higher the value, the longer you have to wait for the command (controlled by that timing) can be issued. The lower the value, the shorter the waiting period, thus the faster the command is issued.

We used Superpi 32M and Lavalys Everest to show the performance differences.

Madshrimps (c)


As you can see, a good B2B-setting can mean the difference between a good and a very bad 32M result.

Madshrimps (c)


Quite a spectacular decrease in the memory read bandwidth going from 6 to 12 or even 10 to 12.

Madshrimps (c)


The memory write bandwidth decrease seems to be more subtile than what we saw in the graph above, but going from 10 to 12 still has quite a big effect on the performance.

Madshrimps (c)


Very big decrease in performance, once again!

Madshrimps (c)


As you can see, this timing has nothing to do with the latency of the memory, but only with the bandwidth throughput.

Findings:

When going over the different graphs, it's more than clear that this timing has a dramatic effect on the performance. More specificly, on the memory bandwidth. Underneath you find an overview of the effect of the different aspects of tuning the memory subsystem on the different benchmarks.

  • Memory frequency: 1600CL7 versus 2000CL7
  • Cas Latency: 2000CL9 versus 2000CL7
  • Back-to-Back Cas Delay: 1600CL7_12 versus 1600CL7_6

    Madshrimps (c)


    Basicly, I calculated the effect of changing on of the three variables of the tests in above section. Each set of bars represent the effect of the three variables in a certain benchmark. The longer the bar is, the more effect a variable has in this specific test.

    Another way of looking at this would be to find a match of low-clock and high-clock settings in terms of performance:

    Madshrimps (c)


    Most interesting result is of course the 'LE - copy' result as 1600CL8 can outperform 2000CL7. Who needs high frequency memory anyway?
    • prev
    • next
    Comment from Massman @ 2009/07/07
    Seems like some people have problems with certain graphs in the article, here's a quick FAQ.

    Q: "On page 1 it sounds like no matter what setting you use, its not stable."
    A: No, what I say on page 1 is that one of the weird characteristics of the issue is that the instability doesn't scale. So, it's not because you'd raise the timing by one that you'll get a more stable system by definition. I tried from 0 upto 10 and every single setting was equally unstable: 2M no problem, 4M crash after 1 or 2 loops. The non-scaling characteristic is also shown at the point where it gets stable: at 11 I couldn't do 4M, at 12 I could do anything.

    Q: "So you mean, at some point losening up b2b doesnt improve stability, that memory speed is simply unstable and changing b2b doesnt change that."
    A: That's something I forgot to mention in the article, well, at least forgot to mention explicitly. It's indeed true that the instability is only under load ... and apparently not under all load. As the article states: 2M was completely stable, copying files also ... 4M crashed after 2 loops or so.

    But, changing the B2B value definitly helps to get more stability, no doubt about that. The problem is that it doesn't scale like you'd expect. For instance, tRas you can increase by one and make it more stable. With B2B, that's not the case: it's either fully stable or half 'n' half.

    Eg:

    0 - unstable
    4 - unstable
    8 - unstable
    9 - unstable
    10 - stable
    11 - stable

    So, 0 would be equally unstable as 9.

    Q: "I don't understand the Performance scaling graph on page two"
    A: I have had a colleague ask me about the graph as well, haha. Basicly, I calculated the effect of changing each variable in the different test. The longer the bar, the more effect a certain variable has in that particular test. It's a different representation of the 5 other performance charts. I thought it would be more clear, but apparently the opposite is true. So, for instance, in the Everest-Copy benchmark, the B2B timing has the most effect.

    Q: "I find the title very misleading, i like the article but its not at all what i expected to find with that title"
    A: Sorry about that
    Comment from leeghoofd @ 2009/08/04
    Felix, the programmer of CPU tweaker and MemSet has added 2 extra settings to tune the B2B setting and the Idle Cycle limit.

    Grab the program at his website :

    CPU Tweaker Website




    Sadly it can't be used on Asus mobo's (MC_CFC_space_locked) as the program has no access to the locked Bios (but at least you know what value is preset)
    Gigabyte, EVGA, DFI and Biostar mobo's allow the tool to change the bios setting from windows. We will have to confirm with MSI and hopefully Asus will grant access to the bios.
    Comment from Kougar @ 2009/08/18
    Well, thanks to a recent BIOS update, CPU-Tweaker does detect and does modify the B2B setting on Gigabyte boards, and now offers the setting in the BIOS

    Gigabyte does not use a "0" setting (0 sets it to "Auto"). Setting a "4" in the BIOS results in a "5" shown by CPU-Tweaker; setting a "3" in the BIOS results in CPU-Tweaker showing a "4", so CPU-Tweaker is always +1.

    CPU-Tweaker does not offer settings below a "4", max is "32". (32 does work) CPU-Tweaker lists the board default setting as "Disabled", however, a setting of 1 or 2 in the BIOS also shows as "Disabled", yet performance tests seem to indicate a 5-7 range so I'm not sure.

    Might be the board, but HyperPI times were not very precise and differed from setting to setting. Setting a B2B of "2" yielded an outlier result of 13 mintues 25.1 seconds for a 32M Hyperpi run, settings of "1" and "3" were in the low 40's range. Everest was even less precise. B2B had no discernible effect on latency at all.

    HyperPI would only lock with settings of 1 or 2. But if I closed and reran HyperPi it would complete the second attempt just fine. (Was how it generated that unusually low time). System never hung despite some brief LinX, SuperPi, and Prime95 runs.

    RAM is OCZ Platinum 6GB 1.66v




    Edit: Looks like I screwed up my testing, appears EIST and C1E were left on. : / Will retest it all later, the results are much more consistent now at least.
    Comment from Kougar @ 2009/08/20
    Okay, had some time to work this out. As I edited my post above, those results were with the power saving features I left on by mistake.


    HyperPI does randomly lock regardless of what B2B setting is in use. So far as long as I start HyperPi, run a quick test, then shut down and restart HyperPI a second time it has stopped locking up. (I do a 16k run, exit, restart it before each 32M test)

    I've tested every setting from 8 down to 1 (the lowest). "2" yielded the best HyperPI 32M time of 13:34.674, "1" SuperPI 32M of 8:44.971

    Now that the results are much more consistent I'm very sure the board defaults to "3", which as I noted above CPU-Tweaker claims is "4".
    Comment from leeghoofd @ 2009/08/20
    Thx for testing, like we also mentioned B2B hardly infleunces latency , it's pure bandwith related... when I tested the B2B my rig would hardlock, only a hard reset was possible... So I couldn't shutdown Hyperpi and rerun it... the lockup problem is less noticeable with 1600mhz, but once around 1900-2000 region, this setting is a MUST...

    Again thx for testing mate
    Comment from Kougar @ 2009/08/20
    I finally did get my first hardlock at "2", so I'm letting the board default to "3". At "2" the board passed HyperPi 32M, two hours of Prime95, but locked 15 minutes into LinX. So far Gigabyte looks to be pretty aggressive with the memory subtimings, I'm pretty happy with it.

    Let me know if there was anything specific you were looking for or interested in checking.

     

    reply