Stabilizing your memory overclock on Core i7 platform - Back-to-Back Cas Delay Investigated

Overclocking/ by massman @ 2009-06-30

In today´s short article, we´ll have a look at the effect of just one memory timing, the Back-to-Back Cas Delay timing, which seems to be one of the more important timings both performance and stability-wise. If you´re looking for a solution for your high-frequency capable memory kit, performance increase with low-frequency memory or stabilizing high BCLK/high memory ... this is something for you.

  • next



It has been a while since my latest article for which I blame examinations at school and of course the Gigabyte Open Overclocking Competition, which eventually led me to the great city of Taipei. However, since I have a few weeks of vacation now, I'm able to present you some of the findings of the Madshrimps team regarding the Core i7 platform.

Maybe you remember the X58 motherboard round-up which was published already four months ago, close to the release of the Core i7 platform. Having 7+ X58 motherboards at home (I was sent a couple more after the round-up) gave me, and my collegues, the opportunity to digg deeper into the different aspects of overclocking an i7 processor. In fact, we focussed on three different aspects:

  • BCLK frequency
  • Memory frequency
  • The combination of BCLK and memory frequency

    As for this article, we'll be spending time on one setting that seems to give us the best solution when running into either stability or performance problems when overclocking the memory. It seems to be one of the key elements to stabilize the memory when running at high frequencies.

    Back-to-Back Cas Delay

    Madshrimps (c)

    This one setting I'm talking about is a memory timing called "Back-to-Back Cas Delay" (referred to as B2B in the article). To explain you why this particular timing is so interesting, I need to tell you the story of how we figured out all this, since it's vital to the understanding of it's impact on stability. First of all, this timing seems to be available on Asus motherboards only. Why? We don't know, but on the Asus motherboards it defaults at "0" when set to auto in the bios. But, before we digg in deeper: the background story.

    To start with, I have to explain a concept named "low-clock challenges", which is mainly used within the enthousiast community to work out a software/hardware configuration as efficient as possible. Basicly, you limit the maximum frequency of the cpu and then try to get a score as good as possible using a certain benchmark, in this case 32M. This allows the overclockers to compare their tweaking skills and, if necessary, figure out hardware-related performance problems.

    Now, when tweaked properly, a 4GHz Core i7 combined with 1GHz CL7 (2000CL7) memory will give you a time around 8 minutes 50 seconds (in the 32M benchmark). However, certain people seem to be able to run at least 10 seconds faster, which could not be explained by software tweaks solely. To keep the story short: it occured to me that almost all fast configurations were Asus-equipped.

    On the Madshrimps forums, I wrote quite a lenghty post regarding my performance issues in the 32M low-clock challenge. At initial stages, I believed the hardware prefetcher options in the bios were causing this since the Asus motherboards were the only motherboards that had them available in the bios. However, after testing both the Rampage II Gene and Foxconn Bloodrage (new bios has prefetcher option) I know that those two prefetchers are not the cause. I presume the best way to read that post is just for understanding that there's a significant difference in clock-per-clock performance between certain motherboards.
    (Link: 32M and i7, a motherboard's choice?)

    The second part of the story is actually located in another thread, but since the information is spread over numerous forums, I'll give you the details in short. Basicly, Jody (3oh6, and I were testing the Rampage II Gene at the same time and both our motherboards were incapable of running an Elpida-based memory kit over 930MHz, no matter what voltage or cas latency. After a few hours (okay, 10+) I found a reasonably stable configuration when playing with the Back-to-back Cas Delay timming. There are a couple of weird issues with this timing, though.

    1) The instability caused by this timing doesn't scale as you'd expect. It's equally unstable set at "0" as it is set at "10" whereas you'd expect it to be more unstable at a lower value.
    2) The instability itself is not typical for memory instability: instead of getting a BSOD or a reboot, the system just locks up.
    3) The instability seems to be very particular to certain benchmarks. I am perfectly capable of running Superpi 2M as much as I want and even copying large files doesn't cause the system to lock up, but after 2 seconds of Superpi 4M the system hangs.
    4) In dual channel configuration, there's absolutely no issue whatsoever with this timing: this value set at "0" is perfectly stable
    5) The issue cannot be solved by increasing the voltage or loosening the timings. The setting "10" is as unstable at 1.8v Vdimm and 1.6v Vqpi/dram as it is at 1.65/1.50v

    A fellow overclocker has tested the influence of this timing on the performance in 32M and changing this value from 0 to 12 has a negative effect on the performance as follows: at loop 4, the benchmark is already 1 second slower. Taking into account that the benchmark has 24 loops in total, that would mean that changing this timing will cause you to lose about 6 seconds, maybe 7. And, coincidence or not, this is almost exactly what I'm losing in comparison to the best scores. This "finding" also occurs on other motherboards; initially I was using the DFI Lanparty DK X58-T3eH6 for testing purposes and again I had times around 8 minutes 50s. But, this board gives me perfect 200/2000 stability out of the box ... so I assume the back-to-back cas delay timing has been set to 12 by default on this motherboard to maximize stability.

    In short: the Back-to-Back Cas Delay timing has a significant effect on the stability of you memory overclock, but seems to have an effect on the performance as well. On the next page, we tried to find out how much effect the timing has on the performance.
    • next
    Comment from Massman @ 2009/07/07
    Seems like some people have problems with certain graphs in the article, here's a quick FAQ.

    Q: "On page 1 it sounds like no matter what setting you use, its not stable."
    A: No, what I say on page 1 is that one of the weird characteristics of the issue is that the instability doesn't scale. So, it's not because you'd raise the timing by one that you'll get a more stable system by definition. I tried from 0 upto 10 and every single setting was equally unstable: 2M no problem, 4M crash after 1 or 2 loops. The non-scaling characteristic is also shown at the point where it gets stable: at 11 I couldn't do 4M, at 12 I could do anything.

    Q: "So you mean, at some point losening up b2b doesnt improve stability, that memory speed is simply unstable and changing b2b doesnt change that."
    A: That's something I forgot to mention in the article, well, at least forgot to mention explicitly. It's indeed true that the instability is only under load ... and apparently not under all load. As the article states: 2M was completely stable, copying files also ... 4M crashed after 2 loops or so.

    But, changing the B2B value definitly helps to get more stability, no doubt about that. The problem is that it doesn't scale like you'd expect. For instance, tRas you can increase by one and make it more stable. With B2B, that's not the case: it's either fully stable or half 'n' half.


    0 - unstable
    4 - unstable
    8 - unstable
    9 - unstable
    10 - stable
    11 - stable

    So, 0 would be equally unstable as 9.

    Q: "I don't understand the Performance scaling graph on page two"
    A: I have had a colleague ask me about the graph as well, haha. Basicly, I calculated the effect of changing each variable in the different test. The longer the bar, the more effect a certain variable has in that particular test. It's a different representation of the 5 other performance charts. I thought it would be more clear, but apparently the opposite is true. So, for instance, in the Everest-Copy benchmark, the B2B timing has the most effect.

    Q: "I find the title very misleading, i like the article but its not at all what i expected to find with that title"
    A: Sorry about that
    Comment from leeghoofd @ 2009/08/04
    Felix, the programmer of CPU tweaker and MemSet has added 2 extra settings to tune the B2B setting and the Idle Cycle limit.

    Grab the program at his website :

    CPU Tweaker Website

    Sadly it can't be used on Asus mobo's (MC_CFC_space_locked) as the program has no access to the locked Bios (but at least you know what value is preset)
    Gigabyte, EVGA, DFI and Biostar mobo's allow the tool to change the bios setting from windows. We will have to confirm with MSI and hopefully Asus will grant access to the bios.
    Comment from Kougar @ 2009/08/18
    Well, thanks to a recent BIOS update, CPU-Tweaker does detect and does modify the B2B setting on Gigabyte boards, and now offers the setting in the BIOS

    Gigabyte does not use a "0" setting (0 sets it to "Auto"). Setting a "4" in the BIOS results in a "5" shown by CPU-Tweaker; setting a "3" in the BIOS results in CPU-Tweaker showing a "4", so CPU-Tweaker is always +1.

    CPU-Tweaker does not offer settings below a "4", max is "32". (32 does work) CPU-Tweaker lists the board default setting as "Disabled", however, a setting of 1 or 2 in the BIOS also shows as "Disabled", yet performance tests seem to indicate a 5-7 range so I'm not sure.

    Might be the board, but HyperPI times were not very precise and differed from setting to setting. Setting a B2B of "2" yielded an outlier result of 13 mintues 25.1 seconds for a 32M Hyperpi run, settings of "1" and "3" were in the low 40's range. Everest was even less precise. B2B had no discernible effect on latency at all.

    HyperPI would only lock with settings of 1 or 2. But if I closed and reran HyperPi it would complete the second attempt just fine. (Was how it generated that unusually low time). System never hung despite some brief LinX, SuperPi, and Prime95 runs.

    RAM is OCZ Platinum 6GB 1.66v

    Edit: Looks like I screwed up my testing, appears EIST and C1E were left on. : / Will retest it all later, the results are much more consistent now at least.
    Comment from Kougar @ 2009/08/20
    Okay, had some time to work this out. As I edited my post above, those results were with the power saving features I left on by mistake.

    HyperPI does randomly lock regardless of what B2B setting is in use. So far as long as I start HyperPi, run a quick test, then shut down and restart HyperPI a second time it has stopped locking up. (I do a 16k run, exit, restart it before each 32M test)

    I've tested every setting from 8 down to 1 (the lowest). "2" yielded the best HyperPI 32M time of 13:34.674, "1" SuperPI 32M of 8:44.971

    Now that the results are much more consistent I'm very sure the board defaults to "3", which as I noted above CPU-Tweaker claims is "4".
    Comment from leeghoofd @ 2009/08/20
    Thx for testing, like we also mentioned B2B hardly infleunces latency , it's pure bandwith related... when I tested the B2B my rig would hardlock, only a hard reset was possible... So I couldn't shutdown Hyperpi and rerun it... the lockup problem is less noticeable with 1600mhz, but once around 1900-2000 region, this setting is a MUST...

    Again thx for testing mate
    Comment from Kougar @ 2009/08/20
    I finally did get my first hardlock at "2", so I'm letting the board default to "3". At "2" the board passed HyperPi 32M, two hours of Prime95, but locked 15 minutes into LinX. So far Gigabyte looks to be pretty aggressive with the memory subtimings, I'm pretty happy with it.

    Let me know if there was anything specific you were looking for or interested in checking.