Madshrimps Forum Madness

Madshrimps Forum Madness (https://www.madshrimps.be/vbulletin/)
-   Hardware Overclocking and Case Modding (https://www.madshrimps.be/vbulletin/f10/)
-   -   [MASS] - UCLK: technical limitations or useful marketing tool? (https://www.madshrimps.be/vbulletin/f10/mass-uclk-technical-limitations-useful-marketing-tool-65573/)

jmke 13th November 2009 20:01

edited my post:)

Massman 13th November 2009 20:02

underneath an email I've sent to one of my contacts.
Quote:

Basically, my problem originated with this one screenshot I've seen of a Gulftown running the uncore frequency at 1,5x memory frequency, which goes against everything I've learned so far about Nehalem. With Bloomfield and Lynnfield, the minimum uncore multipliers make sense because the 192-bit and 128-bit memory bus have to be addressed through an uncore-cpu interconnection of 96-bit. So, with bloomfield limited at 2x uncore per 1 dram clock, you can transfer all data in 1 dram clock. With lynnfield limited at 1,5x uncore per 1 dram clock, you can transfer all data in 2 dram clocks (so 3 uncore clocks).

With gulf, this doesn't make any sense.

There's still triple channel, so 192-bit width, so the uncore-cpu interconnection has to be wider for all data to be transfered in C uncore clocks, where C = 1,5x dram frequency. At first, I guessed the interconnect width has been increased to 128-bit, but then there's no way to split up the 128-bit in 3 equally sized blocks, one for each of the 3 memory channels.

I'm sure I must be missing something here ... but I don't know what. Actually, right before I sent this message, I came across even weirder ratios: 1,6x and also 1,8x. Well, those two are not really abnormal as higher than minimum is always possible, hehe.

The only possibilities I see is that:

- the interconnect is increased to 144-bit, 48 bit from each channel per uncore clock. But that would mean that the gulftown uncore-dram clock ratio could be dropped to even lower than 1,5x namely 1,33x. Haven't seen a screen of that so far ... and it wouldn't make any sense.
- the interconnect is increased to 128-bit, but the bits per channel are arranged differently than on bloomfield. Instead of an equal ratio of 32/32/32, it could be 32/64/32or 32/48/48. I'm pretty sure that would give a whole series of alignment issues as you'd have to 'mark' what data has been transfered and what not. It only fits 1,5x uncore.
- the interconnect is 196-bit ... but makes really no sense.
There's something I might have missed ... !!

Massman 13th November 2009 20:03

Quote:

Originally Posted by jmke (Post 248484)
edited my post:)

You're right. This is just internet-info, something anyone can see.

We don't break NDA by posting so, don't we?

jmke 13th November 2009 20:06

you have to sign an agreement in order to break an agreement, agreed? :)

Massman 13th November 2009 20:08

agreed :D

jmke 13th November 2009 20:13

worse case scenario we might have to remove a few offending pics

thorgal 13th November 2009 20:23

Quote:

Originally Posted by Massman (Post 248485)
underneath an email I've sent to one of my contacts.


There's something I might have missed ... !!

Interesting, and something to add to the question list at the briefing ?

Massman 13th November 2009 20:34

Okay. A few minutes after I've sent the email, I'm pretty sure I've figured out what happened with Gulftown. Well, not certain, but at least I have a concept that works in theory.

Quote:

Originally Posted by Massman
the interconnect is increased to 144-bit, 48 bit from each channel per uncore clock. But that would mean that the gulftown uncore-dram clock ratio could be dropped to even lower than 1,5x namely 1,33x. Haven't seen a screen of that so far ... and it wouldn't make any sense.

This is an issue we have seen come up with the Lynnfield as well, where the 128-bit memory bus width could be transfered to the CPU by having an uncore clock frequency of 1,33x the memory frequency as "96 x 1,33 = 128". The problem is described underneath:

Quote:

Originally Posted by Massman (Post 242353)
For dual channel configurations, which are 128 bit wide, it's a more complicated problem since there's no easy fix as with triple channel (just x2). Basicly, in a perfect world, increasing the uncore frequency by 1,33 would do the trick as instead of 96 bit per clock cycle you would then be able to address 96 x 1,33 = 128 bit in one clock cycle, which is the full dual channel bandwidth. The problem, however, is that this would make the register management quite difficult as you can see on the graph underneath:



Basicly, the uncore register would have to be aligned at the 1/3 and 2/3 mark. Or, put differently, the system has to make note where the first register output ends (1/3 of the second 96 bit series) and the second output ends (2/3 of the third 96 bit series). It's not technically impossible, but far from an elegant (= efficient) solution. Much easier is to increase the frequency by 1,5: the only alignment is the one at 1/2, which is just splitting up into two pieces.

The same story can be applied to the Gulftown, assuming that the uncore-cpu interconnect width has indeed been increased to 144-bit. In an attempt to make the memory bandwidth transfer as efficient as possible, we could state the the 192-bit memory data can be transfered as 144+48 bit, or, 3x 192-bit can be transfered as [(64+64+16)+(48+64+32)+(32+64+48)+(16+64+64)] which equals 4 blocks of 144-bit. The problem is allignement ... as you can see: the blocks have to be split up in a lot of pieces, which means set marks at all those cut-offs. Again, much easier would be to sacrifice a bit of uncore to make things less complicated. So, instead of a 4:3 uncore-memory ratio, we go for a 3:2 uncore-memory ratio.

I will make a similar sketch as I did for the Lynnfield. To be honest, this is really good! It fits the whole concept. It's perfect!

Massman 13th November 2009 20:37

Quote:

Originally Posted by thorgal (Post 248491)
Interesting, and something to add to the question list at the briefing ?

Not sure, I'm pretty sure that the Clarkdale is very similar to the Lynnfield design when it comes to memory-uncore-cpu design. Only if the lowest uncore multiplier is lower than 15x (for 2:10 memory multiplier) I will have to ask questions.

Massman 13th November 2009 20:43

1 Attachment(s)


This actually fits on so much more different levels. I have a drawing on the white board here that visually explains it completely. This 144-bit solution even fits the 6-core cpu design :D


All times are GMT +1. The time now is 02:54.

Powered by vBulletin® - Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO