AMD Opteron Dual CPU 

Home Products Frequently Asked Questions Best Price Computers Ltd

 

 

 

   


The technology behind dual CPU Opterons


Written originally for the Poweroid 9334, some parts also apply to other dual Xeon Opteron machines we sell.

Some people may ask, what makes a good workstation?  Since "time is money", we'd say a computer that gets your work completed in the least amount of time, and provides the highest level of stability, is a good workstation.  The idea is to create a tool that eliminates, where possible, or reduces performance bottlenecks to the maximum extent that current technology allows. The Poweroid 9334 is exactly that kind of machine.

The most obvious contributor to that performance in this machine is the pair of Opteron processors working in concert.  Two heads is always better than one right?  Unlike Intel's "HyperThreaded" processors, or some future implementations of "dual core" CPU's, this is a full fledged dual processor implementation. That translates to two full cores, on separate dies, each having their own full speed cache. Thanks to the design of the Opteron, they also have their own separate pipes to the memory. 

Direct Memory Access

Each processor has an integrated dual channel memory controller, going to it's own set of slots.  This is the opposite of a Xeon setup, where dual processors there must choke down through the Front Side Bus (FSB) to a shared dual channel memory controller.  AMD is able to use a more independent setup thanks to the inclusion of two "HyperTransport" links (distinct from hyperthreading) in each processor.  One acts as a "crossbar" for sharing cache information between CPUs, while the other goes to the AMD 8151 Graphics Tunnel (also called the "north bridge").  This allows for the two CPUs to have high bandwidth conversations as long as the software being used is optimised for dual processor operation.  With Intel's HyperThreading feature becoming more mainstream now more programs are being recompiled for multithreading, especially those in the workstation and content creation area.  

How does the Opteron differ from the original Athlon MP?

Unlike the previous Athlon MP processors, which Poweroid were the first to introduce to the UK (read report), the Opterons also support x86 instruction set extensions such as SSE2.  This is an important set of SIMD (Single Instruction Multiple Data) instructions used for graphics work to speed up the completion of large datasets.  While these types of instructions are more clock speed dependant than others (i.e. Xeons gain more benefit from this because the higher GHz numbers they come in), they do make the work go faster than it otherwise would. 

Also squirrelled away in these Opteron processors are the x86-64 extensions i.e. support for 64 bit processing.  This is The Big Selling Point in all AMD's marketing.  At the moment, there is a lack of true Windows support, but that doesn't mean support doesn't exist elsewhere.  Linux has been much quicker to capitalise on the uptake of this feature. What it allows for is the use of more than 4 GB in total  memory, and for individual tasks (such as large databases or simulations) to each be able to use more than 2-3 GB of memory  (without having to rely on the current "hacked solutions"). That, however, is not the only benefit for customers.  Programs compiled in x86-64 also gain performance from the doubling of available general purpose and SSE registers.  Registers are the smallest available storage area in a processor, and is where data is stored before it enters the execution core.

As you can see from the diagram, not only are the values of the GPR's extended to 64 bits, but there is also a whole lot more of them.  The SSE registers aren't any bigger but there are more of them to play with. This allows for is less calls to memory, as the compiler can keep more values tucked away in these registers to immediately operate on as and when necessary.  While the Opteron does possess many more "unnamed" registers which can be used internally for storing values, it's not the most efficient way of accomplishing the goal of executing code. 

Another mention of each processors integrated memory controller is necessary.  While SUN and others have gone this route before, it's not been seen in the x86 world until now.  By moving this typically discreet IC on to the processor die, there's one less bus stop for the data to go through from memory to CPU core.  Since latency is the hardest part of memory speed to increase, this is an invaluable advancement.  Especially for applications that require random reads to various different parts of memory, as opposed to sequential ones.  Database apps are a good example of "random" reads and writes.

The motherboard used here

Turning to the motherboard that all this processing horsepower is plugged into: Poweroid decided to not skimp on this vital component. Tyan is known for building some of the best server and workstation class motherboards and the "Thunder K8W" doesn't disappoint.  With four memory slots for each CPU, supporting 16GB of RAM in total, expandability in this area is not an issue.  On the Extended ATX PCB (printed circuit board), there's still room for two independant PCI-X 64 bit bus runs, with two slots for each, as well as a standard 32 bit PCI slot. That's not to mention the 8x AGP Pro slot for graphics.  What's that "pro" tag for?  That indicates extra power handling capabilities, required for various high end, energy hungry workstation class video cards.  The gigabit LAN, a standard feature on most boards these days, is hooked in through an unconventional method.  Instead of helping to saturate the older PCI bus as is done in most cases, it's instead attached internally to a PCI-X bus.  This keeps it from interfering with other high bandwidth devices on board, like video editing cards. The last feature is one that is not often given enough credit. That's the separate three phase VRM's (Voltage Regulation Module) used for each processor.  Cheaper boards might try to get away with a single VRM shared between the two CPUs.  Instead, each processor here gets it's own clean power from it's own VRM located close to the socket (to reduce transmission variances over distance).

There is more to a workstation though than just the sexy bits.  Coming standard with the Poweroid 9334 is a 300GB SATA HD, possessing 16MB of cache and another new feature called NCQ. The NCQ technology, though, is something you won't benefit from in an AMD based solution as the AMD boards don't currently support NCQ. However, the fast spindle speed and the extra high quantity of hard disk cache (which runs at RAM speeds of nanoseconds rather than hard disk speeds of milliseconds) does make a big difference to read/write operations.

Working with digital content creation requires space of course but making your work mobile is also important.  700MB available on a CD isn't always sufficient, but the 9.4GB available on a dual layer DVD should cover many more cases.  And with 16x speed for writing, you aren't going to have to wait the hours it would take on a slower drive to get that info onto the medium.  The 9334 also comes with a 16x DVD reader, so you can copy from DVD to another at one time, or dump massive amounts of data onto the host computer through both drives.

Last, but not least is the choice of an nVidia Quadro workstation class video card.  What, you've never heard of a "Quadro" before?  Think of a GeForce FX card meant for gaming, then add support for OpenGL calls in hardware.  In workstation 3D graphics, things like two sided lighting, clipping planes, logic operations, culling etc are required, which are not called for in the gaming arena.  To speed these operations up cards like the Quadro execute them in the GPU (Graphics Processor), as opposed to passing them through the driver path to the CPU like a standard game/desktop card does.  The hardware path of a Quadro is very optimized to complete these in "real time", so that you can rotate and translate 3D models without having to wait for them to render.  Again, it all boils down to getting your work done faster. A workstation graphics card is a key part in accomplishing that goal when you work with 3D graphics. 

                                        

 

 

Article on dual core

 

  © Content on this site copyright Best Price Computers Ltd 1996-2011 - Make Money Online

Site last updated: June 2010