AMD Quad-Core Barcelona Gets Big Boost From Little-Known BenchmarksAMD Quad-Core Barcelona Gets Big Boost From Little-Known Benchmarks
Amid the battering AMD has taken in recent months, ranging from processor bugs to sagging finances, I've discovered what's sure to be some welcome news: A bunch of under-the-radar benchmark tests run by a respected tech guy, which puts AMD's quad-core Opteron (aka Barcelona) processor in a great light. Moreover, he gives AMD's new native-quad architecture a rave review, identifying what he believes are the technical reasons for its strong performance.
Amid the battering AMD has taken in recent months, ranging from processor bugs to sagging finances, I've discovered what's sure to be some welcome news: A bunch of under-the-radar benchmark tests run by a respected tech guy, which puts AMD's quad-core Opteron (aka Barcelona) processor in a great light. Moreover, he gives AMD's new native-quad architecture a rave review, identifying what he believes are the technical reasons for its strong performance.OK, first things first. I'm not saying somebody came up with a new benchmark. The "little-known" means the person I'm talking about ran a bunch of tests in which Barcelona comes out looking incredibly good, and my point is that you probably haven't heard about his work yet. The guy is Howard Chu, who's one of the lead developers of OpenLDAP, which is the open source implementation of the Lightweight Directory Access Protocol (LDAP).
The benchmarks he put his servers through are thorough, complex, and not subject to a easy, sound-bite assessment. Chu used GCC 4.3, which is the GNU compiler, for the benchmark, running it on both Xeon and Opteron servers to measure authentications/sec. This is usually used as a database metric, but in this case Chu was essentially testing to understand OpenLDAP's concurrency performance, and to see how what kind of network packet loads the machines could handle
The other caveat is that Chu's tests don't necessarily yield apples to apples comparisons. For example, Chu found that the new quad-core Opteron beat a slightly older, 65-nm Xeon. (The Xeon is still sold; it's old only in the sense that there are more advanced 45-nm Xeons currently available.) Of course, a more relevant and strictly fair comparison would be putting Opteron up against a new, 45-nm Xeon. (I discuss this further with Chu below. As to why he used older Xeons, it's because he was working with loaner systems and isn't running a big-budget operation.)
Now that the long and boring stuff is out of the way, let's cut to the chase. Chu's results showed that the quad-core Opteron performed like a champ. A 2P system, meaning it has two quad-core Opterons, or eight cores overall, handled over 54,000 authentications/second. (The systems were equipped with the 1.9-GHz quad Opteron 2347.) That many not seem like a big deal to me or you, but trust me, it's huge. (You can read the full results here, on the Connexitor blog.)
The Interesting Stuff
Of greater interest to me than the actual results are what they mean. To get the scoop on that, I had an e-mail back-and-forth with Chu last night.
The money quote is Chu's take on Barcelona, which is the server-processor implementation of AMD's new quad-core architecture. (Phenom is the desktop processor.) Here's what Chu told me:
"A lot of people think AMD's 'native quad-core' mantra was just empty hype, meaningless to end users. I disagree. In highly concurrent software, cache coherency plays a big part in the performance picture, and it's pretty clear that AMD's design excels here.
People say, 'Big deal, it makes no difference on the desktop.' That may have been true in 2007, but it will be less so going forward. Everybody acknowledges that games and other desktop apps will be written to take advantage of multiple cores; there's no choice on this issue, that's the only way forward for performance. People thought dual-core was useless at first. Today, you can ask anyone who's switched from a single-core to a dual-core system, and they'll tell you about how smooth and responsive their computing experience has become.
I work on server code, [so] a lot of what I work with right now really does have zero relevance to the desktop. In the server context, it's incontestable that AMD's architecture is the right one. Everyone is talking about [Intel's] Nehalem and how it will have an on-die memory controller and QuickPath point-to-point interconnect. Obviously, AMD picked the right system architecture and now Intel is following, so there's nothing to even debate here.
I asked Chu two more important questions. The first was to point out what I mentioned above, that he tested the latest Opteron against what's essentially the N-1 Xeon. How would Barcelona stack up against an Intel 45-nm Xeon, which is also known by the code name Harpertown?
"[On] Barcelona versus Harpertown, I expect Barcelona will still win on data-intensive workloads and highly concurrent workloads. Harpertown's increase in L2 cache size does little to benefit servers operating on databases with multi-gigabyte working sets, and even with the FSB at 1600MHz there's still too much bus contention when you consider I/O traffic, RAM traffic, and cache coherency traffic. I'll be happy to test that expectation any time a system becomes available."
Finally, I asked Chu if he received any compensation from either AMD or Intel. (His blog post already notes that AMD provided a pair of 8-core servers from him to run his tests on.)
"I just do OpenLDAP. We've benchmarked it on Itaniums at Intel's request in the past. This is just another benchmark for us, and my only motivation is to see my code run as fast as possible on a given platform.
I should point out that all of the code used in my tests is freely available to anyone, and the machine configurations were all provided as well. Anybody can set up the same environments and duplicate the results I obtained. As with all of our previous tests, all of our data and software configs are available for download to anybody who wants them. As much as possible of the software was identical across all machines, at least at the source code level. Yet another nice thing about working with open source -- you can't hide. If we rigged the results, any third party could easily expose whatever trickery.
In hindsight, I probably should have also benchmarked the Intel binary on the AMD machine and vice versa, to see how much impact the compiler options had. I may give that a try later. We also have a Sun 5120 due to arrive in a week or two (Niagara 2 system), which will be interesting."
I hope to take this further in a future blog entry, and perhaps post some perspective from Intel. Again, my purpose here is not to "take sides," but to bring an interesting and relevant test to light.
P.S. Like this blog? Subscribe to its RSS feed, here.
For a mobile experience, follow my daily observations on Twitter.
About the Author
You May Also Like