The most keenly anticipated new CPU since ... well ... since forever really. Not even the 386 was so keenly awaited by the masses, and for once the hype was justified. The Athlon debuted to rave reviews everywhere and Intel's long-held high-performance stranglehold, briefly threatened by the 6x86-200, the K6-233 and the K6-III in previous years, was at last broken decisively.
Unlike AMD's K6-2 and K6-3 families, which were evolutionary designs, bigger and faster than the K6 Classic but essentially similar, the Athlon was new from the ground up: the world's first seventh generation CPU. It featured very deep pipelining and multiple parallel units, including three pipelined floating-point units, plus 128k of on-chip primary cache — twice as much as any previous X86 and four times as much as the Pentium II and III family.
The Athlon was designed to allow for multiple processors too — something which has been theoretically possible with past AMD and Cyrix CPUs but never practical because of patent restrictions and motherboard availability — but in reality, SMP capable Athlon mainboard chipsets were not to appear for another two years. Such is the difference between hype and reality.
Initial release Athlon clock speeds were 500, 550, 600 and 650MHz and performance was outstanding: comfortably faster than any other X86, and approaching the Alpha, which was still, in those days, regarded as the ultimate in processor performance. The non-Intel CPU makers' traditional weakness, floating point and multi-media performance, was not just improved but turned into a strength. Clock for clock, the Athlon's FPU was around 30% up on the one in a Pentium-II or III.
Like the Pentium II, the Athlon arrived on a massive, black-shrouded card that fitted into a slot — but not Slot 1. Instead it introduced the mechanically identical but electrically quite different Slot A, under license from Digital, and AMD originally intended to make the processor module highly configurable to allow for a wide range of L2 cache sizes. (In reality, clock speeds were to increase so fast that external card-mounted cache soon became impractical, and this idea came to nothing.) The Slot A EV-6 Alpha bus was double-clocked (i.e., it transmited a signal on both the rising and the falling edge of the clock) which made for an effective 200MHz. Although the slots disappeared quite quickly — replaced by the more practical and cost-effective Socket A — the Athlon bus lived on and it migrated to faster speeds from time to time: 266MHz a little later, 333MHz in mid 2002, and 400MHz in 2003.
Form | Design | Manufacture | Introduction | Status |
---|---|---|---|---|
Slot A | AMD | AMD | August 1999 | Legacy |
Internal clock | External clock | L1 cache | Width | Transistor count |
500 MHz | 200 MHz | 128k at 500MHz | 512k at 250 MHz | 22 million (plus cache) |
The 550 was the least common of the first generation Athlons, and the first one to disappear off the market. The 500 sold in reasonable volume early on, though it was fairly expensive back then, but having overcome the psychological barrier of going beyond the cheapest Athlon, buyers tended to bypass 550 in favour of the 600 or 650. As so often with first-generation parts, the early Athlons did not age well. An Athlon 550 was state of the art and a rather expensive purchase in late 1999, but was already comfortably out-performed by quite modest entry-level parts six or eight months later.
Form | Design | Manufacture | Introduction | Status |
---|---|---|---|---|
Slot A | AMD | AMD | August 1999 | Legacy |
Internal clock | External clock | L1 cache | Width | Transistor count |
600 MHz | 200 MHz | 128k at 600 MHz | 512k at 300 MHz | 22 million |
While the original Pentium-III was a non-event, the heavily revised 'Coppermine' version was vastly improved. It was sufficiently different to belatedly justify the new name, and be a worthy competitor to the Athlon. Like the Celeron and the K6-III, Coppermine had full-speed on-die cache. Although it was only 256k (as opposed to the previous model's 512k half-speed), moving it on-chip boosted performance by almost 10 percent. In technical language, big cache good, fast cache gooder.
Alas, the P-III 550 was all but impossible to get for the first six months or so. Intel's once-unmatched fab plants were unable to produce it in reasonable numbers, and what should have been a marketing winner turned into a sales disaster.
Once Intel eventually got the production problems sorted, Coppermine showed that there was life in the old dog yet. The cache was not just smaller and faster, access to it was faster too: Coppermine used a 256-bit pathway to the secondary cache instead of the usual 64-bit path; it had more buffers, lower cache latency, and better cache management firmware. In short, at lower clock speeds, Coppermine was a match for the Athlon Classic, and at higher clock speeds more than a match. Given the venerable age of the P6 design, and its rather ordinary performance in early days, this was a triumph. All Intel had to do from that point on was work out how to produce it in reasonable volumes. This was achieved eventually, but by then the action was up around 800MHz and the P-III 550 remains a rare part.
The P-III in the illustration, by the way, is mounted on a "slocket", a card to adapt the socket-based Coppermine chip to a Slot 1 BX main board. As a rule, slocket adaptors worked perfectly well but they could be troublesome, particularly if you bought the adaptor and the motherboard from different manufacturers.
Form | Design & Manufacture | Anounced | Available from | Status |
---|---|---|---|---|
Slot 1 & Socket 370 | Intel | October 1999 | March 2000 | Legacy |
Internal clock | External clock | L1 cache | Width | Transistor count |
550 MHz | 100 MHz | 32k at 550 MHz | 256k at 550MHz | 28.1 million |
Assuming the two chips were at the same clockspeed, say 600MHz, which was faster: a Coppermine P III or an Athlon Classic?
There will never be a final and conclusive answer to that hotly-debated question. There are just too many different ways to measure performance, and as many views as there are individuals holding them. In any case, the question was not really relevant: Athlons were always available at higher clockspeeds than the P III, at lower prices, and by the time the Coppermine P III 600 arrived on retail shelves, the Athlon 600 had long since given way to the 700 and the 750.
But for the record, our view is that there is very, very little between them. If pushed, we would probably have the 133MHz version of the P-III 600, closely followed by the Athlon 600, and take the visibly slower 100MHz bus P-III as third choice. In reality, we don't know anyone who could pick the difference between the better two chips without very careful measurement.
The Athlon 600 was popular for a high-end part. In its first release form, the Athlon had four speed grades: 500, 550, 600 and 650. As is often the case, the second-top part was considerably less expensive and much more common than the very fastest one. The 600 sold quite briskly to performance buyers in the early days, but Athlons didn't start to move into the mainstream of the market until the second-generation parts hit, like the 700 and 750, by which time the Athlon 600 was already forgotten.
Form | Design | Manufacture | Introduction | Status |
---|---|---|---|---|
Slot A | AMD | AMD | August 1999 | Legacy |
Internal clock | External clock | L1 cache | Width | Transistor count |
600 MHz | 200 MHz | 128k at 600 MHz | 512k at 300 MHz | 22 million |
The Duron was AMD's K6 replacement for the entry level, and stood to the Athlon Thunderbird as the Celeron did to the Pentium-III — or almost, as the Celeron had three performance-robbing differences: smaller cache, less cache intelligence, and lower bus speed, where the Duron had only one: the smaller cache. Otherwise it was identical to its bigger brother.
The Duron's introduction was the best news for PC buyers since the wonderful old K6-III. Priced much cheaper than a Thunderbird, the Duron had very nearly equal performance, and used the same main board.
The cache arrangement was unusual: it retained the Athlon's 128k of full-speed primary cache, and had a tiny 64k full-speed secondary cache — half the size of the primary! However, like the Thunderbird, the Duron's caches were exclusive: in other words, it avoided storing any given data in both caches (which is wasteful and performance-sapping), so as to get the full benefit of the 192k total.
They arrived with very little fanfare at about the same time as the SSE-enabled Celerons, in August 2000, and proved an immediate performance winner. As a host of reviewers around the world discovered, the Duron was easily faster than a Celeron, and almost exactly equal to an equal clockspeed Athlon Classic or Coppermine P-III.
Form | Design | Manufacture | Introduction | Status |
---|---|---|---|---|
Socket A | AMD | AMD | July 2000 | Legacy |
Internal clock | External clock | L1 cache | Width | Transistor count |
600 MHz | 200 MHz | 128k at 600MHz | 64k at 600MHz | 25 million |
Coppermine didn't just introduce better caching, several of the new Pentium-IIIs ran on a 133MHz bus for faster communication between the CPU, the mainboard and the RAM. Over time, the entire Pentium-III family migrated to 133MHz, though perhaps more slowly than at first expected, as it took quite some time for some major motherboard issues to be sorted out.
Like most Pentium-IIIs, the 600 came in a variety of forms: the "E" indicates a socket rather than a slot package, and the "B" a 133MHz bus capability. Thus the Pentium-III 600EB, for example, was a socketed P III running on a 133MHz board. This particular one was one of the very few P-IIIs that we sold any number of: we were impressed by its speed in general use, far quicker than the P-IIs and old 512k cache P-IIIs it replaced.
Notice that we list both the official release date and the time when they became actually available on the market. Intel's PR machine hyped the parts long before you could actually buy one — in reality the initial "availability" of the Coppermines was little more than limited volume pilot production. When you could finally get one, though, they were excellent.
Form | Design & Manufacture | Anounced | Available from | Status |
---|---|---|---|---|
Slot 1 & Socket 370 | Intel | October 1999 | March 2000 | Legacy |
Internal clock | External clock | L1 cache | Width | Transistor count |
600 MHz | 100 MHz | 32k at 600 MHz | 256k at 600MHz | 28.1 million |
600 MHz | 133 MHz | 32k at 600 MHz | 256k at 600MHz | 28.1 million |
This is a page about CPUs, but it is not really possible to understand the status of the 2000 model year Pentium-III chips without an overview of what was going on at that time with RAM and main board chipsets.
The rise to dominance of the competing Athlon processor at the expense of the Pentium III was powered by three main factors: first, the excellence of the Athlon itself; second, a long and severe shortage of the Intel CPUs; and third — perhaps most important of all — Intel's comprehensive failure to deliver an attractive and reliable mainboard chipset to replace the excellent but elderly BX. At the heart of Intel's chipset problems was a marketing-led decision to favour Rambus RAM over industry-standard SDRAM, regardless of technical merit.
The standard RAM since about mid-1998 had been 100MHz PC-100 SDRAM. The expected industry upgrade path was to 133MHz SDRAM, followed by DDR-SDRAM: SDRAM double-clocked (on both the rising and falling edges of the clock) to 200 or 266MHz. At this point, Rambus RDRAM raised its head. RDRAM could be clocked very much faster than SDRAM, up to about 800MHz with 2000 technology, and so offered a higher bandwidth. On first sight it was an obvious winner. With a closer look, however, a number of disadvantages became apparent.
- The interface to RDRAM was only 16 bits wide — SDRAM was 64 bit — so the speed advantage was lost: RDRAM has to clock four times faster just to reach the same raw data transfer rate. (Once upon a time, a 16-bit RAM interface was pretty special — in fact, it was one of the major advantages the 286 had over an 8-bit 8088. The 386 and 486 were 32-bit, all subsequent chips had 64-bit RAM access.)
- RDRAM had very high latency: in other words, it took much longer for an RDRAM chip to start sending the first byte of data.
- RDRAM got slower as you added more RAM, which SDRAM did not.
- RDRAM was proprietary: you couldn't make it unless you (a) paid a royalty to the Rambus company, and (b) signed an agreement promising, among other things, not to say anything critical of the product.
- RDRAM was very difficult to manufacture: in the early days even the best RAM manufacturers struggled to get RDRAM yields up past 10%, which was grossly uneconomic. Finally, and unsurprisingly given the forgoing, RDRAM was incredibly expensive: roughly 7 times the price of SDRAM. Even as late as March 2002 it remained about 20% more expensive than DDR or SDRAM. In mid-2001, when this controversy was at its height, RDRAM was double the price.
Against that, RDRAM had a single advantage: it had slightly higher maximum bandwidth than DDR SDRAM. Rambus and Intel worked with the manufacturers to reduce the cost. Intel was aiming at a mere 10% premium for RDRAM. Samsung, the world's biggest RAM manufacturer, said this was absurd and a 50% premium was the absolute minimum. At the time we wrote: "we suspect that eventually Rambus RAM will stabilise at a figure not too far above Intel's estimate, if for no other reason, because demand for it remains very weak". This seems to have been be one of our better predictions.
In overall market share terms, Rambus was a tiny player in 2000. Why then, was it so important to the market failure of the Pentium-III?
Simply, because Intel was obsessed with fulfilling an ill-advised and dubiously ethical contract with the Rambus company to turn Rambus RAM into the new industry standard, and make Rambus Inc. a monopoly second only to Microsoft and Intel itself. If Intel had been successful, it stood to gain a huge financial windfall via a share options contract. Alas, as so often happens when enginering decisions are made on non-engineering grounds, Intel's chipset division was constrained to use sub-optimal design goals and, not to put too fine a point on it, comprehensively lost the plot (see The Amazing i820 Saga below). Intel's ill-advised commitment to shoving Rambus RAM down the industry's throat, willing or not, had two direct results: it handed leadership in the CPU market to AMD, and number one spot in the chipset market to VIA.