Saturday, June 6, 2015

SPECIAL REPORT #2 - The Intel-Altera Deal

































Zeus fighting the giant Porphyrion, fragment from a frieze in Pergamon, 2nd century B.C. (source: sukurduzgoren.blogspot.com)

Lightnings, that show the vast and foaming deep,
The rending thunders, as they onward roll,
The loud, loud winds, that o'er the billows sweep—
Shake the firm nerve, appal the bravest soul! - Ann Radcliffe, "The Mysteries of Udolpho"


Lord of cloud-ensconced Olympus, Zeus ruled with absolute sovereignty in the Greek pantheon as well as in the Roman, where he was known as Jupiter or Jove. His ascent was preceded by a ferocious struggle against the older mythical races of Titans and Giants whom he slew, enslaved or imprisoned in the underworld.

Yet Zeus, for all his proud arrogance and abusiveness, was not without mercy; nor was he one so unwise as to forego the opportunity of seeking allies and recruiting capable servants to further his goals and ambitions. Among his confederates was a group of giants previously imprisoned by the Titans in gloomy Tartarus. This group included the Cyclopses, who in gratitude for regaining their liberty, gifted Zeus with the thunderclap and the lightning bolt, with which he reinforced his total dominance of the heavens and the earth.

This week, we have seen such a story repeated in the modern age. In contrast to the 'mouse that roared' story of Avago buying Broadcom to leverage itself up into the big leagues, we see venerable colossus Intel buying much tinier Altera, allegedly to strengthen and expand its position in the datacenter segment.

http://www.eetimes.com/document.asp?doc_id=1326736

This deal is very, very different from last week's. The Avago purchase is brilliant from a strategic technology perspective and will vault the combined firm into the top tier of semiconductor enterprises globally, but its financial aspects are rather ugly and the ultimate success of the venture is unusually problematic from an organizational viewpoint. By contrast, Intel will not immediately gain either significant revenue or competitive advantage from Altera. 

So why did Intel decide to do this? It clearly wasn't for the money, as the financial contrast between the two firms could hardly be more stark:










2014 saw Intel reach $56B in revenues with strong upward momentum during the course of the year, despite continuing weakness in its mainstay PC segment. Altera has been showing significant downward momentum over the last three quarters, finishing 2014 with $1.9B in total earnings. One can readily discern that side by side, Altera is almost an accounting error on an Intel balance sheet. 

Earlier in the year, Intel presented a $15B offer to Altera. As has been typical for programmable logic leaders Xilinx and Altera throughout their histories, the Altera executive management team saw that the offer was an excellent one and promptly turned their backs on it. Shortly thereafter, Altera's major equity investors raised holy hell and forced the executive crew back to the negotiating table, where the final terms grew to $16.7B in an all-cash deal.

For its part, Intel intends to finance the acquisition thru an undisclosed mix of cash and new debt. At the close of 2014, the company had $20.4B in cash and $13.7B in debt obligations. Even if the purchase were 100% debt financed, it is highly unlikely that Intel's credit rating would be harmed. The company routinely builds multi-$B fabs and spent $10B in new PPE (property, plant and equipment) last year. In purely financial terms, this acquisition is, despite its price tag, something Intel can readily take in stride. 

The Datacenter






















Source: techxact.com


Though this be madness, yet there is method in 't - Shakespeare, "Hamlet"

According to Wall Street standards, this merger is not an especially attractive one. That perspective would only be valid, however, if the accounting numbers were the exclusively relevant criteria of the acquisition. What matters much more is a factor that Wall Street does not have the wherewithal to comprehend - the technology implications of the deal.

Intel's public announcement proclaimed that Altera technology would help Intel in its efforts to further build its presence in the datacenter and IoT segments. I think we can safely ignore the suggestion that Altera's products will be of use to Intel's IoT group and attribute it to hype for the benefit of the always-gullible MSM. It's simply not possible that anyone at Intel could believe the cost and power efficiency of Altera's current selection of FPGAs and CPLDs would be broadly attractive to IoT developers. Nobody is that stupid.

The datacenter, though, is a top strategic priority for Intel and is mentioned prominently in its 2014 annual report. Intel provides a deep selection of system level hardware and software products for this segment, including offerings such as SDI (Software Defined Infrastucture - a collection of tools and utilities for server and network virtualization), further tools for realtime analytics of Big Data, server boards and even RAID controllers. Central to Intel's efforts in this market is the Xeon processor, supporting the extraordinary bandwidth and data integrity requirements of server farms. In its latest version (E7), the Xeon family is well suited for computing demands that involve multiple service requests for large data volumes simultaneously.

Altera devices are a common sight on boards and blades throughout the networking and storage sectors, serving as bridges, interfaces, glue logic, bug fixes and dedicated offload engines. Despite their unit cost, the versatility of programmable logic is such that it has been a fixture of communications boards for decades. It's quite normal for an engineering team to include one or more FPGAs and/or CPLDs as part of their BOM even without any specific applications in mind - the programmable device(s) will almost inevitably come in handy.

Making It Work



Source: wikipedia.org

The Devil is in the details, but so is salvation. - Admiral Hyman G. Rickover

The question is how the two technologies - NPU and FPGA/CPLD - can work synergistically to push Intel towards dominance in servers, datacenters and networking. Intel has tried this before by combining an Altera FPGA die into the same package as an Atom CPU. 

http://www.eejournal.com/archives/articles/20101123-stellarton/

Unfortunately, the combination went over like a fart in church and hoped-for business growth from application breakthroughs never materialized.

However, this does not mean that the principle of using programmable logic and CPUs in tandem is unsound. Novel ideas often need trial and error experimentation before the perfect combination of elements is found to create a compelling new capability. As an example: Thomas Edison and his team of scientists spent months in a New Jersey laboratory trying to develop a commercially viable light bulb. Together they went thru 300 prototypes before finding the right set of materials that made a bulb glow for three full days. Stated differently, they experienced 299 failures before coming up with a working recipe. 

It appears Intel didn't really understand the nature of the beast when it married the Atom to an Altera FPGA in one package. If we scrutinize the two programmable technologies at a more fine-grained level, we find commonalities and differences which are key to understanding how each is applied and how they can best be made to work together. Such an analysis presents us with the following:

CPU
1. Almost all CPUs nowadays are modified Harvard/RISC architecture devices, with separate data and instruction ports along with dedicated L1 cache memories. 
2. MCUs are still mostly 8b and 16b products with growth in 32b. Silicon-embedded CPUs are almost all 32b, though 64b is growing in higher end applications (mobile and high performance computing in particular.) The heaviest use by far of 64b data and instructions is in the PC market, of course.
3. Despite the use of L1 caches, L2 caches that are usually 4x-5x larger than L1 and the occasional use of small scratchpad/tightly couple memory (sometimes referred to as L0), CPUs are primarily logic devices.
4. In terms of both instructions and data, CPUs are word-oriented architectures.

FPGA
1. FPGAs have a separate instruction/configuration port and dedicated RAM to capture a configuration pattern. This pattern is imposed on the interconnect, lookup tables and internal logic to personalize the device. Data is received and results transmitted thru device I/O pins, of which there are many.
2. There has been extensive research into creating FPGAs with dynamic reconfigurability - in effect, transforming an FPGA fabric into one that executes on a new configuration every clock cycle, analogous to the typical per-cycle instruction execution of RISC CPUs. Despite a great deal of applied R&D for military applications and a startup dedicated to the idea (the now-defunct Tabula), this idea has not yet really found a home in the market.
3. FPGA lookup tables, in essence, reproduce the truth table for digital functions up to 6 inputs and the desired output is extracted by a configurable mux. This is why there are occasional debates about whether FPGAs are in essence memory devices or multiplexer devices.
4. Despite gross similarities in organization (where data and 'instructions' are received and handled separately), FPGAs are fundamentally bit-oriented architectures.

We can now immediately discern a crippling fault in the Stellarton product and its approach to coupling CPU and FPGA processing. The implementation did not pay serious attention to the system level implications of employing the two devices together.

The use of a PCIe interface to act as a bridge between the two was, to phrase it rather inelegantly, just plain dumb. Instead of treating the FPGA as a peripheral, it would have been much more astute to bring the device to the top level of the bus hierarchy and put it on the processor local bus, where it could function as a bit level coprocessor slaved to a CPU acting as ultimate arbiter and administrator.

Intel will have to goal system architects within its datacenter and networking business units to work with their Altera counterparts and devise even more sophisticated joint implementations of their respective technologies. The system-level architectures they need to develop will most likely fall into three broad categories:
1. An FPGA die becomes one of several functions within an MCM or 3D-IC megachip, with a Xeon processor as the centerpiece. The combined SoC/SiP design effort will be a fascinating project, as it will have to iron out issues of bus hierarchy, handling large volumes of data, marrying various die in F2F or F2B configurations, resolving power distribution and heat dissipation issues and so forth. 
2. An x86 CPU embedded within an FPGA fabric.
3. A portion of FPGA fabric employed as an embedded core within an x86-centric SoC.

All three will almost certainly require the development of scaled product families. There will also be new applications to be addressed, such as configuring the FPGA to act as an offload engine for anomalous packet processing, configurable QoS monitoring & error correction, programmable memory interfaces or physical interfaces and further applications as yet undreamed of. 

Formulating design and integration methodologies for all of the above will be thorny problems to resolve. Consider that an FPGA will run at a fraction of the clock cycle of the CPU. There will thus be time domain considerations for system design. Latencies will be introduced into the system every time the FPGA needs to be reconfigured for a new task - a problem which might provoke Intel into demanding an OTF reconfigurable fabric from Altera, which in itself will create other system level issues, including memory support and power consumption. Moreover, reconfiguration of the FPGA fabric will have to be performed within the context of an extant hardware configuration, be it any of the above three categories. They will all force constraints upon any new desired configuration of the fabric in terms of timing, available resources, clock domain boundaries, utilization and pinout.

A significant chunk of these methodology problems have already been successfully dealt with - in particular, those where a configurable logic fabric was embedded in a larger chip design. Such conundrums were wrestled to the ground by LSI Logic and IBM when they had their own embedded FPGA programs. Unfortunately, those initiatives were ahead of their time and were subsequently mothballed. Altera and Intel will have to re-learn those lessons and techniques from scratch.

Competitive Implications




"Morning on the Seine in the Rain", Claude Monet, 1898 (source: blistar.net)

"Poca favilla gran fiamma seconda."
A great flame follows a little spark. - Dante Alighieri, "The Divine Comedy: Paradiso"

We can see from the above discussion that the strategic implications of commingling these two programmable technologies is profound. It is likely that over the long term there will even be changes to programmable logic which will resemble the diversification of processors into DSP, GPU, NPU, VPU and other architectures, with innovations bringing forth a diversity of bit-level programmable arrays serving a wide variety of market niches and specialized applications. These innovations will themselves foster the invention of even newer applications and will spread, overlap & interact, like ripples from raindrops striking still water. 

Changes to hardware will also inevitably be accompanied by software innovation, both in tools and system stacks. It is likely that there will be repercussions high and low to firmware, middleware, applications, coding languages, modelling, simulation, verification, ESL/HLS, primitives libraries - in short, everything.

"Consuesse enim deos immortales, quo gravius homines ex commutatione rerum doleant, quos pro scelere eorum ulcisci velint, his secundiores interdum res et diuturniorem impunitatem concedere."
The immortal gods are wont to allow those persons whom they wish to punish for their guilt sometimes a greater prosperity and longer impunity, in order that they may suffer the more severely from a reverse of circumstances. - Julius Caesar, "Commentarii De Bello Gallico"

The dreams of Altera and Intel architects, however, will likely turn into nightmares for their rivals. Avago and Broadcom thought last week that together they would have an unbeatable technology portfolio for the datacenter segment. Think again, fellas. 

The merger poses grave danger to other powerhouses as well, including Qualcomm, who has its own wireline networking and datacenter ambitions. One must then consider the position of Xilinx, who has become accustomed over the last decade of beating Altera to the punch in FPGA feature and function enhancements in the race to exploit serial silicon process technology gains in successively deeper submicron nodes.

Intel has made it clear over the last couple of years that it had painted a bullseye on both Broadcom and Qualcomm in their strongest markets, so their situation - already worrisome - has now become decidedly precarious over the long term. Xilinx is in a much more dire predicament - compared to the SoC powerhouses mentioned above, Xilinx is a one trick pony that stands to get stampeded. Woe be to these three and anyone else who is caught in the open when the gathering storm releases its torrential downpour and mighty Jove lets fly his thunderbolt.

Meeting the Threat






Source: ducksters.com

Seek out strategic alliances; they are essential to growth and provide resistance to bigger competition. - Richard Branson

If Xilinx, Broadcom-Avago and Qualcomm hope to have any chance of defending themselves effectively against Intel-Altera in the datacenter and networking segments over the next decade, they will need to move quickly, aggressively and decisively. 

The greatest threat is posed by the x86 itself. Wherever Intel decides to compete, it brings along the potential of providing an unrivaled software ecosystem. Intel has already demonstrated that its CPUs can run just about any non-Windows based application at least reasonably well. The reverse, however, is not true, as the ARM-based products of Xilinx, Avago-Broadcom, Qualcomm and so many other semiconductor firms cannot claim solid Windows OS support.

Just 2-3 years ago there was rampant speculation that ARM was preparing a multicore CPU with hyperthreading support that could challenge the x86 on its PC home turf. The fact that no such product has yet been released by ARM is disquieting.

We all remember from the early and mid 1990's that when the Wintel duopoly threatened to coerce all of High Tech into a creative straightjacket defined by conformance to x86 and Windows, industry players began forming consortia to define capabilities and standards to ensure their continued independence and viability. Obviously another such consortium needs to be formed, centered on either ARM or AMD and assisting the chosen company in developing a competitive offering.

Countering the programmable logic half of the threat is much more problematic. Xilinx historically resists any proposals that it feels might pose even a remote menace to its packaged parts business, including die sales and licensing embedded versions of its fabric. 

Perhaps Lattice, which is increasingly treating its programmable logic technology as just one piece in a puzzle of IP for its future products, would be more amenable. Microsemi (who bought Actel in 2010) is another possibility. Both companies possess the mature toolchain and specialized application support required for the programmable logic sector. Each company has as well a sophisticated strategic outlook that is not warped by a religious rapture for the "holy quest" of replacing all silicon with FPGAs that still seems to captivate Xilinx. It then becomes just a matter of a consortium approaching either of the two firms with a proposal much like that for AMD or ARM - to assist in the development of embedded cores and unpackaged chip products that could be used to counter the capabilities of a merged Altera and Intel in datacenter and networking market segments.

Overthrow






"The Course of Empire: Destruction", Thomas Cole, 1836 (source: explorethomascole.org)

Change always involves a dark night when everything falls apart. Yet if this period of dissolution is used to create new meaning, then chaos ends and new order emerges. - Margaret Wheatley

Things rarely get boring in our industry - and when they do, they don't stay that way for very long, do they? ;-)

Just as mighty Zeus dethroned Cronus and the Titans and overcame Porphyrion and the Giants, this recent alteration in the High Tech landscape promises to be highly disruptive. Along with this wave of change there will come anxiety, but more importantly an abundance of opportunity. We should look forward to this in anticipation of the possibilities it will bring.

I don't think we're done with surprises for the year, though. I'm inclined to suspect that there will be many more mergers and takeovers before 2015 is done, with the 2016 High Tech landscape looking transformed from what it is today. But that is a topic for a future post. ;-)

5 comments:

  1. This is excellent Pete. However, I believe that this Intel attempt is a desperate attempt to delay the inevitable. Xilinx will end up gaining 70% market share - a near total monopoly. This is going to be an epic failure.

    ReplyDelete
  2. I think the FPGA co-processor approach is more likely than the 3D-IC approach. An FPGA SoC solution having an x86 embedded processor is also not a distant guess.

    Thanks for the insight

    ReplyDelete
  3. Fun and provocative...with many bases covered.
    Funny how bad x86 was in handling memory but how it was bandaided to continue the duopoly dominance. Now linux rules along with free app's window and ARM covers the high volume products so the performance niches are shrinking but luckily not all commoditized yet.

    ReplyDelete
  4. How about this simple explanation: Intel did build a lot of 14 nm capacity, were bad at handling Altera as a customer for it's foundry business, had to buy them to "fill the fab" and generate revenue for coming technology nodes.

    To my understanding the FPGA is also a great fab technology driver because it has such a regular structure (yield analysis/improvement etc.).

    Sounds too simplistic?

    ReplyDelete
    Replies
    1. Paying almost $17B to guarantee some extra capacity is quite a price tag, though. ;-)

      Delete

Feel free to comment or critique!