Last week, Intel executives took the stage in San Francisco to report to an audience of analysts, investors and media that Moore’s Law is alive and well. What does this have to do with our investment process and the Knowledge Effect? Everything.
Moore’s Law and the semiconductor era today fuel a knowledge economy that enables companies to innovate and therefore our Knowledge Effect to continue to persist as a market anomaly for investors. Click the link below for our analysis on what this means for investors.
Long-time readers will be familiar with the Knowledge Effect, which forms the basis for our investment strategy and is based on the work of Professor Baruch Lev of the NYU Stern School of Business. Lev first discovered a link between a firm’s knowledge capital and its subsequent stock performance, ultimately identifying a market inefficiency that leads highly innovative companies to tend to deliver excess returns in the stock market. We have named this market anomaly the Knowledge Effect (Read our white paper on the Knowledge Effect).
For companies that pursue innovation leadership, existing in an era of a general purpose technology can exponentially accelerate that innovation. Today we live in the semiconductor era.Entire generations of innovation originate from general purpose technologies. The early industrial revolution was driven by the steam engine, which provided constant rotary power. That rotary motion replaced human and animal energy, and as it diffused throughout the global economy, output, productivity and wealth creation soared for decades. Electricity catalyzed the second industrial revolution, allowing greater distribution of rotary power, again leading to huge increases in output, productivity and wealth.
The semiconductor provides continuous binary logic and is the basis of our modern knowledge economy. In 1971 Intel released the 4004, the first commercially available semiconductor. In 1965, Gordon Moore observed that the power of semiconductors doubled roughly every 24 months and predicted in “Moore’s Law,” that the trend would continue.
For years, investors have worried that Moore’s Law, which has always been interpreted as a physical law with finite boundaries, would reach its end. As the general-purpose technology of our times, it is critically important that innovation continues in semiconductors. The creation of new knowledge is a function of combining the existing stock of knowledge into new knowledge. The semiconductor provides the continuous binary logic that allows for the acceleration of new discoveries, and thus the testing of new combinations that lead to innovation.
Years ago, Stanford economist Paul Romer offered an estimate on the scope for further possible innovations. ”To get some sense of how much scope there is for more such discoveries, we can calculate as follows. The periodic table contains about a hundred different types of atoms, which means that the number of combinations made up of four different elements is about 100 × 99 × 98 × 97 = 94,000,000. A list of numbers like 6, 2, 1, 7 can represent the proportions for using the four elements in a recipe. To keep things simple, assume that the numbers in the list must lie between 1 and 10, that no fractions are allowed, and that the smallest number must always be 1. Then there are about 3,500 different sets of proportions for each choice of four elements, and 3,500 × 94,000,000 (or 330 billion) different recipes in total. If laboratories around the world evaluated 1,000 recipes each day, it would take nearly a million years to go through them all.”
Semiconductors are the backbone of the knowledge economy and have been experiencing a nearly 36% compound annual growth in the number of transistors for the last 52 years. To gain some perspective on this, starting with $1, a 36% CAGR for fifty-two years yields $8,790,695.47.
The continued rapid compound growth in semiconductors is central to the continued rapid growth in the creation of knowledge and hence central to our Knowledge Leaders Strategy. Our confidence in the ability of firms who choose to be Knowledge Leaders by deliberately pursuing an innovation strategy is strengthened when we hear that Moore’s Law is still firmly on track and will be at least into the next decade.
Last week Intel Corp, the largest semiconductor manufacturer in the world —the most advanced manufacturing company in the world —held a Technology and Manufacturing Day conference at which executives discussed Moore’s Law extensively, offering insights on how it is fundamentally misunderstood and marching boldly ahead anyway. In this report we curate the most impactful observations from these presentations as they relate to our work investing in the world’s most innovative companies. We hope the reader finds this curation from the conference enlightening and encouraging: enlightening for the techniques Intel is employing to stay firmly on the innovation curve described by Moore’s Law, and encouraging, that an American company — a manufacturing company —is far and away the worldwide leader in the production of semiconductors, which is a key comparative advantage for the US in a world seeming to get more chaotic by the day.
Stacy Smith, EVP-Manufacturing, Operations & Sales: Our ability to advance Moore’s Law to make products less expensive and more capable year in and year out is really our core competitive advantage. It’s a huge driver of our business and truth that’s a huge driver of the worldwide economy. It enables people to connect, it enables people to entertain themselves, to play and to learn. Moore’s Law helps us solve some of the biggest problems on the planet and it improves people’s lives. So I’m going to kick off today by answering just a few of the questions that we get, things like whether or not Moore’s Law is dead, do we still have technology leadership, spoiler alert on those, it isn’t and we do. I’m going to take a second and define Moore’s Law for you, just starting at the highest level here. Gordon’s observation back in 1965, was that the number of transistors, per square millimeter was doubling approximately every two years. That simple observation has become the heartbeat of technology, it means that the capability of devices that use semiconductors doubles every two years. It’s what brings us technology from supercomputers to virtual reality to wearables, it’s really the driver of the industry.
But at its core, Moore’s Law is really an economic law. It says that by advancing semiconductor manufacturing capability at a regular cadence, we can bring down the cost of making semiconductors over time. And since it’s a doubling every two years, the cumulative effect of Moore’s Law has been enormous, it has literally changed the way we live our lives.
To illustrate this in a fun way, we like to look at what would happen if other industries saw innovation at the rate of Moore’s Law, a doubling of capabilities every two years, starting at the same timeframe that Gordon Moore penned his law.
If you apply the same metric to car mileage, it would be so efficient that you can travel the distance between the U.S. and the sun on a single gallon of gas, you can feed the entire planet on a single kilometer of agricultural land, and space travel would have gotten to the point that you could actually travel at 300 times the speed of light.
All right, so let’s shift gears now, to directly answer one of the key questions that we get from time-to-time, which is whether or not Moore’s Law is dead. By the way, I was in the factories in 1990. And that was about the time that we were progressing lithography to the point that the lines that we were scribing on the wafer, were narrower than the wavelength of light. And that was seen as this instrumental technology, and it wasn’t even a blip on Moore’s Law cadence. The reality is that we are always looking out five years.
We have good insight into how we solve the problems in the next five years. We’re going to hear a lot of that today. We do a lot of path finding for the five years behind that and always when it’s 10 years out, it’s a view that we’re not going to be able to solve the problems that exist, but as we get there, we do solve those problems.
So, the short answer, as you look at this chart, is that Moore’s Law is not dead. At least, it’s not for us. These are our curves, and let me take a second and walk you through the chart behind me, which shows that for Intel, Moore’s Law is alive and well.
Starting on the left-hand side, what that curve shows is how scaling or density is improving over time. This is a log scale, and it shows how much we can shrink the transistors every generation. This is what drives Moore’s Law for us. The transistors shrink, so that we can double the number of transistors at every node.
The fact that those dots for 10-nanometer and 7-nanometer are below the historical line is actually quite significant. You’ll find that when the technologists and TMG draw dots, they don’t do that loosely. And what it says is that we are getting a greater than the normal density benefit for those processes. That’s an important point and I’m going to come back to that in a minute.
The middle graph shows that as the cost per square inch of technology goes up, which it just gets more expensive to make the way for us. That’s not a surprise, that’s been a constant in our industry and you’ll hear more about that from Mark.
And in the way that these curves come together, in every generation that cost per square millimeter to manufacture wafer goes up, but we shrink the transistors. And at the end, we get a declining cost per transistor. It all comes together in the chart on the right. Our cost per transistor is coming down at a slightly better rate than the historical trend. That says that for us, Moore’s Law is alive and well.
Then, we get what to decide what to do with the benefit of a shrinking or a declining cost per transistor. We can either keep the die size constant and add performance capabilities and features, or we can decide to shrink the die and reduce the cost of each product. And the reality for us at Intel is we do both because of the breadth of our products. In some cases, we take that benefit as a smaller die size, so lower cost products to go after new markets.
In other cases, we go after more and more performance and features to enable new usage models. The benefit of Moore’s Law is that we can do all of that, we can improve performance, we can add features, and we can reduce costs. In a little while, I’m going to show you some of our actual cost status, so that you can see the impact that these Moore’s Law curves have on reducing our costs. Over the course of the rest of the morning, you’re going to get a lot of insight into the actual technologies that we’re using to continue to enable us.
The chart on the previous page showed cost per transistor coming down from process to process, right? But we know that the time between processors, the time between nodes has gotten longer. It’s gotten longer for us, and it’s gotten longer for the rest of the industry. So, given that, you might wonder whether or not we’re getting the same annual benefit to Moore’s Law, right? We’re showing that we’re still coming down node to node, processor to processor, but as that time gets longer, do we still get the same annual benefit.
The short answer to that question is, yes. And I’ll show a little bit more about this in a second. We’re getting the same year-on-year improvement, even with that longer time as we go from 22-nanometer to 14-nanometer, from 14-nanometer to 10-nanometer. And it goes back to that density curve that I showed you a couple of slides ago. We’re getting a larger-than-normal density benefit as we go 14-nanometer and as we go to 10-nanometer. In essence, we’re taking bigger steps from generation to generation, which is enabling us to stay on the historical trend. And we’re able to do that because of a strategy called hyperscale. There are several underlying technologies that enable this, but the really important ones are called Self-Aligned Double Patterning, and Self-Aligned Quad Patterning. You’re going to hear about that in Ruth’s and Kaizad’spresentations. It gets a little technical, so I’ll give you a spoiler alert for the non-technical people in the room. I want you to know this is really, really cool, and we’re very lucky that these people work for us.
So, it’s taking us longer to go from node to node. And when Gordon penned Moore’s Law, the time between nodes at that time was more like 18 months, right. Over the course of my career, that became two years. Now, it’s more like three years to go from node to node. But we’re able to take bigger steps in terms of density improvement, and this is what’s enabling us and allowing us to stay on that same improvement rate that we’ve achieved in the past.
In addition, we’re taking advantage of the longer life at each processor node to introduce processor optimizations. We were really clever in naming convention for these, we called them 14-nanometer+, 14-nanometer++, likely be 10- nanometer+, 10-nanometer++, easy for all of us to remember. And those optimizations allow us to bundle together process technology improvements, architectural innovations, new IP blocks to enable an annual cadence of products to hit the market every year, and Murthy is going to talk a lot about that in his presentation. The key for us is an annual improvement, so we can deliver to the customers something new and fresh and enable new usage models for them.
This chart shows the impact of hyper scale that I was just talking about. So, if you look at the 14-nanometer and the 10-nanometer chips that are to the right, what you see is that shaded dotted line shows what the die size would be, so this is a future constant view of the world, and we know that sometimes we invest Moore’s Law and more features, and I’ll come to that in a minute. But in the future concept view of the world, the shaded area shows what those die sizes would have been if we were just on the traditional Moore’s Law scaling, which by the way is really good scaling.
But with hyperscaling, we’re getting that bigger benefit, and you can see the impact on this chart. By the time you get to 10-nanometer because of the cumulative effect of the steps that we’re taking at 14-nanometer and at 10- nanometer, we get about half of the die size at 10-nanometer that we would’ve gotten with normal scaling, that’s again assuming that we apply all of the benefit to die size.
As a recently reformed CFO, I do feel compelled to show you a couple of actual cost curves, and I think this really illustrates the heart of how Moore’s Law works in practice at Intel. So, these are the actual cost curves for the 22-nanometer and 14-nanometer product families. The first product on a processor, as you would expect is a relatively expensive product, right. You’re coming on to a processor, when the factory is ramping, and when yields are a little bit low. Then the second product, typically comes in at the sweet spot of cost, and then carries on to hit the optimal cost.
On the left of this chart, you see the 22-nanometer product family, so that’s Ivy Bridge in Haswell. As you might recall, Haswell on 23-nanometer was one of our company’s lowest cost products ever and one of our highest yielding processor technologies ever. On the right hand side, you see our 14-nanometer product family and their families and for the first time. They follow that similar trend, as we go from Broadwell, which was the first product on 14-nanometer to Skylake and then to Kabylake. You see that by the time Skylake gets to the point that it’s ramped, it gets to a similar cost to Haswell at the same time in its life. You see something really interesting with Kabylake. If you take out your ruler, and I know some of the financial types in the back will, you know who you are. You’re going to see that Kabylake at launch is actually a little bit lower than Haswell is.
So we’re getting a great cost at 14-nanometer, as we get to those later waves of product, that actually shouldn’t be a surprise. I think you’ve seen this kind of information from Intel in the past, but there is something that’s really important here. Kabylake has 800 million more transistors than does Haswell. It scores 30% higher on common metrics like 3DMark. Kabylake enables entirely new usage models, like immersive virtual reality, and those kinds of things. It’s an entirely different kind of product, entirely different class of product frankly.
So the truth of Moore’s Law resides in those cost curves. We can bring new capabilities to the market at a similar cost to the generation that came before. The prior graph showed actual product cost, but that can be impacted by things other than Moore’s Law, so if you think about it, as we progress our product line from two cores to four cores, the die size can increase, and our unit cost presumably goes up in that model, so it could be impacted by mix. Now, presumably, we get paid for that but still that cost trend is impacted by that mix. This graph is an all-in cost for our PC CPU product, so it isolates that segment of the business, and it shows our actual curves of what we’ve achieved in terms of cost per transistor since 2004, so this is an all-in actual cost for the company for the PC client business.
It’s cost per million transistors, if you think about it like that, so it’s kind of the corollary to Moore’s Law, where it’s showing cost per million transistors as opposed to density. This is also analog scale, and this is the first time that we’ve shown this data publicly. This chart is the realized benefit of Moore’s Law to our cost structure over time, so it’s cost per million transistors, so it normalizes for mix, it’s the entirety of our PC product business.
You see something interesting here, because we put no transistors on here, even with that longer time between nodes, we’re staying on the same realized cost per million transistor curve. This is the benefit of hyperscaling that was I just talking about, and the intra-node optimizations, and while I’m not showing 10-nanometer on this chart, I can say that based on everything that we know about 10-nanometer, we expect that this trend will continue out through the 10-nanometer generation.
Remember I told you earlier, there are two benefits to Moore’s Law. One is that the cost comes down over time, which I just showed you, but the other is that we can improve the capabilities of our products year in and year out. We can create more capable products for data centers, for machine learning. We can create more capable products in the client space, we can go after new markets. And you can see in our financials, over the last several years the impact of having this Moore’s Law leadership has impacted our financials, not just in the cost but also in the competitiveness of our product line, and the best way to view that is in viewing the gross margin of the company and how it has shifted up. It’s a combination of the cost that we’ve been achieving and the capability of our products over time. So, we continue to benefit from Moore’s Law.
End of Stacy J. Smith presentation transcript. Read Stacy J. Smith’s editorial “Moore’s Law: Setting the Record Straight.”
Mark Bohr, Intel Senior Fellow, Technology and Manufacturing Group Director, Process Architecture and Integration: Good morning. Today, I get to talk to you about my favorite topic, Moore’s Law leadership.
All right. Let me remind you what Moore’s Law is: it’s not a law of physics, it’s a law of economics. By scaling transistors you can deliver lower cost per transistor or you can take those lower cost transistors and add more transistors to a product to provide more functionality and higher performance. So again, Moore’s Law can deliver either more cost savings or more performance or some combination of the two.
Historically microprocessor die area scaling has been around 0.62x per generation, more than the transistor density improvements that I showed earlier because a microprocessor is a mixture of several different types of circuits. There are of course the logic circuits which tend to scale the best, but there are also IO circuits and SRAM circuits and maybe some of the analog circuits that typically don’t scale as well as logic circuits. So, that’s where the 0.62x number has come from.
And note, if we had kept following this normal scaling path, I had to start with a 100 square millimeter die in the far left, our 10-nanometer die size would be just a little bit less than 15 square millimeters. But with hyper scaling used on 14-nanometer and 10-nanometer, we in reality achieved much better than that 0.62x area scaling factor for a feature neutral die, so we’re not adding transistors in this case. And the 10-nanometer die would be around 7.6 square millimeters or about half of what it had been if we had stuck with the normal scaling patterns.
So what does this mean for cost per transistor. So I show the three graphs here, starting with the area per transistor on the left and if we had followed the normal 0.62X trend we would be delivering those open yellow circles at 14-nanometer, 10-nanometer and 7-nanometer, so those are hypothetical points — that’s not what we’re doing.
The middle graph there still would be a wafer cost increase to deliver that scaling, maybe a little bit less than that with hyperscaling, but still would’ve been a wafer cost or cost per area increase on those generations. And the result in terms of cost per transistor is the graph on the right: we would be deviating from the historic reduction rate delivering not-so-good cost per transistor, maybe better than the previous generation, but a curve that would be flattening out.
Here is one more a hypothetical scenario to describe to you. So again starting in the left, I’m assuming a normal area of scaling 0.62X per generation so that’s a hypothetical for 14-nanometer, 10-nanometer and 7-nanometer. And then in the middle graph we would’ve introduced or we could have introduced a wafer size conversion from 300 millimeters to 450 millimeters and of course, wafer size conversions deliver a lower cost per area. So, that would really benefit 14-nanometer, 10-nanometer and 7-nanometer that would be the one-time reduction in cost per area on 14-nanometer, but that benefit still applies as you scale forward to 10-nanometer and 7- nanometer. And then that the result of that, the graph on the right, so a little bit better CPT improvement than on the original slides I showed and close to the historic trend rate for reducing cost per transistor. But here is what we actually did on 14-nanometers and now on 10-nanometers, so use of hyperscaling on 14-nanometers and 10-nanometers on the left hand graph shows a much better than normal area of scaling. In the middle, wafer cost is still going up, maybe at a slightly faster rate, but the results in terms of the cost per transistor shown on the right hand graph below the trend line for 14-nanometers and certainly for 10-nanometers, and even 7-nanometers will come in below that long-term trend line.
So I’ve talked quite a bit so far about area scaling, density improvements, cost per transistor, but another important benefit to scaling and of Moore’s Law is improvement to the transistor performance and lower transistor power. So this graph, the graph in the left shows Intel’s trend for delivering improved transistor performance over different generations. The graph on the right shows our trend for reducing the dynamic capacitance of those transistors at the same time, and of course, capacitance effects active power, so you want a lower capacitance for lower active power.
The quick message here is that Moore’s Law at least at Intel, continues to deliver higher performance and lower power. But we’re not standing still. We’re developing performance enhancements on our technologies, so on 14-nanometer, and we first developed 14-nanometer+ that delivers improved performance over the original version, and now also 14-nanometer++ with an even bigger performance gain over the original technology. That’s the performance enhancement as shown on the left-hand graph, notice on the right-hand graph that the dynamic capacitance is unchanged. So the point there is that we are delivering increased performance without increasing dynamic capacitance and without increasing active power. Another point I want to make is that these changes and these improvements or enhancements in performance can be traded off for power. So if you have a transistor that is inherently faster, then you can choose to operate that circuit at a lower voltage, and you give up some of that performance but you gain in terms of a much lower active power.
And now at the coming 10-nanometer technology, we also have in place plans for a 10-nanometer+ and 10-nanometer++ technology, delivering in each case improved performance while not sacrificing dynamic Capacitance.
Okay. This is my last slide, so let me wrap up, and these are the same key messages I started with. Intel leads the industry in introducing innovations that enable scaling. Hyperscalingon Intel 14-nanometer and 10-nanometer provide better than normal scaling, while continuing to reduce cost per transistor. Intel’s 14-nanometer technology has about a three year lead over other 10-nanometer technologies with the similar logic transistor density. And our 10-nanometer technology provides industry leading transistor density using a quantitative density metric. Enhanced versions of 14-nanometer and 10-nanometer provide improved performance and extend the life of these technologies. And again, the key message is Moore’s Law is alive and well at Intel.
End of Mark Bohr presentation.
Kaizad Mistry, Intel Corporate Vice President, Technology and Manufacturing Group, Co-Director Logic Technology Department: If you look at that in historical context one more time, traditional Moore’s Law scaling are on 0.49x and on our 14-nanometer technology, and then again on our 10-nanometer technology, we use these unique hyperscaling innovations to provide better-than-normal 0.37x logic area scaling. And if you translate that to the metric that Mark described earlier today, which is the transistor density, which continues to grow, we provide a 2.7x transistor density improvement over our already leading 14-nanometer technology, really an unprecedented transistor density improvement in our 10-nanometer technology.
And again, although it took us longer than two years to develop this 10-nanomter technology, we took a much bigger step. So, if you look at that in historical perspective, you can see that our 10-nanometer technology as was the case in our 14-nanometer technology continues to keep Intel on the Moore’s Law pace of roughly doubling transistor density every two years. So, we continue to maintain the rate of Moore’s Law density scaling.
Now let me switch gears and talk about the enhanced versions of our 10-nanometer technology. So first a little introduction. What I’ve said to you so far is that we are taking bigger steps than the traditional Moore’s Law pace. But we’re doing it at a somewhat longer cadence than the traditional two year cadence. The net result in terms of density scaling is that we are on the traditional Moore’s Law pace. Hyperscalingis really a technique that allows Intel to continue the economic benefits of Moore’s Law. It features logic transistor density increase that is significantly more than the traditional 2x Moore’s Law pace, albeit at a longer than two-year cadence. But it does afford the same rate of transistor density increase per year, as traditional Moore’s Law scaling. And the same rate of cost per transistor improvement, as traditional Moore’s Law scaling. This is really, really important.The economic benefits of Moore’s Law are intact.
So let me come back to this, why do we chose to hyperscale? It’s because the 193-nanometer immersion lithography with a single pass can only get you down to a 80-nanometer pitch. If you want to scale below 80-nanometers, you have to have more than one pass through the lithography and etch tools. When you do that, you add to the cost of fabricating the wafer. If you don’t scale faster, you don’t make-up for that cost. And so, to maintain the economic benefits of Moore’s Law, we have to invent these new patterning schemes to extract the full cost per transistor benefit of those multiple passes through the lithography tool that you are going to pay for regardless. So hyperscaling really would not be possible without these innovations. These include self-aligned dual-patterning at the 14-nanometer node, selfaligned quad-patterning at the 10-nano node. Along with some of the other innovations that I have disclosed for the first time today, contact over active gate, and single-dummy poly with an advanced FinFET transistor.
So, these hyperscalinginnovations really allow Intel to continue the economic benefits of Moore’s Law. And this results in this slide, which you’ve seen now multiple times, Moore’s Law is alive and well at Intel. So, let me conclude. Intel’s 10-nanometer process technology has the World’s tightest transistor and metal pitches, the tightest pitches in the industry along with unique, hyperscalingfeatures contact overactive gate, single dummy poly that really provide us leadership density. And returning to the theme of hyperscaling, you know Moore’s Law has been an economic reality for us for the last 40 or 50 years of this industry and Moore’s Law is alive and well. Hyperscalingtechniques allow us to extract the full value of the multi-pass–patterning schemes and allows Intel to continue the economic benefits of Moore’s Law, which has really revolutionized our whole world.
End of Kaizad Mistry presentation.
Article by Steven Vannelli, CFA – Knowledge Leaders Capital