Forget Moore’s Law. Algorithms drive technology forward
Everybody loves to talk about Moore’s law—which states that transistor counts on a microprocessor double about every two years—when talking about speed and productivity improvements in computing. But what about the gains that come from new algorithms? Those gains may exceed ones from hardware improvements.
Algorithmic improvements make more efficient use of existing resources and allow computers to do a task faster, cheaper, or both. Think of how easy the smaller MP3 format made music storage and transfer. That compression was because of an algorithm. The study cites Google’s PageRank, Uber’s routing algorithm, and Netflix video compression as examples.
Many of these algorithms come from publicly available sources—research papers and other publications. This “algorithmic commons”—like the digital commons of open source software and publicly available information—helps all programmers produce better software.
But where do these algorithms come from and who provides the research to develop them? New research from MIT sheds light on who’s building our algorithmic commons and why. We dug into the research and then asked the study’s principal researcher, Neil Thompson, research scientist at MIT’s Computer Science and Artificial Intelligence Lab (CSAIL), about what this commons means to the field of computer science.
Algorithms: Made in the USA
The research looked at where the researcher who produced the algorithm was born and where they were when they produced the algorithm. And the US was top in both.
38% of all algorithmic improvements came from researchers born in the United States, but a whopping 64% of improvements were made by researchers working in United States. That means that, while the US education system has nurtured lots of computer scientists, it’s even better at attracting top talent to research institutions. Outside the US, richer countries generally made more contributions to the commons.
The researchers note that there is room for change and improvement here. A nation’s contribution to the algorithmic commons is not related to population, but to per capita GDP. “This suggests that algorithm development may exhibit the ‘lost Einsteins’ problem seen in the creation of patentable inventions (Bell, Chetty, Jaravel, Petkova, & Van Reenen, 2019), wherein those with natural talent never get to benefit the world because of a lack of opportunity.”
A positive feedback loop?
The vast majority of algorithmic improvements came from public and non-profit institutions like colleges and governments. The top contributors were those universities that ranked highest in computer science: Stanford, MIT, UC Berkeley, and Carnegie Mellon.
Private institutions made contributions, too, including IBM, Bell Labs,and the Rand Corporation. But one wonders why a for-profit institution would give away their competitive advantage so that their competitors could have access to their research?
Q&A with Neil Thompson
Q: Of the algorithms that you looked at, how many were proprietary and how many were published and freely available?
A: To be included in our data, algorithms needed to be made publicly known at some point so they could make their way into textbooks and the academic literature. We don’t know how quickly this happened for cases, like IBM, where the algorithms where initially developed privately.
Q: For algorithms developed at public institutions, who funds them? Is there any expectation that the funder will get the benefits of the algorithm?
A: This is something that we are looking into, but we don’t yet know the answer. Since many algorithms are developed by mathematicians and computer science theorists, we would expect them to be funded through a mix of overall university funding that provides academics the time to work on these problems, and organizations like the National Science Foundation.
Q: For proprietary algorithms, do you think there’s a benefit to sharing them publicly for the creators?
A: To answer this question, we can draw on the analogy of open source software, where firms also disclose information that could otherwise have been kept proprietary. In open source, one of the big benefits is having the community coalesce around your topic and contribute to developing it. So, for algorithms, I would expect the biggest benefits of public disclosure to come when a firm wants to recruit outside algorithm designers to further improve upon the firm’s work.
It’s also quite likely that firms would get a reputational boost from making important algorithms public, which might help with recruiting and retaining talented computer scientists.
Q: Why do you think people and firms contribute to the algorithmic commons?
A: There are clearly a mix of motivations for contributing to the algorithm commons. One motivation is pragmatic: there is a problem that the discoverer needs a better algorithm to solve. This is probably the most important innovation for many companies, but is also common amongst user-innovators across many fields. Another motivation is intrinsic: there is something fun about teasing apart a tough question and finding a clever solution.
Q: When a group of related firms share the same algorithms, what determines which software is better?
A: Having better algorithms is like a chef having better ingredients. It isn’t a guarantee that you’ll get a better meal, but if the chef uses them right, you will. The same is true for algorithms. Give a software designer better algorithms and they have the potential to build something better than the software designer that doesn’t.
Q: What, in your opinion, are the most important algorithms for industry productivity?
A: This is an excellent question, but one we don’t yet know the answer to. Check back with me in a year or so – I hope to have an answer then!
Tags: algorithims, mit, moore's law
26 Comments
> Everybody loves to talk about Moore’s law—which states that transistor counts on a microprocessor double about every two years—when talking about speed and productivity improvements in computing.
It’s really remarkable to write such an article and and not state what people really love to talk about these days: Moore’s law is dead!
Also, nothing about this article is new. People have realized this over a decade ago.
> Moore’s law is dead!
Moore’s law refers to the doubling of transistor counts, which isn’t dead. See chart here: https://www.electronicsweekly.com/news/business/transistor-counts-still-doubling-every-2-years-says-ic-insights-2020-03
> Also, nothing about this article is new. People have realized this over a decade ago.
Neil Thompson’s research got published specifically because it *does* contribute to our understanding of this area. Of course people have previously studied it – and for a lot longer than a decade. I think it’s a useful blog post because improvements in algorithmic efficiency seems to be less talked about. Here’s another good post from OpenAI on this topic: https://openai.com/blog/ai-and-efficiency/
I just want to plainly say here, with no hostility at all, that your comment doesn’t really contribute positively to the discussion. You should try try not to make comments on articles unless you have something interesting to say.
The law is not dead — computers really do continue to become “beefier” at about same pace. Individual CPUs may not increase in *speed*, but the core-counts do go up. And so do the on-die caches. And RAM-sizes — both on the motherboards and on the peripheral cards, such as graphics, network-interfaces, and storage-controllers.
In 1996 I struggled to achieve the sustained I/O rate of 5Mb/s — using a striped array of 6 SCSI-drives across two EISA-controllers. Today I can see 120Mb/s on a single drive (magnetic). 10GB Ethernet is common place, and — for only about $70 — I connected my two main computers with Infiniband cards (used off of eBay) for much higher MTUs and 5 times lower latency.
Hardware has improved *tremendously*.
That our “computing experience” did *not* increase 1024-fold over the last 20 years, is the fault of the programmers and the admins, who’ve taken the lion’s share of improvements to make *their* jobs easier.
“C is hard, let’s use Python!..” — say the programmers, who are unable to distinguish a millisecond from a microsecond.
“This package wants JRE 1.X.Y and this one — 1.X.Z, so I’m going to install them into **separate virtual machines**” — say the admins, screwing up any possibility to *share* the common files, which using the same JRE would’ve allowed… **Because hardware is cheap**
And so on
A reliable source recently told me that Moores Law is not dead at all, and in fact recent advancements in ultraviolet lithography are going to push it forward again over the next few years.
Btw: It was Jim Keller (https://www.youtube.com/watch?v=c01BlUDIlK4)
There is long-established law regarding algorithms improving performance, called [Proebsting law][1]: optimization advances double computing power every 18 years. Algorithmic improvements’ contribution is 1/15 of our computing abilities.
Also this article is incomplete without a single reference to”Khazzoom–Brookes postulate” or “tragedy of commons”.
[1] https://proebsting.cs.arizona.edu/law.html
I had a similar reaction. I thought that we were fast approaching or were at the point where the main driver for Moore’s law was constrained by fundamental physics. I’m not a hardware guy so I may be stating this wrong but my understanding was Moore’s law was primarily driven by the fact that VLSI advances kept making it possible to squeeze more transistors into the same or smaller sized chip. However, due to some law of physics you can only get transistors to be so tightly packed when quantum effects or some physics law constrains you from packing them any tighter. My understanding was that we are essentially there but improvements in other technologies such as flash memory (and better algorithms) have made it seem like Moore’s law is still working but really it isn’t.
Algorithms are cool. But what about languages?
Alan Kay has an interesting interpretation of Moore’s Law(s): he interprets them as doubling the level of abstraction every two years. And he is arguing that we should also have a Moore’s Law for Software: the level of abstraction should increase exponentially. Which means the amount of code needed to accomplish a certain goal should *decrease* exponentially.
As an example, he gives the Smalltalk system: it has about 60000 lines of code all inclusive, from the microcode in the CPU, the boot loader, device drivers, kernel, VM, interpreter, compiler, garbage collector, graphics libraries, windowing system, desktop, editor, debugger, IDE, browser, office suite, etc. 60000 lines of code in total.
If we *did* have a Moore’s Law of Software, a modern desktop system should be significantly smaller than that, because we have much more powerful languages now than we had back in the 1970s. However, it turns out that the opposite is true: Windows is about 50 *million* lines of code. The Linux kernel *alone* is about 10 million lines of code. 90% of that are drivers, 90% of what is left is architecture code. That means the very basic core of Linux, which does only memory management, task scheduling, some namespacing and provides the framework for drivers is still more code than an entire desktop system was in the 1970s.
Alan Kay believes this shouldn’t be the case. His study group was working on a new system whose goal was to replicate what Smalltalk did, but in less than 20000 lines of code. This number is not arbitrary: Alan Kay believes that the best documentation of how a computing system works is the source code. However, in order for this to work, the source code of the system must be understandable by a single person whose main job is not programming. The instruction manual for a car, or a complex tool, or a college textbook for a single class about a single subject for a single semester is about 20000 lines. So, this seems to be what is acceptable as documentation for a machine.
Unfortunately, the project ran out of funding before they were finished, but I believe they managed to get down to about 30000 lines. And they did this mostly by designing better languages.
For example, they designed a language which can do exactly one thing: describe parsers for network packets. They designed another language which can do exactly one thing: describe state machines for networking protocols. Using those two languages, they were able to implement TCP in about 30 lines of code, as opposed to several 100 that are used even in the tiniest, simplest, most optimized network stacks for embedded microcontrollers.
Now, you might say: well, but how many lines did it take to implement those two languages? Aren’t they basically cheating?
Here’s the trick: they designed another language which can do exactly one thing: describe interpreters for languages. The implementations of each of those two languages mentioned above are only about 20 lines of code each, if I remember correctly.
Now, you might say: well, but how many lines did it take to implement *that* language? Aren’t they still cheating?
Here comes the really cool trick: that language is implemented in itself, in only about 30 lines.
And they did this in other areas as well: they invented a new text layout algorithm, and they invented a new language for describing text layout algorithms, then implemented that algorithm in that language. The result is almost as good as TeX, but only a fraction of the code. I found a JavaScript (itself already a somewhat expressive language) implementation of the TeX line breaking algorithm in about 560 SLOC. The implementation of Kay’s group was about 10 lines, if I remember. (Oh, any I ignored the fact that the JS code I found depends on JQuery and one other library.)
So, I would argue that *languages* are just as important as *algorithms*. In order to tackle the problems of the future we need all three: better languages in which to express better algorithms to be run on better hardware.
Do you have links to work by Alan Kay where he talks about what you’ve mentioned in your comment, it is a fantastic comment, I’d like to learn more. I’ve done some searching on Google but it’s showing a lot of stuff and nothing specific. Cheers mate.
You might try this first:
http://www.vpri.org/pdf/tr2012001_steps.pdf
There are some other interesting papers on that site too.
You reply is so cool. Why not make it into your own blog post?
Agree with comment above, fascinating post, please blog about it!
Allen Kay is someone I really respect. When I worked at Accenture we paid him to come and talk to our R&D group for a day. At first I thought it was a complete waste of time and money but then he started talking and I realized there are some guys that are worth paying just for a few hours of their time. This was back when most consultants were still programming in COBOL and “portable” PC’s were what my friend called “luggables” (anyone remember the original Compaqs?).
Alan talked about the Dynabook (which was essentially what is now the iPad), how Computer Generated graphics would eventually dominate movies (this was way before Pixar or any serious use of CGI in movies). He also showed us a video of the “Mother of all Demos” from another visionary Doug Engelbart. Sorry, I’m reminiscing, I would also like to see the reference to see what he actually said.
I’ve always been skeptical of domain specific languages. We went through this in the early years of OOP. Many people were advocating domain specific languages but others argued that having one good general language and lots or reusable libraries for different domains was the way to go and I think at least so far that has been mostly true. There are always specific domains such as VLSI design or writing security protocols that are provably correct where specialized languages are worth the effort but mostly I think the general approach has proven to be the best for the most part. The stories about software bloat are IMO kind of irrelevant. It is an unfortunate fact of life that companies love to pack new features into software that often aren’t needed but it happens and you can’t compare the size of Windows now with the size it was in the past.
I’d question the assumption that “lines of code =~ complexity”. Software needs to be written in order to be read. If someone needs to learn 15 languages in order to understand the full system, I’d argue it is not a system that can be extrapolated into a real team. Think about bringing on a new team member, or having a relatively new team member jump into one part of a system – they’d need to learn an entirely new language before they can even touch what they’re looking at.
“Alan Kay has an interesting interpretation of Moore’s Law(s): he interprets them as doubling the level of abstraction every two years. And he is arguing that we should also have a Moore’s Law for Software: the level of abstraction should increase exponentially. Which means the amount of code needed to accomplish a certain goal should *decrease* exponentially.”
I respect Alan Kay, and I’m sure he’s put more thought into this than I have, but I just don’t see how this can be even remotely true. I’m still programming in C, the language I started with nearly 30 years ago. Other languages are slightly more advanced, but not exponentially. Looking at the Tiobe index, the top 10 most popular languages are all at least 25 years old (even Python).
Maybe Alan Kay is putting that out there as a goal, not as a description of what currently is. But even as a goal, it is non-attainable. Projects last for years, you want to build on previous work rather than rewrite the wheel. If every project was written in a different language, maintenance would be a nightmare, good luck finding a programmer who remembers the intricacies of the language for a simple bug fix or new feature. There’s a reason why languages last as long as they do.
Moore’s law isn’t dead, FLOPS/$ has increased 59 times over the last 10 years (doubling every 20 months) https://en.wikipedia.org/wiki/FLOPS#Cost_of_computing
Hmmm – An MIT researcher concludes that most new algorithms originate in the US. No danger of bias here, obviously. I wonder how much of the research done in China and Japan he studied? Did he assume that if it’s not published in English, then it doesn’t count?
The US is singularly blinkered when it comes to awareness of what’s going on outside its own borders. This has improved since the advent of the internet, but it remains the case that rediscoveries by Americans of ideas first published elsewhere are often cited in preference to the earlier overseas publication.
In addition, algorithmic ideas are much more likely to be patented in the US because of its lawyer-dominated culture.
One study does not science make. And according to the write up here, this study seems significantly flawed in some pretty obvious ways. Algorithms do nothing at all. Implementations do the work, and the implementation of an algorithm is often suboptimal for various reasons. And, while an improved algorithm’s implementation may offer an order of magnitude increase in performance it will face sparse adoption, whereas a 2x increase in CPU speed with an accompanying drop in power will benefit everyone who buys that CPU architecture without having to reengineer an application to gain that benefit. And, the boost will happen again approx. 2 years later. And again 2 years after that.
Take for example a new fast binary search algorithm. How long do you think it’ll take Oracle to implement it in their enterprise database? 10-years? Never? This example algorithm has no impact whatsoever outside of academia and a handful of niche open-source databases that are built upon it. Even if it is the best algorithm in the world, it won’t make the slightest bit of difference. And while the algorithm is sitting there unused for 10 years, computing power will have increased 32x.
It is just business, by releasing the last wersion of their algorithm to the commons, they are raising the floor and thinning the herd. Small operators don’t have the manpower to keep up, so they have to either get new talent that was in the loop in university or get sold. It keeps the churn turning with the top dogs holding the podium and picking the tunes.
I’m disappointed that the headline editor felt a need to write, “Forget Moore’s Law”. The article doesn’t say Moore’s Law is (or was) unimportant, it just makes the claim that algorithms are _also_ important. So the headline feels more clickbait-y than a site like Stack Overflow should stand for.
And, the survey of algorithms seems to have a huge selection bias. The article starts by talking about “algorithm improvements” in general, then makes the unsupported claim that “Many of these algorithms come from publicly available sources…”, then talks only about those algorithms from publicly available sources. I don’t accept so readily that the algorithm improvements from private sources which made a difference are only a small fraction of the total. I can recall, from my own experience, two classes of proprietary algorithms from which my employers derived great value. The first are those which “make more efficient use of existing resources and allow computers to do a task faster, cheaper, or both”, as the article says. The second are those which describe the details and irregularities of an external, real-world system, using code. This “know-how” can form a considerable barrier to entry, and can remain proprietary forever.
I think this article is an interesting interview about a census of publicly-available algorithms, and their creators — but it is undermined by editorial choices to punch up the headline with clickbait, and to set up a misleading frame of all algorithms rather than publicly-available algorithms.
Jingoistic. Should be 38% of algorithms published and described via the English language.
Some parts of the paper are very worrying indeed.
1. The Contribution function has a country’s GDP in it . Utility comparisons of GDP are very subjective
The paper does try to account for bias but the calculations rely on a linear interpretation of GDP vs efficiency. Research money is an outcome of marginal GDP, e.g. the spare money after all living expenses have been removed.
2. The paper says “We add a dummy variable for the USA reflecting its outsized role in algorithm development”
I cannot see how the dummy variable is used, but this looks like confirmation bias through and through
3. Only papers published in English are considered.
Cutting-edge research may not be understood for many years before it is discovered
4. There may also be a bias towards papers that deliver more direct financial gains.
A pure financial measure is not a good measure of how Important the paper is
Like Facebook’s algorithm, this paper seems to be in its own echo chamber
Moore’s “law” is a rather dated empirical observation, not an enabler. It’s interesting that the author or anyone would take it for granted even in a glib tag line.
Moores Law is not really applicable any more due to the issues related to heat and the “power wall”, hence the emergence of multiple cores and parallel computing.
One thing not mentioned, is the requisite of higher performing computing, for the advanced utilization of data. Storage, compute, communications, these together make possible data tasking that was not previously feasible.
Greater hardware capabilities enable newly feasible things to do in software, new things for which to create new algorithms.
I work on cognition engineering (not artificial intelligence), of which hardware capabilities are just barely starting to make possible, this endeavor. The enormity and concurrency of engineered cognition cannot be “coded around.” And this isn’t difficult to realize.
It is correct though, that none of these advances occur because the hardware is better. It is just that until hardware is capable, there is little to do about it. In the early days, we hacked around hardware limitations, and got practical results, given the resources, thus the artificial “intelligence” moniker.
Of note too: without access, it doesn’t exist. Propriety paradigms are without access.
If you really want to speak to things that would help algorithms, speak to corrupt intellectual property rights laws that prevent access and distribution. Algorithms are covered by patent law, where as software is covered by copyright. Neither of these are appropriate for algorithms or code. This needs to be remedied, and that remedy would be a celestial breakthrough in technology. Free and open, not closed, litigious NDA garbage.
I think part of the problem is that people don’t really seem to value software improvements. People will pay $100s more for a processor, graphics card, or SSD that is just 10% or 20% faster than their current one; but they won’t shell out even $10 for some software that is twice as fast. Sure there are altruistic programmers out there willing to work for free, but nothing speaks louder than some real monetary rewards.
For example, most people use file systems that were invented before 1GB hard drives were available so their search and meta-data algorithms are really slow. I invented a new kind of data management system that can manage unstructured data (among other things) like a file system, but can search for things like photos, documents, or software thousands of times faster than current file systems. Yet because file systems are free, too few people want to give it a second look.