ARCHITECHT Daily: AI research is becoming a victim of its own success

Apologies in advance if this post is long and kind of unorganized, but I have a lot of thoughts about
ARCHITECHT Daily: AI research is becoming a victim of its own success
By ARCHITECHT • Issue #94 • View online
Apologies in advance if this post is long and kind of unorganized, but I have a lot of thoughts about this issue, which I first started covering in 2013 …
Artificial intelligence is so hot right now that even some researchers in the space think the hype is getting out of control. Case in point: A researcher Yoav Goldberg’s recent takedown of a natural-language processing paper published on the popular Arxiv repository. Criticism about the specifics of the research aside, Goldberg raised some big questions about the prolific nature of AI papers on Arxiv (often before, or in lieu of, peer review) and the notion of “flag-planting"—a race by researchers to stake their claim on a particular method or field, legitimacy of their approach or their results be be damned.
Goldberg’s post generated a fair amount of commentary and criticism, including on the Reddit machine learning forum and a post by Facebook AI head and deep learning guru Yann LeCun. Goldberg himself added some clarifications within a day of his original post. There’s some good discussion (including LeCun’s post) about the pros and cons of Arxiv’s publishing model—especially with regard to who gets credit for what ideas, and how anybody is supposed to keep up with the glut of work being published.
I also want to address a related point brought up by some other commenters, which is a concern that well-known institutions like Google/DeepMind or OpenAI also complement their papers with blog posts, thus leading to amplified attention and credit for their work. Of course they do—and, generally speaking, this is fantastic for the field of AI. 
AI needs blog posts because people need AI
There’s a whole world of people who are not steeped in AI research but are nonetheless interested in the field. And how could they not be, with the CEOs of every company under the sun going on about how AI and machine learning will be the foundation of their companies going forward. Or, better yet, with well-known scientists, technologists, and businesspeople going on about how AI might present an existential risk to mankind, and at best will turn our economy on its head.
Blog posts are the vehicle for putting sometimes ridiculously complex math and computer science into a format that’s easier (even if only slightly so) to consume for lay readers. Blog posts are also a great vehicle for getting journalists to pay attention. Blog posts by large companies or well-known institutions are even better. 
Sometimes, as often happens with Google, this is because blog posts are tied to improvements in applications that millions of people use. When we’re talking about conceptually or technically difficult concepts, the only thing better than a good analogy is a good example. "Deepmind just built a machine that can kick your ass at Pong” or “Android phones can now recognize your voice even on the floor of a Slayer concert” (not real headlines, that I know of) are always going to be more interesting, more relatable and, for most people, more explanatory stories than “Human-level control through deep reinforcement learning” (an actual DeepMind paper title).
Are some of these AI results overhyped in blog posts and media stories. Yes. Are some of these stories reported breathlessly? Absolutely. From a public relations standpoint, that can lead to inflated expectations about what’s possible (for example, some stuff is still very much reserved to the lab; AlphaGo is still just a great Go-playing system; and Watson perhaps should have stuck to Jeopardy!) and perhaps an undue assignment of credit for making a breakthrough that hasn’t actually solved any problem.
However, it’s largely because of the work of companies like Google, DeepMind, Facebook and Baidu, and their willingness to talk about it publicly, that there’s so much activity in the AI field right now. Regardless where they’re publishing, a lot of people* doing AI research right now, especially if it has anything to do with neural networks, can probably thank these companies for that opportunity. 
(*By the way, that is a lot of people. This blog post does a good job visualizing some of the trends, including the fact that nearly 2,000 papers were submitted to Arxiv in March 2017 alone. Also, probably not coincidentally, the number of papers really starts to pick up after November 2015, when Google open sourced its TensorFlow framework.)
Keeping a clear head in a sea of hype
All that being said, I do think the AI community—researchers, companies, reporters and investors, alike—should take these concerns about peer-review and proper credit seriously. Mostly because we live in a world where open source software is a dominant force in enterprise IT, and where AI is fast becoming one. Being until relatively recently primarily an academic pursuit, AI is also rooted in concepts of openness, at least in terms of publishing research results.
But a big difference between research and software development is that research results are not necessarily an analog to beta products or features. When a software company announces a new product that won’t ship until six months from now or is available as a pre-GA release, there’s a good chance it doesn’t work as advertised at the moment. But you can also bet its engineers are actually working on delivering what they promised, because vaporware is bad for business.
When researchers publish results, they might sound amazing but not be tied to anything beyond that specific research, which itself might have very little real-world application or be just a minor improvement on a minor improvement. That’s fine when you’re steeped in the field and can parse the through the good, the bad and the ugly, but probably less fine when the field is one of the hottest things on the planet. All of a sudden, the pace and scope of research becomes nigh impossible for any single human to track (Hello, AI model!), and competition among researchers is augmented by reporters and investors ready to jump on any sign of the next big thing. 
I don’t think there’s any easy answer to any of this, but it’s a big part of the reason this newsletter exists. I’m trying to cut through the noise and share stuff I think is particularly interesting, well-reasoned and, in the case of research, reasonably likely to have some commercial or societal impact. AI being a feeding frenzy makes this goal both necessary and difficult, but to anybody trying to make a go of a career in AI (even if on the periphery), activity and excitement are probably better than the alternatives.

Sponsor: Cloudera
Sponsor: Cloudera
Listen to the latest ARCHITECHT Show podcast
Chef CTO Adam Jacob on building an open source business, and building software that people want
Artificial intelligence
WIRED has published a couple of good articles lately talking about the interaction between man and machine in autonomous vehicles. One of them, a discussion with Audi’s CEO about self-driving cars, raises some good, if not rose-colored, visions about our AI-powered driving futures. The other, about Boeing’s planned pilot-free airplane, tries to be reassuring but it really just terrifying. I predict fear, regulation and consumer demand will force both cars and airplanes to have humans behind the wheel in some capacity for the foreseeable future.
Sponsor: DigitalOcean
Sponsor: DigitalOcean
Cloud and infrastructure
All things data
Listen the the ARCHITECHT Show podcast. New episodes every Thursday!
Listen the the ARCHITECHT Show podcast. New episodes every Thursday!
Did you enjoy this issue?
ARCHITECHT delivers the most interesting news and information about the business impacts of cloud computing, artificial intelligence, and other trends reshaping enterprise IT. Curated by Derrick Harris. Check out the Architecht site at
Carefully curated by ARCHITECHT with Revue. If you were forwarded this newsletter and you like it, you can subscribe here. If you don't want these updates anymore, please unsubscribe here.