Databricks raises $140 million, and Microsoft unveils its custom AI chips

These are two pretty big items and I have a lot to say, so let me just dive into each with as as litt
ARCHITECHT
Databricks raises $140 million, and Microsoft unveils its custom AI chips
By ARCHITECHT • Issue #126
These are two pretty big items and I have a lot to say, so let me just dive into each with as as little windup as possible. However, I think they’re also more related than it might appear, and I’ll explain why at the end.
Databricks raises $140 million, and has now raised $247 million overall
That’s a lot of money no matter who you are, but it probably seems like even more to folks who follow the big data space and might have spent the last few years questioning the Databricks cloud-only business model. I’ll admit I was a little skeptical, as well, but the bet to more or less avoid becoming a Spark support company appears to have paid off. I could be off-base, but I’ll take this large investment round, the company’s claim of more than 500 customers, and the cloud services now offered by every other big data vendor (and, obviously, cloud providers) as solid proof. 
What I think we’re seeing is a slew of data scientists and software architects looking to build new products or new models, and they’re not constrained by existing infrastructure decisions and policies. There are lots of reasons this might be—including company size/age, departmental infrastructure, settled Hadoop use cases, and a general acceptance of the cloud as a safe, and even preferable, place to run. There’s a TON of competition—including services from Cloudera, Hortonworks, AWS, Microsoft and Google—but Databricks’ reputation as the company behind Spark probably goes a long way (as does a good product).
The other thing that’s really clear if you’ve read the press coverage of Databricks’ funding (especially this Bloomberg piece) is that the company is starting to position itself as a platform for running artificial intelligence workloads. Cloudera, by the way, has done pretty much the same thing, although it talks more about machine learning than about AI. It’s a move that’s both annoying and, probably, necessary. 
I think that people who understand Spark understand where it fits into the picture, and also understand the differences between, for example, classic Spark machine learning jobs and TensorFlow models (which Databricks does support). I also suspect that AI jobs account for a very small, but growing, percentage of revenue across most of the companies pushing their platforms as the best place to run AI jobs (although perhaps a higher percentage at Databricks, given its size, age and product scope relative to others in this space). But if the world has moved on to AI as the next big competitive arena, choosing to stay behind can be a risky marketing decision (even if actual discussions with customers don’t include much AI).
That being said, AI will soon enough be a must-have capability for anybody pushing analytics or data-processing technologies. If you want more details on how Databricks is thinking about the intersection of Spark and AI, you can check out my two recent ARCHITECHT Show podcast interviews with its co-founders:
For good measure (on this and the Microsoft news), also check out the following podcast interviews:
Forget about the architecture for a moment and focus on the big picture: Brainwave is a direct response to Google’s Tensor Processing Units and a strategic weapon in the fight to own cloud AI workloads. This is the end of Microsoft’s blog post on BrainWave:
We are working to bring this powerful, real-time AI system to users in Azure, so that our customers can benefit from Project Brainwave directly, complementing the indirect access through our services such as Bing. In the near future, we’ll detail when our Azure customers will be able to run their most complex deep learning models at record-setting performance. With the Project Brainwave system incorporated at scale and available to our customers, Microsoft Azure will have industry-leading capabilities for real-time AI.
Getting into the speeds and feeds is kind of a pointless exercise because (1) Microsoft acknowledged Brainwave was designed using last-generation FPGAs (and all of these things are always evolving), and (2) actual user performance and experience will depend greatly on how Brainwave is productized as part of Azure and actual customer workloads. But for the sake of comparison, here’s what Google had to say about its second-generation TPUs in May. And here’s the product page.
Also, Microsoft BrainWave supports multiple AI frameworks, including TensorFlow. Google TPUs only support TensorFlow.
Here are a few other thoughts on the impact of BrainWave and custom AI chips, in general:
1. Where is Amazon Web Services in all of this? Perhaps the world’s largest cloud provider doesn’t think it needs custom hardware to make money on AI, and perhaps it would be right to think that. On the other hand, keeping up with the Joneses is kind of the name of the game for AWS, Microsoft and Google. I suspect AWS announce its own AI-optimized hardware option before the year’s end.
2. How meaningful is custom hardware to Nvidia and Intel sales? Nvidia GPUs presently own the market for running deep learning and other AI workloads, but not inside Google or Microsoft. And, if they have their way, probably not for their cloud customers over the long run, either. Yes, all the major cloud providers now offer Nvidia GPU instances and will no doubt make a lot of money renting them, but the point of building and offering your own hardware is to get people using it.
Ultimately, this will boil down to what AI frameworks these chips support, what types of services cloud providers build on top of them, and how they match up with GPUs in terms of price/performance. If we assume the cloud—and that includes Chinese giants like Alibaba and Baidu—will host a good majority of AI workloads over time, then that’s where chip companies are going to look for sales. (Also, bitcoin miners …)
An interesting wrinkle here is that while Intel is often seen as having missed the AI boat, the company actually acquired FPGA maker Altera back in 2014. Microsoft Brainwave is built on Altera chips, and FPGAs, in general, have a lot of potential for AI workloads
However, Intel isn’t alone: Baidu’s FPGA Cloud runs on Xilinx gear. And, in fact, it also unveiled a new FPGA-based AI accelerator today, called the XPU.
3. This is why companies like Databricks need to double down on AI messaging. Because while chips like Brainwave and TPUs are optimized for AI workloads, they’re not optimized to run Spark—at least not yet. So, if you’re Databricks or Cloudera or anybody else, you need to remind people that open, standard platforms running on GPUs can be really good for AI, too. There’s definitely something to the argument that a single data platform for handling ETL, data science, AI, etc., is the way to go, especially when it can run on top of any cloud.
It will be interesting to see how open Brainwave, TensorFlow and anything else become in terms of the frameworks they support. From things like Spark up to various AI libraries, what you can run on these hardware platforms could have major effects in terms of how heavily they’re adopted, and how well existing third-party services like Databricks fare against cloud-provider services and new services built with new architectures in mind.

Sponsor: Bonsai
Sponsor: Bonsai
Artificial intelligence
Security is certainly a more targeted application than being a general-purpose machine learning company, but it’s possibly even more crowded. 
The really interesting thing here is that Ericsson made this investment (Matterport has how raised $66 million in total), which could speak to the opportunity it sees in providing bandwidth for virtual reality.
This is a really interesting story. The short version is that Bitmain, which built its own ASICs for bitcoin mining, is now developing deep learning chips. Now it just needs someone to buy them …
qz.com  •  Share
Click on the headline for a story about its new approach to reinforcement learning, which it claims is “less finicky” than current approaches and more broadly applicable than supervised learning. Also, this article examining the creep of AI into Microsoft Office is worth reading.
Which helped it win a contest at this year’s ActivityNet competition, which focuses on video understanding and classification rather than image classification.
This paper is just what the title suggests. Read it if you’re like, “I’ve heard of this and kind of get it, but would like to know more.” It’s relatively easy reading.
arxiv.org  •  Share
Sponsor: DigitalOcean
Sponsor: DigitalOcean
Cloud and infrastructure
The headline news about Microsoft teaming up with Halliburton to increase the Azure footprint in oil & gas, which could be meaningful considering the amount of compute and data those companies use. Then there’s this deal with Red Hat, which is largely about OpenShift on Azure and Windows Server in OpenShift.
Storage: It’s not sexy, but it continues to be a very lucrative space, thanks in part to the shift toward cloud computing.
Skytap has been around for a long time, and seems to have found its groove helping companies move legacy apps into the cloud. By building full-on replicas of their data center configurations in the cloud.
GoDaddy has made a few attempts at breaking into cloud computing, and Irving even tried (how successfully I do not know) to modernize GoDaddy’s engineering practices. He’s also a really nice guy.
fortune.com  •  Share
Yeah, that’s a thing. I’m not certain, but I believe there’s some work with Mesos in this space, as well. Microsoft’s acquisition of Cycle Computing last week served as a reminder that HPC is still a big deal, and the folks who do it are looking for more modern and flexible options for doing it.
The author argues it will. I argue it will not. Because I think companies care about lock-in and community much more than cost or even features.
If you know Dave McCrory, you know he was an early and instrumental contributor to VMware’s Cloud Foundry efforts, and also helped re-platform Warner Music onto Cloud Foundry. He may also have coined the term “data gravity.”
If you enjoy the newsletter, please help spread the word via Twitter, or however else you see fit.
If you’re interested in sponsoring the newsletter and/or the ARCHITECHT Show podcast, please drop me a line.
Use Feedly? Get ARCHITECHT via RSS here:
Did you enjoy this issue?
ARCHITECHT
ARCHITECHT delivers the most interesting news and information about the business impacts of cloud computing, artificial intelligence, and other trends reshaping enterprise IT. Curated by Derrick Harris. Check out the Architecht site at https://architecht.io
Carefully curated by ARCHITECHT with Revue. If you were forwarded this newsletter and you like it, you can subscribe here. If you don't want these updates anymore, please unsubscribe here.