Don't sleep on Baidu as an AI, and even cloud, powerhouse

Chinese search engine, and general web, giant Baidu made a lot of news on Wednesday, spanning from sm
ARCHITECHT
Don't sleep on Baidu as an AI, and even cloud, powerhouse
By ARCHITECHT • Issue #107
Chinese search engine, and general web, giant Baidu made a lot of news on Wednesday, spanning from smart-home assistants to driverless cars. All in all, the company often referred to as “the Google of China” is looking a lot more like, well, Google. Still largely in China, but spreading its wings globally, as well.
Here’s the news, in case you missed it:
The partnership with Nvidia cover integrations within automobiles (where Nvidia has a dedicated GPU product and Baidu is clearly interested); smart homes (where Nvidia sells its Shield streaming box, and Baidu has developed its Duer digital assistant); artificial intelligence (Baidu is optimizing its PaddlePaddle deep learning framework for the latest Nvidia GPUs); and the cloud (Baidu will offer the latest Nvidia GPUs to its cloud customers). 
Probably not coincidentally, Baidu partner Xilinx announced on Monday that Baidu will also start offering its FPGAs to cloud customers who want to accelerate deep learning and other workloads.
While Baidu’s Apollo autonomous car effort is clearly a big deal and the company has long been viewed as a leader in AI (check out the February podcast interview with then-chief-scientist Andrew Ng for more on Baidu’s plans and accomplishments there), I really never considered the scale of Baidu’s cloud computing business before today. It’s certainly not on the scale of AWS, Azure or probably even Alibaba in China, but the fact that it’s offering customized hardware like advanced GPUs and FPGAs suggests this is not some small-time operation.
(Perhaps Baidu’s hiring of Qi Lu—former Microsoft corporate vice president, and Satya Nadella righthand man—as COO in January should have given me a clue, too.)
I would be happy to have readers correct me on this, but my sense is Alibaba, Tencent and Baidu are to Chinese cloud computing roughly what AWS, Microsoft and Google are to U.S. cloud computing. And like Google, Baidu is betting on its expertise in AI and big data (the company runs research labs in both areas) to help propel its cloud businesses as those areas really begin to take hold within the enterprise. That’s a particularly good position to be in in China, where AI is all the rage right now and where there is a lot of unclaimed cloud business to be had.
It’s also a good position to be in globally, as first cloud computing, and then AI, start to take root in other markets, as well. Alibaba is already establishing quite a broad global footprint, and Baidu could very well do the same. While there might be legitimate geopolitical and legal reasons why U.S. cloud providers never really catch on in China, and vice versa, the rest of the world is a battleground where the competition is likely only getting started.
There’s plenty of good technology, brainpower and competitive pricing to go around, which could make advantages like ubiquitous AI products and a world-class R&D reputation important considerations. If Baidu can beat its competitors to that point, it could start looking a lot more real on the cloud front, too.

Sponsor: DigitalOcean
Sponsor: DigitalOcean
Highlights from the ARCHITECHT Show podcast
These are highlights from the recent podcast interview with Apache Spark creator Matei Zaharia, who is working with Stanford colleagues—including fellow podcast guest Peter Bailis—on a new project called DAWN, which aims to make it easier for everyone to take advantage of artificial intelligence.
Sponsor: Linux Foundation
Sponsor: Linux Foundation
Artificial intelligence
According to the Korean Herald, Samsung is delaying the release of its Bixby assistant (think Siri, Alexa, etc.) because of a lack of data (presumably to train models) and issues developing the English version a continent away from the company’s Korean headquarters.  Samsung also appears to be behind on the hardware and non-AI features of its upcoming smart speaker product. Frankly, I find all of this kind of surprising considering the number of Samsung phones out there, but perhaps Google is collecting all that speech data from existing Android phones?
Google’s DeepMind division is opening a research center in Edmonton, Alberta, Canada, which will focus on reinforcement learning and be led by University of Alberta professor Rick Sutton. For better or worse, this move spreads Canada’s AI research centers across the vast country.
In other DeepMind news, the company also released the first in a series or reports by an independent commission it hired to analyze its controversial health care partnerships in the United Kingdom. Basically, the commission found in this first report that DeepMind didn’t do anything legally wrong, but has to think about some bigger issues around the effects of its Streams product on the workforce in hospitals, and also to address a few small security concerns.
And to complete the trifecta, here’s a blog post breaking down DeepMind’s really interesting relational neural networks paper from last month. If you’re interested in where deep learning research is headed, this is well worth reading.
Including examining a a specific molecule that could help decrease the tremendous amount of energy spent producing fertilizer every year. Just one of many areas where quantum computing could have financial and societal impacts even beyond AI.
This is an interesting study performed using virtual reality to figure out how humans react in certain situations, but we’ll probably need to factor in many other things before this is solved. Car manufacturers and AI systems manufacturers won’t want to accept too much liability, which they might have to if algorithms are clear about how to react.
This probably isn’t news if you follow this space, but sentiment analysis on text is not as simple as reading tweets and choosing “happy” or “sad,” and capabilities are also lagging behind demand for them. This is a space I think has very limited useful applications at the moment, with the most promise in areas like mental health.
This is a good discussion of whether or not a chief AI officer is a necessary position right now. I’d come down on the side of “it’s not,” at least until companies get data figured out first.
Sponsor: CircleCI
Sponsor: CircleCI
Cloud and infrastructure
This is one of the reasons all the delay with Samsung’s AI products surprises me. The company has such a broad device footprint that you’d think it would be able to leverage it. On the other hand, various divisions in large conglomerates aren’t always great at working together.
fortune.com  •  Share
You might have seen this post today from the usually astute Matt Asay. Not only do I not think Google would bail on Kubernetes anytime soon (being open is critical to its cloud story) but the Kubernetes community is too big and well-funded for the project to die.
DataGravity was the latest venture from Paula Long, who co-founded EqualLogic back in the day. No word yet on who bought it (although someone in the comments section claims HyTrust), but the storage startup appeared to have been struggling to find the right business model and product lineup.
Here’s a roundup of a few of the companies trying to sell companies on machine learning software to optimize data center operations. You knew this was coming because AI is so frothy right now, but also because Google and Facebook do this, and it has paid off. Of course, Google would tell you that’s a reason to adopt its cloud, not to trying replicating it.
Adrian Coyler of The Morning Paper blog is highlighting papers this week from the SIGMOD'17 database conference, which took place in May. The headline is for a paper describing Microsoft’s new company-wide storage system (it’s replacing Cosmos), and here’s an update from Google on how Spanner has evolved since its 2012 paper.
This is kind of insider baseball, but Oracle built the runtime, called Railcar, so that RunC, built by Docker, wouldn’t be the only runtime compatible with Open Container Initiative specs. There’s value in variety, but the whole point of things like RunC is to be a commodity, no-frills building block. You risk wasting energy on competition if you don’t differentiate or don’t have a community to keep up on development.
Researchers built a tool called Fractal that makes it easier to code parallel algorithms (like a fraction of the code), while simultaneously providing major performance improvements. It’s worth noting that an Nvidia employee is a co-author, because Nvidia would love the world to use more of its massively parallel GPUs.
It sounds like the 140 jobs in the U.S. and Ireland are part of a broader shift that includes dropping some IoT processor lines. Missing IoT would be a huge loss for Intel, but the company is also working hard on related areas, including AI, that should help move embedded chips.
Sponsor: Bonsai
Sponsor: Bonsai
All things data
I loved having Las Vegas CIO Mike Sherwood on the podcast recently. Here’s more on how he plans to create a smarter city by collecting and analyzing municipal data in real time. 
gcn.com  •  Share
You hear about a lot of new chip and memory architectures, but this 3-D one seems to have legs beyond the lab. The implications for applications and devices needing real-time data-processing could be very large, indeed.
This is kind of in the same vein as the previous chip research, only it’s a single-node database system. At any rate, the trend is toward doing much more work on devices and single nodes as IoT, AI and other edge applications ramp up.
arxiv.org  •  Share
Most 0.1 releases aren’t noteworthy, but BigDAWG counts Dr. Stonebraker among its researchers. It’s a database system comprised of multiple storage engines, currently Postgres, SciDB and Accumulo. Microsoft’s new cloud-based CosmosDB is another multi-model database, so it appears this space is gaining steam.
arxiv.org  •  Share
Sponsor: Cloudera
Sponsor: Cloudera
Did you enjoy this issue?
ARCHITECHT
The most interesting news, analysis, blog posts and research in cloud computing, artificial intelligence and software engineering. Delivered daily to your inbox. Curated by Derrick Harris. Check out the Architecht site at https://architecht.io
Carefully curated by ARCHITECHT with Revue. If you were forwarded this newsletter and you like it, you can subscribe here. If you don't want these updates anymore, please unsubscribe here.