ARCHITECHT Daily: Nvidia's AI beast, Microsoft's global database, and China's OSS opportunity

Fair warning: This is going to be a big, Sunday-newspaper-size issue because it was a very busy week
ARCHITECHT
ARCHITECHT Daily: Nvidia's AI beast, Microsoft's global database, and China's OSS opportunity
By ARCHITECHT • Issue #74
Fair warning: This is going to be a big, Sunday-newspaper-size issue because it was a very busy week (five big conferences) and I gotten taken down by a bug (the microbiological kind) that resulted in me not publishing yesterday. Here we go!
Nvidia jacks up the performance
As soon as I published Wednesday’s newsletter asking how long Nvidia will dominate the market for machine learning hardware, I read the press release announcing its Volta-processor-based V100 GPU. According to Nvidia, the thing is a beast when it comes to performance, helped along by 640 “tensor cores” that are optimized for deep learning. According to an Nvidia benchmark, throughput on the Caffe2 deep learning framework increases by 2.5x using the V100 over the previous generation of Nvidia GPU.
Nvidia also made a number of other announcements, including some pricey desktop-sized “supercomputers” packed with GPUs; a partnership with Toyota on autonomous cars; and a game-showy award of $1.5 million in cash, split among six AI startups.
The V100, especially, will probably help cement Nvidia’s status in the machine learning and artificial intelligence space for a while longer as more workloads come online and big GPU buyers like Facebook refresh their gear. But as I’ve been saying for a while, and others are now pointing out, there’s lots of competition from Intel (although this “editorial” is, as the kids say, a little thirsty), cloud providers and startups alike. Fear not: we’re still in for a wild ride.
Microsoft and the movement toward cloud parity
I assume most readers have already gotten their fill of news from Microsoft Build, but let me recap some of the highlights just in case:
If you want an omnibus blog post highlighting more of the cloud- and data-center-focused announcements, check out this one from Azure boss Scott Guthrie. 
If there’s a high-level takeaway from what Microsoft announced, it’s that there probably will be no killer app among the big three cloud providers: all will eventually offer very similar core services around compute, storage, databases, AI, IoT, etc. Decisions on which cloud to go with will be made around the edges, which makes business decisions around things like price, security, risk-acceptance and target market much more important.
That being said, I do think Cosmos DB is pretty important because well, databases are still a very big deal (as you’ll see if you read the links below) and it serves as a reminder from Microsoft that it’s more than just “the cloud provider that knows the enterprise.” The company is also an innovation factory and has been building webscale systems for a long time. 
So while very few companies today are going to choose their cloud provider solely because of a somewhat futuristic database service (or any other single service)—whether that’s Cosmos DB, Google’s Cloud Spanner or whatever globally distributed service AWS is no doubt working on—they will take note. In a fight for mindshare, , especially developer mindshare, against Google and Amazon, Microsoft cannot afford to look like a technological laggard.
Open source: It’s big in China
Seriously. Watch this OSCON presentation from Ying Xiong of Huawei to get a sense of how big the OpenStack community there is. Or check out this Chinese startup called EasyStack, created by a team of former IBM China engineers, which just released a hybrid OpenStack-Kubernetes platform and in November raised a $50 million funding round.
Five years ago, I spent a couple weeks in Beijing speaking with startups and some investors, and was surprised to hear how hundreds or even thousands of people would show up for low-profile around projects like Hadoop or OpenStack. Chinese web companies, including Baidu, are some of the biggest users of technologies such as Spark and Mesos—we’re talking massive clusters, in some instances—even if they don’t get a lot of attention in the United States.
I don’t know that trying to capitalize on the Chinese opportunity is necessarily worth the risk for, say, American open source startups looking to boost revenue, but it’s certainly a topic worth paying more attention to for the open source community at large.

Sponsor: Cloudera
Sponsor: Cloudera
Listen to the latest ARCHITECHT Show episode
Speaking of innovation in cloud computing, the latest episode of the ARCHITECHT Show podcast features Google VP of Infrastructure, and CAP theorem mastermind, Eric Brewer. I haven’t been able to write up the highlights yet, but suffice it to say that he has a lot of interesting things to say about everything from Kubernetes to Spanner, and from Hadoop to TensorFlow.
Listen here:
Artificial intelligence
This seems like a good exit for MindMeld (formerly called Expect Labs), which has been building a conversational AI platform for several years now. Cisco wants to incorporate the tech into collaboration products, which is a diversion from the digital assistant space where MindMeld had been focusing.
This kind of work will be important as smart-home systems proliferate and companies try to advance verification beyond passwords. Of course, there’s always a privacy concern as these platforms are able to learn more about users.
At least according to a survey by ServiceNow. I’m optimistic this is true—more revenue should mean more opportunities to expand businesses—but the quality of jobs will matter.
More than 200 of them, in fact. There’s obviously a place for them, especially when it comes to automating “mind-numbing” administrative tasks. For customer service, though, I still prefer a human if I’m calling.
Deep learning is great for lots of pattern-recognition tasks, but training models can be slow. This research suggests that shallow models work best for predicting bond prices accurately and fast.
Given recent attention on the Canadian AI scene, Fluent.ai’s location in Montreal might be more noteworthy than its technology.
Is it possible that PowerAI is IBM’s superior effort in AI, over Watson? It does focus on Power systems rather than the cloud, but it’s speaking the language that most practitioners care about.
Speaking of IBM, here are some highlights from David Ferrucci, one of Watson’s creators, speaking on a podcast about his new startup called Elemental Cognition.
I remember covering Context Relevant at Gigaom, when commercial machine learning was really taking off. After raising a fair amount of money and going through some internal changes, it’s now a new company.
This is a remarkably mundane but, I must admit, potentially useful application of image recognition. Organizations selling ads might love it or hate it.
When I first saw this, I though it was another thing to add hats and mustaches to selfies, and was disappointed. This is a little more interesting in that it makes emojis of you.
This whole discussion is disappointing on many levels. Kudos to Facebook for finally trying to address it, but creating a platform that encouraged it in the first place is another issue.
If there’s one area where I’m really skeptical about applying machine learning and AI, it’s to anything involving law enforcement or facial recognition beyond a very limited set of purposes. There are lots of reasons to bring more machines to policework or the courts, which have hardly been perfect over the years, but sometimes algorithms serve to exacerbate existing problems. Here are some more advances in those realms, for better or for worse:
Listen the the ARCHITECHT Show podcast. New episodes every Thursday!
Listen the the ARCHITECHT Show podcast. New episodes every Thursday!
Cloud and infrastructure
Wow! I never would have predicted Stitch Fix makes that much money. If you want to learn about its tech stack, check out this interview I did with its CTO, Cathy Polinsky.
This is a pretty big deal in the world of open source databases, especially viewed in the context of Cloud Spanner and Cosmos DB. In the podcast interview with Eric Brewer (above), he mentions future Google Cloud support for the open source CockroachDB tech.
Liqid is a startup that’s apparently taking some cues from Facebook and the Open Compute Project, as well as technologies like Mesos, by trying to make resource levels customizable and on-demand.
This is the second in a series of blog posts from Kubernetes co-creator and Heptio co-founder Craig McLuckie. He’s understandably biased, but this is the world we’re heading for, and Kubernetes will be a part of it.
This review of some of the public soul-searching at this week’s OpenStack Summit is really interesting. OpenStack is not exactly the slow-motion trainwreck some people paint it as, but something has to give.
It does indeed have legs, even years after I wrote this possibly flawed headline: “Facebook trapped in MySQL ‘fate worse than death’.”
Among developers and the cloud-native/microservices crowd, for sure. Its big challenge would appear to be monetizing its various products, especially as components of broader stacks or ecosystems.
redmonk.com  •  Share
Building data centers is a still big, important business—especially for companies like Google and Facebook that drive outsized proportions of internet traffic. Data Center Knowledge has details on two new webscale facilities, and one very interesting colo facility:
Media partner: GeekWire
Media partner: GeekWire
All things data
You won’t be surprised to learn that Kafka is a big part of its infrastructure, but Cloudflare’s choice of the ClickHouse OLAP system from Russian search engine Yandex might turn some heads.
This is a fair, if overly optimistic, assessment of Teradata’s shifting business model and product lines. It still makes more money than the big three Hadoop vendors combined, but open source and the cloud will be a tough combination to overcome.
This is a good interview with Metamarkets founder and CEO Mike Driscoll, who, apart from being the inaugural guest on the ARCHITECHT Show, is also an experienced engineer in the world of big data.
I could have sworn we had laws and regulations in place to standardize how federal data is formatted and released (in fact, we do!), but we’re not there yet. Don’t hold your breath on this getting any better anytime soon.
gcn.com  •  Share
Here’s some good advice, especially if you’re just getting started deploying these types of systems. There’s lots of info about best practices for reliability engineering now, so there’s really no excuse for not doing it.
svds.com  •  Share
Did you enjoy this issue?
ARCHITECHT
The most interesting news, analysis, blog posts and research in cloud computing, artificial intelligence and software engineering. Delivered daily to your inbox. Curated by Derrick Harris. Check out the Architecht site at https://architecht.io
Carefully curated by ARCHITECHT with Revue. If you were forwarded this newsletter and you like it, you can subscribe here. If you don't want these updates anymore, please unsubscribe here.