ARCHITECHT Daily: Don't forget that Hadoop and Spark still own data workloads

Just a heads up that I'll be in Seattle part of this week at the GeekWire Cloud Tech Summit. This wil
ARCHITECHT Daily: Don't forget that Hadoop and Spark still own data workloads
By ARCHITECHT • Issue #91
Just a heads up that I’ll be in Seattle part of this week at the GeekWire Cloud Tech Summit. This will almost certainly affect the time you receive these emails Tuesday through Thursday.
For many companies, the path to adoption of any new big data technologies runs right through Hadoop and Spark—into which many companies, especially large enterprises, have already invested millions of dollars and man-hours. The popularity of these two technologies can be easy to forget amid all the talk about artificial intelligence, IoT cutting-edge cloud data services, but the reality is that they’re still focal points in so, so many data environments. 
Researchers and companies alike are trying their damnedest to create better, faster, easier alternatives, but moving mountains of data and rebuilding pipelines is hard. It’s not like the Hadoop and Spark communities are sitting idle, either. We’ve seen companies in this space embrace the cloud, IoT and deep learning with new products over the past few months (longer, really), and now the respective Apache projects are getting upgrades, as well.
As this handful of items from the past week or so highlights, it’s far too early to move Hadoop and Spark to the dustbin of history:

Sponsor: Cloudera
Sponsor: Cloudera
Listen to the latest ARCHITECHT Show podcast
CoresOS CEO on Kubernetes, containers and coopetition with cloud providers
In this episode of the ARCHITECHT Show podcast, CoreOS co-founder and CEO Alex Polvi talks about why Kubernetes is so popular right now and how microservices can power next-gen applications.
Artificial intelligence
This is the WSJ’s headline, and it speaks to some of what I referred to yesterday. Even with Siri and a new machine learning development tool in CoreML, Apple is playing catch-up in AI. Whether that ultimately matters much remains to be seen.
There are so many good uses for AI around agriculture, economics and health care, but also so many ways for governments in some countries to abuse their power.
The author has some criticism for the TensorFlow team at Google about adding too many features too fast, and not being opinionated enough about what are the preferred options in some cases.
Using machine learning to rank and surface human experts seems possibly like overkill, and also pretty ironic.
I continue to believe that AI for HR is not a great idea (I don’t see the value, and it seems ripe for latent discrimination) and, yet, startups continue raising money to do just that:
One marketing stunt after another. Critics love to dig into Watson, and IBM isn’t helping out by focusing on this type of partnership rather than real technological or business breakthroughs.  •  Share
An AI-for-law company called ROSS Intelligence says it’s moving its R&D work from Silicon Valley to Toronto. That’s a small step toward Canada’s quest to become the world’s AI hub, but it will need a lot more.  •  Share
This is interesting research into building an system that uses multiple techniques in order to read, analyze and answer questions about images.  •  Share
Sponsor: DigitalOcean
Sponsor: DigitalOcean
Cloud and infrastructure
That’s more than $231 million in total, and the company claims its revenue tripled from 2015 to 2016. I’m not so familiar with the product, but that kind of growth and amount of top-tier investment suggests it’s doing something right.
Webscale buyers might be waiting for new Skylake processors, but everyone else is moving to the cloud (check out that ODM line) or at least trying to consolidate using containers.
Speaking of declining server sales, HPE has been hit pretty hard overall, but also by Microsoft specifically lately. I’m biased, but I think the work mentioned here around Docker and Mesosphere DC/OS is a good idea, although it does put software at the forefront.
I think it’s really interesting that Google has embraced this Netflix open source project, which is designed to run across many cloud platforms, and ran with it. Google’s “open cloud” approach could work out in its favor if it becomes a focal point of the business strategy there.
Asgardia is a cool name for a project aiming to build an independent nation in space. There are many questions about the viability of its plans, but it’s not the first group to talk about doing computing using satellite networks.
Media partner: GeekWire
Media partner: GeekWire
All things data
I’ve been covering Placed since it launched several years ago, with the mission of using mobile data to track where consumers shop in the physical world. It could be a good fit for Snap in terms of selling ads and verifying that they’re working.
An analysis from the folks at O'Reilly about what job titles like data scientist, data engineer and machine learning engineer mean, and how they relate to each other. I suspect the latter two will grow increasingly close.
Did you enjoy this issue?
The most interesting news, analysis, blog posts and research in cloud computing, artificial intelligence and software engineering. Delivered daily to your inbox. Curated by Derrick Harris. Check out the Architecht site at
Carefully curated by ARCHITECHT with Revue. If you were forwarded this newsletter and you like it, you can subscribe here. If you don't want these updates anymore, please unsubscribe here.