ARCHITECHT Daily: All automation is not AI. It's an important distinction.

The New York Times published an interesting article on Friday, titled "Meet the people who train the
ARCHITECHT
ARCHITECHT Daily: All automation is not AI. It's an important distinction.
By ARCHITECHT • Issue #66
The New York Times published an interesting article on Friday, titled “Meet the people who train the robots (to do their own jobs)”. I say it’s interesting because  job loss due to automation is a big deal right now, and speaking with people who are training some of these technologies is a good way of approaching the subject. I also think referring to most of the highlighted technologies as AI is misleading and, frankly, detrimental to serious discussions about job automation.
You can read the article and make your own judgments (and I’d love to hear why I’m wrong), but my basic beef is this: The intelligence part of AI is not just the interface, but also (primarily?) the backend. And most of the backend systems in this particular piece don’t seem very intelligent. In fact, they seem to be doing things we’ve been able to do for years via traditional data analytics and scripts. 
That’s all fine and dandy, and companies are free to use them to automate certain aspects of their business processes—but presenting relatively dumb software as AI obscures the fact that researchers are making serious progress on systems that could actually be much more intelligent in roles such as customer service and text analysis. (See, for example, this piece on what Maluuba, now part of Microsoft, is working on.) Focusing on technologies with limited capabilities gives workers and policymakers a false sense of what will eventually be possible. This could result in ill-formed opinions about actual risk, and short-sighted policy decisions.
On the other hand, focusing on the wrong technologies doesn’t help employers accurately gauge how and when they might optimize their operations with AI. Whether that’s ultimately better or worse for employees remains to be seen—there’s an argument for both—but it would be good for everyone to get a real sense of what’s coming. 
A handful of observations from the NYT article:
  • Hotel search is pretty much a solved problem, right?
  • Identifying correlations like (users of App X will spend more, or prefer this type of hotel) has been possible at least since the advent of Hadoop. For example, this from 2011.
  • “Shall” is not a vague term in legal documents. It has very specific meaning: The party does this, or the party has breached the contract or broken the law.
  • The real breakthrough in email automation is not suggesting the right reply, but rather answering specific questions that aren’t binary or don’t lend themselves to a website link for more info.
  • I’ve been on the receiving end of an x.ai digital assistant for scheduling a meeting. It worked well enough (with maybe one extraneous email) but I actually felt a sense of unease in suspecting it was a bot and not knowing how personable my responses should be.

Sponsor: Cloudera
The latest ARCHITECHT Show episode is live
In this episode of the ARCHITECHT Show, Kaggle CEO Anthony Goldbloom discusses Kaggle’s journey and Google acquisition, as well as the rise of deep learning and the evolution of AI.
You can check out the archive of ARCHITECHT Show episodes (featuring the folks behind Kubernetes, Kafka, Cloudera, Google Brain and more) here:
The best guests and biggest news in cloud computing, artificial intelligence and software engineering. The biggest news, best analysis and most insightful interviews in cloud computing, artificial intelligence and software engineering.
Artificial intelligence
I could go for a smarter home-security system. If this works and they price it right, it should be a cash cow.
I want to be a naysayer on Chinese companies being able to attract top talent in the U.S., but there’s no reason they can’t do it if they mirror the Google experience as much as possible.
This discussion, here viewed through the lens of predictive policing, is even more important in the AI era. There are times when correlation is enough (e.g., recommending a song I might like) and times when it might violate civil rights.
And data, too, as a point of control over consumers. Cornering the market on data hasn’t been an historical basis for antitrust regulation. Algorithms colluding without (blatant) human intervention is another issue.
www.bna.com  •  Share
If you’re looking for an AI program to pitch your startup to, or from which to look for new investments or vendors, this is a good place to start.
At least, if this new partnership with Indiana University is any indication. Better, faster sensors and calculations are great, especially for informing human decision-makers about what’s up in a given situation.
This article does a good job explaining why the company SigOpt exists—to optimize machine learning models in the most efficient manner. It’s a real problem, but perhaps for a very small audience.
Listen the the ARCHITECHT Show podcast. New episodes every Thursday!
Cloud and infrastructure
Every now and then, it’s good to have a refresher on what’s what among storage options at AWS, Microsoft and Google. Because a lot changes. Here’s your latest installment.
TL;DR: It’s more work than you might expect on the configuration side, but much less work once deployed. At this point, it’s a tradeoff that users need to think about when considering the serverless approach.
This post might not even cover them all, but it’s a good start. From Red Hat to Cloudera to Kubernetes, there’s still a lot we need to figure out in order to optimize the open source and commercial aspects of software.
I found this article both fascinating and puzzling. It’s a good look at the way the movie industry has relied on tape drives for archiving, but (unless I’m missing something obvious) I can’t figure out why digital storage isn’t an option.
There’s a geopolitical angle to this, but also an architectural one. Specifically: When it comes to deploying infrastructure for edge computing, companies that know a lot about data centers (e.g., Google, Amazon and Microsoft) have a built-in advantage.
That’s the author’s sentiment, not mine. But he does make a very good point. You should read it if you’re building a mobile app.
This is cool research. Basically, they show how to maximize both cost-savings and availability in spot/preemptible instances by building a model that takes into account factors such as price variability, risk of losing the server, etc.
arxiv.org  •  Share
All things data
This is a nice chart encapsulating how a collection of companies are using data, and where they’re seeing results. Cost-cutting has been most effective so far, which is not surprising considering it is low-hanging fruit.
hbr.org  •  Share
This is a good (if brief) interview with Cloudera co-founder Mike Olson. And here’s a longer take on the IPO from Michael Coté. The gist of both, which is true for most situations, is that time will really tell whether the company is a success.  
This article gives a fairly detailed looks at the pros of the Look, without going into the cons. Privacy is probably Nos. 1, 2 and 3 on most people’s list of concerns, followed by lack of necessity and the mundanity of a world where we all look like what’s hot on Amazon.
And scraping 40,000 Tinder images and sharing them publicly, via Kaggle, might be that limit. For better or worse (but mostly better), scraping and privacy continue to be limiting factors in data sharing.
Did you enjoy this issue?
ARCHITECHT
The most interesting news, analysis, blog posts and research in cloud computing, artificial intelligence and software engineering. Delivered daily to your inbox. Curated by Derrick Harris. Check out the Architecht site at https://architecht.io
Carefully curated by ARCHITECHT with Revue. If you were forwarded this newsletter and you like it, you can subscribe here. If you don't want these updates anymore, please unsubscribe here.