First things first

I don't have a whole lot to say about any news over the past few days, but there are some good links below, so make sure to check them out.

However, if there's one thing I have been thinking about, it's the actual value of personal data compared with the harm that can come from mishandling it. This is obviously related to recent controversies surrounding Facebook, but also involves the Equinix breach, data brokers, our smartphones and devices, and every other avenue through which data is collected. It's also related to the forthcoming GDPR regulations in the EU and, perhaps, similar laws to follow around the world.

Specifically, I've been thinking about the connection between mass data collection -- aka "big data -- and technological innovation. Today's data systems were borne of the web and designed to handle massive quantities of data, much of it personal and much of it not. We owe so many advances in Hadoop, Spark, deep learning, NoSQL and pretty much everything to work done at places like Facebook, Google, Yahoo, LinkedIn and their peers. We probably don't owe too much to companies like Apple.

But if we're actually at a moment of reckoning for data and how much we're willing to give up, or that companies are willing to collect, then what becomes "big data" and big data systems? Will innovation pick up in earnest around systems and algorithms designed to do more with less data? Or will sensor data from IoT and other devices, or non-personal data such as drone imagery, continue driving demand for big, scalable data systems?

I guess what I'm wondering is what happens if the tech field as a whole decides that bigger is not always better when it comes to data. It might end up being best for everyone involved -- for reasons ranging from privacy to efficiency -- but the biggest challenge might be whether companies can actually wean themselves from the desire to gather more and build ever-larger systems.

For better or worse, big data is as much responsible for the current state of our digital lives as is anything else. Dialing it back might be harder than we think, but engineering within the constraints that smaller data entails might also bring some serious innovation in its own right.


AI and machine learning

Cloud and infrastructure

Data and analytics