Apache Spark didn’t merely make big data processing faster; it also made it simpler, more powerful, and more convenient. Spark isn’t only one thing; it’s a collection of components under a common umbrella. And each component is a work in progress, with new features and performance improvements constantly rolled in.
Here’s an introduction to each of the major components in the Spark ecosystem — what each piece does, why it matters, how it has evolved, where it might fall short, and where it’s likely to go from here.