Check order status

Become part of a community of book lovers from all over the world and get access to a whole bunch of benefits. Create an account for free

Austrian Post 5.49 € DPD courier 3.99 € DPD point 2.99 €

Contact

How to shop

Help

My account

▸ Empty :-(

Apache Spark 4.0

Name: Apache Spark 4.0
Brand: Independently published
SKU: 51319811
Price: 17.29 EUR
Availability: InStock
Author: Yila Harrison
ISBN: 9798249316587

Build High-Performance Data Engineering Pipelines with Spark SQL, Structured Streaming, and Modern Cluster Architectures

Yila Harrison

Language

English

Book Paperback

Libristo code: 51319811

Publishers Independently published, February 2026

Build High-Performance Data Engineering Pipelines with Spark SQL, Structured Streaming, and Modern C... Full description

Libristo code: 51319811

42 b

New

17.29 € VAT included

In stock at our supplier Shipping in 9-15 days

Delivery to Austria

30-day return policy

You might also be interested in

Modern Computer Vision with PyTorch Yeshwanth Reddy

Paperback

69.09 €

Build High-Performance Data Engineering Pipelines with Spark SQL, Structured Streaming, and Modern Cluster Architectures

Apache Spark has become the backbone of modern data engineering - but knowing Spark isn't the same as mastering it in production.

Apache Spark 4.0 is a deeply practical, production-focused guide for data engineers, platform engineers, and analytics professionals who want to build scalable, fault-tolerant, high-performance data pipelines using Spark SQL, Structured Streaming, and modern cluster architectures.

This book goes far beyond surface-level tutorials. It teaches you how Spark actually works under the hood - and how to use that knowledge to design systems that scale.

You won't just learn Spark APIs.
You'll learn how to think like the Spark engine.

What You'll Master

Inside this book, you will learn how to:

Understand Spark's execution model: jobs, stages, tasks, DAGs, Catalyst, and Tungsten
Write high-performance Spark SQL queries and choose efficient join strategies
Design batch, streaming, and hybrid pipelines that scale
Optimize memory, CPU, shuffle behavior, and partitioning
Build real-time pipelines with Structured Streaming
Deploy Spark on Kubernetes and modern cloud architectures
Diagnose slow jobs and production failures with confidence
Apply operational best practices for reliability and fault tolerance
Design complete end-to-end data engineering systems

Each chapter builds progressively - from core fundamentals to advanced architectural decisions - ensuring you develop both tactical skills and strategic judgment.

Built for Real-World Production

This book is not theoretical.

Every concept is explained clearly, then grounded in practical Spark applications. You will learn how to:

Prevent silent data corruption
Handle skewed data and large shuffles
Tune Spark configurations that actually matter
Debug production failures under pressure
Design pipelines that survive real workloads

If you work with large-scale data, this book gives you the mental models and tools needed to operate Spark with confidence.

Who This Book Is For

This book is ideal for:

Data Engineers building batch and streaming pipelines
Analytics Engineers optimizing Spark SQL workloads
Platform Engineers managing Spark clusters
Developers moving from Spark basics to production mastery
Teams adopting Spark 4.0 and modern cluster architectures

If you already know basic Spark and want to move into performance tuning, reliability, and architecture design - this book is for you.

Why Apache Spark 4.0 Matters

Spark 4.0 represents a refinement of Spark's execution engine, adaptive query behavior, and production readiness. This book shows you how to leverage those improvements without guesswork.

Instead of memorizing settings or copying code snippets, you'll understand:

Why Spark behaves the way it does
How execution plans translate into real resource usage
When Spark is the right tool - and when it isn't

That clarity is what separates average Spark users from high-impact data engineers.

Build Systems That Scale

Data systems fail when engineers treat Spark as a black box.

This book removes that black box.

By the end, you will be able to design and deploy robust, high-performance data pipelines - from ingestion to analytics - using Spark SQL, Structured Streaming, and modern cluster architectures.

Actress & Polyglot

EWA KASP for

Play video

Libristo has the largest selection of foreign-language books. That’s why I buy my books there.

About the book

Full name Apache Spark 4.0

Author Yila Harrison

Language

English

Binding Book - Paperback

Date of issue 2026

Number of pages 172

EAN 9798249316587

Libristo code 51319811

Publishers Independently published

Weight 311

Dimensions 178 x 254 x 9

Frequently searched

Categories

Authors

Publishers

Frequently searched

Items

Categories

Authors

Publishers

Delivery

Shopping guide

Apache Spark 4.0

Build High-Performance Data Engineering Pipelines with Spark SQL, Structured Streaming, and Modern Cluster Architectures

You might also be interested in

About the book

Categories

Give this book today

It's easy

We are at home across Europe

Frequently searched

Categories

Authors

Publishers

Apache Spark 4.0

Build High-Performance Data Engineering Pipelines with Spark SQL, Structured Streaming, and Modern Cluster Architectures

You might also be interested in

About the book

Categories

Give this book today

It's easy

Don’t have an account? Discover the benefits of having a Libristo account!