Partitioning and Parallel Step Execution
Scale throughput with multi-threaded steps, partitioning, and remote chunking strategies.
Why Scale Spring Batch?
Single-threaded Spring Batch jobs process one chunk at a time — fine for small datasets, but too slow for millions of records. When batch throughput becomes a bottleneck, Spring Batch offers four scaling strategies:
- Multi-threaded Step — parallel threads within a single JVM step
- Parallel Steps — independent steps run concurrently in a flow
- Partitioning — divide data into partitions, each processed by a worker step
- Remote Chunking — offload chunk processing to remote workers over messaging middleware
Each strategy has different complexity/throughput trade-offs. This lesson covers all four, starting with the simplest.
Multi-Threaded Steps with TaskExecutor
The easiest way to add parallelism is to inject a TaskExecutor into your Step. Spring Batch will execute chunks concurrently on a thread pool. Important: the ItemReader must be thread-safe (e.g. stateless, or use SynchronizedItemStreamReader).
Configure a multi-threaded step like this:
import org.springframework.batch.core.Step;
import org.springframework.batch.core.repository.JobRepository;
import org.springframework.batch.core.step.builder.StepBuilder;
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.builder.FlatFileItemReaderBuilder;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.ClassPathResource;
import org.springframework.core.task.SimpleAsyncTaskExecutor;
import org.springframework.transaction.PlatformTransactionManager;
@Configuration
public class MultiThreadedStepConfig {
@Bean
public Step multiThreadedStep(JobRepository jobRepository,
PlatformTransactionManager txManager,
FlatFileItemReader<String> reader) {
return new StepBuilder("multiThreadedStep", jobRepository)
.<String, String>chunk(100, txManager)
.reader(reader)
.writer(items -> items.forEach(System.out::println))
.taskExecutor(new SimpleAsyncTaskExecutor())
.throttleLimit(4) // max concurrent threads
.build();
}
}All lessons in this course
- Jobs, Steps, and the JobRepository Model
- Chunk-Oriented Reader-Processor-Writer Flows
- Fault Tolerance, Skip, and Retry Policies
- Partitioning and Parallel Step Execution