Skipping Faulty Records in Spring Batch – A Complete, Practical, and Modern Guide
Real-world batch systems deal with **imperfect data**—missing fields, malformed CSV rows, incorrect formats, and more. In such cases, failing the whole job because of a few bad records is unacceptable. This is where Spring Batch’s skip feature becomes invaluable.
In this guide, you’ll learn:
- Why skipping is needed
- How SkipPolicy works internally
- Difference between skip and retry
- How to skip in reader, processor, and writer
- How to log and audit skipped items
- SkipListener explained
- Real-world production patterns
- Full working code with copy buttons
📌 Why Skip Faulty Records?
In enterprise batch processing, your input might contain:
- Invalid emails
- Missing mandatory fields
- Unparseable numbers or dates
- Rows that violate business rules
Instead of failing the entire batch, you want the job to continue with good records while collecting details of the bad ones. This is exactly what Spring Batch is designed to do.
📘 How Spring Batch Skip Mechanism Works Internally
// High-level skip flow (ASCII diagram)
Item Reader ----> Item Processor ----> Item Writer
| | |
| ❌ Exception occurs? |
|--------------------------------------|
| Should we skip this item? (SkipPolicy)
| |
| Yes → Skip + continue
| No → Fail the step
The job step continues processing if the exception is skippable. If not, the step fails immediately.
🔧 Maven Dependencies
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-batch</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
<scope>runtime</scope>
</dependency>
⚙️ application.properties
spring.datasource.url=jdbc:h2:mem:testdb
spring.datasource.driver-class-name=org.h2.Driver
spring.datasource.username=sa
spring.datasource.password=
spring.h2.console.enabled=true
spring.batch.job.enabled=false
🧱 Employee Entity
@Entity
public class Employee {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
private String name;
private String email;
}
📂 Sample CSV File (employees.csv)
name,email
John,john@example.com
InvalidUser,invalid-email
Jane,jane@example.com
🧠 Custom SkipPolicy (Recommended Approach)
Use SkipPolicy when you want reusable, clean, pluggable skip logic.
public class EmailValidationSkipPolicy implements SkipPolicy {
@Override
public boolean shouldSkip(Throwable t, int skipCount) {
return t instanceof IllegalArgumentException
&& skipCount < 10; // skip up to 10 bad records
}
}
🔍 Processor With Business Validation
The processor is the most common place to validate data. Throw exceptions for invalid rows, and let SkipPolicy decide.
public class EmployeeItemProcessor implements ItemProcessor<Employee, Employee> {
@Override
public Employee process(Employee employee) {
if (employee.getEmail() == null || !employee.getEmail().contains("@")) {
throw new IllegalArgumentException("Invalid email: " + employee.getEmail());
}
return employee;
}
}
📌 SkipListener – Log & Audit Skipped Records
SkipListener is extremely useful for logging bad rows, writing to a separate file, or inserting into an error table.
public class EmployeeSkipListener implements SkipListener<Employee, Employee> {
@Override
public void onSkipInProcess(Employee item, Throwable t) {
System.out.println("Skipped Employee: " + item.getName() + ", Reason: " + t.getMessage());
}
}
⚙️ Spring Batch Configuration
@Configuration
public class BatchConfig {
@Autowired private JobRepository jobRepository;
@Autowired private PlatformTransactionManager transactionManager;
@Autowired private EntityManagerFactory entityManagerFactory;
@Bean
public Job employeeJob() {
return new JobBuilder("employee-import-job", jobRepository)
.start(step())
.incrementer(new RunIdIncrementer())
.build();
}
@Bean
public Step step() {
return new StepBuilder("employee-import-step", jobRepository)
.<Employee, Employee>chunk(5, transactionManager)
.reader(reader())
.processor(new EmployeeItemProcessor())
.writer(writer())
.faultTolerant()
.skipPolicy(new EmailValidationSkipPolicy())
.listener(new EmployeeSkipListener())
.build();
}
@Bean
public FlatFileItemReader<Employee> reader() {
return new FlatFileItemReaderBuilder<Employee>()
.name("employee-reader")
.resource(new ClassPathResource("employees.csv"))
.delimited()
.names("name", "email")
.targetType(Employee.class)
.linesToSkip(1)
.build();
}
@Bean
public JpaItemWriter<Employee> writer() {
JpaItemWriter<Employee> writer = new JpaItemWriter<>();
writer.setEntityManagerFactory(entityManagerFactory);
return writer;
}
}
🚀 REST Controller to Trigger Job
@RestController
@RequestMapping("/jobs")
public class JobLauncherController {
@Autowired private JobLauncher jobLauncher;
@Autowired private Job job;
@GetMapping("/run-skip-job")
public String runJob() throws Exception {
JobParameters params = new JobParametersBuilder()
.addLong("time", System.currentTimeMillis())
.toJobParameters();
jobLauncher.run(job, params);
return "Job launched!";
}
}
🔍 Retry vs Skip (Most Interviewed Topic)
| Feature | Retry | Skip |
|---|---|---|
| Purpose | Try again if temporary issue | Ignore faulty record |
| Good for | Network, DB issues | Data validation errors |
| Config | .retry(Exception.class) | .skip(Exception.class) |
✔️ Summary
- SkipPolicy helps continue processing even when some records are invalid.
- SkipListener is essential for logging, auditing, and debugging.
- Use skip for **data errors**, retry for **transient system errors**.
- Validation is best placed inside the Processor.
Subscribe to our YouTube channel: Spring Java Lab for practical tutorials!