Mastering Skipping Faulty Records in Spring Batch – SkipPolicy Explained

Skipping Faulty Records in Spring Batch – A Complete, Practical, and Modern Guide

Real-world batch systems deal with **imperfect data**—missing fields, malformed CSV rows, incorrect formats, and more. In such cases, failing the whole job because of a few bad records is unacceptable. This is where Spring Batch’s skip feature becomes invaluable.

In this guide, you’ll learn:

  • Why skipping is needed
  • How SkipPolicy works internally
  • Difference between skip and retry
  • How to skip in reader, processor, and writer
  • How to log and audit skipped items
  • SkipListener explained
  • Real-world production patterns
  • Full working code with copy buttons

📌 Why Skip Faulty Records?

In enterprise batch processing, your input might contain:

  • Invalid emails
  • Missing mandatory fields
  • Unparseable numbers or dates
  • Rows that violate business rules

Instead of failing the entire batch, you want the job to continue with good records while collecting details of the bad ones. This is exactly what Spring Batch is designed to do.


📘 How Spring Batch Skip Mechanism Works Internally


// High-level skip flow (ASCII diagram)

Item Reader ----> Item Processor ----> Item Writer
      |                  |                   |
      |        ❌ Exception occurs?          |
      |--------------------------------------|
      |      Should we skip this item? (SkipPolicy)
      |                 |
      |      Yes → Skip + continue
      |      No  → Fail the step

The job step continues processing if the exception is skippable. If not, the step fails immediately.


🔧 Maven Dependencies

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-batch</artifactId>
</dependency>

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>

<dependency>
  <groupId>com.h2database</groupId>
  <artifactId>h2</artifactId>
  <scope>runtime</scope>
</dependency>

⚙️ application.properties

spring.datasource.url=jdbc:h2:mem:testdb
spring.datasource.driver-class-name=org.h2.Driver
spring.datasource.username=sa
spring.datasource.password=

spring.h2.console.enabled=true
spring.batch.job.enabled=false

🧱 Employee Entity

@Entity
public class Employee {
  @Id
  @GeneratedValue(strategy = GenerationType.IDENTITY)
  private Long id;

  private String name;
  private String email;
}

📂 Sample CSV File (employees.csv)

name,email
John,john@example.com
InvalidUser,invalid-email
Jane,jane@example.com

🧠 Custom SkipPolicy (Recommended Approach)

Use SkipPolicy when you want reusable, clean, pluggable skip logic.

public class EmailValidationSkipPolicy implements SkipPolicy {

  @Override
  public boolean shouldSkip(Throwable t, int skipCount) {
    return t instanceof IllegalArgumentException
           && skipCount < 10; // skip up to 10 bad records
  }
}

🔍 Processor With Business Validation

The processor is the most common place to validate data. Throw exceptions for invalid rows, and let SkipPolicy decide.

public class EmployeeItemProcessor implements ItemProcessor<Employee, Employee> {

  @Override
  public Employee process(Employee employee) {
    if (employee.getEmail() == null || !employee.getEmail().contains("@")) {
      throw new IllegalArgumentException("Invalid email: " + employee.getEmail());
    }
    return employee;
  }
}

📌 SkipListener – Log & Audit Skipped Records

SkipListener is extremely useful for logging bad rows, writing to a separate file, or inserting into an error table.

public class EmployeeSkipListener implements SkipListener<Employee, Employee> {

  @Override
  public void onSkipInProcess(Employee item, Throwable t) {
    System.out.println("Skipped Employee: " + item.getName() + ", Reason: " + t.getMessage());
  }
}

⚙️ Spring Batch Configuration

@Configuration
public class BatchConfig {

  @Autowired private JobRepository jobRepository;
  @Autowired private PlatformTransactionManager transactionManager;
  @Autowired private EntityManagerFactory entityManagerFactory;

  @Bean
  public Job employeeJob() {
    return new JobBuilder("employee-import-job", jobRepository)
      .start(step())
      .incrementer(new RunIdIncrementer())
      .build();
  }

  @Bean
  public Step step() {
    return new StepBuilder("employee-import-step", jobRepository)
      .<Employee, Employee>chunk(5, transactionManager)
      .reader(reader())
      .processor(new EmployeeItemProcessor())
      .writer(writer())
      .faultTolerant()
      .skipPolicy(new EmailValidationSkipPolicy())
      .listener(new EmployeeSkipListener())
      .build();
  }

  @Bean
  public FlatFileItemReader<Employee> reader() {
    return new FlatFileItemReaderBuilder<Employee>()
      .name("employee-reader")
      .resource(new ClassPathResource("employees.csv"))
      .delimited()
      .names("name", "email")
      .targetType(Employee.class)
      .linesToSkip(1)
      .build();
  }

  @Bean
  public JpaItemWriter<Employee> writer() {
    JpaItemWriter<Employee> writer = new JpaItemWriter<>();
    writer.setEntityManagerFactory(entityManagerFactory);
    return writer;
  }
}

🚀 REST Controller to Trigger Job

@RestController
@RequestMapping("/jobs")
public class JobLauncherController {

  @Autowired private JobLauncher jobLauncher;
  @Autowired private Job job;

  @GetMapping("/run-skip-job")
  public String runJob() throws Exception {
    JobParameters params = new JobParametersBuilder()
      .addLong("time", System.currentTimeMillis())
      .toJobParameters();

    jobLauncher.run(job, params);
    return "Job launched!";
  }
}

🔍 Retry vs Skip (Most Interviewed Topic)

Feature Retry Skip
Purpose Try again if temporary issue Ignore faulty record
Good for Network, DB issues Data validation errors
Config .retry(Exception.class) .skip(Exception.class)

✔️ Summary

  • SkipPolicy helps continue processing even when some records are invalid.
  • SkipListener is essential for logging, auditing, and debugging.
  • Use skip for **data errors**, retry for **transient system errors**.
  • Validation is best placed inside the Processor.
📺 Want to learn Spring with hands-on videos?
Subscribe to our YouTube channel: Spring Java Lab for practical tutorials!