Understanding ItemProcessor in Spring Batch with Practical, Real-World Examples

Understanding ItemProcessor in Spring Batch (Deep Practical Guide)

ItemProcessor is one of the most powerful—and often misunderstood—components in Spring Batch. It acts as a **bridge** between reading data and writing data, allowing you to **validate**, **transform**, **filter**, or **enrich** each item before it moves forward in the batch pipeline.

Processing Flow:
ItemReader → ItemProcessor → ItemWriter

ItemProcessor decides whether data is valid, transformed, enriched, or discarded.

๐Ÿ” What Is ItemProcessor?

ItemProcessor is a functional interface with a single method:

T process(T item) throws Exception;
You return: - **Processed/transformed item** → continues to writer - **null** → item is filtered/skipped - **throw Exception** → step may fail unless SkipPolicy is defined

๐Ÿงฑ When to Use an ItemProcessor?

  • Validate fields (email, phone, numbers)
  • Transform names, formats, dates, units
  • Filter invalid or incomplete rows
  • Enrich employee data (e.g., fetch department)
  • Mask sensitive information
  • Apply business logic before persistence
๐Ÿ’ก Rule of thumb: Use ItemProcessor for *per-row operations*. For batch-wide operations, use listeners instead.

๐Ÿงช Practical Example: Email Validation Processor

๐Ÿ“Œ EmailValidationProcessor.java

@Component
public class EmailValidationProcessor implements ItemProcessor {

    @Override
    public Employee process(Employee employee) {

        String email = employee.getEmail();

        if (email != null && email.matches("^[A-Za-z0-9+_.-]+@[A-Za-z0-9.-]+$")) {
            return employee;  // valid → move to writer
        }

        return null; // invalid → skip this item
    }
}

Returning null automatically tells Spring Batch to skip the record.

Spring Batch considers "null return" as "filtered item, not an error".

⚙️ Updating Step Configuration to Use Processor

@Bean
public Step csvStep() {
    return new StepBuilder("csv-step", jobRepository)
            .chunk(10, transactionManager)
            .reader(csvReader())
            .processor(emailValidationProcessor)
            .writer(jpaItemWriter())
            .build();
}

Only one line changes: .processor(emailValidationProcessor)


๐Ÿ“Œ Advanced ItemProcessor Examples

1️⃣ Transforming Names (Uppercase)

public class NameFormatterProcessor implements ItemProcessor {

    @Override
    public Employee process(Employee emp) {
        emp.setName(emp.getName().toUpperCase());
        return emp;
    }
}
---

2️⃣ Filtering Salaries Below Threshold

public class SalaryFilterProcessor implements ItemProcessor {

    @Override
    public Employee process(Employee emp) {
        return emp.getSalary() < 20000 ? null : emp;
    }
}
---

3️⃣ Enriching Data with External API

public class DepartmentEnrichmentProcessor implements ItemProcessor {

    @Autowired
    private DeptService deptService;

    @Override
    public Employee process(Employee emp) {
        String dept = deptService.getDepartment(emp.getEmail());
        emp.setDepartment(dept);
        return emp;
    }
}
---

4️⃣ Multiple Processors Using CompositeItemProcessor

@Bean
public CompositeItemProcessor compositeProcessor() {

    List> processors = List.of(
        new EmailValidationProcessor(),
        new NameFormatterProcessor(),
        new SalaryFilterProcessor()
    );

    CompositeItemProcessor cip = new CompositeItemProcessor<>();
    cip.setDelegates(processors);
    return cip;
}
๐Ÿ’ก CompositeItemProcessor = chain of processors Great for complex pipelines.

๐Ÿง  Internal Flow of ItemProcessor

FlatFileItemReader → Employee  
  ↓  
EmailValidationProcessor → null (if invalid)  
  ↓  
NameFormatterProcessor → "JOHN"  
  ↓  
JpaItemWriter → Persist to DB

๐Ÿšจ Common Mistakes to Avoid

  • Throwing exceptions instead of returning null
  • Doing multi-row operations inside processor (use listeners instead)
  • Using heavy operations (API calls) without caching
  • Not logging skipped records
  • Processing-sensitive data without masking

๐Ÿงฉ Logging Skipped Records (Best Practice)

@Override
public Employee process(Employee employee) {

    if (!isValid(employee)) {
        log.warn("Skipping invalid record: {}", employee);
        return null;
    }

    return employee;
}

❓ FAQ

1. Is ItemProcessor mandatory?

No. If you don’t need filtering/transformation, you can skip it.

2. Can ItemProcessor skip bad records?

Yes — return null.

3. What if a processor throws an exception?

The step fails unless SkipPolicy or FaultTolerantStep is configured.

4. Can I use multiple processors?

Yes — using CompositeItemProcessor.

5. Is ItemProcessor executed per chunk or per row?

Per row.


๐Ÿ“ Summary

  • ItemProcessor is ideal for validation, transformation, filtering, and enrichment.
  • Returning null skips the record safely.
  • CompositeItemProcessor allows chaining multiple processors.
  • Use processors for business logic, not step-level logic.
  • Always log skipped records in production pipelines.

Mastering ItemProcessor makes your Spring Batch jobs significantly more robust, reusable, and production-ready.