How to Read Multiple CSV Files with MultiResourceItemReader in Spring Batch (Advanced Guide)

How to Read Multiple CSV Files with MultiResourceItemReader in Spring Batch

Spring Batch is widely used for batch processing tasks like reading CSV files, transforming data, and writing to a database. But what if your application needs to process multiple CSV files at once—like daily uploads, partitioned data, or multiple data sources?

Spring Batch provides a powerful and elegant solution: MultiResourceItemReader.

This guide explains:
  • How MultiResourceItemReader works internally
  • Full end-to-end implementation (reader → processor → writer)
  • Best practices for large CSV batches
  • Error handling, parallel reading, and performance tuning
  • REST endpoint to trigger the batch manually

🌐 What is MultiResourceItemReader?

MultiResourceItemReader is used when you need to read multiple files with the same structure (e.g., multiple CSVs, XMLs, or flat files). It iterates through each file and delegates line reading to a normal FlatFileItemReader.

📌 Processing Flow

File 1 → FlatFileItemReader → Chunk → Writer  
File 2 → FlatFileItemReader → Chunk → Writer  
...
File N → FlatFileItemReader → Chunk → Writer
💡 Tip: MultiResourceItemReader only switches files; it does not merge or sort data.

📦 1. Maven Dependencies

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-batch</artifactId>
</dependency>

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-web</artifactId>
</dependency>

<dependency>
  <groupId>com.oracle.database.jdbc</groupId>
  <artifactId>ojdbc11</artifactId>
  <scope>runtime</scope>
</dependency>

⚙️ 2. application.properties

spring.datasource.url=DB_URL
spring.datasource.username=DB_USERNAME
spring.datasource.password=DB_PASSWORD
spring.datasource.driver-class-name=oracle.jdbc.OracleDriver

spring.batch.job.enabled=false
spring.jpa.hibernate.ddl-auto=update

📁 3. Folder Structure for CSV Files

E:/csv-files/
   ├── employees-1.csv
   ├── employees-2.csv
   ├── employees-3.csv

All files must have the same header and structure.


📄 4. CSV File Format

name,email
Alice,alice@example.com
Bob,bob@example.com

🧱 5. Employee Entity

@Entity
public class Employee {

  @Id
  @GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "emp_id_seq")
  @SequenceGenerator(name = "emp_id_seq", sequenceName = "emp_id_seq", allocationSize = 1)
  private Long id;

  private String name;
  private String email;
}

🛠 6. Spring Batch Configuration

➡️ MultiResourceItemReader + FlatFileItemReader

@Bean
public MultiResourceItemReader<Employee> multiFileReader() {
    MultiResourceItemReader<Employee> reader = new MultiResourceItemReader<>();
    FileSystemResource[] resources = {
        new FileSystemResource("E:/csv-files/employees-1.csv"),
        new FileSystemResource("E:/csv-files/employees-2.csv")
    };
    reader.setResources(resources);
    reader.setDelegate(singleFileReader());
    return reader;
}

@Bean
public FlatFileItemReader<Employee> singleFileReader() {
    return new FlatFileItemReaderBuilder<Employee>()
            .name("employeeReader")
            .delimited()
            .names("name", "email")
            .linesToSkip(1)
            .targetType(Employee.class)
            .build();
}

📝 7. Step & Job Configuration

@Bean
public Step csvStep() {
    return new StepBuilder("csv-step", jobRepository)
            .<Employee, Employee>chunk(10, transactionManager)
            .reader(multiFileReader())
            .writer(jpaItemWriter())
            .build();
}

@Bean
public Job csvMultiFileJob() {
    return new JobBuilder("multi-csv-job", jobRepository)
            .incrementer(new RunIdIncrementer())
            .start(csvStep())
            .build();
}

🖊️ 8. JPA Writer

@Bean
public JpaItemWriter<Employee> jpaItemWriter() {
    JpaItemWriter<Employee> writer = new JpaItemWriter<>();
    writer.setEntityManagerFactory(entityManagerFactory);
    return writer;
}

🌐 9. REST API to Trigger the Batch Job

@RestController
@RequestMapping("/job-launcher")
public class JobLauncherController {

    @Autowired
    private JobLauncher jobLauncher;

    @Autowired
    private Job csvMultiFileJob;

    @GetMapping("/run")
    public String runJob() throws Exception {
        JobParameters params = new JobParametersBuilder()
                .addLong("startAt", System.currentTimeMillis())
                .toJobParameters();

        jobLauncher.run(csvMultiFileJob, params);
        return "Multi CSV Job Launched Successfully!";
    }
}

🔍 10. Advanced Tips & Best Practices

  • Use wildcard ResourcePatternResolver for dynamic file loading
  • Validate headers before reading the file
  • Use StepExecutionListener to track processed files
  • Enable skipping bad records using SkipPolicy
  • For large files → use partitioning for parallel reading

🔧 Loading CSV Files Dynamically (Highly Recommended)

@Bean
public MultiResourceItemReader<Employee> dynamicMultiReader() throws IOException {
    Resource[] resources = new PathMatchingResourcePatternResolver()
            .getResources("file:E:/csv-files/*.csv");

    MultiResourceItemReader<Employee> reader = new MultiResourceItemReader<>();
    reader.setResources(resources);
    reader.setDelegate(singleFileReader());
    return reader;
}

❓ FAQ

1. Can MultiResourceItemReader read files with different structures?

No. All files must have the same format.

2. Does it read files in alphabetical order?

Yes — unless you manually sort resources.

3. Can we skip corrupt CSV lines?

Yes — by introducing SkipPolicy or FaultTolerantStep.

4. Does Spring Batch support parallel reading?

Yes — using partitioning or multi-threaded steps.

5. Can we process thousands of files?

Yes, but use streaming + partitioning + chunk tuning.


🏁 Conclusion

MultiResourceItemReader is a powerful Spring Batch component that makes reading multiple CSV files easy, clean, and scalable. By combining it with chunk processing, JPA writer, and a REST endpoint trigger, you can build production-grade CSV import pipelines in minutes.

Enhance your job further using validation, partitioning, skip logic, and dynamic resource loading.