Can I use Retry and CircuitBreaker together in Spring Boot?

Yes. A common production pattern is to apply @Retry for transient errors and @CircuitBreaker to stop sending calls when the failure rate becomes too high. Both annotations can be used on the same method and configured via named instances in the application.yml file.

When should I use TimeLimiter with Resilience4j?

Use TimeLimiter when you want to fail requests that are taking too long, for example when a downstream API is slow or hanging. It is especially useful for asynchronous or CompletableFuture-based calls and prevents threads from being blocked indefinitely.

What is a good default configuration for CircuitBreaker?

For typical REST APIs, a good starting configuration is: sliding-window-size of around 20 calls, failure-rate-threshold of 50%, wait-duration-in-open-state of about 10 seconds, and 3 to 5 permitted calls in half-open state. These values should then be tuned based on real production metrics.

Should I retry all exceptions with Resilience4j Retry?

No. You should only retry transient errors such as IO exceptions and timeouts. Business or validation exceptions should not be retried because they are not expected to succeed on subsequent attempts. Use retry-exceptions and ignore-exceptions configuration to control this behavior.

Spring Boot Resilience4j Retry & CircuitBreaker Tutorial

Q: What is Resilience4j used for in Spring Boot?

Resilience4j is a fault-tolerance library that provides patterns like Retry, CircuitBreaker, TimeLimiter, RateLimiter and Bulkhead. In Spring Boot it is used to make calls to external services, databases and other microservices more resilient to failures and latency spikes.

In a microservices world, failures are normal: network issues, slow downstream services, temporary outages, rate limits… If your Spring Boot services call other APIs or databases, you must design for failure.

In this in-depth guide, we’ll build production-ready fault tolerance using Resilience4j with Spring Boot 3.x – focusing on Retry and CircuitBreaker.

What you’ll learn:

Resilience4j basics & why it replaced Hystrix
Spring Boot 3.x setup with Resilience4j (Maven + YAML config)
Implementing Retry and CircuitBreaker for remote REST calls
Fallback methods & combining patterns (Retry + CircuitBreaker + TimeLimiter)
Metrics, monitoring (Actuator), and how to test failures
Production best practices & common pitfalls to avoid

1. What is Resilience4j and Why Use It?

Resilience4j is a lightweight, modular fault-tolerance library inspired by Netflix Hystrix, but designed for Java 8+ and functional programming.

Feature	What it gives you
Retry	Automatically retry failed operations with backoff
CircuitBreaker	Stop hitting a failing service and fail-fast
TimeLimiter	Fail calls that exceed a time limit
RateLimiter	Throttle calls to external services
Bulkhead	Isolate failures and limit concurrent calls

Resilience4j integrates very nicely with Spring Boot 3.x via auto-configuration and simple annotations.

2. Example Scenario – Calling a Remote Pricing Service

Throughout this tutorial, we’ll use a realistic example:

inventory-service (your main Spring Boot service)
calls a remote pricing-service over HTTP
pricing-service sometimes fails or is slow

We’ll protect the remote call with:

@Retry – auto-retry a few times if it fails
@CircuitBreaker – open the circuit if the failure rate is high
Fallback – return cached / default price when downstream is down

3. Project Setup – Spring Boot 3.x + Resilience4j

3.1. Maven Dependencies

<dependencies>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>

    <!-- Resilience4j Spring Boot 3 integration -->
    <dependency>
        <groupId>io.github.resilience4j</groupId>
        <artifactId>resilience4j-spring-boot3</artifactId>
    </dependency>

    <!-- Optional: Actuator for health & metrics -->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-actuator</artifactId>
    </dependency>

    <!-- Optional: Micrometer Prometheus registry -->
    <dependency>
        <groupId>io.micrometer</groupId>
        <artifactId>micrometer-registry-prometheus</artifactId>
    </dependency>
</dependencies>

If you also use Spring Cloud, you can integrate Resilience4j via Spring Cloud CircuitBreaker, but here we use it directly for more control.

4. Basic Resilience4j Configuration (application.yml)

4.1. Enabling Actuator Endpoints (recommended)

management:
  endpoints:
    web:
      exposure:
        include: health,info,metrics,prometheus
  endpoint:
    health:
      show-details: always

4.2. Retry & CircuitBreaker Base Configuration

Create or update application.yml:

resilience4j:
  retry:
    configs:
      default:
        max-attempts: 3             # 1 initial call + 2 retries
        wait-duration: 500ms        # wait between attempts
        enable-exponential-backoff: true
        exponential-backoff-multiplier: 2
        retry-exceptions:
          - java.io.IOException
          - org.springframework.web.client.ResourceAccessException
        ignore-exceptions:
          - com.example.demo.exception.BusinessException
    instances:
      priceService:
        base-config: default

  circuitbreaker:
    configs:
      default:
        sliding-window-type: COUNT_BASED
        sliding-window-size: 20            # number of calls to measure
        minimum-number-of-calls: 10
        failure-rate-threshold: 50         # percentage
        wait-duration-in-open-state: 10s   # how long circuit stays OPEN
        permitted-number-of-calls-in-half-open-state: 3
        automatic-transition-from-open-to-half-open-enabled: true
        record-exceptions:
          - java.io.IOException
          - org.springframework.web.client.HttpServerErrorException
          - org.springframework.web.client.ResourceAccessException
        ignore-exceptions:
          - com.example.demo.exception.BusinessException
    instances:
      priceService:
        base-config: default

4.3. TimeLimiter (Optional but Important for Slow APIs)

resilience4j:
  timelimiter:
    configs:
      default:
        timeout-duration: 2s
        cancel-running-future: true
    instances:
      priceService:
        base-config: default

5. 🔥 Deep Dive: Resilience4j Configuration Properties Explained

Understanding why each configuration exists helps you tune correctness and performance for production workloads. Below is a full breakdown of the most important properties we used.

5.1. 🔁 Retry Configuration — Meaning of Each Property

Property	What It Controls	Recommended Range
`max-attempts`	Total attempts allowed (1 initial call + retries)	3–5
`wait-duration`	Delay between retry attempts	200–800ms
`enable-exponential-backoff`	Whether to increase wait time after each retry	`true` in most cases
`exponential-backoff-multiplier`	How much the delay grows each retry	1.5–2
`retry-exceptions`	Only these exception types trigger a retry	Network / I/O errors
`ignore-exceptions`	These are treated as business errors → fail immediately	Validation / business exceptions

Rule of thumb: Retries help with transient failures (network glitches, timeouts). Don’t retry validation or permanent business errors — they will never succeed on retry.

5.2. 🛡 CircuitBreaker Configuration — Meaning of Each Property

Property	What It Controls	Why It Matters
`sliding-window-type`	Whether to measure calls by count or time	COUNT_BASED is easier for REST APIs; TIME_BASED for streaming
`sliding-window-size`	Number of recent calls to consider	Larger window = more stable decision, slower to react
`minimum-number-of-calls`	Min calls before evaluating failure rate	Prevents circuit opening with small sample sizes
`failure-rate-threshold`	% of failed calls required to open the circuit	50% is common starting point
`wait-duration-in-open-state`	How long the circuit stays OPEN before trying again	Enough time for dependency to recover (e.g. 10–30s)
`permitted-number-of-calls-in-half-open-state`	Number of trial calls in HALF_OPEN	3–10 is typical; too high can cause spikes
`automatic-transition-from-open-to-half-open-enabled`	Automatically move from OPEN to HALF_OPEN after wait-duration	`true` is usually what you want
`record-exceptions`	Which exceptions count as failures for the failure-rate	Real downstream failures (5xx, timeouts, I/O)
`ignore-exceptions`	Exceptions that should not count as circuit failures	Business rules, validation errors, etc.

Key Insight:
CircuitBreaker protects your system by stopping calls to a failing dependency. Proper window size + threshold decide when to trip the breaker. Too small → flapping (open/close frequently).

5.3. ⏱ TimeLimiter — Timeout Behavior (Optional but Highly Recommended)

Property	Description	Typical Value
`timeout-duration`	Maximum time allowed for a remote call	1–3 seconds for REST APIs
`cancel-running-future`	Whether to cancel the underlying task when timeout occurs	`true` in most cases

TimeLimiter protects threads from being blocked forever. CircuitBreaker protects the dependency and your system from repeated failures. Use both together for slow or flaky remote services.

5.4. 🎯 Suggested Defaults for Real Projects

Retries: 3–4 attempts with exponential backoff
Timeout (TimeLimiter): 2 seconds
CircuitBreaker sliding window: 20 calls
Failure threshold: 50%
Half-open trial calls: 3–5

These defaults work well for REST APIs with P99 latency under ~500ms. You should always tune them later using real production metrics from Prometheus / Grafana.

🧠 Performance tip: Too many retries = more load on a broken service → cascading failures. Keep retry counts low and use exponential backoff.

6. Implementing a Resilient Client with Retry & CircuitBreaker

6.1. DTO for Remote Response

public record PriceResponse(
        Long productId,
        BigDecimal price,
        String currency
) {}

6.2. RestTemplate Bean

@Configuration
public class HttpClientConfig {

    @Bean
    public RestTemplate restTemplate(RestTemplateBuilder builder) {
        return builder
            .setConnectTimeout(Duration.ofSeconds(2))
            .setReadTimeout(Duration.ofSeconds(2))
            .build();
    }
}

6.3. PriceClient – Remote Call Wrapped with Resilience4j

@Service
public class PriceClient {

    private final RestTemplate restTemplate;

    @Value("${pricing-service.base-url:http://localhost:8081}")
    private String pricingServiceBaseUrl;

    public PriceClient(RestTemplate restTemplate) {
        this.restTemplate = restTemplate;
    }

    @Retry(name = "priceService", fallbackMethod = "getPriceFallback")
    @CircuitBreaker(name = "priceService", fallbackMethod = "getPriceFallback")
    public PriceResponse getPrice(Long productId) {

        String url = pricingServiceBaseUrl + "/api/prices/" + productId;

        ResponseEntity<PriceResponse> response =
                restTemplate.getForEntity(url, PriceResponse.class);

        return response.getBody();
    }

    /**
     * Fallback method must have same parameters plus Throwable as last parameter.
     */
    public PriceResponse getPriceFallback(Long productId, Throwable throwable) {
        // Return cached / default / last-known price
        BigDecimal defaultPrice = BigDecimal.valueOf(0.00);

        // Optionally log the root cause
        System.err.println("Fallback triggered for product " + productId
                + " due to: " + throwable.getClass().getSimpleName()
                + " - " + throwable.getMessage());

        return new PriceResponse(productId, defaultPrice, "USD");
    }
}

Order of annotations is not important – both will apply: @Retry tries a few times; if failures persist and thresholds are exceeded, @CircuitBreaker will open and fail-fast for subsequent calls.

7. Exposing a REST Endpoint that Uses the Resilient Client

@RestController
@RequestMapping("/api/products")
public class ProductController {

    private final PriceClient priceClient;

    public ProductController(PriceClient priceClient) {
        this.priceClient = priceClient;
    }

    @GetMapping("/{id}/price")
    public PriceResponse getProductPrice(@PathVariable Long id) {
        return priceClient.getPrice(id);
    }
}

Now, every call to /api/products/{id}/price is protected by Retry and CircuitBreaker.

8. How Retry Works in Resilience4j (Under the Hood)

With our earlier YAML:

max-attempts: 3 → 1 initial call + 2 retries.
wait-duration: 500ms → wait 500ms between attempts.
enable-exponential-backoff: true → wait times grow (500ms, 1000ms, 2000ms…).
retry-exceptions → retry IOException and ResourceAccessException only.
ignore-exceptions → do not retry BusinessException (fail fast).

8.1. Visual Timeline

Initial call  ──X (IOException)
                │
 wait 500ms     ▼
Retry #1    ────X (IOException)
                │
 wait 1000ms    ▼
Retry #2    ────X (IOException)
                │
     All attempts failed → Fallback method invoked

8.2. Custom Retry Instance for a Specific API

You can override configuration per instance:

resilience4j:
  retry:
    instances:
      priceService:
        max-attempts: 5
        wait-duration: 300ms
        enable-exponential-backoff: true
        exponential-backoff-multiplier: 1.5

Once a call fails with one of the configured exceptions, Resilience4j will retry according to this config before giving up and calling the fallback.

9. How CircuitBreaker Works in Resilience4j

9.1. States: CLOSED → OPEN → HALF_OPEN

CLOSED: normal operation. All calls are allowed and counted.
OPEN: too many failures → short-circuit; calls fail immediately with CallNotPermittedException.
HALF_OPEN: some test calls are allowed; if they succeed → go to CLOSED, else back to OPEN.

9.2. Visual State Diagram (Textual)

[CLOSED]
   │  (failure-rate > threshold within sliding window)
   ▼
[OPEN]
   │  (after wait-duration-in-open-state)
   ▼
[HALF_OPEN]
   │  (if trial calls succeed)
   ├──► [CLOSED]
   │
   └──► (if trial calls fail) back to [OPEN]

From our YAML:

sliding-window-size: 20 → consider last 20 calls.
failure-rate-threshold: 50 → if 10 out of 20 calls fail → circuit goes OPEN.
wait-duration-in-open-state: 10s → stay OPEN for 10 seconds before trying HALF_OPEN.
permitted-number-of-calls-in-half-open-state: 3 → only 3 test calls allowed.

💡 Benefit: we stop hammering a broken service and failing slowly. Users get a consistent error or fallback immediately.

10. Combining Retry + CircuitBreaker + TimeLimiter

Retry and CircuitBreaker address errors. But what about slow responses? That’s where TimeLimiter helps.

10.1. TimeLimiter Configuration Recap

resilience4j:
  timelimiter:
    instances:
      priceService:
        timeout-duration: 2s
        cancel-running-future: true

10.2. Using TimeLimiter with CompletableFuture

TimeLimiter works with async calls. Example with CompletableFuture:

@Service
public class AsyncPriceClient {

    private final RestTemplate restTemplate;
    private final TimeLimiter timeLimiter;

    public AsyncPriceClient(RestTemplate restTemplate,
                            TimeLimiterRegistry timeLimiterRegistry) {
        this.restTemplate = restTemplate;
        this.timeLimiter = timeLimiterRegistry.timeLimiter("priceService");
    }

    public CompletableFuture<PriceResponse> getPriceAsync(Long productId) {

        Supplier<CompletableFuture<PriceResponse>> supplier =
            () -> CompletableFuture.supplyAsync(() -> {
                String url = "http://localhost:8081/api/prices/" + productId;
                return restTemplate.getForObject(url, PriceResponse.class);
            });

        return timeLimiter.executeCompletionStage(
                () -> supplier.get()
        ).toCompletableFuture();
    }
}

If the remote call doesn’t finish within timeout-duration, a TimeoutException will be thrown. You can wrap this with @Retry and @CircuitBreaker as well for a full protection stack.

11. Observability: Metrics & Monitoring Resilience4j

With spring-boot-starter-actuator and Resilience4j on the classpath, you automatically get metrics:

resilience4j_circuitbreaker_calls
resilience4j_retry_calls
resilience4j_timelimiter_calls

11.1. Check Metrics via Actuator

GET http://localhost:8080/actuator/metrics/resilience4j.circuitbreaker.calls
GET http://localhost:8080/actuator/metrics/resilience4j.retry.calls

11.2. Prometheus + Grafana (Optional)

If you added micrometer-registry-prometheus, expose metrics:

management:
  endpoints:
    web:
      exposure:
        include: health,metrics,prometheus

Then you can scrape /actuator/prometheus from Prometheus and build Grafana dashboards:

Circuit state (open / closed)
Failure rate over time
Retry count, timeouts, fallback usage

12. Testing Failure Scenarios (Very Important)

Don’t just rely on happy-path tests. You should simulate failures:

12.1. Simulate Intermittent Failures in Fake Pricing Service

@RestController
@RequestMapping("/api/prices")
public class FakePricingController {

    private final Random random = new Random();

    @GetMapping("/{id}")
    public PriceResponse getPrice(@PathVariable Long id) {

        int value = random.nextInt(10);

        if (value < 3) {
            // 30% of the time: simulate slow response
            try { Thread.sleep(3000); } catch (InterruptedException ignored) {}
        }

        if (value > 7) {
            // 20% of the time: simulate failure
            throw new RuntimeException("Downstream pricing service failed");
        }

        return new PriceResponse(id, BigDecimal.valueOf(99.99), "USD");
    }
}

Now, hit /api/products/{id}/price multiple times and watch:

Retries being applied
CircuitBreaker opening after too many failures
Fallback being used when circuit is open or all retries fail

13. Best Practices for Resilience4j in Production

13.1. Use Different Instances per Remote Dependency

Don’t reuse a single priceService config for everything. Instead:

resilience4j:
  circuitbreaker:
    instances:
      priceService: { base-config: default }
      stockService: { base-config: default }
      paymentService: { base-config: default }

This way, one noisy dependency doesn’t affect others.

13.2. Choose the Right Retry Count

Too many retries → more pressure on an already slow/failing service.
Too few retries → transient network glitches may not be recovered.
Common choice: 3–5 attempts with exponential backoff.

13.3. Always Provide Meaningful Fallbacks

Return cached data or last-known-good values when possible.
Return a well-formed error response instead of raw exceptions.
Log fallback usage with enough context for debugging.

13.4. Don’t Abuse CircuitBreaker

Use it only for remote calls (HTTP, DB, external systems).
Don’t wrap CPU-heavy local operations with CircuitBreaker.

13.5. Monitor Metrics and Tune Continually

Start with conservative defaults, then adjust:

failure-rate-threshold (e.g., 50%)
sliding-window-size
timeout-duration (TimeLimiter)

Use real production metrics to tune these values over time.

14. Summary & Next Steps

By integrating Resilience4j with Spring Boot 3, you get a powerful, flexible toolkit for building resilient microservices.

@Retry handles transient failures.
@CircuitBreaker prevents cascading failures from broken dependencies.
TimeLimiter ensures slow calls don’t hog resources.
Fallback strategies keep the user experience graceful.
Metrics + Actuator provide visibility into real-world behavior.

Combine these patterns with good observability and careful tuning, and your Spring Boot services will stay responsive even when dependencies fail.

15. FAQ: Resilience4j Retry & CircuitBreaker in Spring Boot

Q1. What is Resilience4j used for in Spring Boot?

Resilience4j provides fault-tolerance patterns like Retry, CircuitBreaker, TimeLimiter, RateLimiter and Bulkhead. In Spring Boot it’s commonly used to protect HTTP calls to external APIs, databases and other microservices.

Q2. Can I use Retry and CircuitBreaker together?

Yes. A common pattern is to apply @Retry for transient errors and @CircuitBreaker to stop calls when failure rate is high. You can annotate the same method with both and share the same named configuration instance.

Q3. When should I use TimeLimiter?

Use TimeLimiter when you want to fail calls that are taking too long (e.g. slow downstream API). It’s especially useful for async / non-blocking or CompletableFuture-based calls.

Q4. What is a good starting configuration for CircuitBreaker?

For typical REST APIs, a good starting point is: sliding-window-size = 20, failure-rate-threshold = 50, wait-duration-in-open-state = 10s, permitted-calls-in-half-open = 3, then tune using metrics.

Q5. Should I retry all exceptions?

No. Only retry transient errors like I/O issues or timeouts. Never retry business exceptions (validation failures, domain errors) – they will not succeed on retry.

Spring Java Lab

📂 Explore Topics

🔥 Featured Posts

Spring Boot Resilience4j Retry & CircuitBreaker Tutorial

1. What is Resilience4j and Why Use It?

2. Example Scenario – Calling a Remote Pricing Service

3. Project Setup – Spring Boot 3.x + Resilience4j

3.1. Maven Dependencies

4. Basic Resilience4j Configuration (application.yml)

4.1. Enabling Actuator Endpoints (recommended)

4.2. Retry & CircuitBreaker Base Configuration

4.3. TimeLimiter (Optional but Important for Slow APIs)

5. 🔥 Deep Dive: Resilience4j Configuration Properties Explained

5.1. 🔁 Retry Configuration — Meaning of Each Property

5.2. 🛡 CircuitBreaker Configuration — Meaning of Each Property

5.3. ⏱ TimeLimiter — Timeout Behavior (Optional but Highly Recommended)

5.4. 🎯 Suggested Defaults for Real Projects

6. Implementing a Resilient Client with Retry & CircuitBreaker

6.1. DTO for Remote Response

6.2. RestTemplate Bean

6.3. PriceClient – Remote Call Wrapped with Resilience4j

7. Exposing a REST Endpoint that Uses the Resilient Client

8. How Retry Works in Resilience4j (Under the Hood)

8.1. Visual Timeline

8.2. Custom Retry Instance for a Specific API

9. How CircuitBreaker Works in Resilience4j

9.1. States: CLOSED → OPEN → HALF_OPEN

9.2. Visual State Diagram (Textual)

10. Combining Retry + CircuitBreaker + TimeLimiter

10.1. TimeLimiter Configuration Recap

10.2. Using TimeLimiter with CompletableFuture

11. Observability: Metrics & Monitoring Resilience4j

11.1. Check Metrics via Actuator

11.2. Prometheus + Grafana (Optional)

12. Testing Failure Scenarios (Very Important)

12.1. Simulate Intermittent Failures in Fake Pricing Service

13. Best Practices for Resilience4j in Production

13.1. Use Different Instances per Remote Dependency

13.2. Choose the Right Retry Count

13.3. Always Provide Meaningful Fallbacks

13.4. Don’t Abuse CircuitBreaker

13.5. Monitor Metrics and Tune Continually

14. Summary & Next Steps

15. FAQ: Resilience4j Retry & CircuitBreaker in Spring Boot

Q1. What is Resilience4j used for in Spring Boot?

Q2. Can I use Retry and CircuitBreaker together?

Q3. When should I use TimeLimiter?

Q4. What is a good starting configuration for CircuitBreaker?

Q5. Should I retry all exceptions?

🔗 Related Spring Boot & Microservices Posts

Subscribe on YouTube

Search This Blog