DEV Community

Taras Antoniuk
Taras Antoniuk

Posted on

How JOIN FETCH Reduced Database Load by 94%: A Real-World Case Study

ðŸŽŊ Introduction

The N+1 problem is one of the most common causes of high database load in Spring Boot applications. In this article, I'll show you how to systematically solve this problem using a real-world financial system project.

🔍 What is the N+1 Problem?

The N+1 problem occurs when an ORM generates additional SELECT queries to load related entities.

Example of the Problem

@Entity
public class ExternalExchangeRate {
    @ManyToOne(fetch = FetchType.LAZY)
    private Currency currencyFrom;

    @ManyToOne(fetch = FetchType.LAZY)
    private Currency currencyTo;
}
Enter fullscreen mode Exit fullscreen mode

ðŸ’Ą View ExternalExchangeRate code →

When executing:

List<ExternalExchangeRate> rates = repository.findByExchangeDateAndCurrencyFromId(date, currencyFromId);
for (ExternalExchangeRate rate : rates) {
    System.out.println(rate.getCurrencyFrom().getCode()); // N queries!
    System.out.println(rate.getCurrencyTo().getCode());   // N more queries!
}
Enter fullscreen mode Exit fullscreen mode

Result: 1 + N*2 queries to the database

For example, if we have 15 currency rates for a day:

  • 1 SELECT for rates
  • 15 SELECT for currencyFrom
  • 15 SELECT for currencyTo
  • Total: 31 queries! ðŸ˜ą

🔎 Detecting the N+1 Problem

Configuration for Development Mode

# application-dev.properties
spring.jpa.show-sql=true
spring.jpa.properties.hibernate.format_sql=true
spring.jpa.properties.hibernate.generate_statistics=true
spring.jpa.properties.hibernate.use_sql_comments=true
logging.level.org.hibernate.SQL=DEBUG
Enter fullscreen mode Exit fullscreen mode

ðŸ’Ą View application-dev.properties →

Debug Endpoint for Monitoring

@RestController
@RequestMapping("/api/debug")
public class DebugController {

    @Autowired
    private EntityManagerFactory emf;

    @GetMapping("/hibernate-stats")
    public Map<String, Object> getHibernateStats() {
        Statistics stats = emf.unwrap(SessionFactory.class)
                              .getStatistics();

        return Map.of(
            "queriesExecuted", stats.getQueryExecutionCount(),
            "prepareStatementCount", stats.getPrepareStatementCount(),
            "entitiesLoaded", stats.getEntityLoadCount(),
            "entitiesFetched", stats.getEntityFetchCount()
        );
    }
}
Enter fullscreen mode Exit fullscreen mode

ðŸ’Ą View DebugController code →

Measurements BEFORE Optimization

GET /api/external-exchange-rates/latest?date=2024-11-24&currencyFromId=2

Result:
- Time: 48ms
- prepareStatementCount: 33  ← Real SQL statements executed!
- queriesExecuted: 2  ← HQL/JPQL query types
- entitiesLoaded: 131
- entitiesFetched: 31  ← Additional lazy fetches!
Enter fullscreen mode Exit fullscreen mode

ðŸ’Ą Understanding the metrics:

queriesExecuted = HQL/JPQL query types (2 types of queries)

prepareStatementCount = Actual JDBC statement executions (33 SQL queries!)

entitiesFetched = Lazy entity fetches (31 Currency entities loaded separately)

SQL log showed:

-- 1 query for exchange rates
SELECT * FROM external_exchange_rates 
WHERE exchange_date = '2024-11-24' 
AND currency_from_id = 2;

-- 31 additional queries for currencies!
SELECT * FROM currencies WHERE id = 1;
SELECT * FROM currencies WHERE id = 3;
SELECT * FROM currencies WHERE id = 4;
-- ... 28 more queries
Enter fullscreen mode Exit fullscreen mode

✅ When JOIN FETCH Works Perfectly

Flat Data Structure (@ManyToOne)

JOIN FETCH is ideal for flat data structures where an entity has variables that reference only one entity from another table.

Perfect use case example:

@Entity
public class ExternalExchangeRate {
    @Id
    private Long id;

    // Each rate references ONE currency FROM
    @ManyToOne(fetch = FetchType.LAZY)
    private Currency currencyFrom;

    // Each rate references ONE currency TO
    @ManyToOne(fetch = FetchType.LAZY)
    private Currency currencyTo;

    private BigDecimal rate;
}
Enter fullscreen mode Exit fullscreen mode

Why this works perfectly:

  • 1 ExternalExchangeRate → 1 Currency (from)
  • 1 ExternalExchangeRate → 1 Currency (to)
  • No multiple relationships (@OneToMany)
  • Pagination works correctly ✅
  • COUNT query is accurate ✅

⚠ïļ When JOIN FETCH Is NOT Suitable

Table Parts and Collections (@OneToMany)

JOIN FETCH is NOT an optimal solution when an entity has a variable that references a collection of data (e.g., document line items).

Problematic example:

@Entity
public class Invoice {
    @Id
    private Long id;

    @ManyToOne
    private Customer customer;

    // ⚠ïļ PROBLEM: Collection of items!
    @OneToMany(mappedBy = "invoice")
    private List<InvoiceItem> items;  // Could be 1, 10, 100+ items!
}
Enter fullscreen mode Exit fullscreen mode

The Problem with Pagination

❌ Wrong approach:

@Query("SELECT DISTINCT i FROM Invoice i " +
       "LEFT JOIN FETCH i.items")  // ⚠ïļ PROBLEM!
Page<Invoice> findAll(Pageable pageable);
Enter fullscreen mode Exit fullscreen mode

Why this doesn't work:

  1. Cartesian product - 1 Invoice with 10 items → 10 rows in result
  2. Pagination breaks - Page size 20 might return only 2 invoices!
  3. COUNT incorrect - Counts rows after JOIN, not invoices

⚠ïļ Hibernate Warning

HHH000104: firstResult/maxResults specified with collection fetch; 
applying in memory!
Enter fullscreen mode Exit fullscreen mode

This means Hibernate will load all data and apply pagination in memory - losing all benefits!


🔄 Alternatives for Collections

For collections with @OneToMany:

  1. @EntityGraph - Better Hibernate understanding
  2. Two separate queries - Full control
  3. DTO Projection - When details not needed
  4. Batch Fetching - Optimizes multiple queries

📝 Note: I'll cover @EntityGraph in detail in my next article with a real document + line items example.


📊 Comparison Table

Feature JOIN FETCH (@ManyToOne) JOIN FETCH (@OneToMany)
Pagination ✅ Works perfectly ❌ Breaks
COUNT accuracy ✅ Correct ❌ Counts rows
Number of queries ✅ 1-2 queries ⚠ïļ All in memory
Predictability ✅ Stable ❌ Data dependent
Recommendation ✅ Use it ❌ Avoid it

ðŸ’Ą Solution: JOIN FETCH for Flat Structures

Step 1: Creating Optimized Repository Methods

@Repository
public interface ExternalExchangeRateRepository 
        extends JpaRepository<ExternalExchangeRate, Long> {

    /**
     * Find latest rates by date and currency from with currencies loaded.
     * Uses JOIN FETCH to prevent N+1 queries.
     */
    @Query("SELECT e FROM ExternalExchangeRate e " +
           "LEFT JOIN FETCH e.currencyFrom " +
           "LEFT JOIN FETCH e.currencyTo " +
           "WHERE e.exchangeDate = :date " +
           "AND e.currencyFrom.id = :currencyFromId")
    List<ExternalExchangeRate> findLatestRatesByCurrencyFromWithCurrencies(
            @Param("date") LocalDate date,
            @Param("currencyFromId") Long currencyFromId);

    /**
     * Find all exchange rates with currencies loaded in a single query.
     */
    @Query(value = "SELECT DISTINCT e FROM ExternalExchangeRate e " +
           "LEFT JOIN FETCH e.currencyFrom " +
           "LEFT JOIN FETCH e.currencyTo " +
           "ORDER BY e.exchangeDate DESC, e.id DESC",
           countQuery = "SELECT COUNT(e) FROM ExternalExchangeRate e")
    Page<ExternalExchangeRate> findAllWithCurrencies(Pageable pageable);
}
Enter fullscreen mode Exit fullscreen mode

ðŸ’Ą View Repository code →

🔑 Key Points:

  1. DISTINCT - avoids duplicates with JOIN
  2. LEFT JOIN FETCH - loads related entities in one query
  3. countQuery - separate COUNT for pagination (without JOIN)
  4. WithCurrencies suffix - clear method naming

Step 2: Updating the Service

@Service
@Transactional(readOnly = true)
public class ExternalExchangeRateService {

    public List<ExternalExchangeRateResponseDTO> getLatestRatesByDateAndCurrencyFrom(
            LocalDate date, Long currencyFromId) {

        // Using optimized method
        List<ExternalExchangeRate> rates = repository
                .findLatestRatesByCurrencyFromWithCurrencies(date, currencyFromId);

        // No N+1 here, all data is loaded! ✅
        return exchangeRateMapper.toResponseDTOList(rates);
    }
}
Enter fullscreen mode Exit fullscreen mode

ðŸ’Ą View Service →

Step 3: Updating Tests

@Test
void getLatestRatesByDateAndCurrencyFrom_WhenExists_ShouldReturnRates() {
    // Given
    LocalDate date = LocalDate.of(2024, 11, 24);
    Long currencyFromId = 2L;

    // Updated to new method!
    when(exchangeRateRepository.findLatestRatesByCurrencyFromWithCurrencies(date, currencyFromId))
            .thenReturn(List.of(rate1, rate2));

    // When
    List<ExternalExchangeRateResponseDTO> result =
            exchangeRateService.getLatestRatesByDateAndCurrencyFrom(date, currencyFromId);

    // Then
    assertNotNull(result);
    assertEquals(2, result.size());
}
Enter fullscreen mode Exit fullscreen mode

ðŸ’Ą View Tests →


🚀 Optimization Results

Performance Metrics:

Metric Before After Improvement
Response Time 48ms 37ms 23% faster ⚡
SQL Queries 33 2 94% reduction ðŸŽŊ
Prepare Time ~413ξs ~61ξs 85% faster ⚡
Execution Time ~15ms ~14ms Similar
Scalability O(n*2) O(1) Linear → Constant 🚀

🎉 Key Achievement:

  • ✅ N+1 Problem: SOLVED
  • SQL queries don't depend on record count
  • For 1000 records: 2001 queries → 2 queries

ðŸĪ” Why Only 23% Response Time Improvement?

You might ask: "94% fewer queries but only 23% faster response time?"

The answer:

  1. Small dataset - Only 15 records in our test case
  2. Fast local database - PostgreSQL on localhost responds in ~0.5-1ms per query
  3. Connection pooling - Spring Boot's default HikariCP reuses connections efficiently
  4. Simple queries - SELECT by ID is very fast with proper indexes

Note: While the application has JDBC batching configured (batch_size=1000), it only affects INSERT/UPDATE operations, not SELECT queries.

Real impact shows at scale:

10 records:    11 queries →  2 queries  (similar time)
100 records:  201 queries →  2 queries  (23% faster)
1000 records: 2001 queries → 2 queries  (90%+ faster!) 🚀
Enter fullscreen mode Exit fullscreen mode

ðŸ’Ą Key Insight: The real win is predictable performance regardless of data size! Performance doesn't degrade as data grows.

Measurements AFTER Optimization

GET /api/external-exchange-rates/latest?date=2024-11-24&currencyFromId=2

Result:
- Time: 37ms (23% faster! ⚡)
- prepareStatementCount: 2 (94% less! ðŸŽŊ)
- queriesExecuted: 1 (only 1 query type now!)
- entitiesFetched: 0 (no additional fetches! ✅)
Enter fullscreen mode Exit fullscreen mode

ðŸ’Ą Key metric: prepareStatementCount dropped from 33 → 2 (94% reduction!)

SQL log:

-- Only 1 query with JOIN! 🎉
SELECT DISTINCT 
    eer.id, eer.exchange_date, eer.rate, eer.source,
    cf.id, cf.code, cf.name, cf.symbol,
    ct.id, ct.code, ct.name, ct.symbol
FROM external_exchange_rates eer
LEFT JOIN currencies cf ON cf.id = eer.currency_from_id
LEFT JOIN currencies ct ON ct.id = eer.currency_to_id
WHERE eer.exchange_date = '2024-11-24'
AND eer.currency_from_id = 2;
Enter fullscreen mode Exit fullscreen mode

🔧 Systematic Optimization of the Entire Project

I optimized 7 core modules with similar approach:

1. ExternalExchangeRate (2 relations)

@Query("SELECT e FROM ExternalExchangeRate e " +
       "LEFT JOIN FETCH e.currencyFrom " +
       "LEFT JOIN FETCH e.currencyTo")
Enter fullscreen mode Exit fullscreen mode

2. Bank (2 relations)

@Query("SELECT b FROM Bank b " +
       "LEFT JOIN FETCH b.country " +
       "LEFT JOIN FETCH b.counterparty")
Enter fullscreen mode Exit fullscreen mode

ðŸ’Ą View Bank module →

3. BankAccount (2 relations)

@Query("SELECT ba FROM BankAccount ba " +
       "LEFT JOIN FETCH ba.bank " +
       "LEFT JOIN FETCH ba.currency")
Enter fullscreen mode Exit fullscreen mode

4. Country (1 relation)

@Query("SELECT c FROM Country c " +
       "LEFT JOIN FETCH c.currency")
Enter fullscreen mode Exit fullscreen mode

ðŸ’Ą View Country entity →

Plus: Counterparty, Organization, and AccountingPolicy ✅


ðŸŽŊ Best Practices

1. Naming Convention

// Clearly indicate what the method does
findAllWithCurrencies()
findByIdWithRelations()
findLatestRatesByCurrencyFromWithCurrencies()
Enter fullscreen mode Exit fullscreen mode

2. Separate Methods for Different Scenarios

// For reading - with JOIN FETCH ✅
@Query("SELECT b FROM Bank b LEFT JOIN FETCH b.country")
List<Bank> findAllWithCountry();

// For duplicate checks - WITHOUT JOIN (faster!) ⚡
boolean existsBySwiftCode(String swiftCode);
Enter fullscreen mode Exit fullscreen mode

3. DISTINCT for Multiple JOINs

@Query("SELECT DISTINCT e FROM Entity e " +
       "LEFT JOIN FETCH e.relation1 " +
       "LEFT JOIN FETCH e.relation2")
Enter fullscreen mode Exit fullscreen mode

4. Separate countQuery for Pagination

@Query(
    value = "SELECT e FROM Entity e LEFT JOIN FETCH e.relation",
    countQuery = "SELECT COUNT(e) FROM Entity e"  // Without JOIN!
)
Page<Entity> findAllWithRelation(Pageable pageable);
Enter fullscreen mode Exit fullscreen mode

❌ Common Mistakes

Mistake 1: Forgetting DISTINCT

// WITHOUT DISTINCT - may have duplicates! ⚠ïļ
@Query("SELECT e FROM Entity e LEFT JOIN FETCH e.collection")
Enter fullscreen mode Exit fullscreen mode

Mistake 2: JOIN in countQuery

// Slow COUNT due to JOIN! 🐌
countQuery = "SELECT COUNT(e) FROM Entity e LEFT JOIN e.relation"
Enter fullscreen mode Exit fullscreen mode

Mistake 3: Not Updating Tests

// Old method no longer exists! ❌
when(repository.findAll()).thenReturn(list);
Enter fullscreen mode Exit fullscreen mode

Mistake 4: JOIN FETCH with @OneToMany and Pagination

// Breaks pagination! ðŸ’Ĩ
@Query("SELECT i FROM Invoice i LEFT JOIN FETCH i.items")
Page<Invoice> findAll(Pageable pageable);
Enter fullscreen mode Exit fullscreen mode

ðŸšŦ When NOT to Use JOIN FETCH

  1. ❌ Collections with pagination
  2. ❌ Document table parts
  3. ❌ @OneToMany with many records
  4. ❌ Duplicate checks
  5. ❌ Count queries
  6. ❌ Delete operations
  7. ❌ Exists checks

📝 Conclusions

Results:

✅ 7 modules optimized

✅ 48 methods with JOIN FETCH

✅ 94% reduction in SQL queries (33 → 2)

✅ 23% improvement in response time (48ms → 37ms)

✅ Scalability: O(n*2) → O(1)

When to Use JOIN FETCH:

✅ Perfect for:

  • Flat structures (@ManyToOne)
  • Fixed number of relationships
  • Pagination without collections
  • Predictable data sizes

❌ Avoid for:

  • Collections (@OneToMany) with pagination
  • Document table parts
  • Unpredictable data sizes

Next Steps:

  1. 📊 Monitor production metrics
  2. ðŸ’ū Consider caching for frequent queries
  3. 🔍 Explore @EntityGraph for collections (next article!)
  4. ⚡ Optimize batch operations

🔗 Useful Resources


🚀 About the Project

Financial Accounting System

📖 View Project on GitHub →


Coming Next: "Optimizing N+1 for Collections with @EntityGraph" 🔜


This article was created based on real production experience with 19+ years in enterprise development. ðŸ‘Ļ‍ðŸ’ŧ

💎 Questions? Drop them in the comments!

⭐ Found it helpful? Give the repo a star!

🔔 Want more? Follow for the next article on @EntityGraph!


ðŸ‘Ļ‍ðŸ’ŧ About the Author

Taras Antoniuk

Java Backend Developer | 19+ years IT experience

📧 bronya2004@gmail.com

🔗 LinkedIn

ðŸ’ŧ HackerRank (5⭐ SQL, Java, Collections)

🌐 Production API

Top comments (0)