An essential website requires a web server to receive requests and a database to write or read data. However, this simple setup will only scale once you optimise your database or change the overall database strategy if you receive millions of requests per second. Is that correct? The database eventually reached its limit on Active Connections and had difficulty managing concurrent requests.
One solution to consider when looking to improve a system’s scalability is caching. Caching is a widely used technique that can be found in many different areas, including web applications, databases, media streaming, e-commerce, gaming, cloud computing, and mobile applications.
Before delving deeper into caching, it is essential first to understand its advantages over using a traditional database alone. Caching can offer several benefits that a database alone may not provide:
- Speed: Caching can significantly improve the speed of an application by storing frequently accessed data in memory, which can be accessed much faster than data stored on disk.
- Scalability: Caching can help scale an application by distributing the load across multiple cache servers, which can handle many requests.
- Reducing the load on the database: Caching can reduce the load by storing frequently accessed data in memory, thus reducing the number of requests to the database.
- Improving availability: Caching can help improve an application’s availability by providing a backup of frequently accessed data in case the database becomes unavailable.
- Cost: Caching can reduce costs by reducing the number of requests to the database, which helps reduce the need for expensive database resources like CPU, memory and storage.
- Data Consistency: Caching can improve consistency by storing the most recent data and ensuring that the data is always up to date.
Let’s look at the caching solutions available to us and determine which one best meets our requirements. Caching in a high-scale application can be a challenging task. Here are a few strategies that can be used to cache in a high-scale application:
- In-memory Caching: In-memory caching stores the cache data in the memory, allowing faster access times. This strategy is helpful for high-scale applications that handle a high volume of read operations.
- Cache Sharding: Cache sharding is a technique that divides a large cache into smaller partitions, which can be stored on different servers. This can help to distribute the load and reduce the risk of a single point of failure. Each shard is responsible for a specific subset of the data accessed by a particular key.
- Distributed caching: Distributed caching allows you to cache data across multiple servers, which can help reduce the load on any one server and improve the system’s overall performance. Distributed caching systems can be further divided into replicated and partitioned caching. Replicated caching stores a copy of the data on multiple servers, while partitioned caching divides the data into smaller chunks and stores them on different servers.
- Content Delivery Networks (CDNs): CDNs are servers distributed across the globe. They can cache and deliver user content based on their geographic location.
When it comes to caching, it’s important to note that the choice of strategy will depend on the unique needs and requirements of the application. It’s beneficial to test various scenarios and determine the best fit for the specific use case. It’s also possible that a hybrid approach, such as combining in-memory caching with distributed caching or cache sharding, may be necessary for high-scale applications where data consistency is not a primary concern. Overall, the caching strategy should be tailored to the application’s specific needs.
This article will focus on distributed caching, but the concepts discussed can also be applied to other areas.
Distributed caching is a technique that allows you to cache data across multiple servers rather than on a single server. This can reduce the load on any server and improve the system’s overall performance.
Several key components make up a distributed caching system:
- Cache nodes are servers that store and manage cached data. They can be configured as a cluster or grid and communicate with each other to keep the cached data consistent.
- Cache clients are the applications or services that interact with the cache nodes to read and write data.
- Load balancer: This component distributes the load across the cache nodes. It can be configured to use various algorithms such as round-robin, least connections, or IP hash. You can read more about load balancers here.
These components provide high availability, scalability, and system performance.
A wise man said:
There are only two hard things in Computer Science: cache invalidation and naming things.
– Phil Karlton
Cache invalidation is the process of removing stale or outdated data from a cache. This is an essential aspect of caching because it ensures that the data in the cache is up-to-date and consistent with the source data.
Now that we understand the importance of cache invalidation, let’s focus on one critical metric used in this process: Time-to-live (TTL). Each cached item is assigned a timestamp and a TTL value. Once the TTL expires, the corresponding item is automatically removed from the cache.
And how do we remove the item from the cache? Several cache eviction strategies can be used to improve the performance of a system, including:
- First In, First Out (FIFO): This strategy removes the oldest entry in the cache when a new one needs to be added.
- Least Recently Used (LRU): This strategy removes the entry that has been in the cache the longest without being accessed.
- Most Recently Used (MRU): This strategy removes the entry accessed most recently.
- Least Frequently Used (LFU): This strategy removes the entry that has been accessed the fewest times.
- Random Replacement (RR): This strategy randomly selects an entry to remove when a new one needs to be a