August 25, 2020

How to implement cache in Java

According to wikipedia, a cache is a hardware or software component that stores data so that future requests for that data can be served faster.

Benefits:

Types:

In this post, we’ll go through the steps to create a in memory cache in java.

To create a cache, we can simply use a map / dictionary data structure and we can get the expected result of O(1) for both get and put operation.

But, we can’t store everything in our cache. We have storage and performance limits.

A cache eviction algorithm is a way of deciding which element to evict when the cache is full. To gain optimized benefits there are many algorithms for different use cases.

There are only two hard things in computer science, Cache invalidation and naming things. — Phil Karlton

In our design, we will use

Cache implementation strategy
We are using doubly linked list to determine which key to delete and have the benefit of adding and deleting keys in O(1).

Initially we will declare a model to store our key-value pair, hit count and reference node to point previous and next node.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
public class CacheItem<K, V> {
private K key;
private V value;
private int hitCount = 0; // LFU require this
private CacheItem prev, next;

public CacheItem(K key, V value) {
this.value = value;
this.key = key;
}
public void incrementHitCount() {
this.hitCount++;
}
// getter / setter
}

Now, we can outline our cache class with basic functionalities of get, put, delete and size.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
public class Cache<K, V> {
private Map<K, CacheItem> map;
private CacheItem first, last;
private int size;
private final int CAPACITY;
private int hitCount = 0;
private int missCount = 0;
public Cache(int capacity) {
CAPACITY = capacity; //
map = new HashMap<>(CAPACITY);
}

public void put(K key, V value) {}

public V get(K key) {}

public void delete(K key) {
deleteNode(map.get(key));
}

public int size() {
return size;
}
public int getHitCount() {
return hitCount;
}

public int getMissCount() {
return missCount;
}
}

We also included hit count and miss count to determine the performance of our cache.

Hit ratio = total hit / (total hit + total miss)

For our put method

  1. If key already exists, update the value
  2. Otherwise, If size exceeds capacity
  3. Delete existing node using appropriate strategy
  4. Add new node in the top
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    public void put(K key, V value) {
    CacheItem node = new CacheItem(key, value);
    if(map.containsKey(key) == false) {
    if(size() >= CAPACITY) {
    deleteNode(first);
    }
    addNodeToLast(node);
    }
    map.put(key, node);
    }

In our delete method

We can add node in the top. The last pointer will reference to the last inserted node.

1
2
3
4
5
6
7
8
9
10
11
12
private void addNodeToLast(CacheItem node) {
if(last != null) {
last.setNext(node);
node.setPrev(last);
}

last = node;
if(first == null) {
first = node;
}
size++;
}

In our get method,

Reorder is the key for this implementation. For different algorithm this reorder to delete method will change.

Least Recently Used (LRU)

Delete candidate is the oldest used entry.

Least Frequently Used (LFU)

Delete candidate is the least accessed entry.

We have to sort items based on the frequency the nodes being accessed.

To avoid getting deleted, for each accessed items needs to reach top based on their frequency.

This way, there is an edge case where, after adding items that require delete from the start can cause deletion of the high frequency node.

To solve this issue, we need to add nodes pointing our first node for least frequently accessed.

1
2
3
4
5
6
7
8
9
10
11
12
private void addNodeToFirst(CacheItem node) {
if(first != null) {
node.setNext(first);
first.setPrev(node);
}
first = node;

if(last == null) {
last = node;
}
size++;
}

First In First Out (FIFO)

Delete candidate is the first node.

Last In First Out (LIFO)

Delete candidate is the last node.

Questions:

  1. We can use array /ArrayList to solve the same issue with simplified coding.

Yes, it would be simpler. The aspects for not using,

  1. We could have used the LinkedList instead of ArrayList.

Once again, yes. LinkedList had the benefit for adding and removing node at O(1). The aspects for not using,

  1. For LFU, we are having a complexity of O(n) in reordering method.

Yes, unfortunately.

I have came up a solution with this so far. Please suggest any better idea.

  1. We are using HashMap. It can have the issue of hash collision. Then the performance will not be O(1) as mentioned.

Absolutely.

I’ve written a post with implementation and explanation HashMap Implementation for Java.

  1. Is this implementation thread safe?

No. To be thread safe

  1. How to add time based eviction strategy to auto delete items?

No. To implement this, we need to run a timer thread to define which items are eligible for eviction and perform deletion operation.

About this Post

This post is written by Mahfuzur Rahman, licensed under CC BY-NC 4.0.

#java#cache