http://javarevisited.blogspot.sg/2011/04/difference-between-concurrenthashmap.html
How to use ConcurrentHashMap in Java - Example Tutorial and Working
ConcurrentHashMap in Java is introduced as an alternative of Hashtable in Java 1.5 as part of Java concurrency package. Prior to Java 1.5 if you need a Map implementation, which can be safely used in a concurrent and multi-threaded Java program, than, you only have Hashtable or synchronized Map because HashMap is not thread-safe. WithConcurrentHashMap, now you have better choice; because, not only it can be safely used in concurrent multi-threaded environment but also provides better performance over Hashtable and synchronizedMap. ConcurrentHashMap performs better than earlier two because it only locks a portion of Map, instead of whole Map, which is the case with Hashtable and synchronized Map. CHM allows concurred read operations and same time, maintains integrity by synchronizing write operations. We have seen basics of ConcurrentHashMap on Top 5 Java Concurrent Collections from JDK 5 and 6 and in this Java tutorial, we will learn :
Ø How ConcurrentHashMap works in Java or how it is implemented in Java.
Ø When to use ConcurrentHashMap in Java
Ø ConcurrentHashMap examples in Java
How ConcurrentHashMap is implemented in Java
ConcurrentHashMap is introduced as an alternative of Hashtable and provided all functions supported by Hashtable with additional feature called "concurrency level", which allows ConcurrentHashMap to partition Map. ConcurrentHashMap allows multiple readers to read concurrently without any blocking. This is achieved by partitioning Map into different parts based on concurrency level and locking only a portion of Map during updates. Default concurrency level is 16, and accordingly Map is divided into 16 part and each part is governed with different lock. This means, 16 thread can operate on Map simultaneously, until they are operating on different part of Map. This makes ConcurrentHashMap high performance despite keeping thread-safety intact. Though, it comes with caveat. Since update operations like put(), remove(),putAll() or clear() is not synchronized, concurrent retrieval may not reflect most recent change on Map.
In case of putAll() or clear(), which operates on whole Map, concurrent read may reflect insertion and removal of only some entries. Another important point to remember is iteration over CHM, Iterator returned by keySet of ConcurrentHashMap are weekly consistent and they only reflect state of ConcurrentHashMap and certain point and may not reflect any recent change. Iterator of ConcurrentHashMap's keySetarea also fail-safe and doesn’t throw ConcurrentModificationExceptoin..
Default concurrency level is 16 and can be changed, by providing a number which make sense and work for you while creating ConcurrentHashMap. Since concurrency level is used for internal sizing and indicate number of concurrent update without contention, so, if you just have few writers or thread to update Map keeping it low is much better. ConcurrentHashMap also uses ReentrantLock to internally lock its segments.
ConcurrentHashMap putifAbsent example in Java
ConcurrentHashMap examples are similar to Hashtable examples, we have seen earlier, but worth knowing is use of putIfAbsent() method. Many times we need to insert entry into Map, if its not present already, and we wrote following kind of code:
synchronized(map){
if (map.get(key) == null){
return map.put(key, value);
} else{
return map.get(key);
}
}
Though this code will work fine in HashMap and Hashtable, This won't work in ConcurrentHashMap; because, during put operation whole map is not locked, and while one thread is putting value, other thread's get() call can still return null which result in one thread overriding value inserted by other thread. Ofcourse, you can wrap whole code in synchronized block and make it thread-safe but that will only make your code single threaded. ConcurrentHashMap provides putIfAbsent(key, value) which does same thing but atomically and thus eliminates above race condition.
When to use ConcurrentHashMap in Java
ConcurrentHashMap is best suited when you have multiple readers and few writers. If writers outnumber reader, or writer is equal to reader, than performance of ConcurrentHashMap effectively reduces to synchronized map or Hashtable. Performance of CHM drops, because you got to lock all portion of Map, and effectively each reader will wait for another writer, operating on that portion of Map. ConcurrentHashMap is a good choice for caches, which can be initialized during application start up and later accessed my many request processing threads. As javadoc states, CHM is also a good replacement of Hashtable and should be used whenever possible, keeping in mind, that CHM provides slightly weeker form of synchronization than Hashtable.
Summary
Now we know What is ConcurrentHashMap in Java and when to use ConcurrentHashMap, it’s time to know and revise some important points about CHM in Java.
1. ConcurrentHashMap allows concurrent read and thread-safe update operation.
2. During update operation, ConcurrentHashMap only lock a portion of Map instead of whole Map.
3. Concurrent update is achieved by internally dividing Map into small portion which is defined by concurrency level.
4. Choose concurrency level carefully as a significant higher number can be waste of time and space and lower number may introduce thread contention in case writers over number concurrency level.
6. Since ConcurrentHashMap implementation doesn't lock whole Map, there is chance of read overlapping with update operations like put() and remove(). In that case result returned by get() method will reflect most recently completed operation from there start.
7. Iterator returned by ConcurrentHashMap is weekly consistent, fail safe and never throw ConcurrentModificationException. In Java.
8. ConcurrentHashMap doesn't allow null as key or value.
9. You can use ConcurrentHashMap in place of Hashtable but with caution as CHM doesn't lock whole Map.
10. During putAll() and clear() operations, concurrent read may only reflect insertion or deletion of some entries.
That’s all on What is ConcurrentHashMap in Java and when to use it. We have also seen little bit about internal working of ConcurrentHashMap and how it achieves it’s thread-safety and better performance over Hashtable and synchronized Map. Use ConcurrentHashMap in Java program, when there will be more reader than writers and it’s a good choice for creating cache in Java as well.
Related Java Concurrency and Collection Tutorials from this blog
Read more: http://javarevisited.blogspot.com/2013/02/concurrenthashmap-in-java-example-tutorial-working.html#ixzz3gj5h1tdv
14 comments :
interviewer was possibly expecting a simple answer, such as:
•if the whole map is synchronized for get/put operations, adding threads won't improve throughput because the bottleneck will be the synchronized blocks. You can then write a piece of code with a synchronizedMap that shows that adding threads does not help
•because the map uses several locks, and assuming you have more than one core on your machine, adding threads will improve throughput
The example below outputs the following:
Synchronized one thread: 30
Synchronized multiple threads: 96
Concurrent one thread: 219
Concurrent multiple threads: 142
So you can see that the synchronized version is more than 3 times slower under high contention (16 threads) whereas the concurrent version is almost twice as fast with multiple threads as with a single thread.
It is also interesting to note that the ConcurrentMap has a non-negligible overhead in a single threaded situation.
This is a very contrived example, with all the possible problems due to micro-benchmarking (first results should be discarded anyway). But it gives a hint at what happens.
public class Test1 {
static final int SIZE = 1000000;
static final int THREADS = 16;
static final ExecutorService executor = Executors.newFixedThreadPool(THREADS);
public static void main(String[] args) throws Exception{
for (int i = 0; i < 10; i++) {
System.out.println("Concurrent one thread");
addSingleThread(new ConcurrentHashMap ());
System.out.println("Concurrent multiple threads");
addMultipleThreads(new ConcurrentHashMap ());
System.out.println("Synchronized one thread");
addSingleThread(Collections.synchronizedMap(new HashMap ()));
System.out.println("Synchronized multiple threads");
addMultipleThreads(Collections.synchronizedMap(new HashMap ()));
}
executor.shutdown();
}
private static void addSingleThread(Map map) {
long start = System.nanoTime();
for (int i = 0; i < SIZE; i++) {
map.put(i, i);
}
System.out.println(map.size()); //use the result
long end = System.nanoTime();
System.out.println("time with single thread: " + (end - start) / 1000000);
}
private static void addMultipleThreads(final Map map) throws Exception {
List runnables = new ArrayList<> ();
for (int i = 0; i < THREADS; i++) {
final int start = i;
runnables.add(new Runnable() {
@Override
public void run() {
//Trying to have one runnable by bucket
for (int j = start; j < SIZE; j += THREADS) {
map.put(j, j);
}
}
});
}
List futures = new ArrayList<> ();
long start = System.nanoTime();
for (Runnable r : runnables) {
futures.add(executor.submit(r));
}
for (Future f : futures) {
f.get();
}
System.out.println(map.size()); //use the result
long end = System.nanoTime();
System.out.println("time with multiple threads: " + (end - start) / 1000000);
}
}
Then how the putIfAbsent will work, the example you gave..if(map.get(key)==null)
That's imprecise and misleading.
From the Javadoc:
"Retrievals reflect the results of the most recently completed update operations" (for put() and remove()).
"For aggregate operations such as putAll and clear, concurrent retrievals may reflect insertion or removal of only some entries"
if updates are not synchronized then how come it has thread safe writes?