Sunday, May 13, 2012

Map Interface


MAP Interface:A Map is an object that maps keys to values. Maps cannot contain duplicate keys: Each key can map to at most one value.
  • Map is not the child interface of Collection interface
  • If you want to store the values or objects as key value pairs we go for the Map interface
  • Both key and values are objects
  • Duplication of keys is not allowed but values may be duplicated
  • Key –Value Pair is called an Entry



Methods in Map Interface:

1)Object put(Object key,Object value) //Inserts a value into the map.if the key is already exists it will replace the old values with new one
2)Object get(Object key) // Returns the value associated with the key otherwise returns null
3)Object remove(Object key) // It removes the entry associated with the key and returns the corresponding value otherwise returns null
4)boolean containsKey(Object key)
5)boolean containsValue(Object key)
6)int size()
7)boolean isEmpty()
8)void clean()
9)void putAll(Map m)
10)Set keySet()
11)Collection values()
12)Set emptySet()
// The following three methods collection view of Map
13)Object getKey();
14)Object getValue();
15)Object setValue(Object obj)
//Entry is an inner interface present inside the map.It contains
the following three methods which can be applied on the entry object.
16)Object getKey();
17)Object getValue();
18)Object setValue(Object obj)

Note:
interface Map{
Interface Entry{ // inner interface entry
Object getKey();
Object getValue();
Object setValue(Object obj)
}
}

Hashtable:
  • The underlying datastructure for the Hastable is the HashTable itself.
  • Heterogeneous values are allowed for both keys and values
  • null insertion is allowed for both keys and values for the first element violation leads to NullPointerException
  • Almost all methods or Hashtable are synchronized hence it is thread safe
  • Insertion order is not preserved and the objects are arranged based on hashcode
  • Duplicate objects for values but keys are not to be duplicated
Hashtable Constructors:
Hashtable table=new Hashtable();
Hashtable table=new Hashtable(int initialCapacity);
Hashtable table=new Hashtable(int initialCapacity,float fillRatio);
Hashtable table=new Hashtable(Map m);

Hashtable Demo:
import java.util.Hashtable;

public class HashtableEx1 {
public static void main(String[] args) {
Hashtable table=new Hashtable();
table.put(new Integer(1),"xxx");
table.put(new Integer(24),"yyy");
table.put(new Integer(3),"zzz");
table.put(new Integer(8),"aaa");
table.put(new Integer(9),"bbb");
table.put(new Integer(2),"sss");
System.out.println("The Hashtable is "+table);
}
}

output:
The Hashtable is {9=bbb, 8=aaa, 3=zzz, 2=sss, 24=yyy, 1=xxx}


HashMap:
  • The underlying datastructure for HashMap is Hashtable
  • Duplicate keys are not allowed but values may be duplicated,i.e, it replace the new value for the key which is duplicated.
e.g :
            HashMap map=new HashMap();
            map.put(4, "Hi");
            map.put (1,"abhi");
            map.put (2,"abhinav");
            map.put(1,"mishra");
            map.put(3,"hello");
            System.out.println("The map: "+map);

     o/p>> The map: {1=mishra, 2=abhinav, 3=hello, 4=Hi}
  
    /**
     * Constructs an empty <tt>HashMap</tt> with the default initial capacity
     * (16) and the default load factor (0.75).
     */
    public HashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
        threshold = (int)(DEFAULT_INITIAL_CAPACITY * DEFAULT_LOAD_FACTOR);
        table = new Entry[DEFAULT_INITIAL_CAPACITY];
        init();
    }
  • Insertion order is not preserved
  • Heterogeneous key and values are allowed
  • Null key and values are allowed.
HashMap Constructors:
1) HashMap map=new HashMap();
it Create the empty hashmap with the default initial capacity 16 and fillRatio 0.75
2) HashMap map=new HashMap(int initialCapacity);
it Create a HashMap with the specified nitialCapacity and default load factor
3) HashMap map=new HashMap(int nitialCapacity,float loadFactor);
4) HashMap map=new HashMap(Map m);

HashMapDemo Program:
import java.util.*;
public class HashMapEx1 {
public static void main(String... args) {
HashMap map=new HashMap();
map.put("orange",new Integer(1000));
map.put("apple",new Integer(2000));
map.put("banana",new Integer(3000));
map.put("grapes",new Integer(4000));
System.out.println("The Map "+map);
System.out.println(map.put(("orange"),new Integer(1001)));
System.out.println("map "+map);
Set s=map.keySet();
System.out.println("The Key Set"+s);
Collection values=map.values();
System.out.println("The Values Are "+values);
Set s1=map.entrySet();
System.out.println("The Entry Set"+s1);
}
}


output:
The Map {orange=1000, grapes=4000, apple=2000, banana=3000}
1000
map {orange=1001, grapes=4000, apple=2000, banana=3000}
The Key Set[orange, grapes, apple, banana]
The Values Are [1001, 4000, 2000, 3000]
The Entry Set[orange=1001, grapes=4000, apple=2000, banana=3000]

LinkedHashMap:
It is exactly similar to the HashMap except the following differences

LinedHashMap:
  • Underlying datastructures are hashtable and doubly linkedlist
  • Insertion order of elements is preserved
  • While iterating elements we can give guarantee for processing order.Hence we can use for caching.
HashMap
  • Insertion order is not preserved
  • While iterating we can not give guarantee for processing order. Hence we can not use it for caching
  • Underlying datastructure is Hashtable
HashMap is a collection class that is designed to store elements as key-value pairs. Maps provide a way of looking up one thing based on the value of another.

We modify the above code by use of the HashMap class to store words along with their frequencies.

String[] data = new String("Nothing is as easy as it looks").split(" ");
HashMap‹String, Integer> hm = new HashMap‹String, Integer>();
for (String key : data)
{
  Integer freq = hm.get(key);
  if(freq == null) freq = 1; else freq ++;
  hm.put(key, freq);
}
System.out.println(hm);

This prints {as=2, Nothing=1, it=1, easy=1, is=1, looks=1}.
HashSet and HashMap will be printed in no particular order. If the order of insertion is important in your application, you should use LinkeHashSet and/or LinkedHashMap classes. If you want to print data in sorted order, you should use TreeSet and or TreeMap classes


NOTE:-

Ø LinkedHashMap/ LinkedHashSet are used when the order of insertion is important in your application, because these classes the iteration ordering method for this linked hash map and linked hash set is true for access-order and false for insertion order.

        public LinkedHashMap() {
           super();
             accessOrder = false;
          }

            And
       
               public LinkedHashSet() {
             super();
             accessOrder = false;
          }


Ø If you want to print data in sorted order, you should use TreeSet and or TreeMap classes

The following are some of the HashMap methods:
  • map.get(key) -- returns the value associated with that key. If the map does not associate any value with that key then it returns null. Referring to "map.get(key)" is similar to referring to "A[key]" for an array A.
  • map.put(key,value) -- adds the key-value pair to the map. This is similar to "A[key] = value" for an array A.
  • map.containsKey(key) -- returns true if the map has that key.
  • map.containsValue(value) -- returns true if the map has that value.
  • map.keySet() -- returns a set of all keys
  • map.values() -- returns a collection of all value

IdentityHashMap:
In case of HashMap JVM uses the equals() method to identify the duplicate keys
But if want to use the == operator to identify the duplicates we go for the IdentityHashMap
Incase of IdentityHashMap two key reference i1 and i2 are equal if and only if bot i1 and i2 are pointing to the same object on the heap

IdentityHashMap Demo Program:
import java.util.IdentityHashMap;

public class IdentityHashMapEx1 {
public static void main(String... args) {
IdentityHashMap map=new IdentityHashMap ();
Integer i1=new Integer(10);
Integer i2=new Integer(10);
map.put(i1,"orange");
map.put(i2,"apple");
System.out.println("The Map"+map);
}
}


WeakHashMap:
  • WeakHashMap is not a child class of HashMap
  • HashMap dominates the garbage collector.if any objects are associated with the HashMap eventhough that object does not have any external references .Garbage collector is not allowed to destroy that object
  • But garbage collector dominate the WeakHashMap that is in case of hashmap if the key is not reachable garbage collector is allowed to destroy whole entry associated with the key
WeakHashMap Demo Program:
import java.util.WeakHashMap;
class Temp {
public void finalize() {
System.out.println("finalize() Called");
}
public String toString() {
return "Temp";
}
}
public class WeakHashMapEx1 {
public static void main(String[] args) {
WeakHashMap map=new WeakHashMap();
Temp t=new Temp();
map.put(t,"orange");
t=null;
System.gc();
System.out.println("The Map"+map);
}
}

output:
finalize() Called
The Map{}


Why ConcurrentHashMap is better than Hashtable and just as good as a HashMap...

ConcurrentHashMap is a pretty ignored class. Not many people know about it and not many people care to use it. The class offers a very robust and fast (comparatively, we all know java concurrency isn’t the fastest) method of synchronizing a Map collection.
There is no way you can compare the two, one offers synchronized methods to access a map while the other offers no synchronization whatsoever. What most of us fail to notice is that while our applications, web applications especially, work fine during the development & testing phase, they usually go tits up under heavy (or even moderately heavy) load. This is due to the fact that we expect our HashMap’s to behave a certain way but under load they usually misbehave.
Hashtable’s offer concurrent access to their entries, with a small caveat, the entire map is locked to perform any sort of operation. While this overhead is ignorable in a web application under normal load, under heavy load it can lead to delayed response times and overtaxing of your server for no good reason.
This is where ConcurrentHashMap’s step in. They offer all the features of Hashtable with a performance almost as good as a HashMap. ConcurrentHashMap’s accomplish this by a very simple mechanism. Instead of a map wide lock, the collection maintains a list of 16 locks by default, each of which is used to guard (or lock on) a single bucket of the map. This effectively means that 16 threads can modify the collection at a single time (as long as they’re all working on different buckets). Infact there is no operation performed by this collection that locks the entire map. The concurrency level of the collection, the number of threads that can modify it at the same time without blocking, can be increased. However a higher number means more overhead of maintaining this list of locks.
Retrieval operations on a ConcurrentHashMap do not block unless the entry is not found in the bucket or if the value of the entry is null. In such a case the map synchronizes on the bucket and then tries to look for the entry again just in case the entry was put or removed right after the get in synchronized mode.
Removal operations do require a bit of overhead. All removal operations require the chain of elements before and after to be cloned and joined without the removed element. Since the value of the map key is volatile (not really, the value of the inner Entry class is volatile) if a thread already traversing the bucket from which a value is removed reaches the removed element, it automatically sees a null value and knows to ignore such a value.
Traversal in a ConcurrentHashMap does not synchronize on the entire map either. Infact traversal does not synchronize at all except under one condition. The internal LinkedList implementation is aware of the changes to the underlying collection. If it detects any such changes during traversal it synchronizes itself on the bucket it is traversing and then tries to re-read the values. This always insures that while the values recieved are always fresh, there is minimalistic locking if any.
Iteration over a ConcurrentHashMap are a little different from those offered by other collections. The iterators are not fail-fast in the sense that they do not throw a ConcurrentModificationException. They also do not guarantee that once the iterator is created it will list/show all elements that are added after its creation. The iterators do however guarantee that any updates or removal of items will be reflected correctly in their behaviour. They also guarantee that no element will be returned more than once while traversal.
In conclusion, give it a try, replace some Hashtable’s in your application with ConcurrentHashMap and see how they perform under load. The two are interchangeable so it shouldn’t be hard to update your app.

How HashMap works in Java

"Have you used HashMap before" or "What is HashMap? Why do we use it “ 
Almost everybody answers this with yes and then interviewee keep talking about common facts about hashMap like hashMap accpt null while hashtable doesn'tHashMap is not synchronized, hashMap is fast and so on along with basics like its stores key and value pairs etc.
This shows that person has used hashMap and quite familier with the funtionalities HashMap offers but interview takes a sharp turn from here and next set of follow up questions gets more detailed about fundamentals involved in hashmap. Interview here you and come back with questions like

"Do you Know how hashMap works in Java” or 
"How does get () method of HashMap works in Java"
 
And then you get answers like I don't bother its standard Java API, you better look code on java; I can find it out in Google at any time etc.
But some interviewee definitely answer this and will say "HashMap works on principle of hashing, we have put () and get () method for storing and retrieving data from hashMap. When we pass an object to put () method to store it on hashMap, hashMap implementation calls
hashcode() method hashMap key object and by applying that hashcode on its own hashing funtion it identifies a bucket location for storing value object , important part here is HashMap stores both key+value in bucket which is essential to understand the retrieving logic. if people fails to recognize this and say it only stores Value in the bucket they will fail to explain the retrieving logic of any object stored in HashMap . This answer is very much acceptable and does make sense that interviewee has fair bit of knowledge how hashing works and how HashMap works in Java.
But this is just start of story and going forward when depth increases a little bit and when you put interviewee on scenarios every java developers faced day by day basis. So next question would be more likely about collision detection and collision resolution in Java HashMap e.g 



"What will happen if two different objects have same hashcode?”
Now from here confusion starts some time interviewer will say that since Hashcode is equal objects are equal and HashMap will throw exception or not store it again etc. then you might want to remind them aobut equals and hashCode() contract that two unequal object in Java very much can have equal hashCode. Some will give up at this point and some will move ahead and say "Since hashCode () is same, bucket location would be same and collision occurs in hashmap, Since HashMap use a linked list to store in bucket, value object will be stored in next node of linked list." great this answer make sense to me though there could be some other collision resolution methods available this is simplest and HashMap does follow this.
But story does not end here and final questions interviewer ask like 

"How will you retrieve if two different objects have same hashCode?” 
 
Interviewee will say we will call get() method and then HashMap uses keys hashCode to find out bucket location and retrieves object but then you need to remind him that there are two objects are stored in same bucket , so they will say about traversal in linked list until we find the value object , then you ask how do you identify value object because you don't value object to compare ,So until they know that HashMap stores both Key and Value in linked list node they won't be able to resolve this issue and will try and fail.

This is the implementation of get() :
Working of get()>>

public V get(Object key) {
        if (key == null)
            return getForNullKey();
        int hash = hash(key.hashCode());
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
                return e.value;
        }
        return null;
    }

But those bunch of people who remember this key information will say that after finding bucket location , we will call keys.equals() method to identify correct node in linked list and return associated value object for that key in Java HashMap. Perfect this is the correct answer.

In many cases interviewee fails at this stage because they get confused between hashcode () and equals () and keys and values object in hashMap which is pretty obvious because they are dealing with the hashcode () in all previous questions and equals () come in picture only in case of retrieving value object from HashMap.
Some good developer point out here that using immutable, final object with proper equals () and hashcode () implementation would act as perfect Java HashMap keys and improve performance of Java hashMap by reducing collision. Immutablity also allows caching there hashcode of different keys which makes overall retrieval process very fast and suggest that String and various wrapper classes e.g. Integer provided by Java Collection API are very good HashMap keys.

Now if you clear all this java hashmap interview question you will be surprised by this very interesting question "What happens On HashMap in Java if the size of the Hashmap exceeds a given threshold defined by load factor?". Until you know how hashmap works exactly you won't be able to answer this question. 
if the size of the map exceeds a given threshold defined by load-factor e.g. if load factor is .75 it will act to re-size the map once it filled 75%. Java Hashmap does that by creating another new bucket array of size twice of previous size of hashmap, and then start putting every old element into that new bucket array and this process is called rehashing because it also applies hash function to find new bucket location. 

If you manage to answer this question on hashmap in java you will be greeted by "do you see any problem with resizing of hashmap in Java" , you might not be able to pick the context and then he will try to give you hint about multiple thread accessing the java hashmap and potentially looking for race condition on HashMap in Java. 

So the answer is Yes there is potential race condition exists while resizing hashmap in Java, if two thread at the same time found that now Java Hashmap needs resizing and they both try to resizing. on the process of resizing of hashmap in Java , the element in bucket which is stored in linked list get reversed in order during there migration to new bucket because java hashmap doesn't append the new element at tail instead it append new element at head to avoid tail traversing. if race condition happens then you will end up with an infinite loop. though this point you can potentially argue that what the hell makes you think to use HashMap in multi-threaded environment to interviewer :) 

How HashMAp works in Java?
HashMap works on principle of hashing, we have put () and get () method for storing and retrieving object form hashMap.When we pass an both key and value to put() method to store on HashMap, it uses key object hashcode() method to calculate hashcode and they by applying hashing on that hashcode it identifies bucket location for storing value object.
While retrieving it uses key object equals method to find out correct key value pair and return value object associated with that key. HashMap uses linked list in case of collision and object will be stored in next node of linked list.
Also hashMap stores both key+value tuple in every node of linked list.

What will happen if two different HashMap key objects have same hashcode?
They will be stored in same bucket but no next node of linked list. And keys equals () method will be used to identify correct key value pair in HashMap.

In terms of usage HashMap is very versatile and I have mostly used hashMap as cache in electronic trading application I have worked . Since finance domain used Java heavily and due to performance reason we need caching a lot HashMap comes as very handy there.

No comments:

Post a Comment

Thanks for your comments/Suggestions.