Collection of Java Containers Interview Selection

Collection of Java Containers Interview Selection

1.1 What are the commonly used collection classes for collection containers?

The Java collection framework mainly includes two types of containers, one is a collection, which stores a collection of elements, and the other is a map, which stores key/value pair mappings.

  • There are three sub-interfaces of Collection, Set, List and Queue, the commonly used List interface and Set interface.
  • The main implementation classes of the List interface:
  • The main implementation classes of the Set interface:
  • The main implementation classes of the Map interface:
    as well as

1.2 What is the difference between List, Set and Map? What are the characteristics of each of the three interfaces of List, Map, and Set when accessing elements?

  • List: An ordered ( the order in which the elements are stored in the collection is the same as the order in which they are taken out ) container, elements can be repeated, multiple null elements can be inserted, and the elements have indexes.
  • Set: An unordered (the order of deposit and withdrawal may be inconsistent ) container, can not store duplicate elements, only allow to store a null element, must ensure the uniqueness of the element.
  • Map: It is a collection of key-value pairs that stores the mapping between keys, values, and. Key is unordered and unique; value does not require order, and repetition is allowed.

1.3 What is the difference between ArrayList and LinkedList?

  • Implementation of the underlying data structure :
    The bottom layer uses a dynamic array, and
    The bottom layer uses a doubly linked list. Doubly linked list
  • Random access efficiency :
    High efficiency during random access,
    Interface, and **
    ** It is a linear data storage method, so you need to move the pointer to search from the front to the back.
  • Insert and delete efficiency :
    The bottom layer uses array storage, so the time complexity of inserting and deleting elements is affected by the position of the element. For example: when the add(E e) method is executed,
    By default, the specified element will be appended to the end of this list. In this case, the time complexity is O(1). But if you want to insert and delete elements at the specified position i (add(int index, E element)), the time complexity is O(ni). Because in the above operation, the i-th and (ni) elements after the i-th element in the set must be moved backward/forward by one bit.
    The bottom layer is stored in a linked list, so for the insert of the add(E e) method, the time complexity of deleting an element is not affected by the position of the element, which is approximately O ( 1 ). If you want to insert and delete elements at the specified position i **add( The time complexity of int index, E element) is approximately o(n)** because it needs to be moved to the specified position and then inserted.
  • Memory space occupation :
    It takes up more memory, because in addition to storing data, the LinkedList node also stores two references, one pointing to the previous element, and one pointing to the next element.
  • Thread safety : ArrayList and LinkedList are not synchronized, that is, thread safety is not guaranteed.

Supplementary content: RandomAccess interface

public interface RandomAccess { } Copy code

Looking at the source code, we find that in fact nothing is defined in the RandomAccess interface. Therefore, in my opinion, the RandomAccess interface is nothing more than a logo. What to identify? Identifies that the class that implements this interface has random access capabilities.

In the binarySearch() method, it must determine whether the incoming list is an instance of RamdomAccess, if it is, call the indexedBinarySearch() method, if not, then call the iteratorBinarySearch() method

public static <T> int binarySearch (List<? extends Comparable<? super Tjk list, T key) { if (list instanceof RandomAccess || list.size() <BINARYSEARCH_THRESHOLD) return Collections.indexedBinarySearch(list, key); else } return Collections.iteratorBinarySearch(list, key); Copy code

ArrayList implements the RandomAccess interface, but LinkedList does not. Why? I think it has something to do with the underlying data structure! The bottom layer of ArrayList is an array, while the bottom layer of LinkedList is a linked list. Arrays naturally support random access, and the time complexity is O(1), so it is called fast random access. The linked list needs to be traversed to a specific location to access the elements at a specific location. The time complexity is O(n), so fast random access is not supported. ArrayList implements the RandomAccess interface, which shows that it has a fast random access function. The RandomAccess interface is just an identification, it does not mean that ArrayList implements the RandomAccess interface to have fast random access function!

Let's summarize the selection of traversal methods for list :

  • For lists that implement the RandomAccess interface, ordinary for loops are preferred, and foreach second.
  • For lists that do not implement the RandomAccess interface, iterator traversal is preferred (foreach traversal is also achieved through the iterator at the bottom). For large-size data, do not use ordinary for loops.

1.4 What is the difference between ArrayList and Vector?

  • Thread safety :
    Synchronized is used to achieve thread synchronization, which is thread-safe, while ArrayList is not thread-safe.
  • Performance :
    Better in terms of performance
  • Expansion :
    The capacity will be dynamically adjusted according to actual needs, but the expansion of Vector will double each time, while the ArrayList will only increase by 50%.

In summary:

  • The underlying implementation of ArrayList and Vector is an array, which supports random access, so it is suitable for query operations. However, the method in Vector is modified by synchronized, so Vector is a thread-safe container, but its performance is worse than that of ArrayList .
  • LinkedList uses a doubly linked list to achieve storage. Indexing data by sequence number requires forward or backward traversal, but when inserting data, only the previous and subsequent items of the current item need to be recorded, so the LinkedList insertion speed is faster .

Supplementary explanation: why the elementData of ArrayList is modified with transient

The array in ArrayList is defined as follows:

private transient Object[] elementData; copy code

Look at the definition of ArrayList again:

public class the ArrayList < E > the extends AbstractList < E > the implements List < E >, the RandomAccess , the Cloneable , Java . IO . the Serializable duplicated code

You can see that ArrayList implements the Serializable interface, which means that ArrayList supports serialization. The role of transient is to say that you don't want the elementData array to be serialized, and rewrite the writeObject implementation:

private void writeObject ( s) throws { * //Write out element count, and any hidden stuff* int expectedModCount = modCount; s.defaultWriteObject(); * //Write out array length* s.writeInt(elementData.length); * //Write out all elements in the proper order.* for ( int i = 0 ; i<size; i++) s.writeObject(elementData[i]); if (modCount != expectedModCount) { throw new ConcurrentModificationException(); } Copy code

Each time you serialize, first call the defaultWriteObject() method to serialize the non-transient elements in the ArrayList, and then traverse the elementData, and only serialize the stored elements, which not only speeds up the serialization, but also reduces the serialization File size.

1.5 How does HashSet check for duplicates? How does HashSet ensure that data is not repeatable?

When adding () elements to the HashSet, the basis for judging whether the element exists is not only to compare the hash value, but also to compare with the equals method. The add() method in HashSet will use the put() method of HashMap.

The key of HashMap is unique. It can be seen from the source code that the value added by HashSet is used as the key of HashMap, and if K/V in HashMap is the same, the old V will be overwritten with the new V, and then the old V will be returned . So there will be no repetition (HashMap compares whether the keys are equal first compares the hashcode and then equals).

The following is part of the source code of HashSet:

private static final Object PRESENT = new Object(); private transient HashMap<E,Object> map; public HashSet () { map = new HashMap<>(); } public boolean add (E e) { //Call the put method of HashMap, PRESENT is a virtual value that is the same from beginning to end return map.put(e, PRESENT)== null ; } Copy code

The relevant provisions of hashCode() and equals():

  • If the two objects are equal, the hashcode must also be the same
  • Two objects are equal, return true for two equals methods
  • Two objects have the same hashcode value, they are not necessarily equal
  • In summary, the equals method has been overridden, the hashCode method must also be overridden
  • The default behavior of hashCode() is to generate unique values for objects on the heap. If hashCode() is not overridden, the two objects of the class will not be equal anyway (even if the two objects point to the same data).

For in-depth understanding, please refer to: Answers to questions about hashCode() and equals() in Java

1.6 The difference between HashSet and HashMap

If you have read the HashSet source code, you should know: The bottom layer of HashSet is implemented based on HashMap. (The source code of HashSet is very, very few, because except for clone(), writeObject(), and readObject(), which HashSet has to implement, all other methods are directly calling methods in HashMap.

Implemented the Map interfaceImplement the Set interface
Store key-value pairsStore objects only
Call put() to add elements to the mapCall the add() method to add elements to the Set
HashMap uses Key to calculate HashcodeHashSet uses member objects to calculate the hashcode value. For two objects, the hashcode may be the same, so the equals() method is used to determine the equality of the objects. If the two objects are different, then return false
HashMap is faster than HashSet because it uses unique keys to obtain objectsHashSet is slower than HashMap

1.7 What are the differences between HashMap in JDK1.7 and JDK1.8? The underlying implementation of HashMap

In Java, there are two relatively simple data structures for saving data: arrays and linked lists. ** "The characteristics of arrays are: easy to address, difficult to insert and delete; the characteristics of linked lists are: difficult to address, but easy to insert and delete; * So we combine arrays and linked lists to give play to their respective advantages, A method called **zipper method** can be used to resolve hash conflicts.

Before JDK1.8

Before JDK1.8, the bottom layer of HashMap is the combination of array and linked list, that is, the hash of the linked list uses the zipper method. Zipper method: Combine linked list and array. In other words, create a linked list array, and each cell in the array is a linked list. If you encounter a hash conflict, add the conflicted value to the linked list.

HashMap data structure in jdk1.7

After JDK1.8

Compared with the previous version, jdk1.8 has a big change in resolving hash conflicts. When the length of the linked list is greater than the threshold (the default is 8), the linked list is converted into a red-black tree to reduce the search time.

HashMap data structure in jdk1.8

Comparison of JDK1.7 VS JDK1.8

JDK1.8 mainly solves or optimizes the following problems:

  • resize optimization
  • The red-black tree is introduced to prevent the single linked list from being too long and affecting the query efficiency. Please refer to the red-black tree algorithm
  • Solve the problem of multi-threaded infinite loop, but it is still not thread-safe, and may cause data loss when multi-threaded.

1.8 The specific process of the put method of HashMap?

When we put, first calculate the hash value of the key. Here, the hash method is called. The hash method actually makes the key.hashCode() and key.hashCode()>>>16 perform the exclusive OR operation, and the high 16bit is filled with 0, one The XOR of the number and 0 remains unchanged, so the approximate effect of the hash function is: the high 16bit remains unchanged, and the low 16bit and the high 16bit do an XOR, the purpose is to reduce collisions. According to the function notes, because the bucket array size is a power of 2, calculate the subscript index = (table.length-1) & hash. If you don t do the hash processing, it is equivalent to only a few low bits that the hash takes effect. In order to reduce the amount of hashing After considering the speed, effect, and quality of the column collisions, the designer uses high 16bit and low 16bit XORs to simply handle and reduce collisions, and a tree structure of O(logn) complexity is used in JDK8 to improve the performance under collisions. .

PutVal method execution flowchart

public V put (K key, V value) { return putVal(hash(key), key, value, false , true ); } static final int hash (Object key) { int h; return (key == null )? 0 : (h = key.hashCode()) ^ (h >>> 16 ); } //Implement Map.put and related methods final V putVal ( int hash, K key, V value, boolean onlyIfAbsent, boolean evict) { Node<K,V>[] tab; Node<K,V> p; int n, i; //Step : create if tab is empty //table is not initialized or the length is 0, expand if ((tab = table) == null || (n = tab.length) == 0 ) n = (tab = resize()).length; //Step : Calculate the index and deal with the null //(n-1) & hash Determine which bucket the element is stored in, the bucket is empty, and the newly generated node is put into the bucket (at this time, this node is Put in the array) if ((p = tab[i = (n- 1 ) & hash]) == null ) tab[i] = newNode(hash, key, value, null ); //There are already elements in the bucket else { Node<K,V> e; K k; //Step : The node key exists and directly overwrites the value //The hash value of the first element in the comparison bucket (node in the array) is equal, and the key is equal if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))) //Assign the first element to e, use e to record e = p; //Step : Determine that the chain is a red-black tree //hash values are not equal, that is, the key is not equal; it is a red-black tree node //If the current element type is TreeNode, it means a red-black tree, putTreeVal returns to be stored node, e may be null else if (p instanceof TreeNode) //Put in the tree e = ((TreeNode<K,V>)p).putTreeVal( this , tab, hash, key, value); //Step : The chain is a linked list //is a linked list node else { //Insert a node at the end of the linked list for ( int binCount = 0 ;; ++binCount) { //Reach the end of the linked list //Determine if the pointer at the end of the linked list is empty if ((e = == null ) { //Insert a new node at the end = newNode(hash, key, value, null ); //Determine whether the length of the linked list reaches the critical value of transforming the red-black tree, the critical value is 8 if (binCount >= TREEIFY_THRESHOLD- 1 ) //-1 for 1st //Linked list structure to tree structure treeifyBin(tab, hash); //Jump out of the loop break ; } //Determine whether the key value of the node in the linked list is equal to the key value of the inserted element if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) //equal, break out of the loop ; //used to traverse the linked list in the bucket, and the previous e = combination, you can traverse the linked list p = e; } } //When judging that the current key already exists, when the same hash value and key value are returned, the new value is returned if (e != null ) { //Record the value of e V oldValue = e.value; //onlyIfAbsent is false or the old value is null if (!onlyIfAbsent || oldValue == null ) //Replace the old value with the new value e.value = value; //Callback after access afterNodeAccess(e); //return old value return oldValue; } } //structural modification ++modCount; //Step : Expand the capacity if the maximum capacity is exceeded //Expand the capacity if the actual size is greater than the threshold if (++size> threshold) resize(); //callback after insertion afterNodeInsertion(evict); return null ; } Copy code

. Determine whether the key-value pair array table[i] is empty or null, otherwise execute resize() for expansion;

. Calculate the hash value according to the key value key to get the inserted array index i, if table[i]==null, directly create a new node to add, go to , if table[i] is not empty, go to ;

. Determine whether the first element of table[i] is the same as the key, if the same, directly overwrite the value, otherwise go to , where the same refers to hashCode and equals;

. Determine whether table[i] is a treeNode, that is, whether table[i] is a red-black tree, if it is a red-black tree, insert the key-value pair directly into the tree, otherwise go to ;

. Traverse table[i], judge whether the length of the linked list is greater than 8, if it is greater than 8, convert the linked list to a red-black tree, and perform the insertion operation in the red-black tree, otherwise perform the insertion operation of the linked list; if the key already exists during the traversal process Just overwrite value directly;

. After the insertion is successful, judge whether the actual number of key-value pairs size exceeds the maximum capacity threshold, and if it exceeds, expand the capacity.

1.9 How is the expansion operation of HashMap realized?

. In jdk1.8, the resize method is to call the resize method for expansion when the key-value pair in the hashmap is greater than the threshold or when it is initialized;

. Each time it is expanded, it is expanded twice;

. After the expansion, the position of the Node object is either at the original position or moved to a position twice the original offset.

In putVal(), we see that the resize() method is used twice in this function. The resize() method means that it will be expanded during the first initialization, or when the actual size of the array is greater than its The critical value (12 for the first time). At this time, the elements on the bucket will be redistributed while expanding. This is also an optimization of the JDK1.8 version. In 1.7, it needs to be redistributed after the expansion. Calculate its hash value and distribute it according to the hash value, but in version 1.8, it is judged based on whether the location of the same bucket (e.hash & oldCap) is 0. After re-allocating the hash, the element The position of either stays at the original position, or moves to the original position + the increased array size

final Node<K,V>[] resize() { Node<K,V>[] oldTab = table; //oldTab points to the hash bucket array int oldCap = (oldTab == null )? 0 : oldTab.length; int oldThr = threshold; int newCap, newThr = 0 ; if (oldCap > 0 ) { //If oldCap is not empty, the hash bucket array is not empty if (oldCap >= MAXIMUM_CAPACITY) { //If it is greater than the maximum capacity, assign it to the maximum integer threshold threshold = Integer.MAX_VALUE; return oldTab; //Return ) //If the length of the current hash bucket array is still less than the maximum capacity after expansion and oldCap is greater than the default value of 16 else if ((newCap = oldCap << 1 ) <MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) newThr = oldThr << 1 ; //double threshold double expansion threshold threshold } //The old capacity is 0, but the threshold is greater than zero, which means that the parameter structure has a cap passed in, and the threshold has been initialized to the smallest power of 2 to the power of n //directly assign this value to the new capacity else if (oldThr> 0 ) //initial capacity was placed in threshold newCap = oldThr; //Construct a map created without parameters, giving the default capacity and threshold 16, 16*0.75 else { //zero initial threshold signifies using defaults newCap = DEFAULT_INITIAL_CAPACITY; newThr = ( int )(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); } //new threshold = new cap * 0.75 if (newThr == 0 ) { float ft = ( float )newCap * loadFactor; newThr = (newCap <MAXIMUM_CAPACITY && ft <( float )MAXIMUM_CAPACITY? ( int )ft: Integer.MAX_VALUE); } threshold = newThr; //Calculate the new array length and assign it to the current member variable table @SuppressWarnings({"rawtypes","unchecked"}) Node<K,V>[] newTab = (Node<K,V>[]) new Node [newCap]; //Create a new hash bucket array table = newTab; //Copy the value of the new array to the old hash bucket array //If the original array is not initialized, then the initialization of the resize ends here, otherwise the expansion element Rearrange the logic to make it evenly distributed if (oldTab != null ) { //Traverse all the bucket subscripts of the new array for ( int j = 0 ; j <oldCap; ++j) { Node<K,V> e; if ((e = oldTab[j]) != null ) { //Assign the bucket index of the old array to the temporary variable e, and dereference the old array, otherwise the array cannot be recycled by the GC oldTab[j] = null ; //If, it means that there is only one element in the bucket, and there is no linked list or red-black tree if ( == null ) //Use the same hash mapping algorithm to add the element to the new array newTab [e.hash & (newCap- 1 )] = e; //If e is TreeNode and!=null, then process the rearrangement of the elements in the tree else if (e instanceof TreeNode) ((TreeNode<K,V>)e).split( this , newTab, j, oldCap); //e is the head of the linked list and!=null, then the elements in the linked list are rearranged else { //preserve order //loHead, loTail means no need to change the subscript after expansion, see note 1 Node<K,V> loHead = null , loTail = null ; //hiHead, hiTail means change the subscript after expansion, see note 1 Node<K, V> hiHead = null , hiTail = null ; Node<K,V> next; //Traverse the linked list do { next =; if ((e.hash & oldCap) == 0 ) { if (loTail == null ) //initialize head to point to the current element e of the linked list, e is not necessarily the first element of the linked list, after initialization loHead //represents the subscript The head element of the linked list that remains unchanged loHead = e; else // points to the current e = e; //loTail points to the current element e //After initialization, loTail and loHead point to the same memory, so when points to the next element, //the next reference of the element in the underlying array also changes accordingly, resulting in lowHead. //Follow loTail to synchronize, so that lowHead can be linked to all elements belonging to the linked list. loTail = e; } else { if (hiTail == null ) //Initialize head to point to the current element e of the linked list, after initialization hiHead represents the head element of the linked list whose subscript is changed hiHead = e; else = e; hiTail = e; } } while ((e = next) != null ); //At the end of the traversal, point tail to null, and put the head of the linked list into the corresponding subscript of the new array to form a new mapping. if (loTail != null ) { = null ; newTab[j] = loHead; } if (hiTail != null ) { = null ; newTab[j + oldCap] = hiHead; } } } } } return newTab; } Copy code

1.10 How does HashMap resolve hash conflicts?

What is a hash?

Hash, generally translated as "hash", but also directly transliterated as "hash", which is to transform an input of any length into a fixed-length output through a hash algorithm. The output is the hash value (hash value) ; This conversion is a compression mapping, that is, the space of the hash value is usually much smaller than the space of the input, and different inputs may be hashed into the same output, so it is impossible to uniquely determine the input value from the hash value . Simply put, it is a function that compresses messages of any length to a fixed-length message digest.

All hash functions have the following basic characteristic**: If the hash value calculated according to the same hash function is different, the input value must be different. However, if the hash value calculated according to the same hash function is the same, the input value may not be the same**.

What is a hash collision?

When two different input values calculate the same hash value based on the same hash function, we call it a collision (hash collision) .

In this way, we can organize objects with the same hash value into a linked list and place them under the bucket corresponding to the hash value, but compared to the int type returned by hashCode, the initial capacity of our HashMap DEFAULT_INITIAL_CAPACITY = 1 << 4 (ie 2 to the fourth power 16) is much smaller than the range of the int type, so if we simply use the hashCode to obtain the corresponding bucket, this will greatly increase the probability of hash collisions, and in the worst case, the HashMap will be It becomes a singly linked list, so we still need to optimize the hashCode.

hash() function

The problem mentioned above is mainly because if the hashCode is used to take the remainder, then only the low bits of the hashCode are involved in the operation, and the high bits do not play any role, so our idea is to let the high bits of the hashCode take part in the calculation. , To further reduce the probability of hash collision and make the data distribution more even. We call this operation perturbation. The hash() function in JDK 1.8 is as follows:

static final int hash (Object key) { int h; return (key == null )? 0 : (h = key.hashCode()) ^ (h >>> 16 );// Shifting 16 bits to the right from itself OR operation (exclusive OR of high and low bits) } Copy code

This is more concise than in JDK 1.7 . Compared with 4 bit operations and 5 XOR operations (9 perturbations) in 1.7, in 1.8, only 1 bit operation and 1 XOR operation are performed. (2 disturbances) ;

"Red-black tree added in JDK1.8"

Through the above chain address method (using hash table) and perturbation function, we succeeded in making our data distribution more even and reducing hash collisions, but when there is a large amount of data in our HashMap, adding the corresponding linked list under one of our buckets n elements, then the traversal time complexity is O(n). In order to solve this problem, JDK1.8 added a red-black tree data structure to the HashMap, further reducing the traversal complexity to O(logn); summary Briefly summarize what methods HashMap uses to effectively resolve hash conflicts:

  • Use chain address method (using hash table) to link data with the same hash value;
  • Use 2 perturbation functions (hash functions) to reduce the probability of hash collisions and make the data distribution more even;
  • The introduction of red-black trees further reduces the time complexity of the traversal, making the traversal faster;

1.11 Why does HashMap not directly use the hash value processed by hashCode() as the subscript of the table?

Answer: The hashCode() method returns an int integer type, and its range is -(2 ^ 31) ~ (2 ^ 31-1). There are about 4 billion mapping spaces, and the capacity range of HashMap is 16 (initialization default Value) ~2 ^ 30, HashMap usually does not get the maximum value, and it is difficult to provide so much storage space on the device, resulting in the hash value calculated by hashCode() may not be within the range of the array size. Then it cannot match the storage location;

How to solve it?

  • HashMap implements its own hash() method. Through two disturbances, it makes its own hash value high and low to perform exclusive-or operations on its own, reducing the probability of hash collisions and making the data distribution more even;
  • When ensuring that the length of the array is a power of 2, use the value and operation (&) (array length-1) after the hash() operation to obtain the array subscript for storage, which is more than the remainder operation Effective, secondly, because only when the length of the array is a power of 2, h&(length-1) is equivalent to h%length, and thirdly, it solves the problem of "the hash value does not match the size range of the array"

1.12 Why is the length of HashMap a power of 2

In order to make HashMap access efficient, there are as few collisions as possible, that is, the data should be distributed as evenly as possible, and the length of each linked list/red-black tree is roughly the same. This implementation is the algorithm for storing the data in which linked list/red-black tree.

How should this algorithm be designed?

First of all, we may think of adopting the operation of% taking the remainder to achieve. However, here comes the important point: "In the remainder (%) operation, if the divisor is a power of 2, it is equivalent to the AND (&) operation that subtracts one from the divisor (that is, hash%length==hash&(length-1) The premise is that length is 2 to the power of n;). And using binary bit operation &, relative to% can improve the operation efficiency, which explains why the length of HashMap is a power of 2.

Then why are there two disturbances?

Answer: This is to increase the randomness of the low bits of the hash value to make the distribution more uniform, thereby improving the randomness & uniformity of the storage subscript position of the corresponding array, and ultimately reducing the Hash conflict. Two times are enough, and it has been reached. The purpose of both high and low bits participating in calculations;

1.13 What is the difference between HashMap and HashTable?

  • Thread safety :
    Is not thread-safe,
    It is thread-safe; the methods inside HashTable are basically modified by synchronized. (Multi-threaded can use ConcurrentHashMap!);
  • Efficiency : Because of thread safety issues, HashMap is a bit more efficient than HashTable. In addition, HashTable is basically eliminated, do not use it in the code; support for Null key and Null value: In HashMap, null can be used as a key, there is only one such key, and there can be one or more keys corresponding to null. . However, as long as the key value put in the HashTable has a null, a NullPointerException will be thrown directly.
  • **The difference between the initial capacity and the capacity for each expansion**: If you do not specify the initial value of the capacity when you create it, the default initial size of Hashtable is 11, and after each expansion, the capacity becomes the original 2n+1. The default initial size of HashMap is 16. After each expansion, the capacity is doubled. If the initial value of capacity is given when creating, then Hashtable will directly use the size you give, and HashMap will expand it to the power of 2. In other words, HashMap always uses a power of 2 as the size of the hash table. Underlying data structure: HashMap after JDK1.8 has changed a lot when resolving hash conflicts. When the length of the linked list is greater than the threshold (the default is 8), the linked list is converted into a red-black tree to reduce search time. Hashtable has no such mechanism.

1.14 The difference between HashMap and ConcurrentHashMap

  • ConcurrentHashMap divides the entire bucket array into segments, and then uses locks to protect each segment. Compared with HashTable's synchronized lock, the granularity of the synchronized lock is finer and the concurrency performance is better. HashMap has no locks. The mechanism is not thread-safe. (After JDK1.8, ConcurrentHashMap has enabled a new way of implementation, using the CAS algorithm.)
  • HashMap key-value pairs allow null, but ConCurrentHashMap does not allow it.

1.15 The difference between ConcurrentHashMap and Hashtable?

The difference between ConcurrentHashMap and Hashtable is mainly reflected in the way to achieve thread safety.

  • "Underlying data structure" : The bottom layer of ConcurrentHashMap of JDK1.7 is implemented by segmented array + linked list. JDK1.8 uses the same data structure as HashMap 1.8, array + linked list/red-black binary tree. The underlying data structure of Hashtable and HashMap before JDK1.8 are similar in the form of array + linked list. The array is the main body of the HashMap, and the linked list exists mainly to resolve hash conflicts;
  • "A way to achieve thread safety" (important): In JDK1.7, ConcurrentHashMap (segment lock) divided the entire bucket array into segments (Segment), and each lock only locks part of the data in the container. When threads access data in different data segments in the container, there will be no lock contention, which increases the concurrent access rate. (16 Segments are allocated by default, which is 16 times more efficient than Hashtable.) By the time of JDK1.8, the concept of Segments has been abandoned, but directly implemented with the data structure of Node array + linked list + red-black tree, and concurrency control uses synchronized And CAS to operate. (Many optimizations have been made to synchronized locks after JDK1.6) The whole looks like an optimized and thread-safe HashMap. Although the segment data structure can be seen in JDK1.8, the attributes have been simplified, just for Compatible with the old version; Hashtable (same lock): Use synchronized to ensure thread safety, which is very inefficient. When a thread accesses the synchronization method, other threads also access the synchronization method, and may enter a blocking or polling state. For example, use put to add elements, and another thread cannot use put to add elements or use get. The competition will become increasingly fierce. The lower the efficiency.

"Comparison of the two" :


ConcurrentHashMap of JDK1.7:

ConcurrentHashMap of JDK1.8 (TreeBin: Red and Black Binary Tree Node: Linked List Node):

Answer: ConcurrentHashMap combines the advantages of HashMap and HashTable. HashMap does not consider synchronization, HashTable considers synchronization issues. But HashTable locks the entire structure every time it is executed synchronously. The ConcurrentHashMap lock method is slightly fine-grained.

1.16 Do you know the specific implementation of ConcurrentHashMap? What is the realization principle?


1. divide the data into a segment of storage, and then assign a lock to each segment of data. When a thread occupies the lock to access one of the segments, the data in other segments can also be accessed by other threads.

In JDK1.7, ConcurrentHashMap is implemented in the way of Segment + HashEntry, and the structure is as follows:

A ConcurrentHashMap contains an array of Segments. The structure of Segment is similar to HashMap. It is an array and linked list structure. A Segment contains a HashEntry array. Each HashEntry is an element of a linked list structure. Each Segment guards the elements in a HashEntry array. When making changes, you must first obtain the lock of the corresponding segment.

  • This class contains two static internal classes HashEntry and Segment; the former is used to encapsulate the key-value pairs of the mapping table, and the latter is used to act as a lock;
  • Segment is a reentrant lock ReentrantLock. Each segment guards an element in the HashEntry array. When the data in the HashEntry array is modified, the corresponding segment lock must first be obtained.


In JDK1.8, the bloated segment design is abandoned, and Node + CAS + Synchronized is used instead to ensure concurrency safety. Synchronized only locks the first node of the current linked list or red-black binary tree, so as long as the hash does not conflict, there will be no Concurrency will occur, and the efficiency will be increased by N times.

The structure is as follows:

1.17 What is the difference between Array and ArrayList?

  • Array can store basic data types and objects, and ArrayList can only store objects.
  • Array specifies a fixed size, while the size of ArrayList is automatically expanded.
  • There are not as many built-in methods of Array as ArrayList. For example, addAll, removeAll, iteration and other methods are only available for ArrayList.

For basic types of data, the collection uses automatic boxing to reduce coding workload. However, this method is relatively slow when dealing with fixed-size basic data types.

1.18 How to convert between Array and List?

  • Array to List: Arrays. asList(array);
  • List to Array: the toArray() method of List.

1.19 What is the difference between comparable and comparator?

  • The comparable interface is actually from the java.lang package, it has a compareTo (Object obj) method to sort
  • The comparator interface is actually from the java.util package, it has a compare(Object obj1, Object obj2) method to sort

Generally, when we need to use a custom sort for a collection, we have to rewrite the compareTo method or the compare method. When we need to implement two sorting methods for a certain collection, for example, the song name and singer name in a song object use one. For this sorting method, we can rewrite the compareTo method and use the self-made Comparator method or use two Comparators to achieve song name sorting and star name sorting. The second means that we can only use the two-parameter version of Collections.sort() .

Comparator custom sorting

ArrayList<Integer> arrayList = new ArrayList<Integer>(); arrayList.add(- 1 ); arrayList.add( 3 ); arrayList.add( 3 ); arrayList.add(- 5 ); arrayList.add( 7 ); arrayList.add( 4 ); arrayList.add(- 9 ); arrayList.add(- 7 ); System.out.println( "Original array:" ); System.out.println(arrayList); //void reverse(List list): Reverse Collections.reverse(arrayList); System.out.println("Collections.reverse (arrayList):"); System.out.println(arrayList); //void sort(List list), sort in ascending order of natural sort Collections.sort(arrayList); System.out.println("Collections.sort(arrayList) ):"); System.out.println(arrayList); //Usage of custom sorting Collections.sort(arrayList, new Comparator<Integer>() { @Override public int compare (Integer o1, Integer o2) { return o2. compareTo(o1); } }); (System.out.println "After custom sorting:" ); System.out.println (arrayList); Copy the code
Original array: [- 1 , 3 , 3 , -5 , 7 , 4 , -9 , -7 ] Collections.reverse(arrayList): [- 7 , -9 , 4 , 7 , -5 , 3 , 3 , -1 ] Collections.sort(arrayList): [- 9 , -7 , -5 , -1 , 3 , 3 , 4 , 7 ] After custom sorting: [ 7 , 4 , 3 , 3 - . 1 - . 5 - 7 - . 9 ] duplicated code

Override the compareTo method to sort by age

//The person object does not implement the Comparable interface, so it must be implemented so that there is no error, so that the data in the treemap can be arranged in order. //The String class in the previous example has implemented the Comparable interface by default. You can check the String class for details API documentation, and other //classes like Integer have already implemented the Comparable interface, so there is no need to implement public class Person implements Comparable < Person > { private String name; private int age; public Person (String name, int age) { super (); this .name = name; this .age = age; } public String getName () { return name; } public void setName (String name) { this .name = name; } public int getAge () { return age; } public void setAge ( int age) { this .age = age; } /** * TODO rewrites the compareTo method to sort by age */ @Override public int compareTo (Person o) { if ( this .age> o.getAge()) { return 1 ; } else if ( this .age <o.getAge()) { return - 1 ; } return age; } } Copy code
public static void main (String[] args) { TreeMap<Person, String> pdata = new TreeMap<Person, String>(); pdata.put( new Person( " " , 30 ), "zhangsan" ); pdata.put( new Person( " " , 20 ), "lisi" ); pdata.put( new Person( " " , 10 ), "wangwu" ); pdata.put( new Person( " " , 5 ), "xiaohong" ); //Get the value of the key while getting the value corresponding to the key Set<Person> keys = pdata.keySet(); for (Person key: keys) { System.out.println(key.getAge() + "-" + key.getName());} } Copy code

5-Xiaohong 10-Wang Wu 20-Li Si 30-Zhang San

1.20 What is the difference between Collection and Collections?

  • java.util.Collection is a collection interface (a top-level interface of the collection class). It provides general interface methods for basic operations on collection objects. The Collection interface has many concrete implementations in the Java class library. The meaning of the Collection interface is to provide a maximized unified operation method for various specific collections, and its direct inheritance interfaces include List and Set.
  • Collections is a tool class/helper class of the collection class, which provides a series of static methods for sorting, searching, and thread-safe operations of the elements in the collection.

1.21 How do TreeMap and TreeSet compare elements when sorting? How does the sort() method in the Collections utility class compare elements?

TreeSet requires that the class to which the stored object belongs must implement the Comparable interface, which provides a compareTo() method for comparing elements, which will be called back to compare the size of the element when an element is inserted. TreeMap requires that the keys stored in the key-value pair mapping must implement the Comparable interface to sort elements according to the key.

The sort method of the Collections utility class has two overloaded forms,

The first requires that the objects stored in the incoming container to be sorted implement the Comparable interface to achieve element comparison;

The second non-mandatory requirement that the elements in the container must be comparable, but requires that the second parameter be passed in. The parameter is a subtype of the Comparator interface (the compare method needs to be rewritten to implement element comparison), which is equivalent to a temporarily defined Sorting rules are actually injecting algorithms for comparing element sizes through interfaces, as well as the application of callback mode (the support for functional programming in Java).

Article reference:

Re-understanding HashMap in Java 8 series

Java collection container interview questions

Today s sharing is over when you see it here. If you think this article is not bad, let s share it, like it, and watch Sanlian, so that more people can see it too~

Welcome to follow the personal public account "JavaClub" and share some interview dry goods with you regularly.

This article is published by OpenWrite , an operating tool platform such as blog group posting and multi- posting.