Data structure and algorithm

Data structure and algorithm

1. Introduction to data structure

The data structure is to realize the various data organization forms for the effective use of computer data, and serve all kinds of computer operations. Different data structures have their own corresponding applicable scenarios, which are designed to reduce the time and space complexity of various algorithm calculations and achieve the best task execution efficiency.

As shown in the figure below, common data structures can be divided into "linear data structure" and "non-linear data structure", specifically: "array", "linked list", "stack", "queue", "tree", "graph" ", "hash table", "heap".

2. Array: Array

An array is a data structure that stores elements of the same type in a continuous memory space, and its length is immutable.

As shown in the figure below, to construct this array, you need to specify the length during initialization and assign a value to each index element of the array. The code is as follows:

= Array [ 2 , . 3 , . 1 , 0 , 2 ]; duplicated code

"Variable array" is a frequently used data structure, which is implemented based on arrays and expansion mechanisms, and is more flexible than ordinary arrays. Common operations include: accessing elements, adding elements, and deleting elements.

# Initialize variable array array = [] # Add elements to the end array.append( 2 ) array.append( 3 ) array.append( 1 ) array.append( 0 ) array.append ( 2 ) copying the code

2.1 Array exercise 1

Given a binary array, calculate the maximum number of consecutive 1s . Example: Input: [ 1 , 1 , 0 , 1 , 1 , 1 ] Output: 3 Explanation: The first two digits and the last three digits are consecutive 1 , so the maximum number of consecutive 1s is 3.   prompt: The input array only contains  0 and 1 . The length of the input array is a positive integer, not more than 10 , 000 . class Solution : def findMaxConsecutiveOnes ( self, nums: List [ int ] ) -> int : count,result = 0 , 0 # Counter for i in range ( len (nums)): if nums[i]== 1 : # Traverse List, when the value is 1, add 1 to the value of count count+= 1 else : = Result max (COUNT, Result) # 1 when the value is not obtaining maximum value COUNT = 0 # 0 assigned to the counter, counting the new return max (Result, COUNT) copy the code

2.2 Array exercise 2

Given an array nums, write a function to move all zeros to the end of the array while maintaining the relative order of non-zero elements. Example: Input: [ 0 , 1 , 0 , 3 , 12 ] Output: [ 1 , 3 , 12 , 0 , 0 ] Description: You must operate on the original array, and cannot copy additional arrays. Minimize the number of operations. class Solution : def moveZeroes ( self, nums: List [ int ] ) -> None : index = 0 for i in range ( len (nums)): if nums[i] != 0 : # When num[i] is not 0 nums[index]=nums[i] # num[index] is assigned to num[i] index += 1 for j in range ( len (nums)): #traverse the list if j>=index: # When the value of j Greater than the value of index, num[j] is assigned a value of 0 the nums [J] = 0 return the nums duplicated code

3. Linked List: Linked List

The linked list takes the node as the unit, and each element is an independent object, and the storage in the memory space is non-continuous. The node object of the linked list has two member variables: "value val" and "subsequent node reference next".

class ListNode : DEF the __init__ ( Self, Val, Next = None ): self.val = Val # node values . Self Next = Next # references successor nodes replicate the code

As shown in the figure below, to establish this linked list, each node needs to be instantiated and the reference point of each node is constructed.

# Instantiate node n1 = ListNode( 4 ) # Node head n2 = ListNode( 5 ) n3 = ListNode( 1 ) # Build a reference to n1. next = n2 N2. Next = N3 duplicated code

3.1 Linked List Exercise 1: Combine two ordered linked lists

Combine the two ascending linked lists into a new ascending linked list and return. The new linked list is composed by splicing all the nodes of the given two linked lists. Example 1 : Input: l1 = [ 1 , 2 , 4 ], l2 = [ 1 , 3 , 4 ] Output: [ 1 , 1 , 2 , 3 , 4 , 4 ] Example 2 : Input: l1 = [], l2 = [] Output: [] Example 3 : Input: l1 = [], l2 = [ 0 ] Output: [ 0 ] class ListNode : def __init__ ( self, val = 0 , next = None ): self.val = val self. next = next class Solution : def mergeTwoLists ( self, L1: ListNode, L2: ListNode ) -> ListNode: limt=ListNode(- 1 ) prev=limt while L1 and L2: if L1.val<=L2.val: prev. next ,L1=L1,L1. next else : prev. next ,L2=L2,L2. next prev=prev. next prev. next =L1 if L1 is not None else L2 return limt. the Next copy the code

3.2 Linked list exercise 2: Reversing a singly linked list

Reverse a singly linked list. Example: Input: 1 -> 2 -> 3 -> 4 -> 5 ->NULL Output: 5 -> 4 -> 3 -> 2 -> 1 ->NULL Advanced: You can reverse the linked list iteratively or recursively. Can you solve this problem in two ways? class ListNode : def __init__ ( self, val = 0 , next = None ): self.val = val self. next = next class Solution : def reverseList ( self, head: ListNode ) -> ListNode: n1,n2=head, None while n1: . TEMP = N1 Next # TEMP temporal pointer to the next value assigned head . N1 Next = N2 # n1.next N2 = N2 = N1 # N2 = N1 N1 = TEMP # N1 = TEMP TEMP: [2,3] return N2 replication Code

3.3 Linked List Exercise 3: Double Pointer

Topic: Enter two linked lists in ascending order, merge the two linked lists and make the nodes in the new linked list still in ascending order. Example 1 : Input: . 1 -> 2 -> 4 , . 1 -> . 3 -> 4 Output: . 1 -> . 1 -> 2 -> . 3 -> 4 -> 4 copy the code
class ListNode : def __init__ ( self, x ): self.val = x self. next = None class Solution : def mergeTwoLists ( self, L1: ListNode, L2: ListNode ) -> ListNode: cur=tep=ListNode( 0 ) while L1 and L2: if L1.val<L2.val: cur. next ,L1=L1,L1. next else : cur. next ,L2=L2,L2. next cur=cur. next cur. next = L1 if L1 else L2 # Python ternary expression A if x else B, which means that when x = True, execute A, otherwise execute B return TEP. the Next copy the code

4. Stack: Stack

The stack is an abstract data structure with the characteristics of "first in, then out", which can be implemented using an array or a linked list.

= Stack [] # listing the Python may be used as a stack copy the code

As shown in the figure below, through the common operations "push()" and "pop()", the stack's first-in-last-out feature is displayed.

stack.append ( 1 ) # 1 element stack stack.append ( 2 ) # 2 stack elements stack.pop () # pop -> element 2 stack.pop () # pop -> 1 element copy the code

5. Queue: Queue

Queue is an abstract data structure with the characteristics of "first in, first out", which can be implemented using linked lists.

# Python usually uses deque collections.deque from collections import deque queue = deque() Copy code

As shown in the figure below, the common operations "enqueue push()" and "dequeue pop()" demonstrate the first-in, first-out characteristics of the queue.

queue.append ( 1 ) # 1 enqueued element queue.append ( 2 ) # 2 enqueued element queue.popleft () # dequeued -> element 1 queue.popleft () # dequeued -> 2 element copy the code

6. Tree: Tree

A tree is a non-linear data structure, which can be divided into "binary tree" and "multiple tree" according to the number of child nodes. The topmost node is called the "root node root". Taking a binary tree as an example, each node contains three member variables: "value val", "left child node left", and "right child node right".

class the TreeNode : DEF the __init__ ( Self, X ): self.val = X # node value self.left = None # left child node self.right = None # right child node duplicated code

As shown in the figure below, to establish this binary tree, each node needs to be instantiated and the reference point of each node is constructed.

# Initialize node n1 = TreeNode( 3 ) # Root node root n2 = TreeNode( 4 ) n3 = TreeNode( 5 ) n4 = TreeNode( 1 ) n5 = TreeNode( 2 ) # Build a reference point n1.left = n2 n1.right = n3 n2.left = n4 n2.right = n5 Copy code

6.1 Preorder traversal

Preorder traversal of binary tree: first visit the root node, then traverse the left subtree, and finally traverse the right subtree.

# Definition for a binary tree node. # class TreeNode: # def __init__(self, val=0, left=None, right=None): # self.val = val # self.left = left # self.right = right class Solution : def preorderTraversal ( self, root: TreeNode ) -> List [ int ]: res = [] def preordef ( root ): if not root: return res.append(root.val) preordef(root.left) preordef(root.right) preordef(root) return resCopy code

6.2 In-order traversal

In-order traversal: first traverse the left subtree, then visit the root node, and then traverse the right subtree.

# Definition for a binary tree node. # class TreeNode: # def __init__(self, val=0, left=None, right=None): # self.val = val # self.left = left # self.right = right class Solution : def inorderTraversal ( self, root: TreeNode ) -> List [ int ]: res = [] def preordef ( root ): if not root: return preordef(root.left) res.append(root.val) preordef(root.right) preordef(root) return resCopy code

6.3 Post-order traversal

Post-order traversal: first traverse the left subtree, then traverse the right subtree, and finally visit the root node of the tree.

# Definition for a binary tree node. # class TreeNode: # def __init__(self, val=0, left=None, right=None): # self.val = val # self.left = left # self.right = right class Solution : def inorderTraversal ( self, root: TreeNode ) -> List [ int ]: res = [] def preordef ( root ): if not root: return preordef(root.left) preordef(root.right) res.append(root.val) preordef(root) return resCopy code

6.4 Sequence traversal

The algorithm starts from a root node and first visits the node itself. Then it traverses its neighboring nodes, then traverses its second-level neighbors, third-level neighbors, and so on. When we perform a breadth-first search in the tree, the order of the nodes we visit is in the order of traversal.

7. Figure: Graph

A graph is a non-linear data structure consisting of a "node (vertex) vertex" and an "edge". Each edge connects a pair of vertices. According to the direction of edges, graphs can be divided into "directed graphs" and "undirected graphs". This article takes the undirected graph as an example to introduce.

As shown in the figure below, the sets of vertices and edges of this undirected graph are:

Vertex set: vertices = {1, 2, 3, 4, 5} Edge set: edges = {(1, 2), (1, 3), (1, 4), (1, 5), (2, 4 ), (3, 5), (4, 5))

There are usually two ways to represent a graph:

  1. Adjacency matrix: Use the array vertices to store vertices, and the adjacency matrix edges to store edges; edges[i][j] represents whether there is an edge between node i + 1 and node j + 1.
vertices = [ 1 , 2 , 3 , 4 , 5 ] edges = [[ 0 , 1 , 1 , 1 , 1 ], [ 1 , 0 , 0 , 1 , 0 ], [ 1 , 0 , 0 , 0 , 1 ], [ 1 , 1 , 0 , 0 , 1 ], [ 1 , 0 , 1 , 1 , 0 ]] duplicated code
  1. Adjacency table: Use the array vertices to store vertices, and the adjacency table edges to store edges. edges is a two-dimensional container, the first dimension i represents the vertex index, and the second dimension edges[i] stores the sum of the edge sets corresponding to this vertex; for example, edges[0] = [1, 2, 3, 4] represents vertices[0 The edge set of] is [1, 2, 3, 4].
vertices = [ 1 , 2 , 3 , 4 , 5 ] edges = [[ 1 , 2 , 3 , 4 ], [ 0 , 3 ], [ 0 , 4 ], [ 0 , 1 , 4 ], [ 0 , 2 , 3 ]] Copy code

Adjacency matrix VS adjacency list:

The size of the adjacency matrix is only related to the number of nodes, that is, N^2, where N is the number of nodes. Therefore, when the number of edges is significantly less than the number of nodes, using the adjacency matrix to store the graph will cause a large waste of memory. Therefore, the adjacency table is suitable for storing sparse graphs (more vertices, fewer edges); the adjacency matrix is suitable for storing dense graphs (less vertices, more edges).

8. Hash table: Hash table

The hash table is a non-linear data structure, which uses the Hash function to map the specified "key" to the corresponding "value" to achieve efficient element search.

Imagine a simple scenario: the student IDs of Xiaoli, Xiaote, and Xiaokou are 10001, 10002, and 10003 respectively. Now you need to find the "student ID" from the "name". Copy code

This requirement can be achieved by creating a hash table whose name is key and student ID is value. The code is as follows:

# Initialize the hash table dic = {} # Add key -> value key-value pair dic[ " " ] = 10001 dic[ " " ] = 10002 dic[ " " ] = 10003 Find Student ID # Name from dic [ "small force" ] # -> 10001 dic [ "little special" ] # -> 10002 dic [ "little button" ] # -> 10003 copy the code

Design your own Hash function:

Assuming requirement: Find "Name" from "Student ID". Copy code

Store the names of the three persons in the following array, and the index of each name in the array is 0, 1, 2 respectively.

names = [ "Xiao Li" , "Xiao Te" , "Xiao Kou" ] Copy code

At this point, we construct a simple Hash function (% is the remainder symbol), the formula and wrapper function are as follows:

hash (key) = (key- 1)% 10000 duplicated code
def hash(id): index = (id-1)% 10000 return index Copy code

Then we build a hash table with the student ID as the key and the array index corresponding to the name as the value. Using this Hash function, the corresponding name can be found through the student ID under O(1) time complexity, namely:

names[ hash ( 10001 )]//Xiaoli names[ hash ( 10002 )]//small special names [ hash ( 10003 )]//small buckle copy the code

The above design is only applicable to this example, the actual Hash function needs to ensure low collision rate, high robustness, etc., to be suitable for various data and scenarios.

9. Heap: Heap

Heap is a data structure based on "complete binary tree", which can be implemented using arrays. The sorting algorithm based on the heap principle is called "heap sorting", and the data structure based on the heap is called "priority queue". The heap is divided into "big top heap" and "small top heap". Big (small) top heap: The value of any node is not greater than (less than) the value of its parent node.

Definition of a complete binary tree: Let the depth of the binary tree be k, if the nodes of each layer (layer 11 to k-1) of the binary tree except for the kth layer reach The largest number, and the nodes in the kth layer are continuously concentrated on the leftmost side, then this binary tree is called a complete binary tree. Copy code

As shown in the figure below, it is a small top pile containing 1, 4, 2, 6, 8 elements. The nodes in the heap (complete binary tree) are numbered by layer, and then they can be mapped to the array storage form on the right.

By using the "push()" and "pop()" operations of the "priority queue", the heap sort can be completed. The implementation code is as follows:

from heapq import heappush, heappop # Initialize the small top heap heap = [] # Elements into the heap heappush(heap, 1 ) heappush(heap, 4 ) heappush(heap, 2 ) heappush(heap, 6 ) heappush(heap, 8 ) # Elements of the stack (small to large) heappop (heap) # ->. 1 heappop (heap) # -> 2 heappop (heap) # ->. 4 heappop (heap) # ->. 6 heappop (heap) # ->. 8 Copy Code
# Implementation of "Maximum Heap" import sys class MaxHeap : def __init__ ( self, heapSize ): # heapSize is used for the size of the array, because at least the number of elements in the array needs to be specified when the array is created self.heapSize = heapSize # Use an array to create a complete binary tree structure, and then use the binary tree to build a "heap" self.maxheap = [ 0 ]*(heapSize + 1 ) # realSize is used to record the number of elements in the "heap" self.realSize = 0 # Add element function def add ( self, element ): self.realSize += 1 # If the number of elements in the "heap" is greater than the number of arrays set at the beginning, return "Add too many elements" if self. realSize> self.heapSize: print ( "Add too many elements!" ) self.realSize -= 1 return # Add the added element to the array self.maxheap[self.realSize] = element # Index position of the new element index = self.realSize # Index position of the parent node of the new element # Note that if an array is used to represent a complete binary tree, and the root node is stored at the index 1 position of the array, the index position of the parent node of any node is "the index position of the node/2", the index position of the left child node of any node is "the index position of the node*2", and the index position of the right child node of any node is "the index position of the node*2+1" parent = index//2 # When the added element is larger than the parent node, the value of the parent node and the value of the new element need to be exchanged while (self.maxheap[index]> self.maxheap[parent] and index> 1 ): self.maxheap[parent], self.maxheap[index] = self.maxheap[index], self.maxheap[parent] index = parent parent = index//2 # Get heap top element function def peek ( self ): return self.maxheap[ 1 ] # Delete top element function def pop ( self ): # If the number of elements in the current "heap" is 0, then return "Don't have any element" if self.realSize < 1 : print ( "Don't have any element!" ) return sys.maxsize else : # The current "heap" contains elements # self.realSize >= 1 removeElement = self.maxheap[ 1 ] # Assign the last element in the "heap" to the top element self. maxheap[ 1 ] = self.maxheap[self.realSize] self.realSize -= 1 index = 1 # When the deleted element is not a child node while (index <self.realSize and index <= self.realSize//2 ): # The left child node of the deleted node left = index * 2 # The right child node of the deleted node right = (index * 2 ) + 1 # When the element of the deleted node is smaller than the left child node or the right child node, it means that the value of the element is small, and the element needs to be combined with the left and right children The largest value in the node is exchanged if (self.maxheap[index] <self.maxheap[left] or self.maxheap[index] <self.maxheap[right]): if self.maxheap[left]> self.maxheap[ right]: self.maxheap[left], self.maxheap[index] = self.maxheap[index], self.maxheap[left] index = left else : self.maxheap[right], self.maxheap[index] = self.maxheap[index], self.maxheap[right] index = right else : break return removeElement # Return the number of elements in the ``heap'' def size ( self ): return self.realSize def toString ( self ): print (self.maxheap[ 1 : self.realSize+ 1 ]) if __name__ == "__main__" : # Test case maxHeap = MaxHeap( 5 ) maxHeap.add( 1 ) maxHeap.add( 2 ) maxHeap.add( 3 ) # [3,1,2] maxHeap.toString() # 3 print (maxHeap.peek()) # 3 print (maxHeap.pop()) # 2 print (maxHeap.pop()) # 1 print (maxHeap.pop()) maxHeap.add( 4 ) maxHeap.add( 5 ) # [5,4] maxHeap.toString() Copy code
# Implementation of "Minimal Heap" import sys class MinHeap : def __init__ ( self, heapSize ): # heapSize is used for the size of the array, because at least the number of elements in the array needs to be specified when the array is created self.heapSize = heapSize # Use an array to create a complete binary tree structure, and then use the binary tree to build a "heap" self.minheap = [ 0 ]*(heapSize + 1 ) # realSize is used to record the number of elements in the "heap" self.realSize = 0 # Add element function def add ( self, element ): self.realSize += 1 # If the number of elements in the "heap" is greater than the number of arrays set at the beginning, return "Add too many elements" if self. realSize> self.heapSize: print ( "Add too many elements!" ) self.realSize -= 1 return # Add the added element to the array self.minheap[self.realSize] = element # Index position of the new element index = self.realSize # Index position of the parent node of the new element # Note that if an array is used to represent a complete binary tree, and the root node is stored at the index 1 position of the array, the index position of the parent node of any node is "the index position of the node/2", the index position of the left child node of any node is "the index position of the node*2", and the index position of the right child node of any node is "the index position of the node*2+1" parent = index//2 # When the added element is smaller than the parent node, the value of the parent node and the value of the new element need to be exchanged while (self.minheap[index] <self.minheap[parent] and index> 1 ): self.minheap[parent], self.minheap[index] = self.minheap[index], self.minheap[parent] index = parent parent = index//2 # Get heap top element function def peek ( self ): return self.minheap[ 1 ] # Delete top element function def pop ( self ): # If the number of elements in the current "heap" is 0, then return "Don't have any element" if self.realSize < 1 : print ( "Don't have any element!" ) return sys.maxsize else : # The current "heap" contains elements # self.realSize >= 1 removeElement = self.minheap[ 1 ] # Assign the last element in the "heap" to the top element self. minheap[ 1 ] = self.minheap[self.realSize] self.realSize -= 1 index = 1 # When the deleted element is not a child node while (index <self.realSize and index <= self.realSize//2 ): # The left child node of the deleted node left = index * 2 # The right child node of the deleted node right = (index * 2 ) + 1 # When the element of the deleted node is greater than the left child node or the right child node, it means that the value of the element is large. At this time, the element needs to be combined with the left and right children The smallest value in the node is exchanged if (self.minheap[index]> self.minheap[left] or self.minheap[index]> self.minheap[right]): if self.minheap[left] <self.minheap[ right]: self.minheap[left], self.minheap[index] = self.minheap[index], self.minheap[left] index = left else : self.minheap[right], self.minheap[index] = self.minheap[index], self.minheap[right] index = right else : break return removeElement # Return the number of elements in the ``heap'' def size ( self ): return self.realSize def toString ( self ): print (self.minheap[ 1 : self.realSize+ 1 ]) if __name__ == "__main__" : # Test case minHeap = MinHeap( 5 ) minHeap.add( 3 ) minHeap.add( 1 ) minHeap.add( 2 ) # [1,3,2] minHeap.toString() # 1 print (minHeap.peek()) # 1 print (minHeap.pop()) # 2 print (minHeap.pop()) # 3 print (minHeap.pop()) minHeap.add( 4 ) minHeap.add( 5 ) # [4,5] minHeap.toString() Copy code