# Summary of BFS types

## Description

In order to highlight the difference between various types of questions, the following sample questions are selected relatively simple questions. Will not involve some very clever techniques, get rid of some details, so as to pay more attention to the topic type itself

## Connected block problem

Title description

Farmer John has a slice $N M$ rectangular land.

Recently, due to rainfall, part of the land was submerged by water.

Now use a character matrix to represent his land.

In each cell, if it contains rainwater, it is represented by "W", and if it does not contain rainwater, it is represented by ".".

Now John wants to know how many ponds have formed in his land.

Each set of connected water cells can be regarded as a pond.

Each cell is considered to be connected to its eight neighboring cells at the top, bottom, left, right, top left, top right, bottom left, and bottom right.

Please output the total number of ponds, that is, how many connected "W" blocks are in the matrix.

Problem-solving ideas

Traverse all grids, if it is "W" and it has not been marked, then start searching from this grid, mark all reachable "Ws" around it and count

Problem-solving code (C++)

#include <iostream>
#include <algorithm>
#include <cstring>
#include <cstdio>

using namespace std;

typedef pair<int, int> PII;

const int N = 1010, M = N * N;

int n, m;
char g[N][N];
bool st[N][N];
PII q[M];
int dx[] = {-1, -1, -1, 0, 0, 1, 1, 1};
int dy[] = {-1, 0, 1, -1, 1, -1, 0, 1};

void dfs(int sx, int sy)
{
int hh = 0, tt = -1;
q[++ tt] = {sx, sy};
st[sx][sy] = true;

while (tt >= hh)
{
PII t = q[hh ++];
int xx = t.first, yy = t.second;

for (int i = 0; i <8; ++ i)
{
int x = xx + dx[i], y = yy + dy[i];
if (x <0 || x >= n || y <0 || y >= m || g[x][y] =='.' || st[x][y]) continue;

q[++ tt] = {x, y};
st[x][y] = true;
}
}
}
int main()
{
cin >> n >> m;
for (int i = 0; i <n; ++ i) cin >> g[i];

int res = 0;
for (int i = 0; i <n; ++ i)
for (int j = 0; j <m; ++ j)
if (g[i][j] =='W' && !st[i][j])
{
++ res;
dfs(i, j);
}

cout << res << endl;

return 0;
}
Copy code

## Shortest path problem

BFS is the same as the shortest path algorithm . It is used to solve the shortest path problem on the graph. The difference is that BFS has some stricter requirements on the graph.

The shortest path algorithm can solve the single source shortest path and the multi-source shortest path. The path length only has the distinction between whether there is a negative weight edge and a negative weight loop.

But BFS can only solve the shortest path of a single source, and the path length can only be all equal or there are only two values (there are two categories as follows)

### The path lengths are all equal

Title description

Given a $n n$The two-dimensional array of is as follows:

int maze = {
0, 1, 0, 0, 0,
0, 1, 0, 1, 0,
0, 0, 0, 0, 0,
0, 1, 1, 1, 0,
0, 0, 0, 1, 0,
};
Copy code

It represents a maze, where 1 represents a wall, and 0 represents a path that can be walked. You can only walk horizontally or vertically, not diagonally. It requires programming to find the shortest route from the upper left corner to the lower right corner.

The data guarantees that there is at least one path from the upper left corner to the lower right corner.

Problem-solving ideas

Use queues to achieve layer sequence traversal. All points will only enter the team once, and the corresponding path when leaving the team is the shortest path

Prove that the distance obtained when leaving the team must be the shortest distance:

• Method 1-With the help of Dijkstra's algorithm

1. The first need to have proof by mathematical induction cohort traversing the layers BFS two properties and monotonicity

E.g. sequence $|x, x, x, | x+1, x+1, x+1|$ , its two-stage nature means that the datacanbe divided into two parts atmost, that is, x and x+1 here

First select the element x of the head of the line to expand. Assuming that the distances of the edges are all 1, then the distances extended by it are all x+1. Put it at the end of the line. Obviously, the sequence still satisfies both ends and monotonicity

2. On the premise of satisfying the two-stage and monotonicity, consider the heap-optimized version of Dijkstra's algorithm. Every time an element is taken from the opposite end of the priority queue, and all adjacent points are updated, it is also bipartite and monotonic for the traversal sequence, so it is consistent with our current approach. Since Dijkstra's algorithm is correct, the current approach is correct

• Method 2-direct proof

Using the method of proof by contradiction, suppose that the leader of the team at this time$s$ is not the shortest distance, that is, it exists in the subsequent state$t$ , meet$dis[s]> dis[t] + w$ . It has been proved in method 1 that the sequence satisfies the monotonicity, namely$dis[s] <= dis[t]$ , contradicts the hypothesis, so the hypothesis does not hold, that is, the element s at the head of the line must be the shortest distance.

Problem-solving code (C++)

#include <iostream>
#include <algorithm>
#include <cstring>
#include <cstdio>

using namespace std;

typedef pair<int, int> PII;

const int N = 1010, M = N * N;

int n;
int m[N][N];
bool st[N][N];
int dx[] = {-1, 0, 1, 0};
int dy[] = {0, 1, 0, -1};

PII q[M];
PII path[N][N];
PII nextLoca[N][N];

void bfs(int sx, int sy)
{
int hh = 0, tt = -1;
q[++ tt] = {sx, sy};
st[sx][sy] = true;

while (tt >= hh)
{
PII t = q[hh ++];
int xx = t.first, yy = t.second;
for (int i = 0; i <4; ++ i)
{
int x = xx + dx[i], y = yy + dy[i];
if (x <0 || x >= n || y <0 || y >= n || st[x][y] || m[x][y]) continue;

st[x][y] = true;
q[++ tt] = {x, y};
nextLoca[x][y] = {xx, yy};
}
}
}
int main()
{
cin >> n;
for (int i = 0; i <n; ++ i)
for (int j = 0;j <n; ++ j)
cin >> m[i][j];

nextLoca[n-1][n-1] = {n-1, n-1};
bfs(n-1, n-1);

PII t = {0, 0};
while (1)
{
int x = t.first, y = t.second;
cout << x << '' << y << endl;
if (x == n-1 && y == n-1) break;
t = nextLoca[x][y];
}

return 0;
}
Copy code

### Contains two path lengths

Title description

Given a length of $n$ 's$3$ runway roads, one runway includes$n + 1$ point, numbered$0$ to$n$ . A frog from the second runway$0$ at point and end at any point on the runway$n$ places. There may be some obstacles on the runway (guarantee point$0$ and points$n$Any runway at is barrier-free) When facing obstacles ahead, the frog can choose to side jump to the same point on other runways (the two runways may not be adjacent), and ask the frog the least number of side jumps to reach the end point

Example:

The minimum number of side jumps for the frog in the picture below is 2

Topic analysis

Set the cost of moving left and right to 0 and the cost of moving up and down to 1. Obviously, this problem can be abstracted as a single-source shortest path problem on a graph with edge weights only containing 0 and 1. The shortest path algorithm can naturally be solved, but it needs to be built. And BFS does not need to build a map, you can search directly. But the problem with BFS is that it can only solve graphs with equal edge weights, so a double-ended queue is introduced to solve this defect.

• Problem-solving strategy:

Still in accordance with the general BFS search order, take a point from the head of the deque for expansion. The difference is that when the point with edge 0 is placed at the head of the line, the point with edge 1 is placed at the end of the line. Get the shortest path to the end after all states are expanded

• Prove the correctness of BFS under the blessing of deque:

It is only necessary to prove that the sequence at this time still satisfies the bipartite and monotonicity . Suppose the sequence at this time is$|x, x, x, | x+1, x+1, x+1|$ , remove the element from the head of the line$x$ , if the extended point edge is 0, then$x+0$ placed at the head of the team, if the side is 1, then$x+1$ Put it at the end of the queue. Obviously, the changed sequence still satisfies thebipartiteandmonotonicity, that is, the deque BFS is correct.

Problem-solving code (C++)

//partial implementation of leetcode
const int N = 500010;

class Solution {
public:
int bfs(vector<vector<int>> &g, int n)
{
vector<vector<bool>> st(3, vector<bool>(N, false));
vector<vector<int>> dis(3, vector<int>(N, 0x3f3f3f3f));

deque<pair<int, int>> q;
int dx[] = {-2, -1, 1, 2, 0, 0}, dy[] = {0, 0, 0, 0, 1, -1};

q.push_front({1, 0});
dis = 0;

while (q.size())
{
auto t = q.front(); q.pop_front();
int x = t.first, y = t.second;

if (y == n-1) return dis[x][y];

if (st[x][y]) continue;
st[x][y] = true;

for (int i = 0; i <6; ++ i) {
int a = x + dx[i], b = y + dy[i];
if (a <0 || a >= 3 || b <0 || b >= n) continue;
if (g[a][b] == 1) continue;//Can t walk if there are obstacles

if (dis[a][b]> dis[x][y] + (i == 0 || i == 1 || i == 2 || i == 3))
{
dis[a][b] = dis[x][y] + (i == 0 || i == 1 || i == 2 || i == 3);
if (i == 0 || i == 1 || i == 2 || i == 3) q.push_back({a, b));
else q.push_front({a, b});
}
}
}

return -1;
}
int minSideJumps(vector<int>& obstacles) {

vector<vector<int>> g(3, vector<int>(N, 0));
int n = obstacles.size();

for (int i = 0; i <n; ++ i)
if (obstacles[i]) g[obstacles[i]-1][i] = 1;

return bfs(g, n);
}
};

//all achieved
#include <iostream>
#include <algorithm>
#include <cstring>
#include <cstdio>
#include <deque>

using namespace std;

const int N = 500010;

int n;
int obstacles[N], g[N];
int st[N];
int dis[N];

int bfs()
{
memset(dis, 0x3f, sizeof dis);

deque<pair<int, int>> q;
int dx[] = {-2, -1, 1, 2, 0, 0}, dy[] = {0, 0, 0, 0, 1, -1};

q.push_front({1, 0});
dis = 0;

while (q.size())
{
auto t = q.front(); q.pop_front();
int x = t.first, y = t.second;

if (y == n-1) return dis[x][y];

if (st[x][y]) continue;
st[x][y] = true;

for (int i = 0; i <6; ++ i) {
int a = x + dx[i], b = y + dy[i];
if (a <0 || a >= 3 || b <0 || b >= n) continue;
if (g[a][b] == 1) continue;//Can t walk if there are obstacles

if (dis[a][b]> dis[x][y] + (i == 0 || i == 1 || i == 2 || i == 3))
{
dis[a][b] = dis[x][y] + (i == 0 || i == 1 || i == 2 || i == 3);
if (i == 0 || i == 1 || i == 2 || i == 3) q.push_back({a, b));
else q.push_front({a, b});
}
}
}

return -1;
}

int main()
{
cin >> n;
for (int i = 0; i <n; ++ i) cin >> obstacles[i];

for (int i = 0; i <n; ++ i)
if (obstacles[i]) g[obstacles[i]-1][i] = 1;

cout << bfs() << endl;

return 0;
}
Copy code

## The minimum number of steps

In the connected block problem and the shortest path problem, the searched points are all points with coordinates on the graph. In the minimum number of steps problem, the searched point becomes a state containing various information. However, after understanding the related issues, you can find that these states can actually be seen as points on the graph.

### General practice

Title description

A board contains 8 grids, and each grid has a color. Starting from the upper left corner of the board, take out integers in a clockwise direction to form a color sequence. The color sequence can be used to indicate the status of this board.

For example: you can use sequence

12345678
Indicates the following board (at the same time this state is used as the basic state)

1 2 3 4
8 7 6 5
Copy code

The topic provides three basic operations (demonstrated by the basic state):

• A: swap the upper and lower lines

8 7 6 5
1 2 3 4
Copy code
• B: Insert the rightmost column to the leftmost

4 1 2 3
5 8 7 6
Copy code
• C: Rotate the 4 numbers in the center of the magic board clockwise

1 7 2 4
8 6 3 5
Copy code

Given the target state, query the sequence of operations from the basic state to the target state

Topic analysis

From the basic state to the target state, each state is a color sequence. The "points" on the searched graph have also become these sequences.

There is only one step operation between the two "points", so the weights of the upper edges of the graph are equal, and the general BFS can be used.

The current focus is to understand that "the searched point is actually a state". As for the hash required for code implementation, the mutual mapping between one-dimensional space and two-dimensional space is realized through coordinate transformation, so I won t go into details here.

Code

/**
* Coordinate mapping
* A: String flip
* B: a moves to a, a moves to a
* c:a->a, a->a, a->a, a->a
*
* Since this question has only two lines, a two-dimensional array can also be used to directly simulate a two-dimensional space, without the need for one-dimensional to two-dimensional coordinate mapping
*/
#include <iostream>
#include <algorithm>
#include <string>
#include <map>
#include <unordered_map>

using namespace std;

typedef pair<char, string> PII;

const int N = 40500;

unordered_map<string, int> dis;
unordered_map<string, PII> path;
string q[N];
char stk[N];

int bfs(string end)
{
string start = "12345678";
int hh = 0, tt = -1;
dis[start] = 0;
q[++ tt] = start;

while (tt >= hh)
{
string t = q[hh ++];
if (t == end) return dis[t];

string a(t);
reverse(a.begin(), a.end());
if (!dis.count(a))
{
dis[a] = dis[t] + 1;
path[a].first ='A', path[a].second = t;
q[++ tt] = a;
}

string b(t);
char u(b), v(b);
for (int i = 3; i> 0; - i) b[i] = b[i-1];
b = u;
for (int i = 4; i <7; ++ i) b[i] = b[i + 1];
b = v;
if (!dis.count(b))
{
dis[b] = dis[t] + 1;
path[b].first ='B', path[b].second = t;
q[++ tt] = b;
}

string c(t);
u = c;
c = c;
c = c;
c = c;
c = u;
if (!dis.count(c))
{
dis[c] = dis[t] + 1;
path[c].first ='C', path[c].second = t;
q[++ tt] = c;
}
}

return -1;
}
int main()
{
string end("");

for (int i = 0; i <8; ++ i)
{
char c;
cin >> c;
end += c;
}

cout << bfs(end) << endl;

int tt = -1;
for (string s = end; s != "12345678"; s = path[s].second) stk[++ tt] = path[s].first;
while (tt >= 0) cout << stk[tt --];

return 0;
}
Copy code

### Optimization-two-way BFS

Think first $a^b$ and$2 a^{\frac{b}{2}}$ The size relationship.

$\frac{a^b}{2 a^{\frac{b}{2}}} =/frac{a^{\frac{b}{2}}}{2}$, That is, the relationship between the two depends on $\frac{b}{2}$ with $2$ size relationship

The vast majority of real problems are$\frac{b}{2}> 2$ , namely$a^b> 2 a^{\frac{b}{2}}$

$a^b> 2 a^{\frac{b}{2}}$It clearly explained the relationship between general BFS and two-way BFS. Generally, in the one-way search used by BFS, the increase in the number of states is not linear, but exponential. It is similar to the figure below. When the end point is searched, the opening is already very large, which means that there are many invalid searches.

Two-way BFS refers to searching from the starting point and ending point to the middle at the same time (note that it does not mean synchronization at the same time). For each part, it can avoid the large increase in the number of states with the increase of the exponential level, thereby speeding up the search.

Title description

Given two strings A, B and a set of string transformation rules, ask for at least several transformations from A to B

String transformation rules

abc->xu
Refers to: in the string
abc
Can be replaced with
xu

Topic analysis

If you follow the general BFS of one-way search, this question is exactly the same as the question solution mentioned in the general practice.

If you follow the two-way BFS, the search direction should be selected from the start point to the end point and the end point to the start point. The number of the current search queues is less to reduce the number of search states. The rest of the process is the same as the one-way search.

Problem-solving code (C++)

One-way BFS:

/**
* The idea of  searching is not difficult to think of, each state is a string
* Expansion is to use every rule to replace
* The difficulty lies in how to use each rule
*
* After reading the explanation, I found that I want to be complicated. The use of rules is the general processing of strings. There is no complicated string algorithm involved. I feel that complexity is completely fearful, just like the previous problem of trees and graphs.
*
* The core test point of this question is obviously not how to use the conversion rules. Due to the excessive number of states, the general bfs will time out, and it should search from the start and end to the middle at the same time
* If it is a one-way search, the increase in the number of states is not at a linear level, but at an exponential level. The difference between a one-way search and a search from both ends to the middle is similar to a^b and 2*a^(b/2) difference
* It will greatly reduce the number of search states. For a clearer understanding, first implement one-way bfs
*
* After writing the code, I really found that it is not my illusion that the string problem is more difficult. It is really different from other problems. It is easy to make mistakes or not understand on some problems.
*/

//One-way bfs
#include <iostream>
#include <algorithm>
#include <queue>
#include <unordered_map>

using namespace std;

const int N = 10;

int n;
string A, B;
string a[N], b[N];
queue<string> q;
unordered_map<string, int> dis;

/**
* The first time I encountered a problem, when I replaced a with b, I only replaced the first a and returned directly.
* I did think of this problem before writing, what if there are multiple matching a
* Actually, only one replacement is currently available. After putting it in the queue, it will naturally continue to replace if it encounters this one, replacing one, two, and so on. . . Can be searched for
* But if there are multiple positions that match, although only one is replaced, all the replacements of each position should be put into the queue instead of the first matching position
*/
void extend(string t, string a, string b)//replace the part of a in t with b
{
for (int i = 0; i <t.size(); ++ i)
if (t.substr(i, a.size()) == a)
{
string ex = t.substr(0, i) + b + t.substr(i + a.size());

if (dis.count(ex)) continue;

q.push(ex);
dis[ex] = dis[t] + 1;
}
}
int bfs()
{
q.push(A);
dis[A] = 0;

while (q.size())
{
string t = q.front();
q.pop();

if (t == B) return dis[t];

for (int i = 0; i <n; ++ i)
extend(t, a[i], b[i]);
}

return 11;
}
int main()
{
cin >> A >> B;

while (cin >> a[n] >> b[n]) ++ n;

int t = bfs();

if (t> 10) cout << "NO ANSWER!" << endl;
else cout << t << endl;

return 0;
}
Copy code

Two-way BFS:

/**
* Two-way bfs as an optimization of bfs
* Mainly used in the minimum number of steps model, that is, the problem of a large number of states to be searched
* The path problem in the chessboard does not need to use two-way bfs because the number of grid points will not be large, and the search time will be less
*
* Bidirectional bfs is to search from the starting point and the end point to the middle at the same time, which can greatly reduce the number of search states from a mathematical point of view
* Take elements from the queue with a small number of elements each time
*/
#include <iostream>
#include <algorithm>
#include <cstring>
#include <cstring>
#include <queue>
#include <unordered_map>

using namespace std;

const int N = 10;

int n;
string A, B;
string a[N], b[N];
queue<string> qa, qb;
unordered_map<string, int> da, db;

/**
* Consider the q queue, the distance of the elements inside is da, and the distance of the elements in the other queue is db, replace a with b
* Da, db must be added here, otherwise we don t know whether the distance of the element t of the q queue is da[t] or db[t]
*/
int extend(queue<string> &q, unordered_map<string, int> &da, unordered_map<string, int> &db, string a[], string b[])
{
//What we have to do is to expand all the elements in the current q, but writing in this way will also traverse the expanded points
//while (q.size())
for (int k = 0, qs = q.size(); k <qs; ++ k)
{
string t = q.front();
q.pop();

for (int i = 0; i <t.size(); ++ i)
for (int j = 0; j <n; ++ j)
if (t.substr(i, a[j].size()) == a[j])
{
string state = t.substr(0, i) + b[j] + t.substr(i + a[j].size());

if (da.count(state)) continue;
if (db.count(state)) return da[t] + 1 + db[state];//Once the intersection has been found, the distance can be returned, da[t] + 1 is da[state], because at this time da[ state] has not been updated yet so use da[t]

da[state] = da[t] + 1;
q.push(state);
}
}

return 11;//q did not find the intersection when expanding
}
int bfs()
{
qa.push(A); da[A] = 0;
qb.push(B); db[B] = 0;

/**
* The solvable condition is that walking from the start point to the end point and walking from the end point to the start point should be able to walk to the same state
* If you have already traveled in a certain direction and have not met, it means there is no solution
* Why is the above conclusion correct?
* For the convenience of description, we define the direction a to go from the starting point to the end point, and the direction b to go from the end point to the starting point.
* I consider a situation as if b has searched all the conditions, a has not searched, and the intersection has not been encountered at this time (the intersection is a state where both a and b have traveled, obviously the starting point and the end point are both Intersection point), is it possible that the intersection point is in b, but a has not reached the intersection point yet
* This idea is wrong, because if the problem is solvable, then from the direction b alone, it must eventually reach the starting point.
* Even if there is no movement at all in the a direction, it must be possible to walk to the starting point from the b direction alone, which becomes a one-way bfs
* In all cases of b, the intersection is not found after the search is completed (this includes the starting point), indicating that there is no path between the starting point and the end point, that is, there is no solution
*
* So the condition of bfs is that the two queues are not empty. During this period, if an answer is found, it means there is a solution, and a queue is already empty and no answer is found.
*/
while (qa.size() && qb.size())
{
int t;
if (qa.size() <= qb.size()) t = extend(qa, da, db, a, b);//For qa, replace a with b
else t = extend(qb, db, da, b, a);

if (t <= 10) return t;
}

return 11;
}
int main()
{
cin >> A >> B;

while (cin >> a[n] >> b[n]) ++ n;

int t = bfs();

if (t> 10) cout << "NO ANSWER!" << endl;
else cout << t << endl;

return 0;
}
Copy code

### Optimization-A*

How to understand A *

As an optimization, A* must be produced for a certain shortcoming of general BFS. In the minimum number of steps problem, the search space may be so large that using bidirectional BFS will still time out. This is because the optimization of two-way BFS only reduces the size of the search space from the mathematical dimension. Compared with the general BFS, the search logic has not changed. The logic they use is to first search for the state closer to the starting point, but these states reach the end point. The distance may be very large, and they are not the final answer. Searching for these states first wastes a lot of time. A* is to change this search strategy that only looks at local factors to look at global factors, not only referring to the length of the previous road but also the length of the back path, that is, the priority is to search for the path length + distance from the starting point The state where the distance from the end point is smaller, these states are obviously easier to reach the end point.

• From the point of view of code implementation, A* changes the general BFS queue to a priority queue. The queue can guarantee the bipartiteness and monotonicity of the sequence only when the path lengths are equal, that is, to ensure the correctness of the search. However, the length of the path from the starting point + the length of the distance from the end is obviously not all equal, and the search has priority, so you need to use the priority queue
• From a logical point of view, A* introduces an evaluation function, which changes the search order of BFS. From the starting point a to the state b, the distance from b to a is known, but the distance to the end is unknown, but we need to use this value to determine the priority of the search, so an evaluation function is introduced as a state to arrive Estimated distance to the end point . The estimated distance is not calculated on the answer, it just changes the search order

Valuation function

From the above analysis, we can know that the key to A* is to find the valuation function. The selection of the valuation function has the following two rules and properties

1. The smaller the deviation between the estimated value and the true value, the better the optimization effect

It s not difficult to understand, when valuation$==$ True value, the search path is the best path, go directly to the end

2. $Estimated value true value$

Consider there are two states$a$ and$b$ from$a$ and$b$ can reach the end,$The estimated value of the two are respectively> their own true value$ , and there is$Valuation a> Valuation b$ ,$True value a , according to the estimated value, the first search state B, and to search for the end point, the end of the search process. Obviously the result obtained is wrong. A shorter distance can be obtained from the state a to the end point, but because$The estimated value of a>the estimated value of b$ , so choose the wrong search order. So guarantee$Estimated value true value$ meaning that when$the true value of a , it is possible to ensure that$Estimated value of a ; when$the true value of a> the true value of b$ , i.e.$The true value of b when, to ensure$Estimated value of b , both in the above cases, can be guaranteed according to the size of the selected search path estimated value must be correct.

It should be noted that$2$ the discussion of , the size relationship of the estimated value can be obtained according to the size relationship of the real value, but the opposite is not true. In the above proof, we considered$the true value of a and$the true value of a> the true value of b$ two cases, both of which already contains all possible, and$Estimated value true value$ under the conditions can be determined in both cases can get the right answer, it proved to be correct. However, the actual execution process is to determine the order of search according to the size relationship of the estimated value, but the real value of the state with a small estimated value may be greater, but the correct answer can still be obtained in the end. Consider the two states a and b, satisfying$Estimated value a ,$True value a> true value b$ , from the beginning of a search, although walking the wrong path, but before the end of the team is not a state c (the state may be the end, when a and b are directly connected to the end, here refers to the state The estimated value of the end point) will be> the estimated value of b ($Estimated value true value$ ,$True value b , it is possible to obtain$Estimated value b ), so that once again will be from b to complete the correction process.

Issues that need attention

A* has higher search efficiency only when it is guaranteed to have a solution, and it is not even as efficient as the general BFS when there is no solution. This is because when there is no solution, A* has to search the searchable state like general BFS, but BFS uses a queue, and both read and write$O(1)$ , but A* uses a priority queue, the write operation in the heap is$O(logn)$ , so the efficiency will be worse. This explains why in the following A* code, it is necessary to first determine whether there is a solution before starting the search.

Of course, in actual problems, it is difficult to judge whether there is a solution before the problem is solved, and the use of A* can only be solved by general methods, holding a try-and-see attitude

Title description

in a $3 3$ the grid of ,$1 8$ this$8$ numbers and one

X
It just happens to be distributed here $3 3$ in the grid.

E.g:

1 2 3
X 4 6
7 5 8
Copy code

Can put

X
Exchange with numbers in one of the four directions up, down, left, and right (if it exists)

The purpose is to make the grid into the following arrangement (called the correct arrangement) through exchange:

1 2 3
4 5 6
7 8 X
Copy code

Put

X
The action record of digital exchange with the up, down, left, and right directions is
u
,
d
,
l
,
r

Give you an initial grid, ask at least a few moves to get the correct arrangement. If there is a solution, output the complete action record with the correct arrangement. If the answer is not unique, output any legal solution. If there is no solution, output

unsolvable

Topic analysis

The reason why A* is used instead of general BFS has been mentioned in the introduction of A* above, so I won t repeat it here, and consider the implementation of A* directly.

The key is to determine the valuation function to ensure$Estimated value true value$ for a state, is converted into the correct number of each alignment needs to move to the correct position, only a few, the mobile moves to a minimum number of target position is the current position of the Manhattan distance from a target position, so we consider The sum of the Manhattan distances of all numbers from their respective target locations is used as the estimated value, which will surely be able to guarantee$Estimated value true value$ , since only a minimum number of times for mobile Manhattan distance, but taking into account the number of the plurality may require some additional operations,$The true value is/geqManhattan distance and$ the.

For the eight-digit problem, the necessary and sufficient condition that there must be a solution is that the two-dimensional array is read from left to right in line-first order. The number of reverse pairs of the sequence obtained must be an even number

• Proof of necessity (the problem has a solution, indicating that the number of reversed pairs must be even): because intra-row movement does not change the number of reversed pairs, and the change of the number of reversed pairs between rows is even, so any transformation will not change the parity of the number of reversed pairs. , The number of reversed pairs of the target sequence is 0, that is, an even number, so only when the number of reversed pairs of the initial state sequence is an even number, it is possible to turn into the target state

• Proof of sufficiency (the problem must be solved if the number of initial inverse pairs is even): it is difficult to prove

Problem-solving code (C++)

General BFS:

#include <iostream>
#include <algorithm>
#include <queue>
#include <unordered_map>

using namespace std;

string bfs(string start)
{
string end = "12345678x";
queue<string> q;
unordered_map<string, int> dis;
unordered_map<string, pair<string, char>> prev;
char op[] = {'u','r','d','l'};
int dx[] = {-1, 0, 1, 0}, dy[] = {0, 1, 0, -1};

q.push(start);
dis[start] = 0;

while (q.size())
{
string t = q.front(); q.pop();
int k = t.find('x');
int x = k/3, y = k% 3;

if (t == end)
{
string res("");
while (end != start)
{
res += prev[end].second;
end = prev[end].first;
}
reverse(res.begin(), res.end());
return res;
}
for (int i = 0; i <4; ++ i)
{
int a = x + dx[i], b = y + dy[i];
string str(t);
swap(str[k], str[a * 3 + b]);
string state =str;

if (a <0 || a >= 3 || b <0 || b >= 3) continue;
if (dis.count(state)) continue;

dis[state] = dis[t] + 1;
prev[state] = {t, op[i]};
q.push(state);
}
}

return "unsolvable";
}
int main()
{
string start("");

char c;
while (cin >> c) start += c;

string t = bfs(start);

cout << t << endl;

return 0;
}
Copy code

A*optimization

#include <iostream>
#include <algorithm>
#include <queue>
#include <unordered_map>

using namespace std;

int f(string state)//Valuation function-get the estimated value of state
{
int res = 0;
for (int i = 0; i <state.size(); ++ i)
if (state[i] !='x')
{
int t = state[i]-'1';
res += abs(i/3-t/3) + abs(i% 3-t% 3);
}
return res;
}
string bfs(string start)
{
string end = "12345678x";
char op[] = {'u','r','d','l'};
int dx[] = {-1, 0, 1, 0}, dy[] = {0, 1, 0, -1};

unordered_map<string, int> dis;
unordered_map<string, pair<string, char>> prev;
priority_queue<pair<int, string>, vector<pair<int, string>>, greater<pair<int, string>>> heap;

dis[start] = 0;
heap.push({f(start), start});

while (heap.size())
{
auto t = heap.top(); heap.pop();
string state = t.second;
int k = state.find('x');
int x = k/3, y = k% 3;

if (state == end) break;
for (int i = 0; i <4; ++ i)
{
int a = x + dx[i], b = y + dy[i];
if (a <0 || a >= 3 || b <0 || b >= 3) continue;

string tmp(state);
swap(tmp[k], tmp[a * 3 + b]);

if (!dis.count(tmp) || dis[tmp]> dis[state] + 1)//Previously, except for dijkstra, dis[tmp] was directly updated without judgment, and dijkstra was implemented using arrays Initialization is carried out. There is no initialization here. Although the default value is 0, it is still necessary to judge whether it does not exist from a rigorous point of view.
{
dis[tmp] = dis[state] + 1;
prev[tmp] = {state, op[i]};
heap.push({dis[tmp] + f(tmp), tmp});//Note that what we store should be the distance from the start point to the state + the estimated distance from the state to the end point, that is, the final answer is considered, follow The size of the final answer is the optimal search order
}
}
}

string res("");
while (end != start)
{
res += prev[end].second;
end = prev[end].first;
}
reverse(res.begin(), res.end());//remember to reverse
return res;
}
int main()
{
char c;
string start, seq;

while (cin >> c)
{
start += c;
if (c !='x') seq += c;
}

int cnt = 0;
for (int i = 0; i <seq.size(); ++ i)
for (int j = i + 1; j <seq.size(); ++ j)
if (seq[i]> seq[j]) ++ cnt;

if (cnt & 1) cout << "unsolvable" << endl;
else cout << bfs(start) << endl;

return 0;
}
Copy code

## Multi-source BFS

Title description

Given a $N$ row$M$ column$01$ matrix$A$ ,$A[i][j]$ and$A[k][l]$The Manhattan distance between is defined as:

$dist(A[i][j],A[k][l])=|i k|+|j l|$

Output one $N$ row$M$ column integer matrix$B$ , where:

$B[i][j]=min_{1 x N,1 y M,A[x][y]=1}dist(A[i][j],A[x][y])$

Topic analysis

In layman's terms, the topic is to find the minimum Manhattan distance between all 0 and 1 in the matrix.

The forward thinking is to traverse the matrix, starting from all 0s, BFS finds the shortest distance from 1, and the result is no error but it will be TLE

As the saying goes, "If the forward solution is difficult, try the reverse solution"

The reverse idea is obviously to start BFS from all 1s, and update the answers of each 0 in the process. The correctness of this proof is more difficult to think. It is very clever to introduce the concept of a virtual source point , which is the distance from all 1s. All 0. For all 0s, the closest distance to 1 is the closest distance to the virtual source point. So the problem has been transformed into a single source shortest path problem starting from the virtual source point, that is, Dijkstra. Dijkstra is correct, So obviously this approach is also correct.

Problem-solving code (C++)

Forward solution:

#include <iostream>
#include <algorithm>
#include <cstring>

using namespace std;

typedef pair<int, int> PII;

const int N = 1010, M = N * N;

int n, m;
int g[N][N];
char ch[N][N];
PII q[M];
bool st[N][N];
int dis[N][N], step;

void bfs(int sx, int sy)
{
int hh = 0, tt = -1;
int dx[] = {-1, 0, 1, 0};
int dy[] = {0, 1, 0, -1};

memset(st, 0, sizeof st);
st[sx][sy] = true;
q[++ tt] = {sx, sy};
step = 0;

while (tt >= hh)
{
int size = tt-hh + 1;
while (size --)
{
PII t = q[hh ++];
int a = t.first, b = t.second;

if (g[a][b])
{
dis[sx][sy] = step;
return;
}
for (int i = 0; i <4; ++ i)
{
int x = a + dx[i], y = b + dy[i];
if (x <0 || x >= n || y <0 || y >= m || st[x][y]) continue;

st[x][y] = true;
q[++ tt] = {x, y};
}
}
++ step;
}
}
int main()
{
cin >> n >> m;

for (int i = 0; i <n; ++ i) cin >> ch[i];

for (int i = 0; i <n; ++ i)
for (int j = 0; j <m; ++ j)
g[i][j] = ch[i][j]-'0';

for (int i = 0; i <n; ++ i)
for (int j = 0; j <m; ++ j)
if (!g[i][j]) bfs(i, j);

for (int i = 0; i <n; ++ i)
{
for (int j = 0; j <m; ++ j)
cout << dis[i][j] << '';
cout << endl;
}

return 0;
}
Copy code

Reverse solution:

#include <iostream>
#include <algorithm>
#include <cstring>

using namespace std;

typedef pair<int, int> PII;

const int N = 1010, M = N * N;

int n, m;
char g[N][N];
int dis[N][N];
PII q[M];

void bfs()
{
int dx[] = {-1, 0, 1, 0}, dy[] = {0, 1, 0, -1};
int hh = 0, tt = -1;
memset(dis, -1, sizeof dis);

for (int i = 0; i <n; ++ i)
for (int j = 0; j <m; ++ j)
if (g[i][j] == '1')
{
dis[i][j] = 0;
q[++ tt] = {i, j};
}

while (tt >= hh)
{
PII t = q[hh ++];
int a = t.first, b = t.second;

for (int i = 0; i <4; ++ i)
{
int x = a + dx[i], y = b + dy[i];
if (x <0 || x >= n || y <0 || y >= m || dis[x][y] != -1) continue;

dis[x][y] = dis[a][b] + 1;
q[++ tt] = {x, y};
}
}

}
int main()
{
cin >> n >> m;
for (int i = 0; i <n; ++ i) cin >> g[i];

bfs();

for (int i = 0; i <n; ++ i)
{
for (int j = 0; j <m; ++ j) cout << dis[i][j] << '';
cout << endl;
}

return 0;
}
Copy code

## Difference and connection

Here we mainly discuss the difference and connection between Dijkstra, general BFS, double-ended queue BFS and A*

contact

No matter which solution is used, it is essentially the single-source shortest path problem, so it is essentially Dijkstra s idea

the difference

algorithmfeature
Dijkstra (heap optimized version)Enter the team multiple times, the shortest path when each point leaves the team for the first time
General BFSEnter the team once, the shortest path is the first time each point leaves the team
Deque BFSEnter the team multiple times, the shortest path when each point leaves the team for the first time
A*Entering the team many times, and leaving the team at other points except the end point is not necessarily the shortest path. The shortest path is the first time out of the end point

Although the idea of Dijkstra algorithm is used in essence, the actual execution process is different due to different conditions. Next, the general BFS, double-ended queue BFS and A* will be discussed separately.

A certain point can be put into the queue to indicate that the distance value of the point is updated

• General BFS

The reason for entering the queue once is that the path length is unique (assumed to be 1). There are two points a and b in the queue, and the distance between them and point c is 1. If$d_a == d_b$, Obviously$d_a + 1 = d_b + 1$ Since the extended distances are equal, point c only needs to join the team once, if$d_a , Obviously$d_a + 1 The distance does not need to be updated for the second time after joining the team for the first time, so it is still only one time for joining the team.

In the shortest path problem, it has been proved that the corresponding path when each point leaves the queue is the shortest path.

• Deque BFS

The reason for entering the queue multiple times is that the path length is not unique (assuming 0 and 1), and there are two points in the queue$d_a == d_b$, The two distances to point c are 1 and 0 respectively. Obviously$d_a + 1> d_b + 0$ , so point c will join the team twice.

Like general BFS, it has two-stage and monotonicity, so it can be proved that each point is the shortest path when it leaves the queue for the first time (the proof method is the same as that of general BFS)

• A*

The reason for multiple entries is that the path length is not unique

Since our search order is always based on the estimated distance from the end point, the priority is to search for points with a small estimated distance from the end point, and the estimated distance considers the entire section rather than just the distance from the start point, so we can only Ensure that the shortest distance is obtained when the end point is right, and other points cannot be guaranteed