A good scenario-based Git best practice manual (multilingual): github.com/k88hudson/g...
Git version control and storage model
What is Git and why use Git
Imagine a scenario: you have written a file whose version is 1.0. Now you need to modify and optimize the content. You are afraid of losing the original file, so you copy the 1.0 version of the file and make changes on this copy.
The two files can't cause too much trouble. However, in actual study and work, we often encounter multiple files in a project, and each file has been modified a lot. When working in a team, everyone needs to update the progress synchronously. , In order to ensure that the update and communication parts of each will not go wrong. This makes version management very difficult, resulting in a lot of brainpower and labor costs.
Git was born to solve this problem. Git is a version control system that will help workers to record every update submitted, without the workers having to manually create copies, merge updates, and manage version history. In team collaboration, there is no need for a manager to actively distribute history records and new versions to other members. Project members can directly copy, Pull a copy of the same record, create your own work branch, and push your own updates back. After each member completes their tasks and pushes them, the manager checks these branches through Git, merges them, and completes the version update of the total project.
Distributed system, localized storage
Git is designed as a distributed management system. Each machine participating in the project has a complete version library. The submission and modification of the version history is also done locally. For users of this machine, you can quickly switch versions and branches without having to obtain version information through the network.
You need to interact with the version history of other machines, such as cloning, remote push, and unified summary and merge. The network is only needed.
As for the so-called "central server", just for convenience, setting up a server with a clean and complete version history is easy to publish, manage, and standardize. It does not mean that Git is not a distributed system.
Store a snapshot of changes over time (snapshot)
Every time an update is submitted/save the project status, Git will create a snapshot of all the files at that time and save the index/pointer. Of course, there is no need to recreate the snapshot and store the unmodified file, as long as the link is kept and points to the previous file. Therefore, when changing and viewing files of different versions, Git can replace them by changing the index/pointer without recalculating the difference. This is the reason why Git version switching is faster.
Git divides the folders/files to be version controlled into three parts: workspace, temporary storage area, and version library.
- Work area, which represents the files that we are currently compiling and modifying, but have not yet submitted to the temporary storage area
- Staging area, through git addSave the current state of the workspace as preliminary information that may be submitted to the version library in the future, and you can add coverage multiple times
- Repository, or Git repository, is through git commitSave the information in the temporary storage area as the final and official version record and add it to the version library. The content of the version library cannot be modified at will (version rollback is index/pointer rollback and does not really delete the content).
Configure user information after installation
Set the user name and email address, this setting is necessary . Every Git commit needs a source record in order to work properly.
Git config --global user.name "John Doe" git config --global user.email "email@example.com" Copy the code
Check the configuration information through the following command, and other configuration parameters can be viewed on the official website document.
Git config --list copy the code
Git Help config git the Add -h copy the code
The above instructions can view the help information of specific instructions in the console,
In addition, it is also very convenient to search for instructions through Git official website, technical forums, search engines, etc.
Get the Git repository
There are two ways to create a Git repository locally, either by initializing it locally or by cloning someone else's repository:
Initialize a warehouse in an existing project directory
Enter your project directory and enter the following command to let Git take over the files in the current directory and its subdirectories.
git initCopy code
This step will create a
Clone a warehouse from another server
Git clone https://github.com/libgit2/libgit2 mylibgit copy the code
The following figure shows the state change cycle of the file:
- Untracked is not tracked, usually occurs when a new file is created, or the file is removed from the temporary storage area and the version library record
- Unmodified, literally, the file has not been modified
- Modified has been modified, as above, after the workspace file has been modified, it has not been submitted to the staging area or version library
- Staged has been temporarily stored and has not been submitted to the repository
View file status
git status On branch master Your branch is up-to-date with'origin/master'. nothing to commit, working directory clean Copy code
The sample information indicates that the workspace is "clean", which means that no new or modified files have been created or modified. Note that the word master, master refers to the branch we are currently in, and is also the default master branch. The concept of branch can be seen later.
File tracking/adding to staging area
Git the Add * .c git the Add LICENSE copy the code
If a file is not tracked, then in Git
Git the Add --all copy the code
Submit to the repository
Git the commit -m 'Initial Project Version' duplicated code
If you want to submit directly from the workspace in one step, you can use
Ignore file gitignore
There are some files, such as temporary files, toolkits during development, log files, etc., which generally do not need to be included in Git management. At this time, you can create one in the project root directory
*.[oa] *~ Copy code
Ignore all .a file *.a But keep track of all lib.a, even if you ignore the .a file in front !lib.a only TODO files in the current directory, not subdir/TODO /TODO Ignore any directory named build folder build/ Ignore doc/notes.txt, but do not ignore doc/server/arch.txt doc/*.txt .Pdf files under ignored doc/directory and all subdirectories doc/**/*.pdf Copy code
git diffCopy code
The above command compares the difference between the current work area and the current staging area snapshot, that is, the current file, and
Git diff --staged copy the code
Git RM filename copy the code
The above command will delete files from the work area and temporary storage area,
If the file to be deleted has been modified or has been placed in the temporary storage area, give
If you want to remove a file from Git management, but still need to remain in the current working directory, add
Git cannot directly track file movement. Simply renaming a file will be treated as a new file. To rename a file in Git, you can do this:
Git mv README.md the README copy the code
Equivalent to the following three commands
Mv README.md the README git RM README.md git the Add the README copy the code
git status On branch master Your branch is up-to-date with'origin/master'. Changes to be committed: (use "git reset HEAD <file>..." to unstage) renamed: README.md -> README Copy code
git log commit 085bb3bcb608e1e8451d4b2432f8ecbe6306e7e7 Author: Scott Chacon <firstname.lastname@example.org> Date: Sat Mar 15 16:40:33 2008 -0700 removed unnecessary test commit a11bef06a3f659402fe7563abf99ad00de2209e6 Author: Scott Chacon <email@example.com> Date: Sat Mar 15 10:31:28 2008 -0700 first commit Copy code
In chronological order, the new submission is on the top, listing the hash checksum of each submission, author information, and submission instructions (
If you want to display the submitted information in one line, you can use
git log --pretty=oneline 15027957951b64cf874c3557a0f3547bd83b3ff6 Merge branch'experiment' a6b4c97498bd301d84096da251c98a07c7723e65 beginning write support 0d52aaab4479697da7686c15f77a3d64d9165190 one more thing Copy code
The more common way is to use
git log --pretty=format: "%h %s" --graph * 2d3acf9 ignore errors from SIGCHLD on trap * 5e3ee11 Merge branch'master' of git://github.com/dustin/grit |\ | * 420eac9 Added a method for getting the current branch. * | 30e367c timeout code and tests * | 5a09431 add timeout protection to grit * | e1193f8 support for heads with slashes in them |/ * d6016bc require time for xmlschema * 11d191e Merge branch'defunkt' into local Copy code
For specific parameters, please check the official document: git-scm.com/book/en/v2/...
If once due to a mistake, some files are found to have errors after submission, or some files have forgotten to be submitted to the temporary storage area, at this time, you can execute the following command after modifying and adding to the temporary storage area:
Git the commit --amend copy the code
Git the commit -m 'Initial the commit' Git the Add forgotten_file Git the commit --amend duplicated code
Suppose you pass
Git the HEAD README.md the RESET copy the code
Undo the modification of the workspace
If you want to undo the modification to the workspace file through the Git command, you can overwrite it with the most recently submitted version, which is equivalent to resetting the modification. The command is as follows.
Git Checkout - README.md copy the code
If we make a wrong commit, or some solutions are abandoned, we need to go back to a previous version to re-develop, then we need to perform a version rollback:
git reset --hard HEAD^ copy code
Here is one
Git the RESET --hard a721c9 copy the code
Note here that it does not have to be a 6-digit hash, Git will automatically find the same hash at the beginning based on the value we enter. For example, if there is no repetition, it is possible to enter the first 4 digits. If there are repetitions, it is time to enter a few more digits.
Tag the submission
What is the label?
Generally speaking, the tag (ie Tag) represents the tag, name, version number and other content of a certain submission, such as v1.0, v2.0-beta. with
View tag history
Very simple, the example is as follows:
git tag v0.1 v1.0 v1.1 Copy code
Git has two types of tags, lightweight and annotated.
The former is a submitted citation, and the latter has more information than the former, such as the name of the tagger, email address, date and time, etc. The official website recommends creating a note label because it has more traceable information.
- An example of creating a note label is as follows:
Git V1.4 Tag -a -m "My Version 1.4" Copy the code
The above instructions passed
- An example of creating a lightweight label is as follows
git tag v1.4Copy code
It is a lightweight label without any parameters, in
- An example of tagging historical submissions is as follows
Git Tag -a v1.2 9fceb02 copy the code
Just add a submitted hash afterwards.
By default, tags will not be transmitted to the remote warehouse server, so you need to push tags manually. The instructions for pushing tags are as follows:
Git the Push Origin V1.4 copy the code
The above instructions push the v1.4 label to the remote origin repository (the concept of origin will be described in the remote server later).
If you want to push multiple tags, you can use the following command:
Git the Push the --tags Origin copy the code
The above command pushes all tags that are not on the remote server
Delete local label:
Git Tag -d V1.4 copy the code
Delete remote tag
Git the Push Origin --delete 1.4 Copy the code
The branch function of Git is the core that needs to be mastered.
In the theoretical part at the beginning of the document, we mentioned that Git creates snapshots for different versions of files, and uses indexes/pointers to point to different snapshots to achieve rapid version switching.
So the concept of [branch] is actually very simple. Suppose we have the original version of a file, called V0, and use the default pointer master to point to this version. When we submit the update to V1, V0 -> V1 will form an ordered reference relationship. V0 points to V1. Now visit the master, the V1 version of the file will be displayed.
At this time another member came, so the two decided to develop a part of each and then merge, so each cloned a project from the central server. Suppose you update the V1 version of the file to V2-1, which is represented by the pointer dev1, and he updates the file to V2-2, which is represented by the pointer dev2. At this time, there is a bifurcation relationship between V1 and V2, that is, the master pointer of V1, there are two paths to go forward, but the results of these two paths are incomplete, how to merge into the real V2, and then let Does the master point to V2?
Now the two have adopted the following scheme: a new pointer called pre is created to point to the master, which is V1, to indicate the meaning of pre-release. Now the two have merged their respective parts into pre, and pre has now become V2. At this time, we review the changes in all pointers:
|master||V0 -> V1|
|dev1||master -> V2-1|
|dev2||master -> V2-2|
|pre||master -> V2 (Merge the contents of the files of dev1 and dev2)|
It can be seen that the version changes of the entire file are called [branch], which is very vivid. After checking that there is no problem, we let the master pointer follow the path of the pre pointer, and now the master points to V2! If you want to continue to update the version, pull the master file again, and then repeat the previous idea.
The above content involves operations such as branch creation, switching, and merging.
The HEAD pointer is a concept, which can be intuitively understood as that HEAD always points to the latest commit under the current branch.
In the previous operation, we have to switch between different branches. We [see] different versions of files under different branches. To [see] master, dev1, dev2, we need an [eye], This [eye] is the HEAD pointer, which is the active pointer that we often use when we write some data structure algorithms, such as trees and linked lists.
For example, when HEAD points to dev1, what we see is the latest file snapshot of dev1, and when it points to master, what we see is the file snapshot of master.
Git Branch dev1 copy the code
The above instruction creates a branch named dev1 with a pointer to the current file, which is HEAD.
git branch dev1 * master Copy code
Git Switch dev1 copy the code
If you want to switch branches while creating, you can use
When you submit a new version under the dev1 branch, both the dev1 and HEAD pointers will move forward, but the original master branch will not move forward.
Branch merge without divergence
Now suppose that the development under the dev1 branch has been completed and submitted, and now we need to modify the second requirement on the basis of the master, we first switch back to the master branch:
$ git switch masterCopy code
At this time, the HEAD pointer returns to the position of the master, and the content of the work area is also restored to the state of the master. We create the dev2 branch to modify and submit, the steps are the same as dev1.
Now we merge the newly added content of dev2 to master:
git switch master git merge dev2 Updating f42c576..3a0874c Fast-forward index.html | 2 ++ 1 file changed, 2 insertions(+) Copy code
Now dev2 is useless, we delete it.
Git Branch -d dev2 copy the code
Divergent branch merge
Now we want to add the work content of dev1 to the master. Obviously, there are some differences between this and the merging of dev2: the content of dev1 comes from earlier history, and there is no modified part of dev2. If you directly move the current master pointer to dev1, the content of dev2 will be lost. At this time, you will encounter some additional conditions:
The contents of files changed by different branches have cross-conflict parts
For example, two branches change the same part of the same file. This situation is very common. For example, dev1 puts a function at the top, and dev2 also adds a function to the top. Our idea is to have both functions There are at
git switch master git merge dev1 Auto-merging Test1.txt CONFLICT (content): Merge conflict in Test1.txt Automatic merge failed; fix conflicts and then commit the result. Copy code
These prompts indicate that there is a conflict in the merge. At this time, open the file, there will be the following conditions:
Git automatically performed special processing on the file, and at the same time there were modifications to dev1 and dev2,
At this point we can manually process the two parts, such as deleting
There are no cross conflicts between changes in different branches
This situation is also very common. When multiple people write code, they will always be divided into file modules, and different people will write different files. The merge at this time will proceed normally , and the snapshots of the two branches and the nearest common root node of the two branches will be merged into a new commit .
For example, the change of dev2 is to create a new file, not to modify the source file. At this point, we repeat the previous process and merge the master pointing to dev2 with dev1:
git switch master git merge dev1 Merge made by the'recursive' strategy. Test1.txt | 1 + 1 file changed, 1 insertion(+) Copy code
Although the master and dev1 are in different branch routes at this time, the previous conflict will not occur at this time, but the respective parts are directly superimposed to form a new commit.
Rebase of branch management
As we mentioned before, the common way to merge branches is to use
So what does Rebase mean?
Common Rebase examples
According to the above picture, logically, we can also not perform violent merge, but first compare the difference between C4 and C2, record the changes, and then add to C3 to form a new submission, experiment redirection After confirming that the submission (the meaning of the word Rebase is to change the benchmark) is correct, we can let the master go one step further. This is the function of Rebase:
git checkout experiment git rebase master 1. rewinding head to replay your work on top of it... Applying: added staged command Copy code
It can be seen that the experiment has been transferred from the original C4 to C4', and the content of the file is equivalent to the original C5. At this time, pass
Git Checkout Master git Merge Experiment copy the code
Some people may ask, what use does such an operation seem to be useful for? In fact, from the perspective of results, natural and
More complex Rebase example
The example in the figure can be assumed as the following situation: At C2, the project is split into two parts, one of which is the underlying backbone code of the master, the other is the specific application development started by C3, and C3 is divided into two parts , Namely server server and client client.
If the client has been developed at this time and wants to join the master, but the server has not been completed yet, it cannot be merged with the client and the C3 root node. It can be used at this time
The specific code is as follows:
Git rebase the --onto Master Server Client copy the code
The meaning of this command is to use the root node when the client and server start to diverge as the benchmark (C3), compare with the client, obtain the changes of C8 and C9, and then add them to the branch where the master is located to form a new path, and then the client The pointer points to the final result.
Then just move the master pointer as before:
Git Checkout Master git Merge Client copy the code
The new submission history is as follows:
Now that the server server has been developed, we need to rebase its updated content to the master branch. Our previous steps are to switch to the branch to be rebase, such as client, and then specify the target branch. Now we can use a more complete syntax to directly rebase the branch without switching branches:
Git rebase Master Server Copy the code
The meaning of the above instructions is very clear, which means that the changes of the server branch are submitted to the snapshot of the master branch. The submission history becomes the following picture:
Then let the master branch move on, and delete the server and client branches:
Git Checkout Master git Merge Server git Branch -d Client git Branch Server -d Copy the code
The submission history finally becomes a clean straight line, amazing!
Don't use Rebase arbitrarily
The shortcomings of Rebase are also obvious. If you pay attention to it, you will realize that Rebase will destroy the previous branch history, such as the original path of client and server in the above example, so if others may develop based on certain branches, then don't use it. Rebase destroys these branches!
If you follow this golden rule, there will be no mistakes. Otherwise, the people will hate you, and your friends and family will laugh at you and spit on you.
In fact, in the original example, it is very unwise to destroy the client and server branches. Because the client and server in the actual project will also be a relatively independent topic branch, they are often version controlled in the central Git repository. Front-end and back-end personnel will develop on the client and server respectively, and the project is constantly iterating. These topic branches should Is reserved.
For more puzzling and painful negative examples, please refer to the Rebase chapter of the Git official website.
Rebase VS Merge best practices
Before we get the answer, we need to rethink what Merge and Rebase mean to commit history.
The Merge operation itself is a record of history, which can trace the changes of various branches. Even though the submission history may seem complicated and confusing, the value and significance of tracing history is eternal.
Rebase destroys the process of change, leaving only the version judged to be "officially published". As for the "draft" in the middle, it is discarded. The advantage is that as users, we can focus on the most necessary research. No need to waste time and energy elsewhere.
The official Git website gives a principle, which I think is more accurate:
The general principle is to only perform rebasing operations to clean up the history of local modifications that have not been pushed or shared with others, and never perform rebasing operations on commits that have been pushed elsewhere. In this way, you can enjoy the convenience brought by the two methods.
The Git submission that needs to be pushed elsewhere means that it will be developed and researched by others, which is in line with the reason for not using Rebase indiscriminately.
Remote Git repository
Now we have mastered most of the common local operations of Git, enough to deal with the general problems in version control, and manage our own projects. Now comes the last step, that is, how to collaborate with others, how to use remote warehouses for development, and know how
In addition, remote Git warehouses can also be built by themselves, but in most cases, developers will use existing hosting services, so this part will not be emphasized. If necessary, please refer to the official documentation.
The first step of network communication is to determine a communication protocol. Here we don t talk about the too principled part. We use
Git clone firstname.lastname@example.org: xxxx/xxxx.git git clone https://github.com/xxxx/xxxx.git copy the code
The above two statements are common examples of clones of remote projects saved on GitHub. The first is the url under the SSH protocol, and the second is the url under the HTTPS protocol. The actual content they cloned is the same, but the protocol Different. Other remote Git services such as GitLab are similar. The URL representation method may be different, but the essence is the same.
Generate SSH public key
In order to ensure security, many Git servers use SSH keys for authentication to distinguish, authenticate, and trace specific development machines, control permissions, and so on. Taking GitHub as an example, you need to do two things:
- Generate the local SSH key pair locally, generally by entering system commands through the console to generate
- Find the local key file and add the public key in the key pair to the trust list of GitHub
After completing the above work, you can download and upload the project normally. For specific operations, please refer to the SSH key guide on GitHub: help.github.com/articles/ge...
Basic operation of remote interaction
Associate with remote warehouse
There are two cases of [association] here, one is that there is no existing project locally and it needs to pass
Git the Add Remote email@example.com Origin: yourname/learngit.git copy the code
We can analyze the above instructions like this:
If you want to modify the URL of the remote warehouse, you only need to execute the following commands:
Git Remote firstname.lastname@example.org the SET-url Origin: yourname/learngit.git copy the code
To view remote warehouse information, you can use the following command:
git remote -v origin email@example.com:yourname/learngit.git (fetch) origin firstname.lastname@example.org:yourname/learngit.git (push) Copy code
If you want to remove the association with the remote warehouse, you can use
Git Remote RM Origin copy the code
View/track other remote branches
Git Branch -a copy the code
Git Checkout -b mydev Origin/mydev copy the code
To view the remote branch corresponding to a local branch, you can use the following command:
Git Branch -vv copy the code
Push to remote warehouse
Git the Push -u Origin Master Copy the code
The above command pushes the local master branch to the remote origin warehouse, which uses
If you are developing based on other remote branches as described earlier, just follow the steps below:
Git Checkout Origin -b dev/dev copy the code
Associate the local branch with the remote branch, so that you can develop based on the dev branch, and then push:
Git the Push Origin dev copy the code
Pull the remote branch update to the local
The actual development is multi-person collaboration. When you develop locally, there may be some updates in the remote warehouse, so you need to merge the latest content first, organize it and then submit it for upload.
git pull copy code
The above command will try to get the branch of the origin remote warehouse to the local, and proceed
Git Branch the --set upstream-to-Origin =/dev dev duplicated code
As mentioned earlier,
Git the commit -m "FIX env Conflict" git the Push Origin dev copy the code