Towards a higher level: the network foundation that good Android programmers must know and know

Towards a higher level: the network foundation that good Android programmers must know and know

1 Introduction

Network communication has always been a relatively important module in the Android project. There have been many excellent network frameworks on the Android open source project. From the beginning, it was just a few tools for simple packaging of HttpClient and HttpUrlConnection, and later Google open sourced more complete and rich Volley, then to the more popular Okhttp and Retrofit nowadays .

To understand the similarities and differences between them (or to be more specific, to have a deeper grasp of the network communication technology in Android development), you must know the basics of the network and the basic principles of the Android network framework. Only at the critical moment can you find the best network communication technology practice suitable for your APP.

Facts have proved that in the daily development of Android and source code reading, you will often encounter relevant knowledge. Mastering these basic network knowledge is also one of the basic technical qualities necessary for Android programmers to truly move toward a higher level.

In view of this, this article will mainly introduce some of the basics of computer networks, as well as some of the use and problems encountered in the development of Android and their solutions.

This article is mainly divided into the following parts:

  • 1) Computer network architecture;
  • 2) Http related;
  • 3) Tcp related;
  • 4) Socket.


2. About the author

Shu Dafei: Android development engineer, author's blog: ...
Note: When the instant messaging network included this article, in order to make it easier to understand, the content was revised in more detail.

3. Computer network architecture

Computer network architecture, that is, the hierarchical structure of computer network systems that is often seen, it is necessary to clarify this to prevent entanglement with Http and Tcp protocols that are not at the same layer at all. According to different reference models, there are several different versions of the hierarchical structure, such as the OSI model and the TCP/IP model.

The following is an example of the more frequently seen 5-layer structure:
(For a clearer and more complete picture, please see " Computer Network Communication Protocol Relation Diagram (Chinese Collector's Edition) [Attachment Download]  ")

As shown in the figure above, the five-layer From the top down, the end-to-end data transmission and communication can finally be realized. What are they responsible for, and how to realize the end-to-end communication in the end?

1) Application layer: For example, the http protocol actually defines how to package and parse the data. If the application layer is the http protocol, the data will be packaged in accordance with the provisions of the protocol, such as packaged according to the request line, request header, and request body. After the data is transmitted, the data is transmitted to the transportation layer.

2) Transport layer: The transport layer has two protocols, TCP and UDP, respectively corresponding to reliable transportation and unreliable transportation. For example, because TCP needs to provide reliable transmission, it must solve how to establish a connection and how to ensure that the transmission is reliable. Loss of data, how to adjust flow control and congestion control. Regarding this layer, we usually deal with Socket. Socket is a set of encapsulated programming call interfaces. Through it, we can operate TCP and UDP for connection establishment and so on. When we usually use Socket to establish a connection, we generally have to specify the port number, so this layer specifies the port number to send data to.

3) Network layer: This layer of IP protocols, and some routing protocols, etc., so this layer specifies which IP address the data is to be transmitted to. Some optimal routes, routing algorithms, etc. are involved in the middle.

4) Data link layer: The most impressive thing is the ARP protocol, which is responsible for parsing the IP address into the MAC address, that is, the hardware address, so that the corresponding unique machine can be found.

5) Physical layer: This layer is the bottom layer, which provides binary stream transmission services, which means that data transmission starts through the transmission medium (wired, wireless).

Therefore, through each of the above five layers to perform their duties, the physical transmission medium-MAC address-IP address-port number-obtained data is parsed according to the application layer protocol to finally realize network communication and data transmission.

The following will focus on HTTP and TCP related things. Regarding other layers, I have forgotten a lot after graduation for so long. If you want to have a more detailed and specific understanding of the following three layers such as routing algorithm, ARP addressing and physical layer, etc. I want to revisit " TCP/IP Detailed Explanation Volume 1: Protocol "~

4. HTTP related

This section mainly talks about some basic knowledge about Http, as well as some practical applications in Android and the problems and solutions encountered.

Due to space limitations, this article only provides a brief overview of some knowledge points. If you want to master the HTTP protocol in a comprehensive and in-depth manner, please read the following articles:


4.1 Correctly understand the "connectionless" and "stateless" of HTTP

Http is connectionless and stateless.

Connectionless does not mean that there is no need to connect. The Http protocol is just an application layer protocol. In the end, it still depends on the services provided by the transport layer, such as the TCP protocol, to connect.

The meaning of connectionless is that http stipulates that only one request is processed per connection, and the connection is disconnected after one request is completed. This is mainly to relieve the pressure on the server and reduce the occupation of server resources by the connection. My understanding is that establishing a connection is actually a matter of the transport layer. For the application layer http, it is connectionless, because the upper layer has no perception of the lower layer.

Stateless means that each request is independent, and there is no ability to remember the previous request transaction. So there are things like cookies that are used to save some state.

4.2 Request message and response message

Here is a brief introduction to the basic knowledge of the format of the HTTP request message and response message.

Request message:

Response message:

Regarding Get and Post, we are all familiar with the following points about the difference between Get and Post:

  • 1) Get will splice all request parameters behind the url, and finally display them in the address bar, while Post will put the request parameter data in the request body, and will not display it in the address bar;
  • 2) The length limit of the passed parameters.


  • Regarding point 1), it is really inappropriate to expose private data on the address bar in the browser, but if it is in App development, there is no concept of the address bar, then will this be the choice of post or The constraints of get;
  • Regarding point 2), the length limit should be the limit of the browser, which has nothing to do with get itself. If it is in App development, can this point be ignored.


4.3 HTTP caching mechanism

The reason why I want to introduce the following caching mechanism of Http is because Okhttp uses the caching mechanism of Http for network request caching, instead of writing a set of caching strategies for clients like Volley and other frameworks.

Http caching is mainly controlled by two fields in the header: Cache-control and ETag, which will be introduced separately below.

1) Cache-control mainly contains and several fields:

  • private: only the client can cache;
  • public: Both the client and the proxy server can be cached;
  • max-age: the expiration time of the cache;
  • no-cache: Need to use contrast cache to verify cached data;
  • no-store: All memory will not be cached.

In fact, a caching strategy is set up here, and the server sends it to the client through the header for the first time. You can see:

  • max-age: the time when the cache expires, then request again later, if the time of cache invalidation is not exceeded, the cache can be used directly;
  • no-cache: Indicates that you need to use the comparison cache to verify the cached data. If this field is open, even if the max-age cache is not invalid, you still need to initiate a request to the server to confirm whether the resource is updated and whether it needs to be re-requested Data, as for how to do comparison caching, is the role of Etag to be discussed below. If the server confirms that the resource has not been updated, it will return 304 and just take the local cache. If there is an update, it will return the latest resource;
  • no-store: If this field is opened, no cache will be performed, and no cache will be taken.

2) ETag: It is used for comparison and caching. Etag is an identification code of the server resource.

When the client sends the first request, the server will issue the identification code Etag of the currently requested resource, and the client will request it next time. It will use the If-None-Match in the header to bring this identification code Etag, and the server will compare the Etag sent from the client with the latest resource Etag. If it is the same, it means that the resource has not been updated and returns 304.

3) Summary:

Http caching mechanism is realized through the cooperation of Cache-control and Etag. For more knowledge about HTTP caching, I havemade a detailed reading in the relevant chapters of thearticle " Introduction to Brain Disabled Network Programming (3): Some Knowledge Must Know About HTTP Protocol ", you can refer to it.

4.4 HTTP cookies

The above said that the Http protocol is stateless, and cookies are used to remember some state in the local cache. A cookie generally contains several attributes such as domin (domain), path, and Expires (expiration time). The server can write the state into the client's Cookie through the set-cookies in the response header. The next time the client initiates a request, the cookie can be brought with it.

Problems encountered in Android development and their solutions:

Speaking of cookies, generally if you are only doing App development, you will not often encounter them, but if it is related to the needs of WebView, you may encounter them.

Let s talk about a worry about WebView Cookie that I encountered in the project: the demand is this, the H5 page in the loaded WebView needs to be logged in, so we need to manually log in the native page The ticket is written into the cookie of the WebView, and then the H5 page loaded in the WebView carries the ticket in the cookie to the server to verify it.

But I encountered a problem: Through Chrome inspect to debug the WebView, the manually written cookie was indeed written in, but when the request was initiated, the cookie was not brought, causing the request to fail to verify. After the investigation, it is the attribute of the WebView that is closed by default. Cause, you can open it with the following code settings:

CookieManager cookieManager = CookieManager.getInstance(); ``if
(Build.VERSION.SDK_INT >= Build.VERSION_CODES.LOLLIPOP) {`` `` cookieManager.setAcceptThirdPartyCookies(mWebView, ``true``);``} ``else
{`` ``cookieManager.setAcceptCookie(``true``);``}



We all know that Https guarantees the security of our data transmission. Https=Http+Ssl. The main reason for ensuring security is the use of asymmetric encryption algorithms. The commonly used symmetric encryption algorithms are insecure because both parties use If a unified key is used for encryption and decryption, as long as either of the two parties leaks the key, others can use the key to decrypt the data.

The core essence of the asymmetric encryption algorithm for achieving secure transmission is that the information encrypted by the public key can only be unlocked by the private key, and the information encrypted by the private key can only be unlocked by the public key.

1) Briefly explain why the asymmetric encryption algorithm is safe: The

server applies for a certificate issued by the CA, and then obtains the public and private keys of the certificate. The private key is only known by the server, and the public key can be notified to others, such as the public key. Pass it to the client, so that the client encrypts the data it transmits with the public key from the server, and the server can decrypt the data with the private key. Since only the private key can decrypt the data encrypted with the public key on the client side, and the private key is only available on the server side, the data transmission is safe.

The above is just a brief description of how asymmetric encryption algorithms ensure data security. In fact, the working process of Https is much more complicated than this (the space limitation is not detailed here, there are many related articles on the Internet):

  • One is that the client also needs to verify the legitimacy and validity of the CA certificate from the server, because there is a risk that the CA certificate will be subcontracted during the transmission process, which involves the issue of how the client verifies the validity of the server certificate to ensure both parties 'S identity is legal;
  • The other is that although the asymmetric algorithm guarantees the security of the data, its efficiency is relatively poor compared to the symmetric algorithm. How to optimize it to ensure the security of the data and improve the efficiency.

2) How does the client verify the validity of the certificate:

1. the CA certificate generally includes the following:

  • The issuing authority and version of the certificate;
  • The user of the certificate;
  • The public key of the certificate;
  • The valid time of the certificate;
  • The digital signature Hash value of the certificate and the signature Hash algorithm (this digital signature Hash value is the value encrypted with the private key of the certificate);
  • and many more.

The client verifies the legitimacy of the certificate passed from the server by: first use the obtained public key to decrypt the digital signature Hash value 1 in the certificate (because it is encrypted with the private key), and then use the certificate in the certificate The signature Hash algorithm generates a Hash value of 2. If the two values are equal, it means that the certificate is legal and the server can be trusted.

3) Problems encountered in Android development and their solutions:

By the way, an issue where the web page certificate on the company s test server is expired and the web page cannot be loaded with a white screen in the project development using Android WebView to load the company s test server.

The solution is to temporarily ignore the SSL error report in the test environment, so that the web page can be loaded. Of course, do not do this in production. One is that there will be security issues, and the other is that google play should be reviewed and will not pass.

The best way is to rewrite the onReceivedSslError() of WebViewClient:

onReceivedSslError(WebView view, SslErrorHandler handler, SslError error) {
(ContextHolder.sDebug) (`` ``handler.proceed();'' ``return``;`` ``)`` ``super``.onReceivedSslError(view, handler, error);``)

Finally: For more detailed and comprehensive knowledge of HTTPS, please read " Instant Messaging Security (7): If you understand the principles of HTTPS in this way, one article is enough. "

4.6Http 2.0

Okhttp supports the configuration and use of the Http 2.0 protocol. Compared with Http1.x, Http2.0 has a huge improvement, mainly as follows.

1) Binary format: http1.x is a text protocol, and http2.0 is a binary with a frame as the basic unit. It is a binary protocol. In addition to the data, a frame also contains the identifier of the frame: Stream Identifier, that is, the identifier Which request the frame belongs to makes the network transmission very flexible;

2) Multiplexing: A great improvement, the original http1.x situation of one connection and one request has relatively large limitations and also caused many problems. Such as the consumption and efficiency of establishing multiple connections.

In order to solve the efficiency problem, http1.x may initiate as many concurrent requests as possible to load resources. However, the browser has restrictions on concurrent requests under the same domain name, and the optimization method is generally to place the requested resources under different domain names. Break through this restriction.

The multiplexing supported by http2.0 can solve this problem well. Multiple requests share a TCP connection, and multiple requests can be concurrent on this TCP connection at the same time. One is to solve the consumption problem of establishing multiple TCP connections. , One also solves the problem of efficiency.

So what is the principle that supports that multiple requests can be concurrent on a TCP connection? The basic principle is the above binary framing. Because each frame has an identity, different frames of multiple requests can be sent out concurrently and out of order. The server will sort them into The corresponding request.

3) Header header compression: mainly by compressing header to reduce the size of the request, reduce traffic consumption, and improve efficiency. Because there is a problem before, every request must be brought with the header, and the data in this header is usually unchanged in one layer.

4) Support server push .

For more knowledge about HTTP2, please read " From HTTP/0.9 to HTTP/2: An article to understand the historical evolution and design ideas of the HTTP protocol ".

5. TCP related

TCP is connection-oriented and provides reliable data transmission. At this layer, we usually use Socket Api to operate TCP, establish connections, and so on.

5.1 Three-way handshake to establish a connection


The first time: Sending SNY=1 means that this handshake is requesting to establish a connection, and then seq generates a random number X of the client. The

second time: Sending SNY=1, ACK=1 means that it is a reply to the request to establish a connection, and then ack= The client's seq+1 (so that the client can confirm that it is the server that it wants to connect to before), and then the server also generates a random number seq=Y that represents itself and sends it to the client.

The third time: ACK=1. seq=client random number+1, ack=server random number+1 (so that the server knows that it is the client just now)

Why does it require a three-way handshake to establish a connection?

First of all, it is very clear that the two handshake is the most basic. In the first handshake, the C-side sends a connection request message to the S-side. After the S-side receives it, the S-side knows that it can connect with the C-side successfully, but The C side does not know whether the S side has received the message at this time, so the S side has to respond after receiving the message, and the C side can determine that it can connect with the S side after receiving the reply from the S side. This is the second time shake hands.

The C terminal can only start sending data after confirming that it can connect with the S terminal. So two handshake is definitely the most basic.

So why is the third handshake necessary? Suppose, if there is no third handshake, but after two handshake we think that the connection is established, then what will happen?

The third handshake is to prevent the invalid connection request segment from being suddenly transmitted to the server, thus causing errors.

The specific situation is:

The first network connection request sent by the C terminal is stuck in the network node for some reason, resulting in a delay, and it does not reach the S terminal until a certain point in time when the connection is released. This is a message that has long been invalidated. , But at this time, the S side still thinks that this is the first handshake of the C-side connection establishment request, so the S-side responds to the C-side, and the second handshake.

If there are only two handshakes, then here, the connection is established, but at this time the C side does not have any data to send, and the S side will wait foolishly, resulting in a great waste of resources. Therefore, a third handshake is required, and this situation can be avoided only if the C side responds again.

To deeply understand the TCP three-way handshake, please don't miss the following articles:


5.2 4.waves to disconnect


After the analysis of the establishment of the connection diagram above, this diagram should not be difficult to understand.

There is a main question here: Why is there one more wave of hands than when the connection is established?

It can be seen that the ACK (reply to the client) and FIN (terminate) messages of the server here are not sent at the same time, but ACK first, and then FIN. This is also well understood. When the client requests to disconnect, at this time The server may still have unsent data, so ACK first, and then wait for the data to be sent and then FIN. This becomes a four-way handshake.

The above mentioned the process of TCP establishing and disconnecting the connection. The main feature of TCP is to provide reliable transmission. Then how does it ensure that the data transmission is reliable? This is the sliding window protocol to be discussed below.

For related knowledge, please read in depth:


5.3 Sliding Window Protocol

The sliding window protocol is to ensure the reliable transmission of TCP, because the sending window will move backward to continue sending other frames only when the confirmation frame is received.

Here is an example: if the sending window is 3 frames

and the sending window is in the first 3 frames [1,2,3], then the first 3 frames can be sent, and the latter cannot be sent temporarily, such as [1] frame sending After going out, and receiving the confirmation message from the receiver, the sending window can move backward by 1 frame at this time, and the sending window reaches [2, 3, 4]. Similarly, only the frames in the sending window can be sent, once analogy.

After receiving the frame, the receiving window puts it into the corresponding position, and then moves the receiving window. The interface window has the same size as the sending window. If the receiving window is 5 frames, the frames that fall outside the receiving window will be discarded.

The different settings of the send window and the receive window size extend different protocols:

stop-wait protocol: each frame has to wait for the confirmation message before sending the next frame. Disadvantage: poor efficiency.

Back N frame protocol: adopt the method of accumulative acknowledgment, the receiver will send an accumulative acknowledgment message to the sending window after receiving N frames correctly, confirming that N frames have been received correctly, if the sender does not receive the acknowledgment message within the specified time, it will be considered Overtime or data loss, all frames after the confirmation frame will be resent. Disadvantages: The PDU following the error sequence number has been sent, but it still needs to be re-sent, which is wasteful.

Select the retransmission protocol: If there is an error, only retransmit the required PDUs involved in the error, which improves the transmission efficiency and reduces unnecessary retransmissions.

There is one last problem left here: because there will be a mismatch between sending and receiving efficiency between the sending window and the receiving window, it will cause congestion. To solve this problem, TCP has a set of flow control and congestion control mechanisms.

5.4 Flow control and congestion control

1) Flow control:

Flow control is to control the flow on a communication path, that is, the sender dynamically adjusts the sending rate by obtaining feedback from the receiver to achieve the effect of controlling the flow, and its purpose is to ensure the sending speed of the sender Do not exceed the receiving speed of the receiver.

2) Congestion control:

Congestion control is to control the flow of the entire communication subnet, which is a global control.

Slow start + congestion avoidance

Let's first look at a classic picture:

Slow start is used at the beginning, that is, the congestion window is set to 1, and then the congestion window exponentially increases to the slow start threshold (ssthresh=16), then switch to congestion avoidance, that is, additive growth, which increases to a certain extent, causing network congestion , The congestion window will be reduced to 1, that is, the slow start will be restarted, and the new slow start threshold will be adjusted to 12, and so on.

Fast retransmission + fast recovery

Fast retransmission: The retransmission mechanism we mentioned above is to wait until the timeout has not received the receiver's reply before starting the retransmission. The design idea of fast retransmission is: if the sender receives 3 repeated ACKs from the receiver, it can judge that there is a segment loss, and then the lost segment can be retransmitted immediately without waiting for the setting The retransmission starts only after the timeout period expires, which improves the efficiency of retransmission.

Fast recovery: The above congestion control will reduce the congestion window to 1 when the network is congested, and start again slowly. One problem with this is that the network cannot quickly return to a normal state. Fast recovery is to optimize this problem. When fast recovery is used, when congestion occurs, the congestion window will only be reduced to the new slow start gate threshold (ie 12) instead of 1, and then directly enter the congestion to avoid additive growth. , As shown in the following figure:

Fast retransmission and fast recovery are further improvements to congestion control.

For a deeper understanding of the issues in this section, please read: " TCP/IP Detailed Explanation  -  Chapter 21 TCP Timeout and Retransmission ", " Easy to Understand-Deep Understanding of TCP Protocol (Part 2): RTT, sliding window, congestion Treatment .

6. About Socket

Socket is a set of APIs that operate TCP/UDP, such as HttpURLConnection and Okhttp, which involve relatively low-level network request sending. Of course, they are also sent through Socket to connect and send network requests, while Volley and Retrofit are higher-level. Encapsulation, and finally rely on HttpURLConnection or Okhttp for final connection establishment and request sending.

The simple use of Socket should be good, both ends establish a Socket, the server is called ServerSocket, and then establish a connection.

For related information, please read:


7. Summary of this article

Of course, the above content is only the basics of computer network that I know and think it is very important. There is still a lot of basic network knowledge that I need to understand and explore. I have written a lot, and it is a sort of sorting out of my own network foundation, and there may be mistakes. I have the right to use it as a reference. Please feel free to enlighten me.