Interview: Chapter 2: Various frameworks and middleware and cache databases

Interview: Chapter 2: Various frameworks and middleware and cache databases

Talk about the working principle and common annotations of SpringMVC

1. The user sends a request to the server, and the request is intercepted by the DispatcherServlet, the front controller of SpringMVC.
2. DispatcherServlet parses the requested URL (Uniform Resource Locator) to obtain the URI (Requested Resource Identifier), and then
calls HandlerMapping through configuration or annotation according to the URI to find all related objects of the Handler configuration, including the Handler object and the Handler The interceptor corresponding to the object,
these objects will be encapsulated in a HandlerExecutionChain object to return DispatcherServlet.
3. According to the obtained Handler, the front controller requests the HandlerAdapter to process multiple Handlers, and calls the Handler's actual method of processing the request.
4. Extract the model data in the request and start executing Handler (Controller)
5. After the Handler is executed, it returns a ModelAndView object to DispatcherServlet.
6. According to the returned ModelAndView object, request ViewResolver (view resolver) to resolve the logical view into a real view and return the view to the front controller.
7. The rendering view converts the model data into a response.
8. The response result is returned to the client.

The difference between url and uri?

URI includes two categories: URL and URN. Personal ID number is URN, and personal home address is URL. URN can uniquely identify a person,
and URL can tell the postman how to deliver the goods to you.

Component annotation:

    @Component Add the @Component annotation before the class definition, it will be recognized by the spring container and converted into a bean.
@Repository Annotates the Dao implementation class (special @Component)
@Service is used to annotate the business logic layer, (special @Component)
@Controller is used to annotate the control layer, (special @Component)

Request and parameter type annotations:

    @RequestMapping: Used to process request address mapping, which can be applied to classes and methods.
@RequestParam: used to obtain the value of the incoming parameter
@PathViriable: used to define the value of the path parameter
@ResponseBody: acting on the method, the entire returned result can be returned in a certain format, such as json or xml format.
@CookieValue: Used to get the cookie value of the request

Talk about Spring's IOC (DI) and AOP dynamic proxy

Traditional program development (without IOC): For a simple example, how do we find a girlfriend? Common situation is that we see everywhere where there
was pretty tall and good mm, then inquire about their hobbies, qq number, phone number, ip number, iq No. ........., think of ways to know them, their vote
by Good to give what you want, and hehe... This process is complicated and profound, we must design and face each link by ourselves.

With IOC:

How does IoC do it? It's a bit like finding a girlfriend through a matchmaking agency. A third party was introduced between my girlfriend and me: a marriage agency. Marriage
referral management information on a lot of men and women, I can make a list of the matchmaking, tell it what I want to find a girlfriend, for example, looks like a Lee
Linda, it is like Lin Xi Lei, sing like Jay, like Carlos rate, The technology is like Zidane, and then the matchmaking agency will
provide a mm according to our requirements . We only need to fall in love with her and get married.

Summarize the inversion of control: all classes will be registered in the spring container, tell spring what you are and what you need, and then
spring will actively give you what you want when the system is running properly, and also You give other things that need you. The
creation and destruction of all classes are controlled by spring, which means that it is no longer the object that references it, but spring that controls the life cycle of an object. For a
specific object, it used to control other objects, but now all objects are controlled by spring, so this is called inversion of control.

The key to understanding DI is: "Who depends on whom, why needs to depend, who injects whom, and who injects what"
Who depends on whom: Of course the application depends on the IoC container;
Why depends on: The application needs the IoC container to provide The external resources needed by the object;
Who injects whom: Obviously it is the IoC container that injects an object of the application, the object that the application depends on;
What is injected: It is the injection of external resources (including objects, resources, Constant data).

What is the relationship between IoC and DI?

DI (Dependency Injection) is actually another term for IOC. In fact, they are different perspectives of the same concept.

 

Various implementations of AOP

AOP is aspect-oriented programming, we can implement AOP from the following levels

 

  • Modify the source code at compile time
  • Modify the bytecode before loading the runtime bytecode
  • Dynamically create the bytecode of the proxy class after the bytecode is loaded at runtime

Comparison of various implementation mechanisms of AOP

The following is a comparison of various implementation mechanisms:

categorymechanismprincipleadvantageDisadvantage
Static AOPStatic weavingDuring compile time, the aspect is directly compiled into the target bytecode file in the form of bytecodeNo performance impact on the systemNot flexible enough
Dynamic AOPDynamic proxyIn the runtime, after the target class is loaded, a proxy class is dynamically generated for the interface, and the aspect is woven into the proxy classMore flexible than static AOPThe focus of entry needs to implement the interface. Have a little performance impact on the system
Dynamic bytecode generationCGLIBIn the runtime, after the target class is loaded, the bytecode file is dynamically constructed to generate the subclass of the target class, and the aspect logic is added to the subclassCan be woven without an interfaceWhen the instance method of the extension class is final, weaving cannot be performed
Custom class loader At runtime, before the target is loaded, the aspect logic is added to the target bytecodeCan weave most classesIf other class loaders are used in the code, these classes will not be woven
Bytecode conversion At runtime, all class loaders intercept before loading bytecodeAll classes can be woven 

 

 

 

 

 

 

 

 

 

 

Citizens in AOP

  • Joinpoint: Interception point, such as a business method
  • Pointcut: The expression of Joinpoint, which means which methods are intercepted. One Pointcut corresponds to multiple Joinpoints
  • Advice: the logic to be cut into
  • Before Advice: cut in before the method
  • After Advice: cut in after the method, throwing an exception will not cut in
  • After Returning Advice: cut in after the method returns, and will not cut in if an exception is thrown
  • After Throwing Advice: cut in when the method throws an exception
  • Around Advice: cut in before and after method execution, you can interrupt or ignore the execution of the original process

The dynamic proxy used by Spring AOP, the so-called dynamic proxy means that the AOP framework does not modify the bytecode, but temporarily generates the method in memory
an AOP object . This AOP object contains all the methods of the target object, and in a specific The pointcut of is enhanced, and the method of the original object is called back.

There are two main ways of dynamic proxy in Spring AOP, JDK dynamic proxy and CGLIB dynamic proxy. JDK dynamic proxy receives the proxied
class through reflection , and requires the proxied class to implement an interface. The core of JDK dynamic proxy is InvocationHandler interface and Proxy class. If the
target class does not implement the interface, then Spring AOP will choose to use CGLIB to dynamically proxy the target class. CGLIB (Code Generation
Library) is a code-generated class library that can dynamically generate subclasses of a certain class at runtime. Note that CGLIB is done through inheritance
a dynamic proxy , so if a class is marked as final, then it cannot use CGLIB as a dynamic proxy.

AOP In terms of transaction management, Spring uses AOP to complete declarative transaction management in two forms: annotation and xml. During development, it is convenient to
write code . In many cases, the transaction manager is configured in the spring configuration file and the transaction control annotation is turned on. Add
@Transactional to the business class or business class method to achieve transaction control.

 

Talk about the MyBatis framework

(1) Mybatis is a Java-based persistence layer framework, which encapsulates jdbc, and does not need to spend energy to process the process of loading drivers, creating connections, etc.,
eliminating a lot of redundant JDBC code.
(2) Mybatis configures various statements to be executed through xml or annotation, and
maps the dynamic parameters of sql in the java object and statement to generate the final executed sql statement, and finally the mybatis framework executes sql and maps the result to java object and return.
(3) MyBatis supports customized SQL, stored procedures and advanced mapping. MyBatis avoids almost all JDBC code and manual setting
of parameters and obtaining result sets. MyBatis can use simple XML or annotations to configure and map native information, and map interfaces and Java POJOs to records in the database.
(4) Provides a lot of third-party plug-ins (paging plug-ins/reverse engineering);
(5) It can be well integrated with Spring;
(6) MyBatis is quite flexible, and SQL is written in XML, which is completely separated from the program code and relieves SQL The coupling with the program code facilitates unified management and supports the writing of dynamic SQL statements.
(7) Provide mapping tags to support ORM field relationship mapping between objects and databases.
(8) SQL statements depend on the database, resulting in poor database portability, and the database cannot be replaced at will.

 

Talk about the characteristics of SpringBoot

Springboot is used to simplify the initial construction of spring applications and the development process. Use a specific way to configure
(properties or yml files).
You can create an independent spring reference program main method to run
Springboot embedded Tomcat. No need to deploy war files.
Simplify maven configuration

Talk about the creation of threads and the difference between several ways to implement threads

1: Inherit the Therad class, 2: Implement Runnable interface 3: Implement Callable interface 4: Use thread pool

Inherit the Thread class and rewrite the run method inside

class A extends Thread{
public void run(){
for(int i=1;i<=100;i++){
System.out.println("----------------- "+i);
}
}
}
A a = new A();
a.start();

Implement the Runnable interface and implement the run method inside

class B implements Runnable{
public void run(){
for(int i=1;i<=100;i++){
System.out.println("----------------- "+i);
}
}
}
B b = new B();
Thread t = new Thread(b);
t.start();

Implement Callable

class A implements Callable<String>{
public String call() throws Exception{
//...
}
}
FutureTask<String> ft = new FutureTask<>(new A());
new Thread(ft).start();

Thread Pool

ExcutorService es = Executors.newFixedThreadPool(10);
es.submit(new Runnable(){//task});
es.submit(new Runnable(){//task});
...
es.shutdown();

 

What is the difference between implementing Runnable and implementing Callable?

To implement the Callable interface, tasks can have return values, but Runnable does not.
To implement the Callable interface, you can specify generics, but Runnable does not.
To implement the Callable interface, you can declare exceptions in the call method, but Runnable does not.

What is the difference between Runnable and Thread?

The way to implement the Runnable interface is more suitable for handling situations where there are shared resources.
The way to implement the Runnable interface avoids the limitations of single inheritance.

 

Java custom class loader and parent delegation model

Bootstrap C++
Extended Class Loader (Extension) Java
Application Class Loader (AppClassLoader) Java

The working principle of the parent delegation model: If a class loader receives a class loading request, it will not first try to load the class by itself, but will
delegate the request to the parent class loader to complete. This is true for every class loader. Only when the parent loader cannot find the specified class in its search range (ie
ClassNotFoundException), the child loader will try to load it by itself.

Talk about the composition and tuning of jvm, memory model, GC, tomcat tuning

tomcat tuning:

Increase JVM heap memory size
Fix JRE memory leak
Thread pool setting
Compression
Database performance tuning
Tomcat native library

JVM tuning:

-Xms-Specify the heap memory during initialization, the default is 1/64 of the physical memory
-Xmx-Specify the maximum memory, the default is 1/4 of the physical memory
-XX:+PrintGCDetails: output detailed GC processing logs
before restarting you After the Tomcat server, these configuration changes will be effective.

 

Talk about how to achieve high-availability data and services, load balancing strategies and differences, distributed (and things), clusters, high concurrency, and problems and solutions encountered

Distributed:

Distributed architecture: The system is divided into multiple subsystems according to modules, and multiple subsystems are distributed on different network computers to cooperate with each other to complete the business
process, and communication between the systems is required.
Advantages:
Split the modules and use interface communication to reduce the coupling between modules.
Split the project into several sub-projects, and different teams are responsible for different sub-projects.
When adding functions, you only need to add another sub-item and call the interfaces of other systems.
Distributed deployment can be carried out flexibly.
Disadvantages:
1. The interaction between systems requires remote communication, and interface development increases the workload.
2. Each module has some common business logic that cannot be shared.

Soa-based architecture

SOA: Service-oriented architecture. That is, the project is divided into two projects, the service layer and the presentation layer. The service layer contains business logic and only needs to
provide external services. The presentation layer only needs to process the interaction with the page, and the business logic is implemented by calling the services of the service layer.

What is the difference between distributed architecture and soa architecture?

SOA, mainly from the perspective of services, splits the project into two projects: the service layer and the presentation layer.
Distributed, mainly from the perspective of deployment, classify applications according to access pressure. The main goal is to make full use of server resources and avoid uneven resource allocation.

Cluster:

A cluster system is a group of loosely integrated server groups that form a virtual server to provide unified services for client users. For this
client, usually when accessing the cluster system, it does not realize which specific server its service is provided. The purpose of clustering is to achieve
load balancing, fault tolerance, and disaster recovery. In order to meet the requirements of system availability and scalability. The cluster system should generally have
special properties such as high availability, scalability, load balancing, failure recovery, and maintainability. Generally, the same project will be deployed on multiple servers.
Common tomcat cluster, Redis cluster, Zookeeper cluster, database cluster

The difference between distributed and cluster:

Distributed refers to distributing different businesses in different places. The cluster refers to the clustering of several servers to achieve the same business. In a
word: Distributed works in parallel, and clusters work in series.

Every node in the distributed system can be used as a cluster. The cluster is not necessarily distributed.
Example: For example, Sina.com
, if there are more people visiting, he can set up a cluster, put a response server in the front, and several servers in the back to complete the same business. If there is business access, the response server should see which server is not very loaded. If it is heavy, it will be done by which one.
Distributed, in a narrow sense, is also similar to a cluster, but its organization is relatively loose. Unlike a cluster, there is an organization. When a server fails,
other servers can top up. Each distributed node completes a different business. If one node goes down, the business becomes inaccessible.

Distributed is to shorten the execution time of a single task to improve efficiency, while clusters improve efficiency by increasing the number of tasks executed per unit of time.
Example: If a task consists of 10 subtasks, and each subtask takes 1 hour to execute separately, it takes 10 hours to execute the task on a server
. A distributed solution is adopted and 10 servers are provided. Each server is only responsible for processing one sub-task, regardless of the dependency between the sub-tasks. After the execution
is completed, this task only takes one hour. (A typical representative of this working mode is the Map/Reduce distributed computing model of Hadoop.)
The cluster solution also provides 10 servers, each of which can handle this task independently. Assuming that 10 tasks arrive at the same time, 10
servers will work at the same time. After 1 hour, 10 tasks will be completed at the same time. In this way, as a whole, one task will be completed within 1 hour!

High concurrency:

What are the common methods to deal with high concurrency?
1) Data layer

    Database clusters and database table hashing
, table and database partitioning,
indexing,
caching enabled,
table design optimization,
SQL statement optimization,
caching server (improving query efficiency, reducing database pressure),
search server (improving query efficiency, reducing database pressure),
image server separation

2) Project layer

    Adopt a service-oriented distributed architecture (share server pressure and improve concurrency).
Adopt more concurrent access details. The system Adopt static pages, static HTML. Freemaker
uses page caching
. ActiveMQ further decouples business and improves business processing capabilities
. Distributed File system stores massive files

3) Application layer

    Nginx server for load balancing,
Lvs for layer two load
mirroring

High availability:


Purpose: To ensure that the server hardware failure service is still available, the data is still saved and can be accessed.
Highly available services
1. Hierarchical management: Core applications and services have higher priority. For example, timely payment of users is more important than the ability to evaluate products;
Timeout setting: Set the timeout period for service calls. Once the timeout expires, the communication framework throws an exception , The application chooses to retry or transfer the request to other servers according to the service scheduling strategy.
Asynchronous call: It is done through asynchronous methods such as message queue to avoid the situation that a service failure causes the entire application request to fail.
Not all services can be called asynchronously. For calls such as obtaining user information, the use of asynchronous methods will extend the response time, and the gains outweigh the losses. For
those applications that must confirm the success of the service call before proceeding to the next step, it is not suitable for asynchronous calls.
Service degradation: During the peak period of website visits, in order to ensure the normal operation of core applications, the service needs to be degraded.
There are two ways to downgrade:
one is denial of service, rejecting calls of lower priority applications, reducing the number of concurrent service calls, and ensuring the normal operation of core applications; the
second is turning off functions, turning off some unimportant services, or shutting down internal services Some unimportant functions are used to save system overhead and give up resources for core application services;
Idempotent design: to ensure that the results of repeated service calls and one call are the same;

Highly available data
There are two main means to ensure high data availability: one is data backup, and the other is a failover mechanism;
data backup: it is divided into cold backup and hot backup. Cold backup is regular replication and cannot guarantee data availability. Hot backup is divided into asynchronous hot backup
and synchronous hot backup. Asynchronous hot backup means that the write operation of multiple data copies is completed asynchronously, while the synchronous mode means that the write operation of multiple data copies is completed at the same time.
Failover: If any server in the data server cluster is down, all read and write operations of the application for this server
must be rerouted to other servers to ensure that data access will not fail.

Website operation monitoring
"Does not allow unmonitored systems to go online"
(1) Monitoring data collection
User behavior log collection: server-side log collection and client-side log collection; currently many websites are gradually developing log statistics and analysis tools based on the real-time computing framework Storm ;
server performance monitoring: collect server performance metrics such as system Load, memory usage, disk IO and so on, timely determine and take preventive measures;
operating data report: collection and reporting, unified display after the summary, the application needs code Handle the logic of operating data collection;
(2) Monitoring and management
System alarm: configure the alarm threshold and the contact information of the guards, when the system alarms, even if the engineer is thousands of miles away, he can be notified in time;
Failover: monitoring system When faults are found, the application is proactively notified to failover;
Automatic graceful degradation: In order to cope with the peak of website visits, take the initiative to close some functions, release some system resources, and ensure the
normal operation of core application services ; > the ideal state of the website s flexible architecture

Load balancing:

What is load balancing?
When the performance of a server reaches its limit, we can use server clusters to improve the overall performance of the website. Then, in the server cluster
, you need to have a server act as a scheduler, and all requests are first received by the users of it, according to the dispatcher and then negative for each server
overload situation will assign the request to a back-end servers to handle .
(1) HTTP redirection load balancing.
Principle: When a user initiates a request to the server cluster scheduling request is first intercepted; dispatcher allocation according to a certain policy, select a service
unit, and the IP address of the selected server is encapsulated in an HTTP response header field of the message Location , And set the status code of the response message to
302, and finally return the response message to the browser. When the browser receives the response message, it parses the Location field and initiates a request to the URL
, then the designated server processes the user's request, and finally returns the result to the user.

Advantages: relatively simple
Disadvantages: the dispatch server only works when the client initiates a request to the website for the first time. When scheduling server returns a response message to the browser,
after which the operation of the client are carried out based on the new URL (that is, the back-end server), then the browser would not have relations with the scheduling server, Liu
browser requires each request Only two servers can be used to complete one visit, and the performance is poor. In addition, the scheduling server cannot know
how much pressure the current user will put on the server during scheduling . It is just that the number of requests is evenly distributed to each server, and the browser will directly interact with the back-end server.

(2) DNS domain name resolution load balancing
Principle: In order to facilitate user memory, we use domain names to access websites. Before accessing a website through a domain name, you first need to resolve the domain name into an IP
address. This work is done by the DNS domain name server. The request we submit will not be sent directly to the website we want to visit, but first sent to the domain name
server, which will help us resolve the domain name into an IP address and return it to us. We will not initiate a request to that IP until we receive it. A domain name points to
multiple IP addresses. Every time a domain name is resolved, the DNS only needs to select an IP and return it to the user to achieve load balancing of the server cluster.

Scheduling strategy: Generally, DNS providers will provide some scheduling strategies for us to choose, such as random allocation, round-robin, and allocation of the nearest server according to the requester's region.
Random allocation strategy:
When the dispatch server receives a user request, it can randomly decide which back-end server to use, and then encapsulate the server's IP in
the Location attribute of the HTTP response message and return it to the browser.
Round-robin strategy (RR): The
dispatch server needs to maintain a value to record the IP of the back-end server assigned last time. Then when a new request comes, the dispatcher will assign the request
to the next server in turn.

Advantages: simple configuration, load balancing work is handed over to DNS, eliminating the trouble of network management;
Disadvantages: cluster scheduling power is handed over to the DNS server, so we can't control the scheduler as we want, and there is no way to customize the scheduling strategy. the
load of each server solution, but to all requests equally among the back-end server only. When a back-end server fails, that is,
so that we immediately remove the server from the DNS, but DNS server will cache the IP will still be retained for a period of time in DNS, then
it will lead to some users can not Visit the website normally. However, dynamic DNS allows us to dynamically modify the domain name
analysis in the DNS server through the program . So when our monitoring program finds that a server is down, it can immediately notify DNS to delete it.

(3) Reverse proxy load balancing.
Principle: reverse proxy server is a server located before the actual server, all requests sent to the site to us are the first to go through the reverse proxy
proxy server, the server either directly return the result to the user based on the user's request or the request to pay It is processed by the back-end server and then returned to the user
. The reverse proxy server can act as the dispatcher of the server cluster. It can forward the request to
a suitable server according to the current back-end server load and return the processing result to the user.

Advantages:
1. Simple deployment
2. Hide back-end server: Compared with HTTP redirection, reverse proxy can hide the back-end server, and all browsers will not directly
interact with the back-end server , thus ensuring the control of the dispatcher , Improve the overall performance of the cluster.
3. Failover: Compared with DNS load balancing, reverse proxy can remove faulty nodes more quickly. When the monitoring program finds that a certain back-end server
is malfunctioning, it can notify the reverse proxy server in time and delete it immediately.
4. Reasonable allocation of tasks: Neither HTTP redirection nor DNS load balancing can achieve real load balancing, that is, the scheduling server cannot
allocate tasks according to the actual load situation of the back-end server. But the reverse proxy server supports manually setting the weight of each back-end server. We can
set different weights according to the configuration of the server. The different weights will lead to different probability of being selected by the scheduler.
Disadvantages:
1. Excessive pressure on the dispatcher: Since all requests are processed by the reverse proxy server first, when the amount of requests exceeds the maximum load of the dispatch server
of the scheduling server, the reduction in the throughput rate of the scheduling server will directly reduce the overall performance of the cluster.
2. Restricted expansion: When the back-end server cannot meet the huge throughput, the number of back-end servers needs to be increased, but there is no way to increase it infinitely
, because it will be restricted by the maximum throughput of the scheduling server.
3. Sticky sessions: Reverse proxy servers can cause a problem. If a back-end server processes the user's request, and saves the user's
session or stored in the cache, then when the user sends a request again, still no guarantee that the request by the depositary or its Session cache service
processing service unit If processed by other servers, the previous session or cache will not be found.
Solution 1: You can modify the task allocation strategy of the reverse proxy server, and it is more appropriate to use the user IP as the identification. The same user IP will be same
A back-end server handles it, thus avoiding the problem of sticky sessions.
Solution 2: The requested server ID can be marked in the Cookie. When the request is submitted again, the dispatcher can assign the request to
the server marked in the Cookie for processing.

(4) IP load balancing.
1. Load balancing through NAT: The response message is generally large, and NAT translation is required every time, and the scheduler will become a bottleneck when the traffic is large.
2. Realize load balancing through direct routing
3. VS/TUN realize virtual server
Advantages: IP load balancing completes data distribution in the kernel process, and has better processing performance than reverse proxy balancing.
Disadvantages: load balancing system bandwidth of the NIC becomes a bottleneck scenario: a server application running non-500M can be reached at its peak during
the evening peak generally can exceed 1G, mainstream server NIC is gigabit, more than 1G of Traffic will obviously cause packet loss. At this time, you
can not stop the business and replace the network card.

(5) Load balancing at the data link layer.
For the Linux system, the solution of the data link layer is to realize multiple network card binding to provide services jointly, and bundle multiple network cards into a
logical network card. To prevent the bandwidth of the load balancing server network card from becoming a bottleneck, it is currently the most widely used load balancing method for large websites.
7.modes of linux bonding, mod=0~6: balance loop strategy, master-backup strategy, balance strategy, broadcast strategy, dynamic link aggregation
, adapter transmission load balancing, adapter adaptive load balancing

Talk about how you optimize the database (sql, table design) and what are the restrictions on the use of indexes (index failure)

a. Select the most applicable field: When creating the table, in order to obtain better performance, we can set the width of the field in the table as small as possible. Another
way to improve efficiency is to set the field as NOTNULL if possible,
b, use JOIN instead of Sub-Queries
c, use UNION instead of manually created Temporary table
d, things:
a) Either each statement in the statement block succeeds or fails. In other words, the consistency and integrity
of the data in the database can be maintained . Things start with the BEGIN keyword and end with the COMMIT keyword. In the meantime, a SQL operation fails, then the ROLLBACK command can
restore the database to the state before BEGIN started.
b) When multiple users use the same data source at the same time, it can use the method of locking the database to provide users with a safe access
method, which can ensure that the user's operations are not interfered by other users.
e, reduce table associations, add redundant fields
f, use foreign keys: the method of locking the table can maintain the integrity of the data, but it cannot guarantee the relevance of the data. At this time we can use foreign keys.
g, use index
h, optimized query statement
i, cluster
j, read-write separation
k, master-slave replication
l, sub-table
m, sub-database
o, and use stored procedures when appropriate

Restrictions: Try to use full-time indexes, the leftmost prefix: the query starts from the leftmost front column of the index and does not skip the columns in the index; no operation on the index column, the range
the whole failure; there is unequal null OR, impact index Note; if the like starts with the wildcard %, the index failure will become a full table scan operation, and the string without
single quotation marks will be invalid.

Talk about Redis cache, its data type, the difference between it and other caches, and its solutions for persistence, cache penetration and avalanche

redis is a data structure stored in system memory, a key-value type non-relational database, persistent database, with respect to off
line databases (data mainly hard drive), high-performance, we generally use redis to Used as a cache; and redis supports rich
data types, it is easier to solve various problems, so redis can be used as a registry, database, cache, and message middleware. Redis
Value supports 5 data types, string, hash, list, set, zset (sorted set);

String type: a key corresponds to a value
Hash type: its key is a string type, and its value is a map (key-value), suitable for storing objects.
List type: string linked list (double-linked list) in the order of insertion. The main commands are LPUSH and RPUSH, which can support reverse search and traversal.
Set type: use hash table type string sequence, no order, set members are unique , There is no duplicate data, the bottom layer is mainly
realized by a hashmap whose value is always null.
zset type: basically the same as the set type, but it associates a double type score (score) with each element, so that the
members can be sorted, and the insertion is ordered.

The difference between Memcache and redis:

Types of data supported: redis not only supports simple k/v type data, but also supports
storage of data structures such as list, set, zset, hash, etc .; memcache only supports simple k/v type data, both key and value It is a string type.
Reliability: memcache does not support data persistence, and data disappears after power failure or restart, but its stability is guaranteed; redis supports data persistence
and data recovery, allowing single points of failure, but at the same time it also pays performance The price of
performance: For storing big data, the performance of memcache is higher than that of redis

Application scenarios:

Memcache: suitable for more reading and less writing, large data volume (some official website article information, etc.)
Redis: suitable for system
cases that require high read and write efficiency, complex data processing services, and high security requirements : distributed systems There is a problem of sharing between sessions, so when doing single sign-on, we use redis to simulate
session sharing to store user information and realize session sharing between different systems;

There are two ways to persist redis:

RDB (semi-persistent mode) : According to the configuration, the data in the memory is directly persisted to the disk in the form of asynchronous mode and snapshot according to the configuration.
Ge dump.rdb file (binary temporary file), redis default persistence It is in the configuration file (redis.conf).
Advantages: Contains only one file, and transfers a single file to other storage media, which is more practical for file backup and disaster recovery.
Disadvantages: Once the system is down before the persistence strategy, data that has not been persisted before will be lost

AOF (full persistence method) : Append the commands you execute to an appendonly.aof file through the write() function for every data change.
Redis does not support this full persistence method by default. The configuration file (redis.conf) will
appendonly no to appendonly yes in the

Advantages: data security is high, and the write operation to the log file uses append mode, so even if there is a downtime during the writing process,
will not destroy the existing content in the log file;
Disadvantage: For the same number For data sets, AOF files are usually larger than RDB files, so RDB is faster than AOF when recovering large data sets;

AOF persistent configuration:
There are three synchronization methods in the Redis configuration file. They are:
appendfsync always #fsync will be called to refresh to the aof file every time a data modification occurs, which is very slow, but safe;
appendfsync everysec #each Call fsync to refresh to the aof file every second, very fast, but may lose data within one second, it is recommended to use it, taking into account speed and safety;
appendfsync no # will not automatically synchronize to the disk, it depends on the OS (operating system) Refresh is fast, but the security is relatively poor;

The difference between the two persistence methods:
AOF is often slower than RDB in operating efficiency, the efficiency of the synchronization strategy per second is relatively high, and the efficiency of the synchronization disable strategy is as efficient as RDB;
if the cache data security requirements are relatively high, use aof This kind of persistence (such as a shopping cart in a project);
if high efficiency is required for large data sets, you can use the default. And these two persistence methods can be used at the same time.  

Redis-cluster cluster, this method uses a non-central structure, each node saves data and the state of the entire cluster, and each node is connected to its
connected to all other nodes. If you use it, use the redis-cluster cluster. The cluster is built by the company's operation and maintenance, and I don't know how to build it.

In our project, there are mainly 6 redis clusters, 3 masters (to ensure redis voting mechanism) and 3 slaves (high availability). Each master server has
a slave server as a backup machine. All nodes are connected to each other through the PING-PONG mechanism; the client connects to the redis cluster, and only
needs to connect to any node in the cluster; Redis-cluster has 16384 hash slots built in, and Redis-cluster puts all was
physical node mapped to [0-16383] slot, is responsible for maintaining.


Redis has transactions. The transaction in redis is a collection of commands. This group of commands are either executed or not executed, ensuring that
the commands in a transaction are executed in sequence without being inserted by other commands. Redis transactions do not support rollback operations. The realization of redis transaction requires the use of
MULTI (start of transaction) and EXEC (end of transaction) commands;

Cache penetration: Cache query generally uses key to find value. If there is no corresponding value, it must be searched in the database. If
the value corresponding to this key does not exist in the database, and there are large concurrent requests for the key, it will put a lot of pressure on the database, which is called cache penetration

Solution:
1. Store all possible query parameters in hash form, check them at the control layer first, and discard them if they don't match.
2. Hash all possible data into a sufficiently large bitmap, and a data that must not exist will be intercepted by this bitmap, thus
avoiding the query pressure on the underlying storage system.
3. If the data returned by a query is empty (whether the data does not exist or the system is faulty), we still cache the empty result, but
its expiration time will be very short, no more than five minutes.

Cache avalanche : When the cache server restarts or a large number of caches fail within a period of time, a large amount of cache penetration occurs, so that
the access pressure to the database at the moment of failure is relatively high, and all queries fall on the database, causing the cache avalanche. There is no perfect solution to this, but you can
analyze user behavior and try to make the failure time points evenly distributed. Most system designers consider using locks or queues to ensure a single line of cache
-thread (process) writes to the , so as to avoid a large number of concurrent requests falling to the underlying storage system in the event of a failure.

Solution:
1. After the cache is invalid, the number of threads that read the database and write the cache is controlled by locking or queueing. For example, only one thread is allowed to query
data and write cache for a certain key , and other threads wait.
2. You can update the cache in advance through the cache reload mechanism, and then manually trigger the loading of the cache before large concurrent access is about to occur
. 3. Set different expiration times for different keys to make the time of cache invalidation as even
as possible . 4. Do the second level cache , Or double cache strategy. A1 is the original cache, and A2 is the copy cache. When A1 fails, you can access A2. The cache expiration time of A1 is
set to short-term, and A2 is set to long-term.

Redis security mechanism (how does your company consider the security of redis?)

Vulnerability introduction: Redis is bound to bind 0.0.0.0:6379 by default, which will expose the redis service to the public network. If
authentication is not turned on, it can cause any user to access the target server. Down, you can access redis and read without authorization
redis data, attackers can use redis related methods without authorization to access redis, and successfully on the redis server
write the public key , and then you can directly use the private key Perform direct login to the target host;

Solution:
1. Prohibit some high-risk commands. Modify the redis.conf file to prohibit remote modification of the DB file address
2. Run the redis service with low permissions. Create a separate user and root directory for the redis service, and configure to prohibit login;
3. Add password authentication for redis. Modify the redis.conf file and add requirepass mypassword;
4. Prohibit the external network to access redis. Modify the redis.conf file, add or modify bind 127.0.0.1, so that the redis service is only used on the current host;
5. Do log monitoring to detect attacks in time;

Sentinel mechanism of redis (appeared after redis 2.6)
Sentinel mechanism:
Monitoring: monitors whether the master database and the slave database are operating normally;
reminder: When a certain redis is monitored, the sentry can report to the administrator or others through the API The application sends notifications;
automatic failure migration: When the master database fails, the slave database can be automatically converted to the master database to realize automatic switching;
if the master master server has set a password, remember to configure it in the sentinel configuration file (sentinel.conf) Access password

For the life-time application
in redis, Redis can use the expire command to set the life-time of a key, and redis will automatically delete it after the time expires;
application scenarios:
set restricted information about preferential activities;
some data that needs to be updated in time, points ranking list;
Phone verification code time;
limit the visit frequency of website visitors;

Talk about the difference between ActiveMQ and other messaging middleware

The principle of activemq

Principle: The producer produces messages and sends them to activemq. Activemq receives the message, and then sees how many consumers there are,
and then forwards the message to the consumer. In this process, the producer does not need to participate. After the consumer receives the message, the corresponding processing has nothing to do with the producer

Compare RabbitMQ

The protocol of RabbitMQ is AMQP, while ActiveMQ uses the JMS protocol. As the name implies, JMS is a transmission protocol for the Java system.
There must be JVM at both ends of the queue . Therefore, if the development environment is Java, ActiveMQ is recommended. You can use some Java objects to transfer such as
Map, Blob (Binary Big Data), Stream, etc. However, AMQP is universal, and it is often used in non-java environments, and the transmission content is standard words
string. RabbitMQ installation is more troublesome. ActiveMQ decompression can be used without any installation.

Compare KafKa

The performance of Kafka exceeds traditional MQ tools such as ActiveMQ, and the cluster scalability is good.
The disadvantages are:
1. Messages may be repeated during transmission,
2. The order of sending is not guaranteed
3. Some traditional MQ functions are not available, such as the transaction function of messages. So Kafka is usually used to process big data logs.

Compare Redis

In fact, Redis itself can use List to realize the function of message queue, but there are few functions, and the performance will drop sharply when the queue is large. Correct
can be used for scenarios where the amount of data is small and the business is simple.

How to solve the problem of message duplicationThe
so-called message duplication means that consumers have received duplicate messages. Generally speaking, we must grasp the following points in dealing with this problem.

. The message is not lost (which has been processed above)
. The message is not executed repeatedly.
Generally speaking, we can add a table in the business segment to store whether the message is executed successfully. After each business transaction is committed, the server is notified that it has been
The message has been processed, so that even if your message is resent, it will not cause repeated processing

The general process is as follows: The table on the business side records the id of the processed message, and each time a message comes in, it is judged whether the message has been executed, if it is executed
will be abandoned. If it has not been executed, the message will be executed. Id

Talk about the solution to the asynchronous communication problem of distributed transactions

Problem introduction : A message is sent, no matter what the result is, the sender will not wait for the receiver in place. Until the receiving end pushes back the receipt message, the
sending end will not know the result. But it is also possible that after the sender's message is sent, there is no news. At this time, a mechanism is needed to deal with this kind of
to supplement uncertainty.

Solution:
You have a lot of pen pals, and you usually write a letter back and forth, but sometimes you encounter the situation that there is no reply to the letter. So for this occasional situation, you
can choose two strategies. One option is to set an alarm clock when you send a letter, and set it to ask the other party to confiscated the letter one day later. Another solution is
is to set a time every night to check all the letters that have been sent but have not received a reply for a day. Then call one by one to ask.

The first strategy is to implement a delay queue, and the second strategy is to poll and scan regularly.

The difference between the two is that the delay queue is more accurate, but if the cycle is too long, the task will stay in the delay queue for a very long time, which will make the queue redundant
. For example, a reminder for a user to do in a few days, a birthday reminder.
Then if you encounter such long-period events, and do not need to be accurate to the minute and second level of events, you can use the timing scan to achieve, especially the more expensive
the large-scale scanning that performance, which can be scheduled to be executed at night.

Explain how to explain **** single sign-on access, distributed session cross-domain issues

Single sign-on means that after mutually trusted system modules log in to a module, other modules do not need to log in repeatedly to pass the authentication.
Using the CAS single sign-on framework, first of all, CAS has two parts: client and server.
The server is a web project deployed in tomcat. The user authentication operation is completed on the server side. Every time you access the system module, you need to go to the CAS to
get the ticket. When the verification is passed, the access continues. For the CAS server, the application module we access is the CAS client.

  What is cross-domain?

When an asynchronous request is made, if any of the protocol, ip address, and port number of the requested address to be accessed is different from the current site, cross-domain access will be involved.
When are cross-domain issues involved? Cross-domain is only involved when it involves front-end asynchronous requests.
Solution:
1. jQuery provides jsonp implementation
2. W3C standard provides CORS (cross-domain resource sharing) solution.

With the CAS, all applications configured in the project if you need to log in web.xml filter forwards the request to do cas works will end after cas login
to send a bill (ticket) to the browser, the browser caches the cookie This ticket will be
forwarded to cas with the ticket of the browser when logging in to other items , and after arriving at cas, it will be judged whether to log in according to the ticket

 

Talk about the functions of linux commands awk, cat, sort, cut, grep, uniq, wc, top, find, sed, etc.

Awk: In contrast to sed, which is often used to process an entire line, awk prefers to divide a line into several "fields" for processing. Therefore, awk is
quite suitable for processing small data data processing
cat: mainly used to view file content, create files, file merging, append file content and other functions.
sort: function: sort text, the default is effective for the entire column. The
cut: cut command can extract text columns from a text file or text stream.
grep: is a powerful text search tool that can use regular expressions to search for text and match to print lines
uniq: function: remove duplicate lines, will only count adjacent
wc: function: statistics file line number, byte number of characters
top: the system used to monitor the status of Linux, such as cpu, memory usage
find: Function: Search file directory hierarchy
sed: sed is an online editor, it processes one line of content at a time

**Talk about what is deadlock, how to solve deadlock, ** table-level lock and row-level lock, pessimistic lock and optimistic lock, and thread synchronization lock

Deadlock: For example, when you go to an interview, the interviewer asks you, if you tell me what a deadlock is, I will let you join the company. You answered that you let me join the company and I'll tell you what a deadlock is

Mutually exclusive conditions: resources cannot be shared and can only be used by one process.
Request and hold conditions: The process has obtained some resources, but when it is blocked by requesting other resources, it keeps the obtained resources.
Non-preemption conditions: Some system resources are not preemptible. After a process has obtained this resource, the system cannot forcibly recover it, and can only be released by the process
itself when it is used up.
Circular waiting condition: Several processes form a circular chain, each of which occupies the next resource requested by the other party.

(1) Deadlock prevention: Destroying any one of the necessary conditions leading to deadlock can prevent deadlock. For example, users
are required to apply for all the resources they need at one time when applying for resources, which destroys the holding and waiting conditions; stratify the resources, and only after obtaining the resources of the upper layer, can they apply for the resources of the next layer,
which breaks the loop Wait for the condition. Prevention usually reduces the efficiency of the system.
(2) Deadlock avoidance: Avoidance means that the process judges whether these operations are safe each time it applies for resources, for example, using the banker algorithm.
The execution of deadlock avoidance algorithm will increase the overhead of the system.
(3) Deadlock detection: Deadlock prevention and avoidance are both pre-measures, while deadlock detection is to determine whether the system is in a deadlock state, and if so, execute
the deadlock release strategy.
(4) Deadlock release: This is used in conjunction with deadlock detection, and the method it uses is deprivation. That is, the resources owned by a certain process are forcibly recovered and
allocated to other processes.

Table-level locks : low overhead and fast locking; no deadlock (because MyISAM will obtain all the locks required by SQL at one time); large locking granularity, the
highest probability of lock conflicts, and the lowest concurrency.
Row-level locks : high overhead and slow locking; deadlocks will occur; locking granularity is the smallest, the probability of lock conflicts is the lowest, and the concurrency is the highest.

Pessimistic lock : always assume the worst case. Every time you get the data, you think that others will modify it, so every time you get the data, you will lock it, so that
others will block until it gets the lock if they want to get the data. . Many such locking mechanisms are used in traditional relational databases, such as row locks, table locks,
etc., read locks, write locks, etc., which are all locked before operations. Another example is the implementation of the synchronized keyword of the synchronization primitive in Java is also a pessimistic
lock. Realized by for update

Optimistic lock : As the name suggests, it is very optimistic. Every time I get the data, I think that others will not modify it, so it will not be locked, but when updating, it will be
judged whether someone else has updated the data during this period, and it can be used. Mechanisms such as version numbers. Optimistic locking is suitable for multi-read applications, which can
improve throughput. The mechanism similar to write_condition provided by the database is actually optimistic locking. In java
the atomic variable class under the java.util.concurrent.atomic package is implemented using CAS, an implementation of optimistic locking. Through the version version field to achieve

Sync lock :
Scenario: In the development, when encountering time-consuming operations, we need to put the time-consuming logic into sub-threads for execution to prevent stuck. The two threads execute two
tasks separately, execute them at the same time, parse the file at the same time, and insert the data into the database at the same time after obtaining the data. Since there are more tables inserted, insertion errors are prone
bug.

Use synchronized:
Declare that the method is a synchronized method. If a method is being executed and another method is called, it is in a waiting state. When this method is executed, you can
call the unlock method, wait(): release the held object lock, and the thread enters the waiting pool.

the difference:
synchronized is implemented at the JVM level, so the system can monitor whether the lock is released or not, while ReentrantLock is implemented using code, the system
cannot automatically release the lock, and the lock needs to be explicitly released in the finally clause in the code lock.unlock() ;

In the case of relatively small concurrency, using synchronized is a good choice, but in the case of relatively high concurrency, its performance decreases
very seriously. At this time, ReentrantLock is a good solution.

Talk** how to speed up the access speed, how to optimize the performance of the program **

Speed up access:
Increase network bandwidth on hardware, and server memory.
Code processing: static pages, caching, sql optimization, index creation, etc.

System performance is two things:   
Throughput, throughput. That is, the number of requests and tasks that can be processed per second.
Latency, system delay. That is, the delay of the system in processing a request or a task.
Then the better the Latency, the higher the Throughput that can be supported. Because the Latency is short, the processing speed is fast, so more requests can be processed.

Improve throughput: distributed clusters, decoupling modules, design patterns,
system delay: asynchronous communication

 

Talk about cache design and optimization, cache and database consistency synchronization solutions

1. Reduce back-end load: For high-consumption SQL: join result sets, group statistical results; cache these results.
2. Speed up request response
3. Combine a large number of writes into batch writes: such as the counter first redis accumulates and then batch writes to DB
4. Timeout elimination: such as expire
5. Active update: development control life cycle (final consistency, relatively short time interval)
6. Cache empty objects
7. Bloom filter interception
8. The efficiency of the command itself: such as sql optimization, command optimization
9. Network times: reduce the number of communications
10. Reduce access costs: long connections/connection pools, NIO, etc.
11. IO access merge
Purpose: To reduce the number of cache reconstructions, to keep data as consistent as possible, and to reduce potential risks.
Solution:
1. Mutex lock setex, setnx:
If the result of set(nx and ex) is true, it means that there is no other thread to rebuild the cache at this time, then the current thread executes the cache construction logic.
If the result of setnx(nx and ex) is false, indicating that there are already other threads performing the work of constructing the cache at this time, then the current thread will stop
rates specified time (e.g., here 50 ms, depending on the speed of building caches), the re- Execute the function until the data is obtained.

2 never expires:
hot key, is nothing more than the concurrent particularly large cache rebuild a long time, if they are directly set the expiration time, then time to time, great visit
and asked the amount would be oppressive to the database, so give a hot key val adds a logical expiration time field. During concurrent access, it is judged whether
the time value of this logical field is greater than the current time. If it is greater than that, the cache needs to be updated. At this time, all threads are still allowed to access the old
cache, because the cache No expiration is set, but another thread is opened to reconstruct the cache. After the reconstruction is successful, that is,
after the redis set operation is performed , all threads can access the new content in the reconstructed cache

From the perspective of caching, there is indeed no expiration time set, so there will be no problems after the hot key expires, that is, the "physical" does not expire.
From a functional point of view, a logical expiration time is set for each value. When the logical expiration time is found to exceed the logical expiration time, a separate thread is used to build the cache.

Consistency issues:
1. First delete the cache, and then update the database. If the deletion of the cache fails, then do not update the database. If the deletion of the cache is successful but the update of the
database fails, then the query is just checking the old data from the database That's it, in this way, the consistency of the database and the cache can be maintained.
2. go look there is no data cache, if not, you can go to the queue to see if there are doing the same data update and found there is a queue, please
request, then do not put the new operating inside, with a While (true) loop to query the cache, loop about 200MS and send it to the
queue again , and then synchronously wait for the cache update to complete.

Talk about the message queue and how to deal with repeated consumption of messages, and what to do if consumers cannot receive messages

What is a message queue?
It is the container that saves the message during the transmission of the message.

What problem does the message queue solve?
Asynchronous, parallel, decoupling, queuing

Message mode?
Subscription, peer-to-peer

1. Repeated consumption: Queue supports the existence of multiple consumers, but for a message, only one consumer can consume.
2. Loss of messages:
1. Use persistent messages
2. Non-persistent messages are processed in a timely manner and do not accumulate
3. Start the transaction, after the transaction is started, the commit() method will be responsible for waiting for the server to return, so the connection will not be closed and the message will be lost .
3. Message retransmission: The
message is retransmitted to the client:
1. Use the transaction session and call rollback ().
2. Close the transaction session before calling commit().
3. The session uses the CLIENT_ACKNOWLEDGE signature mode, and Session .recover() is called for resending.
4. Client connection timeout (maybe the code being executed is longer than the configured timeout period).
4. No consumption: Go to ActiveMQ.DLQ to find
what is ActiveMQ.DLQ?
1. Once the retransmission attempts of the message exceed the maximum number of retransmissions configured for the retransmission strategy, a "Poison ACK" is sent back to the broker to let
him know that the message is considered a poison pill. The broker then receives the message and sends it to the dead letter queue so that it can be analyzed later.
2. The dead letter queue in activemq is called ActiveMQ.DLQ. All undeliverable messages will be sent to this queue, which is difficult to manage.
3. Therefore, you can set the individual dead letter strategy in the target strategy map of the Activemq.xml configuration file, which allows you to specify
a specific dead letter queue prefix for the queue or topic .

There are 2 situations in which the Mq consumer cannot accept the message:
1. The processing failure refers to the RuntimeException thrown in the onMessage method of the MessageListener.
2. There are two related fields in the Message header: Redelivered is false by default, redeliveryCounter is 0 by default.
3. The message is first sent by the broker to the consumer, and the consumer calls the listener. If the processing fails, the local
redeliveryCounter++ will give the broker a specific response. The redeliveryCounter++ in the message on the broker side
will continue to be called after a little delay . The default is 1s. More than 6 times, another specific response is given to the broker, and the broker sends the message directly to DLQ.
4. If it fails twice and the consumer restarts, the broker pushes the message again, redeliveryCounter=2, and the local can only retry 4 times
to enter the DLQ.
5. The specific response of the retry is sent to the broker, and the broker will set the redelivered of the message to
true in the memory , redeliveryCounter++, but neither of these two fields are persisted, that is, the message record in the storage is not modified. So
these two fields will be reset to default values when the broker is restarted .

 

Talk about the difference between SOA and distributed, what should I do if the zookeeper or activeMQ service hangs

The difference between SOA and distributed?
SOA splits the project into two projects: the service layer and the presentation layer. The service layer contains business logic and only needs to provide external services. The presentation layer only needs to
process the interaction with the page, and the business logic is implemented by calling the services of the service layer.
Distributed, mainly from the perspective of deployment, classify applications according to access pressure. The main goal is to make full use of server resources and avoid uneven resource allocation.

What should I do if the activeMQ service is down?
1. Under normal circumstances, non-persistent messages are stored in memory, and persistent messages are stored in files. Their maximum limit is configured in
the <systemUsage> node of the configuration file. However, when the non-persistent messages accumulate to a certain extent and the memory is in a hurry, ActiveMQ will
write the non-persistent messages in the memory into a temporary file to free up memory. Although they are all saved in files, the difference between it and persistent messages is that persistent
messages will be restored from the file after restarting , and non-persistent temporary files will be deleted directly.
2. Consider high availability and realize activemq cluster.

 
What if the zookeeper service is down?
Registry center peer-to-peer cluster. After any one goes down, it will automatically switch to another. The
registration center goes down. Service providers and consumers can still communicate through the local cache. The
service provider is stateless. After any one goes down , Does not affect the use
of all service providers downtime, service consumers will not be able to use, and infinitely reconnect waiting for the server to recover

 

Talk about the auxiliary classes of JUC

ReentrantReadWriteLock: Read-write lock
CountDownLatch: Decrease count
CyclicBarrier: Loop fence
Semaphore: Semaphore