An article to understand SuperEdge distributed health check (cloud)

An article to understand SuperEdge distributed health check (cloud)

Du Yanghao, a senior engineer at Tencent Cloud, is passionate about open source, containers and Kubernetes. At present, he is mainly engaged in mirror warehouse, Kubernetes cluster high availability & backup and restoration, and edge computing related research and development.

Preface

The SuperEdge distributed health check function consists of edge-health-daemon on the side and edge-health-admission on the cloud:

  • edge-health-daemon: Perform distributed health checks on edge nodes in the same area, and send health status voting results to apiserver (annotation for node)
  • edge-health-admission: Constantly adjust the node taint set by kube-controller-manager according to the node edge-health annotation (remove NoExecute taint) and endpoints (move the pods on the disconnected node from endpoint subsets notReadyAddresses to addresses) to achieve The cloud and the edge jointly determine the node status

The overall structure is as follows:

The edge-health-admission cloud component was created because when the cloud edge is disconnected, kube-controller-manager will perform the following operations:

  • The disconnected node is set to the ConditionUnknown state, and the NoSchedule and NoExecute taints are added
  • The pod on the lost node is removed from the Endpoint list of the Service

When edge-health-daemon judges that the node status is normal based on the health check on the edge, it will update node: remove NoExecute taint. But after the node is successfully updated, it will be swiped back by kube-controller-manager (add NoExecute taint again), so you must add Kubernetes mutating admission webhook, that is, edge-health-admission, and change kube-controller-manager to node api resource Change and adjust, and finally realize the effect of distributed health check

Before diving into the source code, let's introduce Kubernetes Admission Controllers

An admission controller is a piece of code that intercepts requests to the Kubernetes API server prior to persistence of the object, but after the request is authenticated and authorized. The controllers consist of the list below, are compiled into the kube-apiserver binary, and may only be configured by the cluster administrator. In that list, there are two special controllers: MutatingAdmissionWebhook and ValidatingAdmissionWebhook. These execute the mutating and validating (respectively) admission control webhooks which are configured in the API.

Kubernetes Admission Controllers is a part of kube-apiserver processing api requests. It is used to call after api request authentication & authentication and before object persistence to verify or modify the request (or both)

Kubernetes Admission Controllers include a variety of admissions, most of which are embedded in the kube-apiserver code. Among them, MutatingAdmissionWebhook and ValidatingAdmissionWebhook controllers are special. They will call externally constructed mutating admission control webhooks and validating admission control webhooks, respectively.

Admission webhooks are HTTP callbacks that receive admission requests and do something with them. You can define two types of admission webhooks, validating admission webhook and mutating admission webhook. Mutating admission webhooks are invoked first, and can modify objects sent to the API server to enforce custom defaults. After all object modifications are complete, and after the incoming object is validated by the API server, validating admission webhooks are invoked and can reject requests to enforce custom policies.

Admission Webhooks is an HTTP callback service that accepts AdmissionReview requests and processes them. According to different processing methods, Admission Webhooks can be classified as follows:

  • validating admission webhook : Through the ValidatingWebhookConfiguration configuration, the api request will be checked for admission, but the request object cannot be modified
  • Mutating admission webhook : Through the MutatingWebhookConfiguration configuration, the api request will be checked for admission and the request object will be modified

Both types of webhooks need to define the following Matching requests fields:

  • admissionReviewVersions: defines the version list of AdmissionReview api resoure supported by apiserver (API servers send the first AdmissionReview version in the admissionReviewVersions list they support)
  • name: webhook name (if multiple webhooks are defined in a WebhookConfiguration, the uniqueness of the name needs to be guaranteed)
  • clientConfig: defines the access address (url or service) of the webhook server and the CA bundle (optionally include a custom CA bundle to use to verify the TLS connection)
  • namespaceSelector: limits the labelSelector of the namespace matching the requested resource
  • objectSelector: restricts the labelSelector that matches the requested resource itself
  • rules: Limit the operations, apiGroups, apiVersions, resources and resource scope of the matching request, as follows:
    • operations: specifies a list of requested operations (Can be "CREATE", "UPDATE", "DELETE", "CONNECT", or "*" to match all.)
    • apiGroups: specifies the list of API groups requesting resources ("" is the core API group. "*" matches all API groups.)
    • apiVersions: specifies the list of API versions of the requested resource ("*" matches all API versions.)
    • resources: specifies the requested resource type (node, deployment and etc)
    • scope: specifies the scope of the requested resource (Cluster, Namespaced or *)
  • timeoutSeconds: specifies the timeout period for webhook response. If the timeout expires, it will be processed according to the failurePolicy
  • failurePolicy: specifies the apiserver's processing strategy for failure of admission webhook requests:
    • Ignore: means that an error calling the webhook is ignored and the API request is allowed to continue.
    • Fail: means that an error calling the webhook causes the admission to fail and the API request to be rejected.
  • matchPolicy: specifies how the rules match incoming api requests, as follows:
    • Exact: Exactly match the rules list restriction
    • Equivalent: If the modification request resource (apiserver can realize the conversion of objects in different versions) can be converted into the ability to configure the rules list limit, then the request is considered to match and can be sent to admission webhook
  • reinvocationPolicy: In v1.15+, to allow mutating admission plugins to observe changes made by other plugins, built-in mutating admission plugins are re-run if a mutating webhook modifies an object, and mutating webhooks can specify a reinvocationPolicy to control whether they are reinvoked as well.
    • Never: the webhook must not be called more than once in a single admission evaluation
    • IfNeeded: the webhook may be called again as part of the admission evaluation if the object being admitted is modified by other admission plugins after the initial webhook call.
  • Side effects: In addition to modifying the content of the AdmissionReview, some webhooks also modify other resources ("side effects"). And sideEffects indicates whether Webhooks have "side effects", the values are as follows:
    • None: calling the webhook will have no side effects.
    • NoneOnDryRun: calling the webhook will possibly have side effects, but if a request with dryRun: true is sent to the webhook, the webhook will suppress the side effects (the webhook is dryRun-aware).

Here is the MutatingWebhookConfiguration corresponding to edge-health-admission as a reference example:

apiVersion: admissionregistration.k8s.io/v1 kind: MutatingWebhookConfiguration metadata: name: edge-health-admission webhooks: -admissionReviewVersions: -v1 clientConfig: caBundle:LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUNwRENDQVl3Q0NRQ2RaL0w2akZSSkdqQU5CZ2txaGtpRzl3MEJBUXNGQURBVU1SSXdFQVlEVlFRRERBbFgKYVhObE1tTWdRMEV3SGhjTk1qQXdOekU0TURRek9ERTNXaGNOTkRjeE1qQTBNRFF6T0RFM1dqQVVNUkl3RUFZRApWUVFEREFsWGFYTmxNbU1nUTBFd2dnRWlNQTBHQ1NxR1NJYjNEUUVCQVFVQUE0SUJEd0F3Z2dFS0FvSUJBUUNSCnhHT2hrODlvVkRHZklyVDBrYVkwajdJQVJGZ2NlVVFmVldSZVhVcjh5eEVOQkF6ZnJNVVZyOWlCNmEwR2VFL3cKZzdVdW8vQWtwUEgrbzNQNjFxdWYrTkg1UDBEWHBUd1pmWU56VWtyaUVja3FOSkYzL2liV0o1WGpFZUZSZWpidgpST1V1VEZabmNWOVRaeTJISVF2UzhTRzRBTWJHVmptQXlDMStLODBKdDI3QUl4YmdndmVVTW8xWFNHYnRxOXlJCmM3Zk1QTXJMSHhaOUl5aTZla3BwMnJrNVdpeU5YbXZhSVA4SmZMaEdnTU56YlJaS1RtL0ZKdDdyV0dhQ1orNXgKV0kxRGJYQ2MyWWhmbThqU1BqZ3NNQTlaNURONDU5ellJSkVhSTFHeFI3MlhaUVFMTm8zdE5jd3IzVlQxVlpiTgo1cmhHQlVaTFlrMERtd25vWTBCekFnTUJBQUV3RFFZSktvWklodmNOQVFFTEJRQURnZ0VCQUhuUDJibnJBcWlWCjYzWkpMVzM0UWFDMnRreVFScTNVSUtWR3RVZHFobWRVQ0I1SXRoSUlleUdVRVdqVExpc3BDQzVZRHh4YVdrQjUKTUxTYTlUY0s3SkNOdkdJQUdQSDlILzRaeXRIRW10aFhiR1hJQ3FEVUVmSUVwVy9ObUgvcnBPQUxhYlRvSUVzeQpVNWZPUy9PVVZUM3ZoSldlRjdPblpIOWpnYk1SZG9zVElhaHdQdTEzZEtZMi8zcEtxRW1Cd1JkbXBvTExGbW9MCmVTUFQ4SjREZExGRkh2QWJKalFVbjhKQTZjOHUrMzZJZDIrWE1sTGRZYTdnTnhvZTExQTl6eFJQczRXdlpiMnQKUXZpbHZTbkFWb0ZUSVozSlpjRXVWQXllNFNRY1dKc3FLMlM0UER1VkNFdlg0SmRCRlA2NFhvU08zM3pXaWhtLworMXg3OXZHMUpFcz0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo =TkQgQ0VSVElGSUNBVEUtLS0tLQo=TkQgQ0VSVElGSUNBVEUtLS0tLQo= service: namespace: kube-system name: edge-health-admission path:/node-taint failurePolicy: Ignore matchPolicy: Exact name: node-taint.k8s.io namespaceSelector: {} objectSelector: {} reinvocationPolicy: Never rules: -apiGroups: -'*' apiVersions: -'*' operations: -UPDATE resources: -nodes scope:'*' sideEffects: None timeoutSeconds: 5 -admissionReviewVersions: -v1 clientConfig: caBundle:LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUNwRENDQVl3Q0NRQ2RaL0w2akZSSkdqQU5CZ2txaGtpRzl3MEJBUXNGQURBVU1SSXdFQVlEVlFRRERBbFgKYVhObE1tTWdRMEV3SGhjTk1qQXdOekU0TURRek9ERTNXaGNOTkRjeE1qQTBNRFF6T0RFM1dqQVVNUkl3RUFZRApWUVFEREFsWGFYTmxNbU1nUTBFd2dnRWlNQTBHQ1NxR1NJYjNEUUVCQVFVQUE0SUJEd0F3Z2dFS0FvSUJBUUNSCnhHT2hrODlvVkRHZklyVDBrYVkwajdJQVJGZ2NlVVFmVldSZVhVcjh5eEVOQkF6ZnJNVVZyOWlCNmEwR2VFL3cKZzdVdW8vQWtwUEgrbzNQNjFxdWYrTkg1UDBEWHBUd1pmWU56VWtyaUVja3FOSkYzL2liV0o1WGpFZUZSZWpidgpST1V1VEZabmNWOVRaeTJISVF2UzhTRzRBTWJHVmptQXlDMStLODBKdDI3QUl4YmdndmVVTW8xWFNHYnRxOXlJCmM3Zk1QTXJMSHhaOUl5aTZla3BwMnJrNVdpeU5YbXZhSVA4SmZMaEdnTU56YlJaS1RtL0ZKdDdyV0dhQ1orNXgKV0kxRGJYQ2MyWWhmbThqU1BqZ3NNQTlaNURONDU5ellJSkVhSTFHeFI3MlhaUVFMTm8zdE5jd3IzVlQxVlpiTgo1cmhHQlVaTFlrMERtd25vWTBCekFnTUJBQUV3RFFZSktvWklodmNOQVFFTEJRQURnZ0VCQUhuUDJibnJBcWlWCjYzWkpMVzM0UWFDMnRreVFScTNVSUtWR3RVZHFobWRVQ0I1SXRoSUlleUdVRVdqVExpc3BDQzVZRHh4YVdrQjUKTUxTYTlUY0s3SkNOdkdJQUdQSDlILzRaeXRIRW10aFhiR1hJQ3FEVUVmSUVwVy9ObUgvcnBPQUxhYlRvSUVzeQpVNWZPUy9PVVZUM3ZoSldlRjdPblpIOWpnYk1SZG9zVElhaHdQdTEzZEtZMi8zcEtxRW1Cd1JkbXBvTExGbW9MCmVTUFQ4SjREZExGRkh2QWJKalFVbjhKQTZjOHUrMzZJZDIrWE1sTGRZYTdnTnhvZTExQTl6eFJQczRXdlpiMnQKUXZpbHZTbkFWb0ZUSVozSlpjRXVWQXllNFNRY1dKc3FLMlM0UER1VkNFdlg0SmRCRlA2NFhvU08zM3pXaWhtLworMXg3OXZHMUpFcz0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo =TkQgQ0VSVElGSUNBVEUtLS0tLQo=TkQgQ0VSVElGSUNBVEUtLS0tLQo= service: namespace: kube-system name: edge-health-admission path:/endpoint failurePolicy: Ignore matchPolicy: Exact name: endpoint.k8s.io namespaceSelector: {} objectSelector: {} reinvocationPolicy: Never rules: -apiGroups: -'*' apiVersions: -'*' operations: -UPDATE resources: -endpoints scope:'*' sideEffects: None timeoutSeconds: 5 Copy code

kube-apiserver will send AdmissionReview(apiGroup:

admission.k8s.io
, ApiVersion:
v1 or v1beta1
) To Webhooks, and encapsulated into JSON format, the example is as follows:

# This example shows the data contained in an AdmissionReview object for a request to update the scale subresource of an apps/v1 Deployment { "apiVersion": "admission.k8s.io/v1", "kind": "AdmissionReview", "request": { # Random uid uniquely identifying this admission call "uid": "705ab4f5-6393-11e8-b7cc-42010a800002", # Fully-qualified group/version/kind of the incoming object "kind": {"group":"autoscaling","version":"v1","kind":"Scale"}, # Fully-qualified group/version/kind of the resource being modified "resource": {"group":"apps","version":"v1","resource":"deployments"}, # subresource, if the request is to a subresource "subResource": "scale", # Fully-qualified group/version/kind of the incoming object in the original request to the API server. # This only differs from `kind` if the webhook specified `matchPolicy: Equivalent` and the # original request to the API server was converted to a version the webhook registered for. "requestKind": {"group":"autoscaling","version":"v1","kind":"Scale"}, # Fully-qualified group/version/kind of the resource being modified in the original request to the API server. # This only differs from `resource` if the webhook specified `matchPolicy: Equivalent` and the # original request to the API server was converted to a version the webhook registered for. "requestResource": {"group":"apps","version":"v1","resource":"deployments"}, # subresource, if the request is to a subresource # This only differs from `subResource` if the webhook specified `matchPolicy: Equivalent` and the # original request to the API server was converted to a version the webhook registered for. "requestSubResource": "scale", # Name of the resource being modified "name": "my-deployment", # Namespace of the resource being modified, if the resource is namespaced (or is a Namespace object) "namespace": "my-namespace", # operation can be CREATE, UPDATE, DELETE, or CONNECT "operation": "UPDATE", "userInfo": { # Username of the authenticated user making the request to the API server "username": "admin", # UID of the authenticated user making the request to the API server "uid": "014fbff9a07c", # Group memberships of the authenticated user making the request to the API server "groups": ["system:authenticated","my-admin-group"], # Arbitrary extra info associated with the user making the request to the API server. # This is populated by the API server authentication layer and should be included # if any SubjectAccessReview checks are performed by the webhook. "extra": { "some-key":["some-value1", "some-value2"] } }, # object is the new object being admitted. # It is null for DELETE operations. "object": {"apiVersion":"autoscaling/v1","kind":"Scale",...}, # oldObject is the existing object. # It is null for CREATE and CONNECT operations. "oldObject": {"apiVersion":"autoscaling/v1","kind":"Scale",...}, # options contains the options for the operation being admitted, like meta.k8s.io/v1 CreateOptions, UpdateOptions, or DeleteOptions. # It is null for CONNECT operations. "options": {"apiVersion":"meta.k8s.io/v1","kind":"UpdateOptions",...}, # dryRun indicates the API request is running in dry run mode and will not be persisted. # Webhooks with side effects should avoid actuating those side effects when dryRun is true. # See http://k8s.io/docs/reference/using-api/api-concepts/#make-a-dry-run-request for more details. "dryRun": false } } Copy code

Webhooks needs to respond to kube-apiserver with the same version of AdmissionReview, and encapsulate it in JSON format, including the following key fields:

  • uid: Copy the AdmissionReview request.uid field sent to webhooks
  • allowed: true means permitted; false means not permitted
  • status: When the request is not allowed, the relevant reason can be given through status (http code and message)
  • patch: base64 encoding, including a series of JSON patch operations of the mutating admission webhook on the request object
  • patchType: currently only supports JSONPatch type

Examples are as follows:

# a webhook response to add that label would be: { "apiVersion": "admission.k8s.io/v1", "kind": "AdmissionReview", "response": { "uid": "<value from request.uid>", "allowed": true, "patchType": "JSONPatch", "patch": "W3sib3AiOiAiYWRkIiwgInBhdGgiOiAiL3NwZWMvcmVwbGljYXMiLCAidmFsdWUiOiAzfV0=" } } Copy code

edge-health-admission is actually a mutating admission webhook, which selectively modifies endpoints and node UPDATE requests. The principle will be analyzed in detail below

Edge-health-admission source code analysis

edge-health-admission fully refers to the official example written with . The following is the monitoring entry:

func (eha *EdgeHealthAdmission) Run(stopCh <-chan struct{}) { if !cache.WaitForNamedCacheSync("edge-health-admission", stopCh, eha.cfg.NodeInformer.Informer().HasSynced) { return } http.HandleFunc("/node-taint", eha.serveNodeTaint) http.HandleFunc("/endpoint", eha.serveEndpoint) server := &http.Server{ Addr: eha.cfg.Addr, } go func() { if err := server.ListenAndServeTLS(eha.cfg.CertFile, eha.cfg.KeyFile); err != http.ErrServerClosed { klog.Fatalf("ListenAndServeTLS err %+v", err) } }() for { select { case <-stopCh: ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second) defer cancel() if err := server.Shutdown(ctx); err != nil { klog.Errorf("Server: program exit, server exit error %+v", err) } return default: } } } Copy code

Two routing processing functions are registered here:

  • node-taint: Corresponding processing function serveNodeTaint, responsible for making changes to node UPDATE requests
  • endpoint: corresponding processing function serveEndpoint, responsible for making changes to endpoints UPDATE requests

And these two functions will call the serve function, as follows:

//serve handles the http portion of a request prior to handing to an admit function func serve(w http.ResponseWriter, r *http.Request, admit admitFunc) { var body []byte if r.Body != nil { if data, err := ioutil.ReadAll(r.Body); err == nil { body = data } } //verify the content type is accurate contentType := r.Header.Get("Content-Type") if contentType != "application/json" { klog.Errorf("contentType=%s, expect application/json", contentType) return } klog.V(4).Info(fmt.Sprintf("handling request: %s", body)) //The AdmissionReview that was sent to the webhook requestedAdmissionReview := admissionv1.AdmissionReview{} //The AdmissionReview that will be returned responseAdmissionReview := admissionv1.AdmissionReview{} deserializer := codecs.UniversalDeserializer() if _, _, err := deserializer.Decode(body, nil, &requestedAdmissionReview); err != nil { klog.Error(err) responseAdmissionReview.Response = toAdmissionResponse(err) } else { //pass to admitFunc responseAdmissionReview.Response = admit(requestedAdmissionReview) } //Return the same UID responseAdmissionReview.Response.UID = requestedAdmissionReview.Request.UID klog.V(4).Info(fmt.Sprintf("sending response: %+v", responseAdmissionReview.Response)) respBytes, err := json.Marshal(responseAdmissionReview) if err != nil { klog.Error(err) } if _, err := w.Write(respBytes); err != nil { klog.Error(err) } } Copy code

The serve logic is as follows:

  • Parse request.Body as an AdmissionReview object and assign it to requestedAdmissionReview
  • Execute the admit function on the AdmissionReview object and assign it to the responseAdmissionReview
  • Set responseAdmissionReview.Response.UID to the requested AdmissionReview.Request.UID

Among them, the admit functions corresponding to serveNodeTaint and serveEndpoint are: mutateNodeTaint and mutateEndpoint, which are analyzed in turn:

1, mutateNodeTaint

mutateNodeTaint will modify the node UPDATE request according to the results of the distributed health check:

func (eha *EdgeHealthAdmission) mutateNodeTaint(ar admissionv1.AdmissionReview) *admissionv1.AdmissionResponse { klog.V(4).Info("mutating node taint") nodeResource := metav1.GroupVersionResource{Group: "", Version: "v1", Resource: "nodes"} if ar.Request.Resource != nodeResource { klog.Errorf("expect resource to be %s", nodeResource) return nil } var node corev1.Node deserializer := codecs.UniversalDeserializer() if _, _, err := deserializer.Decode(ar.Request.Object.Raw, nil, &node); err != nil { klog.Error(err) return toAdmissionResponse(err) } reviewResponse := admissionv1.AdmissionResponse{} reviewResponse.Allowed = true if index, condition := util.GetNodeCondition(&node.Status, v1.NodeReady); index != -1 && condition.Status == v1.ConditionUnknown { if node.Annotations != nil { var patches []*patch if healthy, existed := node.Annotations[common.NodeHealthAnnotation]; existed && healthy == common.NodeHealthAnnotationPros { if index, existed := util.TaintExistsPosition(node.Spec.Taints, common.UnreachableNoExecuteTaint); existed { patches = append(patches, &patch{ OP: "remove", Path: fmt.Sprintf("/spec/taints/%d", index), }) klog.V(4).Infof("UnreachableNoExecuteTaint: remove %d taints %s", index, node.Spec.Taints[index]) } } if len(patches)> 0 { patchBytes, _ := json.Marshal(patches) reviewResponse.Patch = patchBytes pt := admissionv1.PatchTypeJSONPatch reviewResponse.PatchType = &pt } } } return &reviewResponse } Copy code

The main logic is as follows:

  • Check whether AdmissionReview.Request.Resource is the group/version/kind of the node resource
  • Convert AdmissionReview.Request.Object.Raw into node object
  • Set AdmissionReview.Response.Allowed to true, which means that the request is allowed anyway
  • Execute the core logic of assisting the side health check: when the node is in the ConditionUnknown state and the distributed health check result is normal, if the node has NoExecute(node.kubernetes.io/unreachable) taint, remove it

In general, the role of mutateNodeTaint is to continuously correct the node status updated by kube-controller-manager, remove NoExecute(node.kubernetes.io/unreachable) taint, so that the node will not be evicted

2. mutateEndpoint

mutateEndpoint will modify the endpoints UPDATE request according to the distributed health check results:

func (eha *EdgeHealthAdmission) mutateEndpoint(ar admissionv1.AdmissionReview) *admissionv1.AdmissionResponse { klog.V(4).Info("mutating endpoint") endpointResource := metav1.GroupVersionResource{Group: "", Version: "v1", Resource: "endpoints"} if ar.Request.Resource != endpointResource { klog.Errorf("expect resource to be %s", endpointResource) return nil } var endpoint corev1.Endpoints deserializer := codecs.UniversalDeserializer() if _, _, err := deserializer.Decode(ar.Request.Object.Raw, nil, &endpoint); err != nil { klog.Error(err) return toAdmissionResponse(err) } reviewResponse := admissionv1.AdmissionResponse{} reviewResponse.Allowed = true for epSubsetIndex, epSubset := range endpoint.Subsets { for notReadyAddrIndex, EndpointAddress := range epSubset.NotReadyAddresses { if node, err := eha.nodeLister.Get(*EndpointAddress.NodeName); err == nil { if index, condition := util.GetNodeCondition(&node.Status, v1.NodeReady); index != -1 && condition.Status == v1.ConditionUnknown { if node.Annotations != nil { var patches []*patch if healthy, existed := node.Annotations[common.NodeHealthAnnotation]; existed && healthy == common.NodeHealthAnnotationPros { //TODO: handle readiness probes failure //Remove address on node from endpoint notReadyAddresses patches = append(patches, &patch{ OP: "remove", Path: fmt.Sprintf("/subsets/%d/notReadyAddresses/%d", epSubsetIndex, notReadyAddrIndex), }) //Add address on node to endpoint readyAddresses TargetRef := map[string]interface{}{} TargetRef["kind"] = EndpointAddress.TargetRef.Kind TargetRef["namespace"] = EndpointAddress.TargetRef.Namespace TargetRef["name"] = EndpointAddress.TargetRef.Name TargetRef["uid"] = EndpointAddress.TargetRef.UID TargetRef["apiVersion"] = EndpointAddress.TargetRef.APIVersion TargetRef["resourceVersion"] = EndpointAddress.TargetRef.ResourceVersion TargetRef["fieldPath"] = EndpointAddress.TargetRef.FieldPath patches = append(patches, &patch{ OP: "add", Path: fmt.Sprintf("/subsets/%d/addresses/0", epSubsetIndex), Value: map[string]interface{}{ "ip": EndpointAddress.IP, "hostname": EndpointAddress.Hostname, "nodeName": EndpointAddress.NodeName, "targetRef": TargetRef, }, }) if len(patches) != 0 { patchBytes, _ := json.Marshal(patches) reviewResponse.Patch = patchBytes pt := admissionv1.PatchTypeJSONPatch reviewResponse.PatchType = &pt } } } } } else { klog.Errorf("Get pod's node err %+v", err) } } } return &reviewResponse } Copy code

The main logic is as follows:

  • Check whether AdmissionReview.Request.Resource is the group/version/kind of the endpoints resource
  • Convert AdmissionReview.Request.Object.Raw into endpoints object
  • Set AdmissionReview.Response.Allowed to true, which means that the request is allowed anyway
  • Traverse the endpoints.Subset.NotReadyAddresses, if the node where the EndpointAddress is located is in the ConditionUnknown state and the distributed health check result is normal, move the EndpointAddress from endpoints.Subset.NotReadyAddresses to endpoints.Subset.Addresses

In general, the role of mutateEndpoint is to continuously modify the status of endpoints updated by kube-controller-manager, and move the load on the normal nodes of the distributed health check from endpoints.Subset.NotReadyAddresses to endpoints.Subset.Addresses to enable the service Still available

summary

  • The SuperEdge distributed health check function consists of edge-health-daemon on the side and edge-health-admission on the cloud:
    • edge-health-daemon: Perform distributed health checks on edge nodes in the same area, and send health status voting results to apiserver (annotation for node)
    • edge-health-admission: Constantly adjust the node taint set by kube-controller-manager according to the node edge-health annotation (remove NoExecute taint) and endpoints (move the pods on the disconnected node from endpoint subsets notReadyAddresses to addresses) to achieve The cloud and the edge jointly determine the node status
  • The edge-health-admission cloud component was created because when the cloud edge is disconnected, kube-controller-manager will set the disconnected node to the ConditionUnknown state, and add NoSchedule and NoExecute taints; at the same time the disconnected node The pod is removed from the Endpoint list of the Service. When edge-health-daemon judges that the node status is normal based on the health check on the edge, it will update node: remove NoExecute taint. But after the node is successfully updated, it will be swiped back by kube-controller-manager (add NoExecute taint again), so you must add the Kubernetes mutating admission webhook, that is, edge-health-admission, and change the kube-controller-manager to the node api resource Change and make adjustments, and finally realize the effect of distributed health check
  • Kubernetes Admission Controllers is a part of kube-apiserver processing api requests. It is used to call after api request authentication & authentication and before object persistence to verify or modify the request (or both); including multiple admissions, Most of them are embedded in the kube-apiserver code. Among them, MutatingAdmissionWebhook and ValidatingAdmissionWebhook controllers are special. They call externally constructed mutating admission control webhooks and validating admission control webhooks, respectively.
  • Admission Webhooks is an HTTP callback service that accepts AdmissionReview requests and processes them. According to different processing methods, Admission Webhooks can be classified as follows:
    • validating admission webhook : Through the ValidatingWebhookConfiguration configuration, the api request will be checked for admission, but the request object cannot be modified
    • Mutating admission webhook : Through the MutatingWebhookConfiguration configuration, the api request will be checked for admission and the request object will be modified
  • kube-apiserver will send AdmissionReview(apiGroup:
    admission.k8s.io
    , ApiVersion:
    v1 or v1beta1
    ) To Webhooks and encapsulate them in JSON format; while Webhooks needs to respond to kube-apiserver with the same version of the AdmissionReview and encapsulate them in JSON format, including the following key fields:
    • uid: Copy the AdmissionReview request.uid field sent to webhooks
    • allowed: true means permitted; false means not permitted
    • status: When the request is not allowed, the relevant reason can be given through status (http code and message)
    • patch: base64 encoding, including a series of JSON patch operations on the requested object by mutating admission webhook
    • patchType: currently only supports JSONPatch type
  • edge-health-admission is actually a mutating admission webhook, which selectively modifies endpoints and node UPDATE requests, including the following processing logic:
    • mutateNodeTaint: Constantly correct the node status updated by kube-controller-manager, remove NoExecute(node.kubernetes.io/unreachable) taint, so that the node will not be evicted
    • mutateEndpoint: Constantly revise the status of endpoints updated by kube-controller-manager, move the load on normal nodes of the distributed health check from endpoints.Subset.NotReadyAddresses to endpoints.Subset.Addresses, so that the service is still available