How Angie Ingress Controller Works#

This document describes the operation of the Angie Ingress Controller, also known as ANIC, built on the capabilities of the Angie web server.

We assume that the reader is familiar with the basic concepts of Kubernetes, such as Pod, Deployment, Service, and Endpoints.

What is an Ingress Controller#

An Ingress Controller is a component in a Kubernetes cluster that configures an HTTP load balancer according to the Ingress resources created by the cluster user.

To learn more about Ingress resources, refer to the official Kubernetes documentation.

Angie Ingress Controller at a High Level#

Let’s start with a general overview of ANIC. Consider an example of how ANIC provides internet clients access to two web applications running in a Kubernetes cluster.

The diagram involves:

A Kubernetes cluster.
Cluster users: administrator, user A, and user B, who use the cluster via the Kubernetes API.
Clients A and clients B, who connect to applications A and applications B, deployed by the respective users.
ANIC, deployed by the administrator in a Pod in the angie-ingress namespace and configured using the ConfigMap angie-ingress resource. For simplicity, we depicted only one ANIC Pod; however, the administrator typically deploys at least two Pods to ensure redundancy. ANIC uses the Kubernetes API to retrieve the latest Ingress resources created in the cluster and then configures Angie according to these resources.
Application A with two Pods deployed in the namespace A by user A. To provide access to the application for its clients (clients A) through the host a.example.com, user A creates Ingress resource A.
Application B with one Pod deployed in the namespace B by user B. To provide access to the application for its clients (clients B) through the host b.example.com, user B creates VirtualServer resource B.
A public endpoint that is in front of the ANIC Pods. This is usually a TCP load balancer (cloud, software, or hardware) or a combination of such a load balancer with a NodePort service. Clients A and clients B connect to their applications through the public endpoint.

For simplicity, many necessary Kubernetes resources, such as Deployment and Service, which the administrator and users also need to create, are not accounted for.

Next, we will explore the ANIC Pod.

ANIC Pod#

The ANIC Pod consists of a single container, which, in turn, includes the following:

The ANIC process, which configures Angie according to Ingress and other resources created in the cluster.
The main Angie process, which manages Angie worker processes.
The Angie worker processes, which handle client traffic and balance the load on server applications.

The table below describes each connection along with its type:

No.	Connection Type	Description
	HTTP	Prometheus fetches metrics from ANIC and Angie via the HTTP endpoint provided by ANIC. Note Prometheus is not required for ANIC, and this endpoint can be disabled.
	HTTPS	ANIC accesses the Kubernetes API to obtain the latest versions of resources in the cluster and performs a write to the API to update the statuses of processed resources and emit events.
	HTTP	Kubelet checks the readiness of ANIC (default value `: 8081/angie-ready`) to determine the readiness of the ANIC Pod.
	File I/O	When ANIC starts, it reads from the filesystem the configuration templates necessary for generating configuration. The templates are located in the `/` directory of the container and have a `.tmpl` extension.
	File I/O	ANIC writes logs to its stdout and stderr streams, which are collected by the container runtime.
	File I/O	ANIC generates the Angie configuration based on the resources created in the cluster and writes it to the filesystem in the folder `/etc/angie`. The configuration files have a `.conf` extension.
	File I/O	ANIC writes TLS certificates and keys from all TLS secrets, referenced by Ingress resources and other resources, to the filesystem.
	HTTP	ANIC fetches Angie metrics via the UNIX socket `unix:/var/lib/angie/angie-status.socket` and transforms them into the Prometheus format used in item 1.
	HTTP	To ensure the successful reloading of the configuration, ANIC checks that at least one Angie worker process has the new configuration. To do this, ANIC checks a specific endpoint via the UNIX socket `unix:/var/lib/angie/angie-config-version.sock`.
	N/A	To start Angie, ANIC runs the command `angie`, which starts the main Angie process.
	Signal	To reload Angie, ANIC executes the command `angie -s reload`, which checks the configuration and sends a reload signal to the main Angie process.
	Signal	To shut down Angie, ANIC executes the command `angie -s quit`, which sends a graceful shutdown signal to the main Angie process.
	File I/O	The main Angie process sends log output to its stdout and stderr streams, which are collected by the container runtime.
	File I/O	The main Angie process reads the TLS certificate and keys specified in the configuration upon startup or reload.
	File I/O	The main Angie process reads configuration files during startup or reload.
	Signal	The main Angie process manages the lifecycle of Angie worker processes. It creates worker processes with new configurations and disables processes with old ones.
	File I/O	An Angie worker process sends log output to its stdout and stderr streams, which are collected by the container runtime.
	UDP	An Angie worker process sends HTTP upstream response latency logs in Syslog format via the UNIX socket `/var/lib/angie/angie-syslog.sock` to ANIC. In turn, ANIC analyzes and transforms these logs into Prometheus metrics.
	HTTP, HTTPS, TCP, UDP	A client sends and receives traffic from any of the Angie worker processes through ports 80 and 443 and any additional ports opened in the GlobalConfiguration resource.
	HTTP, HTTPS, TCP, UDP	An Angie worker process sends traffic to proxied servers and receives traffic from them.
	HTTP	An administrator can connect to Angie stub_status, using port 8080, through an Angie worker process. Note By default, Angie allows connections only from `localhost`.

The ANIC Process#

This section discusses the architecture of the ANIC process, including the following topics:

How ANIC handles a new Ingress resource created by the user.
A brief description of how ANIC operates and how it relates to Kubernetes controllers.
Various components of the ANIC process.

Handling a New Ingress Resource#

Below is an explanation of how ANIC handles a new Ingress resource. For simplicity, we present the main and worker processes of Angie as a single block called Angie. Also, note that VirtualServer and VirtualServerRoute resources are handled similarly.

Handling a new Ingress resource involves the following steps:

A user creates a new Ingress resource.
The ANIC process maintains a cache of resources in the cluster. The cache contains only those resources that are of interest to ANIC, such as Ingress. The cache is synchronized with the Kubernetes API, tracking changes in resources.
As soon as a new Ingress resource appears in the cache, it notifies the control loop about the changed resource.
The control loop retrieves the latest version of the Ingress resource from the cache. Since the Ingress resource references other resources, such as TLS secrets, the control loop also fetches the latest versions of any resources that such references point to.
The control loop generates TLS certificates and keys from the TLS secrets and writes them to the filesystem.
The control loop generates and writes configuration files for Angie that correspond to the Ingress resource and writes them to the filesystem.
The control loop reloads Angie and waits for a successful reload of Angie. During the reload:
- Angie reads the TLS certificates and keys.
- Angie reads the configuration files.
The control loop generates an event for the Ingress resource and updates its status. If the reload fails, the event will contain an error message.

ANIC is a Kubernetes Controller#

Based on the example from the previous section, we can summarize the principles of Ingress operation:

ANIC continuously processes both new resources and changes to existing resources in the cluster. As a result, the Angie configuration remains up-to-date for the cluster resources.

ANIC is an example of a Kubernetes controller: ANIC runs a control loop that ensures that Angie is always configured according to the desired state (which includes Ingress resources and other resources).

The desired state is focused on the following built-in and custom (CR) Kubernetes resources:

Layer 7 load balancing configuration:
- Ingress resources
- VirtualServer resources (CR)
- VirtualServerRoute resources (CR)
Layer 7 policies:
- Policy resources (CR)
Layer 4 load balancing configuration:
- TransportServer resources (CR)
Service discovery:
- Service resources
- Endpoint resources
- Pod resources
Secret configuration:
- Secret resources
Global configuration:
- ConfigMap resource (only one resource)
- GlobalConfiguration resource (CR, only one resource)

In the next section, we will discuss the various components of the ANIC process.

Components of the ANIC Process#

In this section, we will describe the components of the ANIC process and how they interact, including the following topics:

How ANIC watches for changes in resources.
The main components of the ANIC control loop.
How these components handle resource changes.
Several additional components that are critical for handling changes.

ANIC is written in Go and largely depends on the Go client for Kubernetes.

Resource Caches#

In the section Handling a New Ingress Resource, we mentioned that ANIC maintains a cache of resources in the cluster that is synchronized with the Kubernetes API, tracking changes in resources. We also noted that once the cache is updated, it notifies the control loop about the changed resource.

The cache actually consists of a set of informers. Below is an explanation of how changes in resources are handled by ANIC.

For each type of resource that ANIC watches, an informer is created. An informer includes a store that holds resources of that type. To synchronize this store with the latest versions of resources in the cluster, the informer uses the Kubernetes watch and list API for that type of resource.
When a change occurs in the cluster (for example, a new resource is created), the informer updates its store and calls handlers for that informer.
ANIC registers handlers for each informer. In most cases, the handler creates an entry for the affected resource in the work queue, where the work queue item includes the resource type, its namespace, and its name.
The work queue always tries to free itself: if there is an item at its front, the queue will remove it and send it to the controller using a callback function.
The controller is the main component in ANIC that implements the control loop. For a description of the components, see the section Control Loop. For now, it is enough to know that to handle a work queue item, the controller retrieves the latest version of the resource from the store, reconfigures Angie according to the resource, updates the resource status, and sends an event via the Kubernetes API.

Control Loop#

This section discusses the main components of ANIC that make up the control loop:

Controller:
- Runs the ANIC control loop.
- Creates instances of informers, handlers, work queue, and additional helper components.
- Includes a synchronization method (see the next section) that calls the work queue to process the changed resource.
- Passes changed resources to the configurator for reconfiguring Angie.
Configurator:
- Generates Angie configuration files, TLS keys, and certificates based on the Kubernetes resource.
- Uses the manager to write the generated files and reload Angie.
Manager:
- Manages the lifecycle of Angie (start, reload, shutdown).
- Manages configuration files, TLS keys, and certificates.

Helper Components#

There are two additional helper components that are critical for handling changes: Configuration and Local Secret Store.

Configuration#

The Configuration contains the latest valid state of the ANIC load balancing configuration resources, such as Ingress resources, VirtualServer, VirtualServerRoute, TransportServer, and GlobalConfiguration.

Configuration supports add (for adding or updating) and delete operations on resources. When you add, update, or delete a resource in the configuration, it does the following:

Validates the object (in case of adding or updating).
Computes changes in the affected resources that need to be passed to Angie’s configuration, returning the changes to the caller.

For example, when you add a new Ingress resource, the configuration returns a change that requires ANIC to add configuration for that resource to Angie’s configuration files. Another example: if you invalidate an existing Ingress resource, the configuration returns a change that requires ANIC to remove configuration for that resource from Angie’s configuration files.

Additionally, the configuration ensures that only one Ingress, VirtualServer, or TransportServer resource (TLS Passthrough) contains a specific host (for example, mysite.example.com), and only one TransportServer resource (TCP, UDP) contains a specific listener (for example, port 53 for UDP). This ensures that there are no collisions between hosts and listeners in Angie’s configuration.

Ultimately, ANIC ensures that the Angie configuration in the filesystem reflects the state of the objects in the configuration at any given time.

Local Secret Store#

The Local Secret Store contains valid secret resources and synchronizes the corresponding files in the filesystem. Secrets are used to store TLS certificates and keys (type kubernetes.io/tls), certificate authorities, as well as client secrets for OIDC providers.

When the controller processes a change in a configuration resource, such as Ingress, it creates an extended version of the resource that includes dependencies, such as secrets, required to generate Angie’s configuration. The Local Secret Store allows the controller to obtain a reference to the filesystem for a secret using the secret key (namespace/name).

Reloading Angie#

The next section discusses the reloading of Angie in general and specifically its implementation in ANIC.

Reloading in General#

Reloading Angie is necessary to apply a new configuration and involves the following actions:

The administrator sends a HUP (hang up) signal to the main Angie process to initiate a reload.
The main process shuts down worker processes with the old configuration and starts worker processes with the new configuration.
The administrator checks that the reload has been successfully completed.

Note

Refer to the Angie documentation for more detailed information on reloading.

How to Perform a Reload#

The Angie binary (angie) supports a reload operation with the -s reload parameter. When you use this option:

The process checks the new Angie configuration and terminates if it is invalid, outputting error messages to stderr.
It sends a HUP signal to the main Angie process and terminates.

Alternatively, you can send a HUP signal directly to the main Angie process.

How to Ensure Reload Success#

The command angie -s reload does not wait for Angie’s reload to complete. As a result, it is the administrator's responsibility to confirm its success. There are several options:

Check if the main process has created new worker processes. For example, by running ps or reading the filesystem /proc.
Send an HTTP request to Angie, and if a new worker process responds, you will know that Angie has successfully reloaded.

Note

This requires additional configuration of Angie.

Reloading takes some time, usually no less than 200 ms. This time depends on the size of the configuration, the number of TLS certificates and keys, the included modules, configuration details, and available CPU resources.

Potential Issues#

In most cases, if the command angie -s reload completes successfully, the reload will also be successful. In rare cases, the reload may fail, and the main Angie process will log an error message. For example:

2022/07/09 00:56:42 [emerg] 1353#1353: limit_req "one" uses the "$remote_addr" key while previously it used the "$binary_remote_addr" key

The operation is performed smoothly; reloading does not result in traffic loss for Angie. However, frequent reloads may lead to increased memory usage and potentially cause Angie to stop with an OOM (out of memory) error, resulting in traffic loss. This can happen if you 1) proxy traffic that uses long-lived connections (such as Websockets, gRPC) and 2) frequently reload the configuration. In this case, you may encounter multiple generations of terminating Angie worker processes (old Angie worker processes will not terminate until all connections are interrupted either by clients or proxied servers unless you configure worker_shutdown_timeout <https://wbsrv.ru/angie-pro/docs/core/#worker-shutdown-timeout>, which forces old worker processes to terminate after a timeout). Ultimately, all these worker processes may exhaust the available system memory.

Since both old and new Angie worker processes coexist during the reload, it can lead to a sharp increase in memory usage up to double. Due to a lack of available memory, the main Angie process may lose the ability to create new worker processes.