This article describes how Anodot Cost works with Kubernetes usage and waste, and includes:
- Information collected on Kubernetes clusters and how Anodot Cost organizes it.
- The sources of information: where does Anodot Cost collect it from?
- The logic used to calculate waste.
- What are weights used for?
See the video for an overview:
Information collected on Kubernetes clusters and how Anodot Cost organizes it
The process of collecting Kubernetes data is made of two main parts: Mapping Data and Cost Data.
The mapping data process is the process of getting basic information from the cloud provider and storing it in the Anodot Cost database. The information collected includes:
- Static information (settings derived metrics):
- Limits (cpu-limit, memory-limit)
- Requests (cpu-request, memory-request)
- Entities - clusters, nodes, pods, namespaces, node-groups, etc.
- The relationship between them
- Dynamic information that is the actual usage for:
- Nodes and pods
- Multiple metrics (CPU, memory, storage, data transfer, network, etc).
Processing Stage
The processing stage:
- Assigns each of the K8S entities its proportional part of the total K8S costs (combining the data from both Cost and Mapping phase).
- Distributes the costs in a manner that reflects the usage.
The yellow arrows demonstrate the flow of the process, where Anodot Cost correlates the data between the mapping and cost.
Sources
There are a number of sources used to collect data, as listed in the table below:
Cloud provider | Source | Used for |
---|---|---|
AWS | CloudWatch agent | K8s |
GCP | BigQuery | Cost Metrics + K8s |
GCP | GCP Monitoring Client | K8s + Recommendations |
Azure | Azure log client | K8s |
AWS, GCP, Azure | Prometheus agent | K8s |
.
The connection to the Cost Data
The following image illustrates how Anodot Cost correlates the usage with the cost.
The logic used to calculate Waste
All resources that are not in use (CPU, memory, etc.) are considered to be waste. There is a waste at the pod level (when usage is smaller than requested), and also waste at the node level (when total pods usage is smaller than the node capacity).
By using namespaces, you can organize clusters into virtual sub-clusters - they can be especially helpful when different teams or projects share a Kubernetes cluster. Any number of namespaces are supported within a cluster, each logically separated but with the ability to communicate with each other.
Waste at the node level that is not associated to any namespace is shown on the Kubernetes Cost & Usage Explorer as “not allocated”.
The following three examples show the different waste scenarios.
Example: No waste at the pod level
As the actual usage is higher than requested, there is no waste.
Example: Waste at the pod level
As the actual usage is lower than the requested usage, waste occurs as the pod is underutilized.
Example: Waste at the node level
The example below shows a node with 32GB of RAM that is populated with 3 pods using 30GB and leaving 2GB of unused memory.
- For pod-level waste, Anodot Cost applies the calculation for any resource that we have data of request (reservation) setting VS. data of actual usage.
- For node-level waste, Anodot Cost applies the calculation for any resource what we have data of capacity VS. actual usage data (which can be aggregated from pods).
Note that, currently, Anodot Cost only shows Memory and CPU waste on pod and nodes levels. Pod-level waste is still measured as part of the pod usage, therefore the waste-cost is accounted for its namespace.
Using the Allocate Waste Cost option (available in the Kubernetes Explorer, as shown below), you can distribute the node-level unused cost among the pods of each node (with respect to the portion of the pod’s usage in the node). This ensures each cost is associated to an actual namespace (for accounting purposes).
For example:
In the Kubernetes Cost & Usage Explorer, when grouping by namespace It is possible to see:
- Namespaces which are “Not allocated” - highlighted in the Legend below the graph - representing the cost of nodes without a namespace.
- A specific namespace - called “devops” with cost of $597.46.
On the next screenshot, we can see that when “Allocate waste cost״ is selected the following occurs:
- “Not allocated” namespaces disappeared and their cost overloaded in a proportional way on existing namespaces.
- The “devops” namespace cost increased to $724, which represents its share for the not allocated based on its proportional cost to the other metrics.
What are Weights used for?
Note this feature is only relevant for AWS Cloud accounts.
Weights are used to overcome issues where Anodot Cost might not be able to definitively determine a pod cost. Therefore Anodot Cost gives the user the choice to assign weights per instance type to show the costs accordingly. If no weights are specified, Anodot Cost assigns a default ratio of 1:1 between the resources.
The examples below demonstrate how weights can help solve a number of issues.
In AWS, behind each node there’s an EC2 instance, with a fixed price based on its type. This raises a challenge in dividing the cost to pods hosted by this instance.
In a simplified example, where each pod used 25% of the node’s capacity, dividing the cost is easy. The waste in this example is obviously $50.
However, in reality, the ratio of pod usage vs. node capacity varies with each resource. Anodot Cost can bind the cost-worth of each pod to a range, but it cannot provide a definitive cost value for an entire pod individually. There’s no “correct” answer.
The value of each isolated resource (RAM, CPU, network) is subjective. The only objective value is the total cost for the entire instance.
Therefore, to overcome this, Anodot Cost gives the user the choice to assign weights per instance type to split the costs accordingly.
Note that If no weights are specified, Anodot Cost assigns a default ratio of 1:1 between the resources. The weight options can be found under the Kubernetes > Preferences option, as shown below.
Network usage and cost
Currently Anodot Cost also considers the usage of “Network” to be a compute-resource, which refers to any data transfer activity regardless of an actual cost.
As a compute-resource, “Network” has the same functionality as RAM and CPU when setting “cost weights”, but by default, it is set to zero (so no cost originating from the EC2 price is allocated for the pod’s network usage).