Volume Manager Controller
Architecture
The Volume Manager Controller is responsible for receiving information from the Observability Collectors and Observability Processors. It then performs a set of operations to contextualize and analyze this information. Along with the user-provided policies, the Controller generates insights and conducts automation operations to control and optimize the volume of observability data.
The volume manager is built around pipelines of tasks that are executed asynchronously. Each task performs a different function, and the pipelines are configurable to allow flexibility and support for various use cases.
Tasks can be optimized and parallelized to enhance scalability and efficiency. Each task is classified based on its type, which determines its specific functionality and characteristics. The input and output data structures of each task differ from one another. The controller manages and arranges the tasks in Directed Acyclic Graphs (DAGs), and performs operations on them periodically and on-demand. The data ingestion into the controller is batched and processed between the various task operations.
The controller requires data persistence to efficiently analyze observability data over time. Data persistence is decoupled from computing tasks, allowing flexibility for specific use cases.
The controller's behavior is governed by a collection of high-level semantic policies that are visible to the user. This enables the controller to be managed based on intentions. The policies are analyzed by the controller and compared with the observed data to provide insights and configurations for handling the volume of observability data.
Task flows
Basic
For basic use cases, the volume manager controller tasks can be configured as follows:
flowchart LR
direction TB
subgraph Ingress[" "]
direction TB
Ingress-Node["Ingest\n(prometheus metrics)"]
end
subgraph User[" "]
direction TB
User-Policy{{"User\nPolicy"}}
User-Policy:::policyclass
User-Node["Rules\nEngine"]
User-Policy-.->User-Node
end
subgraph Insights[" "]
direction TB
Insights-Node["Volume reduction\nInsights"]
end
subgraph Automation[" "]
direction TB
Automation-Policy{{"Automation\nPolicy"}}
Automation-Policy:::policyclass
Automation-Node["Configuration\nGenerator"]
Automation-Policy-.->Automation-Node
end
subgraph Egress[" "]
direction TB
Egress-Node["Egress\n(Processor\nConfiguration)"]
end
classDef policyclass fill:lightblue
Ingress --> User --> Insights --> Automation --> Egress
Note: In this MVP controller pipeline. The configuration of the controller uses a subset of the available tasks.
Advanced
In advanced use cases, the task pipeline can be extended to provide additional capabilities. For example:
flowchart LR
direction TB
subgraph MetricsIngress[" "]
direction TB
MetricsIngress-Node["Ingest\n(prometheus signals)"]
end
subgraph LogsIngress[" "]
direction TB
LogsIngress-Node["Ingest\n(logs signals)"]
end
subgraph Grouping[" "]
direction TB
Grouping-Policy{{"Grouping\nPolicy"}}
Grouping-Policy:::policyclass
Grouping-Node["Signal-Grouping\n(labels)"]
Grouping-Policy-.->Grouping-Node
end subgraph Features[" "]
direction TB
Features-Node["Features-Extraction\n(labels)"]
end
subgraph ObserFeatures[" "]
direction TB
ObserFeatures-Node["Observability-Domain\nFeatures\n(Extra labels)"]
end
subgraph User[" "]
direction TB
User-Policy{{"User\nPolicy"}}
User-Policy:::policyclass
User-Node["Rules\nEngine"]
User-Policy-.->User-Node
end
subgraph Insights[" "]
direction TB
Insights-Node["Volume reduction\nInsights"]
end
subgraph Automation[" "]
direction TB
Automation-Policy{{"Automation\nPolicy"}}
Automation-Policy:::policyclass
Automation-Node["Configuration\nGenerator"]
Automation-Policy-.->Automation-Node
end
subgraph Egress[" "]
direction TB
Egress-Node["Egress\n(Processor\nConfiguration)"]
end
classDef policyclass fill:lightblue
MetricsIngress --> Grouping --> Features --> ObserFeatures --> User --> Insights --> Automation --> Egress
LogsIngress --> Grouping
Task types
Following are basic explanations for each of the task types:
Task Type | Description |
---|---|
Ingest | Ingest tasks are responsible for ingesting information into the controller. The objects ingested into the controller are “signals”. Signals can be each of the basic common observability data sources: metrics, logs, traces, etc. Signals can be ingested into the controller both synchronously and asynchronously. |
Grouping | Grouping tasks use data from the signals to cluster, group, and partition multiple signals into a signal group . The grouping rules are policy-driven and rely on meta-data and data from the signals. The clustering is based on raw labels provided as part of the signals. |
Feature-Extraction | Feature Extraction tasks are responsible for the basic statistical analysis of the signals. They analyze the signal behaviors and produce a basic set of understandings of the signals. |
Observability-Analysis | Observability analysis tasks are responsible for domain-specific analysis of signals generating observability-level understandings of signals |
User-Policy-Analyzer | The policy enforcer will intersect the user-provided policies with the observability signals information and analysis gathered by the controller to generate policy driver information. |
System-Policy-Analyzer | system policy analyzer tasks are responsible for the analysis of system behavior and the correlation of system risk analysis with the observability data. The tasks will annotate the signals with relevant information to identify the dynamic applicability of the signals according to the policies |
Signal-Insight | Insight tasks are responsible for the generation of volume management insights. Those insights are user-facing outputs of the pipelines and according to policies can be consumed by the users or pushed. The insights are tangible, environment-specific, and dynamic. They provide insights into volume management behaviors and action recommendations. |
Automation/configuration generator | Automation tasks are responsible for the generation of per-processor configuration based on the action recommendation of the insights tasks. The configurations are sent using a customer-provided control plane to the processors to enforce the volume management reductions automatically. |
TBD:
Configurations the configuration tasks are
Feature extraction feature
Persistence:
Objects to consider: