Choosing your ITRS Analytics deployment

This guide helps you choose how to deploy ITRS Analytics based on what matters most to your organization, including where the platform is hosted, who operates it, resiliency, and backup capabilities. It also explains the trade-offs associated with each option. Your choice of deployment model directly affects high availability, operational continuity, and your ability to meet uptime and compliance requirements.

Use the steps below to align your deployment with your organization’s requirements.

Identify your business requirements Copied

Before choosing a deployment model, first define your organization’s requirements. Consider the following:

Data residency — Where must your data be stored? Is an AWS region acceptable, or must it remain on-premises or within a private cloud?
Operational ownership — Who will operate the platform? Should ITRS manage the infrastructure and operations, or will your team be responsible for running it?
Availability and resilience — Do you require production-grade high availability or backup and restore capabilities, or is a simpler setup sufficient for evaluation purposes?
Kubernetes environment — Do you already have a Kubernetes platform or team in place (for example, EKS, AKS, GKE, OpenShift, or a self-hosted Kubernetes cluster)?
Compliance requirements — Are there compliance, audit, or data retention requirements that influence where or how the platform must run?
Cost model preference — Do you prefer a predictable managed bundle (software and hosting) or a software-only model using your own infrastructure?

Choose between SaaS and Self-hosted Copied

Your first decision is whether ITRS manages the platform for you (SaaS) or your organization operates it on your own infrastructure (self-hosted). This choice determines who is responsible for running, maintaining, and securing the platform, as well as where the system is hosted.

Feature	SaaS	Self-hosted
Hosting location	AWS Public Cloud, managed by ITRS	On-premises, private cloud, or public cloud via Bring Your Own (Kubernetes) Cluster or Embedded Cluster
Platform management	ITRS Cloud Operations teams	Internal Kubernetes DevOps team
Data residency	Client’s choice of AWS Region (costs may vary by region)	Client’s full choice of location and environment
Security	TLS, mTLS, Equinix Cloud Connect	TLS, mTLS, service mesh
High availability	Deployment spans two availability zones within a single region	Requires an HA Kubernetes design: minimum three controller nodes for Embedded Cluster HA, workload replicas distributed across failure domains, and a customer-managed load balancer; see resource and hardware requirements
Data backup	Daily automated backups	Daily backups available for BYOC deployments
Upgrade management	Planned upgrades, client-approved, no more than two weeks after a release	Client’s responsibility for image mirroring and upgrade execution
Application customization	Full identity, role, and access management available with SSO	Full identity, role, and access management available with SSO
Cost model	Software costs plus hosting costs (Small, Medium, Large, Extra-Large sizes); optional 2 TB additional storage	Software costs only; infrastructure costs are the client’s responsibility

When to choose SaaS Copied

Choose SaaS when:

You want ITRS to manage platform operations, upgrades, and infrastructure health.
Your team does not have dedicated Kubernetes or cloud operations expertise.
You require a predictable, fully managed cost model that bundles software and hosting.
Data residency requirements can be met by an available AWS region.
You need enterprise-grade high availability across two availability zones without managing the underlying infrastructure yourself.

When to choose Self-hosted Copied

Choose self-hosted when:

Your organization requires data to remain within a specific on-premises environment or private cloud that is not covered by SaaS regional options.
You have an existing Kubernetes operations team capable of managing platform lifecycle.
Your organization’s procurement, security, or compliance policies require full infrastructure ownership.
You need to integrate ITRS Analytics into an existing internal platform ecosystem (networking, security tooling, storage, observability pipelines).

Note

With self-hosted deployments, your internal teams are responsible for patching, upgrades, backup management, and maintaining infrastructure resiliency. Ensure you have the appropriate platform expertise before selecting this model.

If your requirements point to self-hosted, the next step is how you run Kubernetes.

Key resiliency concepts Copied

When planning your ITRS Analytics deployment, these fundamental concepts work together to define the platform’s operational characteristics.

High availability (HA) Copied

High availability ensures that your observability platform continues to operate without interruption, even if individual components fail. This is achieved by deploying redundant services, load balancers, and failover mechanisms so that if one workload becomes unavailable, another seamlessly takes over.

Key characteristics:

Both BYOC and Embedded Cluster support full high availability.
In multi-node deployments, no node should have a round-trip time (RTT) greater than 10 ms to any other node in the Kubernetes cluster.
The key architectural difference is storage:
- BYOC typically uses network-attached persistent volumes, so workloads may be rescheduled to surviving nodes after a failure, provided sufficient spare capacity exists.
- Embedded Cluster uses node-local persistent volumes, so when a node fails, workloads that depend on that storage cannot be rescheduled elsewhere.
As a result, an Embedded Cluster continues running in a degraded state until the affected node returns, with fewer replicas, reduced capacity, and lower fault tolerance.

Continuous operations Copied

Continuous operations means that the platform keeps running during localized failures (pod or node outages) within a single cluster.

Important
ITRS Analytics does not provide built-in cross-site or cross-region disaster recovery. For protection against data center or regional failures, you must run multiple independent deployments and implement your own DR strategy (sync, failover, runbooks).

Deploy Kubernetes in a Self-hosted environment Copied

ITRS Analytics is built on a Kubernetes-native architecture, designed for continuous high availability, scalable deployments, and resilient operations. If you select self-hosted, you must then choose how Kubernetes is provisioned: either by Bring Your Own (Kubernetes) Cluster (BYOC) or by using ITRS’s bundled Kubernetes distribution (Embedded Cluster).

Note

Bring Your Own (Kubernetes) Cluster (BYOC) is the recommended deployment model for production. It typically uses network-attached persistent volumes, which means workloads can be rescheduled to surviving nodes after a failure if sufficient spare capacity exists and the volumes remain accessible over the network.

Embedded Cluster also supports high availability, but it uses node-local persistent volumes. If a node fails, workloads that depend on that storage cannot be rescheduled elsewhere, so the cluster continues in a degraded state until the affected node returns. Choose Embedded Cluster for production environments where Kubernetes expertise is not available.

Designing a resilient ITRS Analytics deployment Copied

Select BYOC when enterprise high availability, backup and restore, standard security tooling integration, and predictable scaling are required, and when your Kubernetes platform can reschedule workloads onto surviving nodes with sufficient spare capacity.
Select Embedded Cluster for production environments where Kubernetes expertise is unavailable; note that a failed node causes the cluster to run in a degraded state with fewer replicas and reduced capacity until that node returns, because workloads cannot be rescheduled to surviving nodes.

Feature	Bring Your Own Cluster (BYOC)	Embedded Cluster
Platform ownership	Customer-managed Kubernetes (EKS, AKS, GKE, OpenShift, self-hosted)	Kubernetes bundled and managed through ITRS-packaged K0s
High availability	Full HA with workload rescheduling across surviving nodes when a node fails, provided sufficient spare capacity exists and network-attached storage remains accessible	Full HA through application-level replication; a node failure causes degraded operation with fewer replicas and reduced capacity until the node returns, because workloads cannot be rescheduled due to node-local storage
Storage architecture	Persistent volumes are decoupled through storage classes; supports dynamic expansion	Storage is tied to local node disks; data loss risk if a node fails without HA configured
Backup and restore	Supported using platform tooling such as Velero	Infrastructure-level backup should be used; Velero does not support the node-local filesystem storage classes used by Embedded Cluster, so use VM snapshots from the hypervisor or storage-level snapshots from the underlying storage platform
Load balancing and networking	Supports native cloud and enterprise load balancers with DNS integration, such as AWS NLB, Azure Load Balancer, GCP Load Balancer, or F5	No built-in load balancer; customers must supply and manage their own, such as HAProxy, keepalived, F5, AWS NLB, Azure Load Balancer, or GCP Load Balancer. A bundled software load balancer is not included because these solutions depend on environment-specific external network cooperation such as ARP/GARP or BGP, which is commonly restricted or unsupported across cloud and many on-premises networks
Security	Kubernetes-native security model integrates cleanly with platform operations and customer-controlled controls such as network policies, admission controllers, and pod security standards	Host-level security controls such as antivirus, EDR agents, SSL/TLS inspection, and host firewalls must be validated and exclusions configured before installation; these controls can block image pulls, container runtime activity, or inter-node communication
Operational responsibility	Clearly divided across infrastructure, platform, and application teams; cluster issues resolved at the appropriate layer	Cluster issues must be escalated to ITRS because customers do not have direct access to the bundled K0s Kubernetes layer or its diagnostic tools
Maintenance and patching	Integrates with existing customer patching and lifecycle processes for Kubernetes, OS, storage, and networking	Increased coordination risk; patching and upgrades may require downtime and careful change management
Disaster recovery	Not built-in; deploy multiple independent ITRS Analytics instances for DR	Not built-in; deploy multiple independent ITRS Analytics instances for DR

The following scenarios illustrate how the choice between BYOC and Embedded Cluster plays out in practice across key operational areas: load balancing, storage scalability, recovery behavior, security, and team responsibilities.

Ensuring resilient access with load balancers Copied

Scenario: Your organization runs multiple ITRS Analytics ingestion services and UIs that must remain accessible even during high traffic spikes.

In a Bring Your Own Cluster environment, especially in cloud-based setups, a load balancer is typically readily available and integrates seamlessly with Kubernetes. It distributes traffic across multiple service replicas and often integrates with DNS services, helping maintain stable URLs and endpoints during scaling events or network changes.

In Embedded Cluster deployments, a load balancer is still required but is not provided as part of the deployment. Customers must supply and manage their own load balancer, which can be hardware-based or software-based. This usually requires additional planning and coordination with the network or infrastructure team.

Scaling storage dynamically with decoupled storage classes Copied

Scenario: Your ClickHouse workload grows steadily from 500GB to several terabytes of data over time.

In a Bring Your Own (Kubernetes) Cluster (BYOC) environment, storage is decoupled from individual nodes. Persistent volumes remain on network-attached backing storage, so rescheduled pods can mount the same volumes from surviving nodes that can reach the storage network, and extensible storage classes allow volumes to grow seamlessly as data increases.

In Embedded Cluster (EC) deployments, storage is tied to local node disks. If a node becomes unavailable, the workloads depending on that node cannot be rescheduled elsewhere, and the system may operate in a degraded state until the node comes back online.

Understanding degraded operation after node failure Copied

Scenario: A node in your ITRS Analytics deployment fails unexpectedly during peak monitoring hours.

In a Bring Your Own (Kubernetes) Cluster (BYOC) environment, Kubernetes may reschedule affected workloads onto surviving nodes if sufficient spare capacity exists and network-attached persistent volumes remain accessible. Services can remain available with limited disruption, but this depends on the resilience and capacity of the underlying Kubernetes platform.

In an Embedded Cluster (EC) deployment, surviving replicas continue running automatically, but workloads tied to the failed node’s local storage cannot be rescheduled elsewhere. The cluster therefore runs in a degraded state with fewer replicas, reduced capacity, and lower fault tolerance until the node comes back online. If another node fails while the cluster is already degraded, the risk of service disruption or data unavailability increases because there is less remaining redundancy.

Meeting security requirements Copied

Scenario: Your IT security team requires visibility and control over Kubernetes-native security policies.

In a Bring Your Own Cluster environment, customers can apply and manage Kubernetes-native controls such as network policies, admission controllers, and pod security standards as part of standard cluster governance.

In an Embedded Cluster deployment, this is a less meaningful differentiator because customers have already accepted a more black-box operating model for the Kubernetes layer running on the provided virtual machines.

Scenario: Your organization secures all servers with antivirus, EDR agents, SSL/TLS inspection, and host firewalls.

For Embedded Cluster, these host-level controls must be validated before installation. Customers should configure exclusions or bypass rules so that security tooling does not block installation steps, image pulls, container runtime activity, service-to-service TLS, or inter-node communication. In BYOC environments, these controls are typically handled through the organization’s existing Kubernetes platform hardening and node management practices.

Streamlined support across teams Copied

Scenario: Your organization has separate teams for infrastructure, platform, and application operations.

In a Bring Your Own Cluster setup, responsibilities are clearly divided: infrastructure teams manage nodes, platform teams administer Kubernetes, and application teams deploy and manage ITRS Analytics. Issues can be addressed at the appropriate layer.

With Embedded Cluster, cluster-level issues must be escalated to ITRS because customers do not have direct access to the bundled K0s Kubernetes layer or the usual platform-level diagnostic tools needed to investigate those problems themselves.

Deployment scenarios Copied

The following sections describe various deployment scenarios, each with specific benefits and trade-offs. Understanding these helps you select the right configuration for your requirements.

Non-HA single or multi-node (BYOC) Copied

This configuration is suitable for proof-of-concept deployments and smaller production environments where high availability is not a strict requirement.

Common use cases:

SaaS proof-of-concepts
Small SaaS Geneos or Opsview observability deployments
Development and testing environments

Characteristics:

Lower infrastructure costs
Backup and restore available with 24-hour recovery time objective
Suitable for environments with flexible uptime requirements
Managed by ITRS cloud operations teams for SaaS deployments

Note
Proof-of-concept deployments come with no guarantee of high availability for stored data due to their exploratory nature.

Non-HA single or multi-node (Embedded Cluster) Copied

This configuration is similar to the Bring Your Own Cluster non-HA configuration, but it is deployed on-premises using Embedded Cluster. This option has additional limitations around data protection.

Common use cases:

On-premises proof-of-concept deployments
Small production use cases with relaxed uptime requirements

Important considerations:

Lower infrastructure costs
Velero-based backup and restore is not supported for node-local filesystem storage classes; use infrastructure-level backups such as hypervisor VM snapshots or storage-platform snapshots instead
Risk of complete data loss if a node fails catastrophically
Requires complete rebuild if storage is lost

Warning
Before using Embedded Cluster in production, plan and validate an alternative backup strategy such as hypervisor VM snapshots or storage-level snapshots. Without a tested backup strategy, a node failure that causes disk loss can result in permanent data loss with no recovery path.

Plan your rollout Copied

Use the following summary to confirm your deployment choice before proceeding to installation.

Requirement	Recommended model
ITRS manages all infrastructure	SaaS
Data must stay in a specific AWS region	SaaS
Data must stay on-premises or in a private cloud	Self-hosted
Full control over Kubernetes platform	Self-hosted and BYOC
No in-house Kubernetes expertise; evaluation or constrained production requirements	Self-hosted and Embedded Cluster
No in-house Kubernetes expertise; production deployment with sufficient replicas on all nodes and a validated external backup strategy	Self-hosted and Embedded Cluster
Production-grade high availability required	SaaS or Self-hosted (BYOC or Embedded Cluster with sufficient replicas and accepted degraded-operation trade-offs)
Backup and restore required	SaaS or Self-hosted and BYOC
Security team must enforce Kubernetes-native controls such as network policies, admission controllers, or image and pod-layer scanning	Self-hosted and BYOC
Existing cloud Kubernetes service (EKS, AKS, GKE)	Self-hosted and BYOC
Deployment on VMs or bare metal without Kubernetes	Self-hosted and Embedded Cluster
Strict compliance or audit data retention requirements	SaaS or Self-hosted and BYOC

To continue planning your deployment, refer to the following resources:

For backup and restore procedures, see the Backup and restore documentation.
For infrastructure sizing and resource requirements, see ITRS Analytics Sizer.

Previous article Next article

Choosing your ITRS Analytics deployment

Identify your business requirements Copied

Choose between SaaS and Self-hosted Copied

When to choose SaaS Copied

When to choose Self-hosted Copied

Key resiliency concepts Copied

High availability (HA) Copied

Continuous operations Copied

Deploy Kubernetes in a Self-hosted environment Copied

Designing a resilient ITRS Analytics deployment Copied

Ensuring resilient access with load balancers Copied

Scaling storage dynamically with decoupled storage classes Copied

Understanding degraded operation after node failure Copied

Meeting security requirements Copied

Streamlined support across teams Copied

Deployment scenarios Copied

Non-HA single or multi-node (BYOC) Copied

Non-HA single or multi-node (Embedded Cluster) Copied

Plan your rollout Copied

Was this topic helpful?

Your thoughts...

How can we improve this topic?

Your thoughts...

Thank you for your feedback!