Choosing your ITRS Analytics deployment

This guide helps you choose how to deploy ITRS Analytics based on what matters most to your organization, including where the platform is hosted, who operates it, resiliency, and backup capabilities. It also explains the trade-offs associated with each option. Your choice of deployment model directly affects high availability, operational continuity, and your ability to meet uptime and compliance requirements.

Use the steps below to align your deployment with your organization’s requirements.

Identify your business requirements Copied

Before choosing a deployment model, first define your organization’s requirements. Consider the following:

Choose between SaaS and Self-hosted Copied

Your first decision is whether ITRS manages the platform for you (SaaS) or your organization operates it on your own infrastructure (self-hosted). This choice determines who is responsible for running, maintaining, and securing the platform, as well as where the system is hosted.

Feature SaaS Self-hosted
Hosting location AWS Public Cloud, managed by ITRS On-premises, private cloud, or public cloud via Bring Your Own (Kubernetes) Cluster or Embedded Cluster
Platform management ITRS Cloud Operations teams Internal Kubernetes DevOps team
Data residency Client’s choice of AWS Region (costs may vary by region) Client’s full choice of location and environment
Security TLS, mTLS, Equinix Cloud Connect TLS, mTLS, service mesh
High availability Deployment spans two availability zones within a single region Requires an HA Kubernetes design: minimum three controller nodes for Embedded Cluster HA, workload replicas distributed across failure domains, and a customer-managed load balancer; see resource and hardware requirements
Data backup Daily automated backups Daily backups available for BYOC deployments
Upgrade management Planned upgrades, client-approved, no more than two weeks after a release Client’s responsibility for image mirroring and upgrade execution
Application customization Full identity, role, and access management available with SSO Full identity, role, and access management available with SSO
Cost model Software costs plus hosting costs (Small, Medium, Large, Extra-Large sizes); optional 2 TB additional storage Software costs only; infrastructure costs are the client’s responsibility

When to choose SaaS Copied

Choose SaaS when:

When to choose Self-hosted Copied

Choose self-hosted when:

Note

With self-hosted deployments, your internal teams are responsible for patching, upgrades, backup management, and maintaining infrastructure resiliency. Ensure you have the appropriate platform expertise before selecting this model.

If your requirements point to self-hosted, the next step is how you run Kubernetes.

Key resiliency concepts Copied

When planning your ITRS Analytics deployment, these fundamental concepts work together to define the platform’s operational characteristics.

High availability (HA) Copied

High availability ensures that your observability platform continues to operate without interruption, even if individual components fail. This is achieved by deploying redundant services, load balancers, and failover mechanisms so that if one workload becomes unavailable, another seamlessly takes over.

Key characteristics:

Continuous operations Copied

Continuous operations means that the platform keeps running during localized failures (pod or node outages) within a single cluster.

Important

ITRS Analytics does not provide built-in cross-site or cross-region disaster recovery. For protection against data center or regional failures, you must run multiple independent deployments and implement your own DR strategy (sync, failover, runbooks).

Deploy Kubernetes in a Self-hosted environment Copied

ITRS Analytics is built on a Kubernetes-native architecture, designed for continuous high availability, scalable deployments, and resilient operations. If you select self-hosted, you must then choose how Kubernetes is provisioned: either by Bring Your Own (Kubernetes) Cluster (BYOC) or by using ITRS’s bundled Kubernetes distribution (Embedded Cluster).

Note

Bring Your Own (Kubernetes) Cluster (BYOC) is the recommended deployment model for production. It typically uses network-attached persistent volumes, which means workloads can be rescheduled to surviving nodes after a failure if sufficient spare capacity exists and the volumes remain accessible over the network.

Embedded Cluster also supports high availability, but it uses node-local persistent volumes. If a node fails, workloads that depend on that storage cannot be rescheduled elsewhere, so the cluster continues in a degraded state until the affected node returns. Choose Embedded Cluster for production environments where Kubernetes expertise is not available.

Designing a resilient ITRS Analytics deployment Copied

Feature Bring Your Own Cluster (BYOC) Embedded Cluster
Platform ownership Customer-managed Kubernetes (EKS, AKS, GKE, OpenShift, self-hosted) Kubernetes bundled and managed through ITRS-packaged K0s
High availability Full HA with workload rescheduling across surviving nodes when a node fails, provided sufficient spare capacity exists and network-attached storage remains accessible Full HA through application-level replication; a node failure causes degraded operation with fewer replicas and reduced capacity until the node returns, because workloads cannot be rescheduled due to node-local storage
Storage architecture Persistent volumes are decoupled through storage classes; supports dynamic expansion Storage is tied to local node disks; data loss risk if a node fails without HA configured
Backup and restore Supported using platform tooling such as Velero Infrastructure-level backup should be used; Velero does not support the node-local filesystem storage classes used by Embedded Cluster, so use VM snapshots from the hypervisor or storage-level snapshots from the underlying storage platform
Load balancing and networking Supports native cloud and enterprise load balancers with DNS integration, such as AWS NLB, Azure Load Balancer, GCP Load Balancer, or F5 No built-in load balancer; customers must supply and manage their own, such as HAProxy, keepalived, F5, AWS NLB, Azure Load Balancer, or GCP Load Balancer. A bundled software load balancer is not included because these solutions depend on environment-specific external network cooperation such as ARP/GARP or BGP, which is commonly restricted or unsupported across cloud and many on-premises networks
Security Kubernetes-native security model integrates cleanly with platform operations and customer-controlled controls such as network policies, admission controllers, and pod security standards Host-level security controls such as antivirus, EDR agents, SSL/TLS inspection, and host firewalls must be validated and exclusions configured before installation; these controls can block image pulls, container runtime activity, or inter-node communication
Operational responsibility Clearly divided across infrastructure, platform, and application teams; cluster issues resolved at the appropriate layer Cluster issues must be escalated to ITRS because customers do not have direct access to the bundled K0s Kubernetes layer or its diagnostic tools
Maintenance and patching Integrates with existing customer patching and lifecycle processes for Kubernetes, OS, storage, and networking Increased coordination risk; patching and upgrades may require downtime and careful change management
Disaster recovery Not built-in; deploy multiple independent ITRS Analytics instances for DR Not built-in; deploy multiple independent ITRS Analytics instances for DR

The following scenarios illustrate how the choice between BYOC and Embedded Cluster plays out in practice across key operational areas: load balancing, storage scalability, recovery behavior, security, and team responsibilities.

Ensuring resilient access with load balancers Copied

Scenario: Your organization runs multiple ITRS Analytics ingestion services and UIs that must remain accessible even during high traffic spikes.

In a Bring Your Own Cluster environment, especially in cloud-based setups, a load balancer is typically readily available and integrates seamlessly with Kubernetes. It distributes traffic across multiple service replicas and often integrates with DNS services, helping maintain stable URLs and endpoints during scaling events or network changes.

In Embedded Cluster deployments, a load balancer is still required but is not provided as part of the deployment. Customers must supply and manage their own load balancer, which can be hardware-based or software-based. This usually requires additional planning and coordination with the network or infrastructure team.

Scaling storage dynamically with decoupled storage classes Copied

Scenario: Your ClickHouse workload grows steadily from 500GB to several terabytes of data over time.

In a Bring Your Own (Kubernetes) Cluster (BYOC) environment, storage is decoupled from individual nodes. Persistent volumes remain on network-attached backing storage, so rescheduled pods can mount the same volumes from surviving nodes that can reach the storage network, and extensible storage classes allow volumes to grow seamlessly as data increases.

In Embedded Cluster (EC) deployments, storage is tied to local node disks. If a node becomes unavailable, the workloads depending on that node cannot be rescheduled elsewhere, and the system may operate in a degraded state until the node comes back online.

Understanding degraded operation after node failure Copied

Scenario: A node in your ITRS Analytics deployment fails unexpectedly during peak monitoring hours.

In a Bring Your Own (Kubernetes) Cluster (BYOC) environment, Kubernetes may reschedule affected workloads onto surviving nodes if sufficient spare capacity exists and network-attached persistent volumes remain accessible. Services can remain available with limited disruption, but this depends on the resilience and capacity of the underlying Kubernetes platform.

In an Embedded Cluster (EC) deployment, surviving replicas continue running automatically, but workloads tied to the failed node’s local storage cannot be rescheduled elsewhere. The cluster therefore runs in a degraded state with fewer replicas, reduced capacity, and lower fault tolerance until the node comes back online. If another node fails while the cluster is already degraded, the risk of service disruption or data unavailability increases because there is less remaining redundancy.

Meeting security requirements Copied

Scenario: Your IT security team requires visibility and control over Kubernetes-native security policies.

In a Bring Your Own Cluster environment, customers can apply and manage Kubernetes-native controls such as network policies, admission controllers, and pod security standards as part of standard cluster governance.

In an Embedded Cluster deployment, this is a less meaningful differentiator because customers have already accepted a more black-box operating model for the Kubernetes layer running on the provided virtual machines.

Scenario: Your organization secures all servers with antivirus, EDR agents, SSL/TLS inspection, and host firewalls.

For Embedded Cluster, these host-level controls must be validated before installation. Customers should configure exclusions or bypass rules so that security tooling does not block installation steps, image pulls, container runtime activity, service-to-service TLS, or inter-node communication. In BYOC environments, these controls are typically handled through the organization’s existing Kubernetes platform hardening and node management practices.

Streamlined support across teams Copied

Scenario: Your organization has separate teams for infrastructure, platform, and application operations.

In a Bring Your Own Cluster setup, responsibilities are clearly divided: infrastructure teams manage nodes, platform teams administer Kubernetes, and application teams deploy and manage ITRS Analytics. Issues can be addressed at the appropriate layer.

With Embedded Cluster, cluster-level issues must be escalated to ITRS because customers do not have direct access to the bundled K0s Kubernetes layer or the usual platform-level diagnostic tools needed to investigate those problems themselves.

Deployment scenarios Copied

The following sections describe various deployment scenarios, each with specific benefits and trade-offs. Understanding these helps you select the right configuration for your requirements.

Non-HA single or multi-node (BYOC) Copied

This configuration is suitable for proof-of-concept deployments and smaller production environments where high availability is not a strict requirement.

Common use cases:

Characteristics:

Note

Proof-of-concept deployments come with no guarantee of high availability for stored data due to their exploratory nature.

Non-HA single or multi-node (Embedded Cluster) Copied

This configuration is similar to the Bring Your Own Cluster non-HA configuration, but it is deployed on-premises using Embedded Cluster. This option has additional limitations around data protection.

Common use cases:

Important considerations:

Warning

Before using Embedded Cluster in production, plan and validate an alternative backup strategy such as hypervisor VM snapshots or storage-level snapshots. Without a tested backup strategy, a node failure that causes disk loss can result in permanent data loss with no recovery path.

Plan your rollout Copied

Use the following summary to confirm your deployment choice before proceeding to installation.

Requirement Recommended model
ITRS manages all infrastructure SaaS
Data must stay in a specific AWS region SaaS
Data must stay on-premises or in a private cloud Self-hosted
Full control over Kubernetes platform Self-hosted and BYOC
No in-house Kubernetes expertise; evaluation or constrained production requirements Self-hosted and Embedded Cluster
No in-house Kubernetes expertise; production deployment with sufficient replicas on all nodes and a validated external backup strategy Self-hosted and Embedded Cluster
Production-grade high availability required SaaS or Self-hosted (BYOC or Embedded Cluster with sufficient replicas and accepted degraded-operation trade-offs)
Backup and restore required SaaS or Self-hosted and BYOC
Security team must enforce Kubernetes-native controls such as network policies, admission controllers, or image and pod-layer scanning Self-hosted and BYOC
Existing cloud Kubernetes service (EKS, AKS, GKE) Self-hosted and BYOC
Deployment on VMs or bare metal without Kubernetes Self-hosted and Embedded Cluster
Strict compliance or audit data retention requirements SaaS or Self-hosted and BYOC

To continue planning your deployment, refer to the following resources:

["ITRS Analytics"] ["User Guide", "Technical Reference"]

Was this topic helpful?