Gateway Hub

Hardware requirements

Overview

A Gateway Hub installation consists of a number of individual servers called nodes. A set of nodes collectively form a cluster and all nodes in a cluster should be co-located.

You will also need to consider the number of clusters that are required and where these should be physically located. For a global organisation, you should deploy one cluster in each geographical region (for example, Americas, EMEA, and APAC) to reduce network latency.

For more information about deploying multiple clusters, see Hardware requirements.

Cluster size and consensus

Gateway Hub requires a quorum of available nodes to achieve distributed consensus. If an insufficient number of nodes are available, then distributed consensus cannot be achieved and Gateway Hub will not operate correctly.

For a cluster with n nodes quorum is given by (n/2)+1, and as a result an odd-sized cluster tolerates the same number of failures as the sequential even-sized cluster but with fewer nodes. Fault tolerance scales with cluster size as follows:

Node count Nodes required for consensus Node failures tolerated
1 1 0
2 2 0
3 2 1
4 3 1
5 3 2
6 4 2
     

Note: The recommended number of nodes in production deployments is three, however, for testing or proof of concept deployments a single node can be used.

Hardware guidelines

Example scenarios

The specific requirements of your deployment will depend on the expected workload.

The following table shows hardware specifications for a range of indicative production scenarios using the default 3 days of Kafka storage and 90 days of metric storage.

Note: The test environments for all scenarios were created without the use of virtual machines. This is because virtual machines, by definition, execute on shared hardware and as a result it is difficult to make assumptions about CPU usage and disk contention. Running Gateway Hub on virtual machines may have different requirements to those listed here. Additionally, virtual machine environments may not be appropriate for extremely large estates.

Scenario CPU cores Memory (GiB) Storage per node (GiB) Disk IO (megabytes/sec) Network IO (megabytes/sec) Number of nodes Disk configuration Equivalent AWS instance Equivalent AWS EBS volume Test environment
10 probes with 1 user 8 16 24 102 0.02 1 Single disk c5a.2xlarge standard AWS
100 probes with 3 users 16 16 186 107 0.05 1 Single disk c5a.2xlarge standard AWS
1000 Probes with 5 users 16 16 1854 203 0.28 1 Dedicated disks c5a.2xlarge standard AWS
3000 Probes with 50 users 32 128 3972 305 1.46 3 Dedicated disks m5a.8xlarge gp2 Bare metal
6000 Probes with 50 users 32 128 7842 312 2.32 3 Dedicated disks m5a.8xlarge gp2 Bare metal
                     

Disks

Under normal conditions Gateway Hub has extremely high disk IO requirements.

For optimal performance under a realistic load, each of the following should be assigned a separate disk or disks:

  • Operating system, swap, application binaries and application logs.
  • Zookeeper and etcd
  • Kafka (may require multiple disks)
  • PostgreSQL data
  • PostgreSQL write ahead log
  • Gateway Hub temporary storage

In addition, when installing Gateway Hub consider:

  • SAS drives can provide better I/O latency. SSDs provide even lower latency.
  • Match aggregate drive throughput to network throughput. 10GbE ~= 10-12 drives.
  • Raw disks should be used instead of Logical Volume Manager (LVM).

The number and performance of disks required for different Gateway Hub components will ultimately depend on the specific details of your deployment.

Memory

  • Gateway Hub performance and specifically metric query response times can be improved with additional RAM. This is particularly important when working with large datasets.