Conversions API Gateway and Signals Gateway: AWS Architecture

The Conversions API Gateway and Signals Gateway Products (also referred to collectively as the Gateway Products) work with Amazon Web Services (AWS) and Google Cloud Platform. Below is a diagram and a list of the main AWS resources and services used by the solution, the number of instances created per resource or service type and, when applicable, their purpose.

The diagram and the list contain only the most important AWS resources and services used by the Gateway Products. Other AWS resources and services not listed here will be used by your instance.

The diagram below shows the main resources instantiated and how they interact between them.

EC2 and Related Services

The Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides secure, resizable compute capacity in the cloud.

  • The number of EC2 instances instantiated by the Gateway Products (as EKS worker nodes) can be configured at the Left side menu > Settings -> Auto scaling limits of the Gateway Products UI. The default value of Maximum Nodes per Availability Zone is 1.

  • The Gateway Products will instantiate a t2.micro EC2 instance that serves as the Bastion host.

  • Virtual Private Cloud (VPC) - a virtual network, logically isolated from other virtual networks in the AWS Cloud, where AWS resources can be launched. The Gateway Product creates one VPC and two availability zones in it; each availability zone has one public and one private subnet.

  • Availability Zone (AZ) - an isolated location within a region; a region can have more availability zones. A service instance is always launched in a region, a VPC, and a subnet, that can be either selected from one of the availability zones or chosen by AWS. If service instances are distributed across multiple availability zones and one instance fails, an application properly designed can have an instance in another availability zone handle the requests to that service. Elastic IP addresses can be used to mask the failure of an instance in one Availability Zone by rapidly remapping the address to an instance in another Availability Zone.
    The Elastic Kubernetes Service (which is used by the Gateway Products) requires at least 2 availability zones. Multiple availability zones provide resilience against accidental zone outage. If one zone is down, services will move to the other zone automatically. Each availability zone has a public subnet and a private subnet.

  • Subnet - a range of IP addresses in a VPC. AWS resources are launched into a specified subnet. The Gateway Products instantiates four subnets, two public and two private.

  • Public subnet - a subnet that can reach the internet through an internet gateway or an egress only gateway. In the Gateway Product one public subnet is created per each availability zone (2 public subnets in total). The bastion host is created in one of the public subnets.

  • Private subnet - a subnet that can not reach the internet. In the Gateway Product one private subnet is created per each availability zone (2 private subnets in total). EKS worker nodes are placed in the private subnets.

  • Internet gateway - a VPC component that allows communication between a VPC and the internet. It supports IPv4 and IPv6 traffic. The Gateway Product instantiates one Internet gateway.

  • EgressOnly gateway - allows outbound communication over IPv6 (only) from instances in the VPC to the Internet, and prevents the internet from initiating an IPv6 connection with the instances. The Gateway Product instantiates one EgressOnly gateway.

  • Load Balancer - Provides access from the internet to the Gateway Products’ service endpoints. A Gateway Product that supports multiple accounts instantiates one Application load balancer.

  • Security group - controls the traffic that is allowed to reach and leave the resources that it is associated with. The Gateway Product creates 4 Security groups, on top of the default ones:

    • A Security group for the communication between all nodes in an EKS cluster.
    • A Security group created by EKS that is attached to EKS Control Plane master nodes, as well as any managed workloads.
    • A Security group for the communication between the control plane and worker node groups.
    • A Security group for the EKS bastion host, to limit access to the instance.

EKS and Related Services

What is EKS?

Amazon EKS is a managed Kubernetes service to run Kubernetes in the AWS cloud and on-premises data centers. Kubernetes is a portable, extensible, open source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation.

In the cloud, Amazon EKS automatically manages the availability and scalability of the Kubernetes control plane nodes responsible for scheduling containers, managing application availability, storing cluster data, and other key tasks. With Amazon EKS, we can take advantage of all the performance, scale, reliability, and availability of an AWS infrastructure, as well as integrations with AWS networking and security services.

Supported EKS Versions

Currently, the latest version of the Gateway Products is using EKS 1.25 by default. However, if you installed the Gateway Product before, it may be using an older version of EKS. Each version of EKS has its own support lifecycle, and when it expires, you can follow the instructions for upgrading their EKS cluster to update to a newer version. If you don't do this, AWS will automatically manage the upgrade after your current version of EKS goes out of support.

How Does the Gateway Product Use EKS?

  • EKS Cluster - an Amazon EKS cluster consists of two primary components: 1- The Amazon EKS control plane, and 2- Amazon EKS nodes that are registered with the control plane. The Gateway Product creates one EKS cluster.

  • Kubernetes node - a machine that runs containerized applications.

  • Node group - Amazon EKS managed node groups automate the provisioning and lifecycle management of nodes (Amazon EC2 instances) for Amazon EKS Kubernetes clusters.

  • Bastion Host - a free tier t2.micro EC2 instance that creates and destroys the EKS cluster. t3.nano is used in regions where t2.micro is not available.

  • Auto Scaling Group for the Bastion Host - allows creating the bastion host dynamically on any availability zone, based on the availability of t2.micro EC2 instance.

  • EKS Worker Node - an m5.large instance where the Gateway Product service pods are running.

  • Auto Scaling Group for the EKS Worker Nodes - scales up and down the EKS worker nodes based on the workload of the Gateway. The Gateway Product instantiates two auto scaling groups, one in each private subnet.

Cloudwatch

Cloudwatch - logging service provided by AWS. The Gateway Product will store installation and application logs in the Cloudwatch service. Installation logs are minimal and will not incur any cost in most cases. Application logs on the other hand can lead to cost due to 1) verbosity of the application logs, 2) how long a Gateway instance is running - the longer an instance is running, the more logs it generates.

IAM

AWS Identity and Access Management (IAM) - an AWS service that helps securely control access to AWS resources. IAM is used to control who is authenticated (signed in) and authorized (has permissions) to use resources.

Kinesis

Amazon Kinesis - Allows to collect, process, and analyze real-time, streaming data to get timely insights and react quickly to new information.

S3

Amazon Simple Storage Service (Amazon S3) - an object storage service offering industry-leading scalability, data availability, security, and performance. The Gateway Product uses AWS S3 mainly for the automatic backup feature.

Simple Queue Service (SQS)

The Amazon Simple Queue Service (SQS) allows sending, storing, and receiving messages between software components at any volume, without losing messages or requiring other services to be available. In the Gateway Products it is used by the data pipeline to ensure: 1- large volumes of events can be handled without memory exhaustion, and 2- received events persist over system restart / update.

Domain Management for the AWS Architecture

Cloudflare

A load balancer is a component in the Gateway Product’s infrastructure that helps distribute incoming network traffic across multiple servers. Using a load balancer can improve the application availability and responsiveness even during periods of high traffic.

In addition to load balancing, we also use SSL/TLS certificates to encrypt data in transit between clients and Gateway Products. These certificates are issued whenever an advertiser sets up a first-party domain. An advertiser would use the agency’s domain by default if they have not set up the first-party domain. The advantage of setting up a first-party domain is that the pixel events will be routed to this domain which is owned and controlled by the advertiser, else the pixel events will flow for the advertiser but through the agency domain. Currently, the Gateway Product allows 22 certificates to be attached to the load balancer by default. This number can be increased to 97.

By using Cloudflare, we can overcome this limitation. One of the key benefits of using Cloudflare is its ability to issue up to 5000 SSL/TLS certificates for advertiser domains that point to an agency domain managed or owned by Cloudflare, that is, it allows up to 5000 advertisers per Gateway Product instance to use a first-party domain.

AWS Certificate Manager

AWS Certificate Manager is an alternative to Cloudflare, with the limitation that it can only support 22 certificates by default. This number can be increased to 97.

Data Flow and Storage

The Gateway Product treats four types of data:

  • Action (event) data sent by the client and only transiting the Gateway.
  • Data sent by the client and only transiting the Gateway.
  • Gateway configuration data stored in the Gateway.
  • Gateway logs stored in the Gateway.

Gateway Configuration Data

The Gateway Product configuration data, detailed below, is stored in the EKS cluster using MariaBD and Apache Zookeeper.

Host related data

  • Host users (email, password, permissions)
  • Host SMTP configuration
  • Accounts and respective access rights

Host related data is stored in the Gateway for as long as the instance exists. It is only accessible by Host users.

Account related data

  • Account users (email, password, permissions)
  • Connected client IDs and respective configuration details (activation status, publishing status)
  • Events names, volumes, and publishing status
  • Website domains where the clients fire, domain allow list, and domain block list
  • Data routing configuration

Account related data is stored in the Gateway for as long as an account exists in the instance. Each account's data is accessible by the users of that account and specific Host users in case of a managed account.

The Host does not have access to event data, unless granted by the account, because these are not logged by the Gateway Products. Logging event data has been programmatically disabled and can't be overridden.

Gateway Products Logs

The Gateway Product uses the AWS Cloudwatch service to log installation and application running information.

Instance Installation logs

Installation logs are written during the installation process of an instance inside a dedicated log stream, part of the/cloud-init-output.log log group. These logs have a retention of 2 weeks and occupy around (size), which will probably not impact AWS cost. Installation logs can be helpful for debugging issues during installation.

Application running logs

Application logs are written for as long as the Gateway Products software and resources are running. Application running logs include:

  • User actions on the Gateway Products UI.
  • Software and resource usage logs.

These logs are written inside a dedicated log group named [Gateway Products host domain]/conversions-api-gateway and organized in several log streams. Application logs have an indefinite retention, up to ten years or the AWS limit.

The AWS Cloudwatch service does not log any event or contact information. How the Agency/Partner/Reseller treats this data if and after an Advertiser leaves depends on the terms and conditions of their relationship.

Cost

The cost of the Gateway Product depends on the cost of the service and resource instances used. AWS provides a tool to estimate the cost of a certain implementation.

The cost information provided in this section are estimations obtained using the AWS pricing calculator in a specific region, that is, us-east-2 (Ohio), and should serve as reference or guidance. The actual cost of your instance may vary.

Fixed/Base Costs

The estimated monthly cost for a Gateway Product instance with one EC2 m5.large instance per availability zone (two EC2 m5.large instances in total) in the us-east-2 or US East (Ohio) region might look like below:

Resource TypeNumber of Resource InstancesEstimated monthly cost @ us-east-2 (Ohio)

Application load balancer

1

$19.80

Internet gateway

1

0

EgressOnly internet gateway

1

Depends on data transfer. See overview of data transfer costs

Data transfer

See overview of data transfer costs

EKS cluster

1

$73

EC2 t2.micro

1

$16.47

Total fixed costs

$180.48

EC2 m5.large

2

$156.16 ($78.08 per EC2 m5.large)

Minimum cost with m5.large

$336


Estimate the cost of your Gateway Product instance instance using this estimation template. Make sure you appropriately configure each service through the modify link:

  • Choose the appropriate region for each service.
  • For the Amazon EC2 service, choose the number (it must be an even number since one EC2 instance will be created per availability zone) and the size (m5.large) of the EC2 instances.

Total Cost of the Instance Based on Capacity

The estimated monthly cost for a Gateway Product instance with one EC2 m5.large instance per availability zone (two EC2 m5.large instances in total) in the us-east-2 or US East (Ohio) region might look like below:

The table below shows the estimated monthly cost in the us-east-2 or US East (Ohio) region for a Gateway Product instance, depending on the number of EC2 instances.

Minimum Instance Count Per Availability ZoneMaximum Instance Count Per Availability ZoneEC2 Instance TypeEC2
Instances
Created
Estimated
Total
Cost (Min/Max)
Recommended Maximum Capacity

1

1


m5.large

2 m5.large (one for each availability zone) and 1 t2.micro

~$ 336/month

1200 QPS

1

2


m5.large

Up to 4 m5.large (one for each availability zone) and 1 t2.micro

~$ 493/month

2400 QPS

1

3


m5.large

Up to 6 m5.large (one for each availability zone) and 1 t2.micro

~$ 649/month

3600 QPS

Network and Security

Allowed network traffic

The Gateway Products require the following inbound and outbound network traffic to work as documented. The default configuration only allows the required traffic.

SourceDestinationProtocol/PortDescription

Gateway Product instance

0.0.0.0/0

All

Allow outbound connection to the internet from Gateway Products to pass events to Meta and download packages from external repositories such as:

  • Download microk8s from canonical
  • Download software in Docker Containers from ECR
  • If opted-in to telemetry data transmission, periodically send telemetry data about your business’ use/operation of its Gateway Product installation to Meta for monitoring and troubleshooting problems

0.0.0.0/0

Gateway Product instance


TCP/80


Allow inbound HTTP connection to Gateway Product. This port is automatically redirected to TCP/443.


0.0.0.0/0

Gateway Product instance


TCP/443


Allow inbound HTTPS connection to Gateway Product. Used by browsers to send events through HTTPS.

Endpoints and In-Transit Data

The Gateway Product requires the following inbound and outbound network traffic to work as documented. The default configuration only allows the required traffic.

Endpoints are secured via TLS and SSL, and in-transit data is encrypted. Please see below. Gateway Product exposes two internet-facing endpoints:

  • HTTPS endpoint for receiving events from browsers
  • HTTPS admin front end for administering the server

These endpoints are secured through TLS (TLS 1.2 and 1.3 are supported) and by using an SSL (default cipher list) certificate generated automatically during the server provisioning. The default certificate has a 90 days life time and it renews itself regularly.

Additional Security Protections

To help reinforce the protections of Gateway Product endpoints, businesses can use their preferred cloud-based security solutions (Web Application Firewall, anti-DDOS) from AWS or other third-party providers. Such protections are configured by proxying the Gateway Product traffic through the corresponding service provider and allowing inbound traffic only from this service provider.