Traffic Director by Example: Part 1

John Tucker
codeburst
Published in
9 min readMar 13, 2021

--

An introduction to Google’s managed service mesh offering.

Assuming that we have a theoretical understanding of service meshes, we then can ask and understand; What is Traffic Director?

Traffic Director is Google Cloud’s fully managed traffic control plane for service mesh. Traffic Director works out of the box for both VMs and containers. It uses the open source xDS APIs to communicate with the service proxies in the data plane, ensuring that you’re never locked into a proprietary interface.

— Google Cloud — Google Cloud networking in-depth: How Traffic Director provides global load balancing for open service mesh

In reading this, we might question the claim of not being locked into a proprietary interface. As we will shortly observe, Traffic Director operates by utilizing service proxies to intercept traffic from our workloads; the workloads have no awareness of the service mesh’s functionality. Thus, swapping out the service mesh implementation requires no changes in the workloads themselves. Moreover, Traffic Director utilizes the commonly used open-source Envoy service proxy.

Envoy is an L7 proxy and communication bus designed for large modern service oriented architectures. The project was born out of the belief that:

The network should be transparent to applications. When network and application problems do occur it should be easy to determine the source of the problem.

— Envoy — What is Envoy

Another thought we might have is, why use and pay for a fully managed service mesh when we can operate our own, e.g., using the open-source Istio service mesh?

At a high level, Istio helps reduce the complexity of these deployments, and eases the strain on your development teams. It is a completely open source service mesh that layers transparently onto existing distributed applications. It is also a platform, including APIs that let it integrate into any logging platform, or telemetry or policy system. Istio’s diverse feature set lets you successfully, and efficiently, run a distributed microservice architecture, and provides a uniform way to secure, connect, and monitor microservices.

— Istio — What is Istio?

While it is relatively easy to implement a basic Istio implementation, a real-world implementation requires a substantial amount of expertise and infrastructure.

When configuring a production deployment of Istio, you need to answer a number of questions. Will the mesh be confined to a single cluster or distributed across multiple clusters? Will all the services be located in a single fully connected network, or will gateways be required to connect services across multiple networks? Is there a single control plane, potentially shared across clusters, or are there multiple control planes deployed to ensure high availability (HA)? Are all clusters going to be connected into a single multicluster service mesh or will they be federated into a multi-mesh deployment?

— Istio — Deployment Models

On the other hand, Traffic Director provides an opinionated solution running on highly-available and scalable infrastructure at a relatively low cost.

Traffic Director billing is based on service endpoints. Each service endpoint costs $0.0006945 per service endpoint per hour, which is approximately equal to $0.50 per endpoint per month.

— Google Cloud — Traffic Director Pricing

Prerequisites

In order to follow along, one only needs to have access to a Google Cloud project with sufficient permissions, i.e., have the project owner or editor role. For this article, I created a new project.

Please note: While all of the activities in this article can be executed using the gcloud CLI, we will perform all the activities using the Google Cloud Console GUI to better emphasize the concepts.

Project Setup

We first need to enable a number of APIs for our project:

  • Compute Engine API by navigating to Compute Engine > VM Instances
  • Kubernetes Engine API by navigating to Kubernetes Engine > Clusters
  • Cloud DNS API by navigating to Network services > Cloud DNS
  • Traffic Director API by navigating to APIs & Services > Library

To keep things simple for this article, we will use the default Compute Engine service account for both our Google Compute Engine (GCE) virtual machines (VMs) and Google Kubernetes Engine (GKE) pods, i.e., we will not be enabling GKE Workload Identity.

To enable the service proxies to connect to Traffic Director, we need to navigate to IAM & Admin > IAM and edit the Compute Engine default service account and add the Compute Network Viewer (compute.networkViewer) role.

As we will be using the GCE Heath Checks feature, we also need to allow health check traffic access to all GCE VMs and GKE pods in our project by following the instructions at Configure firewall rules.

Cloud DNS Setup

While this will make more sense later in this article, we will be using virtual IP addresses (VIPs) as service identifiers.

A virtual IP address (VIP or VIPA) is an IP address that doesn’t correspond to an actual physical network interface.

— Wikipedia — Virtual IP address

We will be using IP addresses that are not being used by the project’s default virtual private cloud (VPC) network; CIDR block 10.128.0.0/9. In particular, we will use addresses from the CIDR block 10.0.0.0/16 to give us 65,534 VIPs to use.

At the same time, IP addresses are not terribly user-friendly. As such, we will follow the instructions to create a private DNS zone using Cloud DNS.

As we will be delivering two services, one backed by GCE VMs and the other backed by GKE pods, we preemptively create two A records in our private DNS zone.

  • hello-world-gce and 10.0.0.1
  • hello-world-gke and 10.0.0.2

Service Backed by GCE VMs

Here we will deploy a Traffic Director service backed by GCE VMs.

Please note: These activities loosely mirror the Google Cloud provided documentation, Traffic Director setup for Compute Engine VMs with manual Envoy deployment.

We first create a GCE Instance Template by following the instructions; Creating the instance template for the Hello World test service; we, however, do not need to set the td-http-server network tag as we have allowed health check traffic access to all GCE VMs.

Next we create a GCE Instance Group by following the instructions; Creating the managed instance group for the Hello World service.

At this point, we have two GCE VMs that serve up a web page on the root, /, path using HTTP on port 80 through their internal IP addresses.

As the Traffic Director service needs to determine if a backing GCE VM is healthy, thus directing traffic to it, we need to set up a GCE Health Check as per Creating the health check.

Next, we create the Traffic Director service backed by the GCE VMs as documented per Creating the backend service; the Port is 80 and for consistency with the later GKE pod backed service, we choose Balancing mode of Rate at 5 RPS.

While we have created the Traffic Director service, we have not specified the logic, Routing Map Rules, that direct traffic to it. In this case we want to direct any traffic to the 10.0.0.1 VIP on port 80 to the service.

From the Traffic Director page, we select the Routing rule maps tab and press the Create Routing Rule Map button.

We name the Routing Map Rule td-vm-routing-rule-map.

We then add a Forwarding Rule using the Add forwarding rule button; providing Name: td-vm-forwarding-rule, Custom IP: 10.0.0.1 and Port: 80 (default).

For the provided Host and path rules, we select the td-vm-service; the service we created earlier.

Save the Routing Map Rule.

At this point our Traffic Director service is fully configured.

Client on a GCE VM

Now we need to create a client running on a GCE VM that can access the Traffic Director service; i.e., be able to access one of the backing GCE VMs via the IP address 10.0.0.1 and port 80. Or better yet, use the DNS address hello-world-gce.example.private that resolves to 10.0.0.1.

Before we go through the details of creating the GCE VM, we need to understand an important behavior of the service proxy.

Make sure that you have set up traffic interception only for the IP-addresses of services that are configured in Traffic Director. If all traffic is intercepted, then connections to the services not configured in Traffic Director are silently discarded by the sidecar proxy.

— Google Cloud — Troubleshooting Traffic Director Deployments

In our particular example, we only want the service proxy intercepting traffic destined to addresses in the VIP CIDR block; 10.0.0.0/16, The service proxy supports such a configuration.

In the sidecar.env file, set the value of SERVICE_CIDR to this range. Traffic to these IP addresses is redirected by netfilter to a sidecar proxy and load-balanced according to the configuration provided by Traffic Director.

— Google Cloud — Advanced Traffic Director Configuration Options

Please note: Finding this configuration option took awhile to find and without it the GCE VM cannot reach the Internet, e.g., for installing packages, etc.

From the Compute Engine > VM instances menu we select the Create Instance button. We supply:

  • Name: td-demo-vm-client
  • Boot disk: Debian GNU / Linux 9 (stretch)
  • Access scopes: Allow full access to all Cloud APIs

Clicking the Management, Security, Disks, Networking, Sole Tenancy link and selecting the Management tab, we copy the following script into the Startup script field (updating the GCP_PROJECT_NUMBER to our project’s number).

Things to observe:

  • This script is based on the example provided in the instructions Traffic Director setup for Compute Engine VMs with manual Envoy deployment
  • While it seems improbable, this Google supplied script does not work with the default Debian GNU / Linux (buster) Boot disk option
  • In addition to updating the SERVICE_CIDR value to 10.0.0.0/16, we supplied values for the GCP_PROJECT_NUMBER (we updated this value) and VPC_NETWORK_NAME
  • We also removed the lines involved in installing and configuring the apache2 service; the client does not need to be running Apache server itself

We then press the Create button to create the GCE VM instance.

Verifying the Configuration

To verify the configuration we log in to the client using the SSH > Open in browser window option in the td-demo-vm-client GCE VM instance entry.

Please note: It can take up to two to three minutes for the startup script to complete.

We confirm that the client’s traffic to the VIP 10.0.0.1 on port 80 is intercepted and sent to one of the the GCE VMs backing the Traffic Director service.

$ curl http://10.0.0.1
<!doctype html><html><body><h1>td-demo-hello-world-mig-l2c6</h1></body></html>

We can also use the DNS name hello-world-gce.example.private that we setup earlier.

$ curl http://hello-world-gce.example.private
<!doctype html><html><body><h1>td-demo-hello-world-mig-10cw</h1></body></html>

We can also observe that traffic not destined to a VIP in the service CIDR block, 10.0.0.0/16, bypasses the service proxy.

$ curl http://www.google.com
<!doctype html>
[obmitted]

Next Steps

In the next article, Traffic Director by Example: Part 2, we will explore clients on a GKE pod and a service backed by a GKE service and pods.

--

--