Our Review: Apigee Hybrid on Azure

Bringing the best of SaaS and Private Cloud together for a brilliant API Program.

We recently took Apigee’s new Hybrid solution for a test drive on Microsoft Azure, and in this review, we give you a rundown and what we found on the journey and what to expect from running Apigee Hybrid.

The High Level

Positives

  • Runtime secured on your own managed IaaS
  • Great separation of private and public APIs
  • More location options for where your APIs are running
  • Lower infrastructure skillset required than on-premise solutions
  • Out-of-the-box configuration and set-up
  • Automation friendly
  • Extensive features
  • Great for medium to large businesses

Negatives

  • More operational overhead than a SaaS solution
  • Some faults may take longer to diagnose
  • Little online community support
  • Installation instructions lacking
  • Not ideal for smaller businesses (go with Apigee SaaS)

Would you like a demonstration?

If you’d like a demonstration of our Apigee Hybrid on Azure solution, we’d love to show you around. Reach out to us at info@sonrai.com.au

Apigee Hybrid

With Apigee Hybrid, customers configure their API Management solution from the cloud UI (or via the management API), however, all the runtime APIs and associated configuration are on the customers-privately configured infrastructure, whether that’s Azure, Google Compute, or AWS. This means all the API traffic runs through the customer-managed infrastructure giving businesses more control and choice, whilst still maintaining the benefit of the cloud via the management of their APIs through the cloud.

This takes away a lot of the complexity of running a completely private API Management solution (e.g. Apigee OPDK), but still provides much of the entire stack within the customer’s control. Hybrid removes the pain of managing VMs and individual applications as it is installed on Kubernetes - anywhere you can run Kubernetes, you can pretty much run Apigee Hybrid (cloud or on-premise infrastructure). At a high level, the solution complexity sits somewhere between a SaaS-based solution and a full-customer-managed solution.

Test Drive

We’ve had a lot of experience with both Apigee OPDK and Apigee SaaS across a number of clients, so we know that Apigee products have a great pedigree and we were therefore very keen to see how Hybrid would stack-up and what differences we would see between the 3 offers from Apigee.

Rather than looking at the box and making up our minds, we decided to take it for a test drive…

hybrid-cloud-meme.png

Our Design

For the test drive, we wanted to deploy a solution that was a realistic customer deployment of Hybrid so that we could get a feel for the maturity of the product and whether it would meet the needs of our customers.

For our demonstration environment design, we looked at deploying:

  • Apigee Hybrid v1.3 onto Microsoft Azure in Australia - Which from our feedback seems to be the cloud du jour for businesses wanting to deploy Hybrid.

  • Across 2 Azure regions in Australia; Australia-east and Australia-southeast - With Australia-east as our primary location, as it has 3 availability zones, and Australia-southeast as our secondary (with 1 AZ).

  • Route 53 from AWS - To manage DNS and active-passive failover from primary to secondary Hybrid runtime environments (as this part wasn’t necessary for the test drive of Hybrid we only deployed a very simplistic DR-based-on-DNS-availability failover solution).

The Build

We followed the build instructions on Apigee’s website for version 1.3. These instructions are a detailed end-to-end manual for installing Hybrid on either GKE, EKS or AKS. This can be easily missed when looking at Apigee Hybrid at first glance, but all Hybrid deployments require 3 distinct environment elements as part of the solution;

  1. Gcloud environment - The first thing to be aware of is the need for a customer-managed GCP Project, on top of the Runtime environment. A Google project is required and sits between the Management and Runtime planes for the management of the runtime plane and it also holds all the user identities for managing the solution.

  2. Apigee Edge UI - The management control plane, or user interface, which sits on GCP as a black-box and is fully managed by Google.

  3. Runtime environment - A Kubernetes cluster (or many clusters) sitting on either GCP, AWS or Azure that is the runtime API gateway. This is where all the action happens and all API traffic flows through.

Prerequisites

Before starting the build of our Hybrid platform we made sure we had the following;

  • Apigee Hybrid license.

  • Google Cloud account and project.

  • Azure account and PAYG subscription.

  • A quota on the Azure subscription of at least 16 Standard_DS3_v2 Microsoft.Compute resources.

  • The correct access control permissions for engineers configuring the solution in Google Cloud and Azure.

  • Windows Powershell with Azure CLI, kubectl and GCloud SDK cmdlets.

  • AWS account for the configuration of Route 53.

Part 1

Part 1 of the build is very straight-forward; configuring your Google Cloud account and project, enabling the APIs on Google (required to interface between the Hybrid Edge UI and the Runtime Plane), creating your initial organization, and adding an environment.

As we were using Azure, we skipped adding a static IP and DNS on Google Cloud and used AWS and Azure instead for that step, which was straight-forward.

NOTE: Before beginning part 1, we made sure we had a valid Apigee Hybrid licence to work with, and that we could log into the Hybrid Edge UI. It’s important to note that the Hybrid Edge UI (https://apigee.google.com/edge) is not the same as the Apigee SaaS Edge UI (https://apigee.com/edge), they have different URLs and there are also subtle differences in how they operate.

Part 2

Part 2 of the build is where most of the work takes place, and also where we spent most of our time troubleshooting the build if anything went awry. This part is also quite complex when you first run through it, and if you’re building your own proof-of-concept environment, we recommend giving yourself at least 1 week to get used to deploying and configuring Hybrid on Kubernetes and getting it fully operational.

For most of the build, we used a mixture of Windows Powershell (in Administration mode) and the GCloud CLI (which is required for some of the later steps that cannot be done through Powershell), where even a novice with these tools will find the build process and commands used fairly straight-forward.

Google provides relatively good instructions on the build, however as version 1.3 uses Google’s Anthos (for “connecting” the AKS cluster into GCloud for management), from Step 2 there are many offshoot linked instructions to Istio and Anthos which makes it difficult to follow and make sure Anthos and Istio (which are critical) are set-up correctly. It is easy to get lost between all the steps, especially if Anthos and Istio are new to you. This is where we built our own end-to-end installation and re-installation instructions and scripts to make sure that we could repeatably create Runtime planes.

At Step 3 and beyond we found that the installation is best run from the GCloud CLI, especially as the apigeectl tool is designed to be run from a Linux environment (as a matter of fact, if you’re looking at scripting a whole installation we would recommend defaulting to the GCloud CLI for as much of the install as you can).

Step 4 is where you create your overrides file. This is used to set configuration parameters, set-up virtual hosts, certificates and environments. We found the use of YAML files a great idea as it helps for future DevOps and automation of the platform, and it aligns with how Kubernentes is configured, meaning Kube-savvy engineers will pick this up quite quickly. YAML files can be very particular to use, and are clearly developer-friendly, but once you get familiar with the syntax of the overrides file you get to understand how this can be used to manage and scale your Hybrid deployment in an automated and release-management-friendly way.

Step 5 then uses the apigeectl tool to configure much of the Apigee Hybrid runtime environment via the overrides file. This is an important step and one that we found is most prone to error (e.g. your certificate has issues, or your environment doesn’t match what was configured in Part 1). We also needed to enable Synchronizer access to the Runtime, and lastly enable Apigee Connect.

With that all completed we then went back into the Apigee Hybrid Edge UI to start deploying APIs, apps and developers to test that the solution works in a single region. When deploying API’s to an environment it can take a minute or so for them to be deployed and report back to the Edge UI that the deployment is successful.

Adding a second region

The above steps covered installing Hybrid in one region, but one of the great features of Hybrid is a multi-region deployment. Although this isn’t new to private and even SaaS API Management platforms, with Hybrid it is more seamless and much easier to keep all regions completely in sync, which is taken care of by the Edge UI. This means you don’t have to worry about different versions of APIs (and other items) in separate regions, giving engineers a “deploy once” capability and all regions are up-to-date.

For our 2 region install, we started from scratch again as part of testing our installation steps and this time configured the second region after step 1 (creating the cluster) and before step 2 (cert-manager and Anthos configuration). At a high-level we;

  1. Set-up a new virtual network in Australia-southeast.

  2. Peered the networks together.

  3. Copied the cert-manager configuration across to the new region.

  4. Seeded Cassandra in the new region to expand the ring.

It took us some time to work through all the steps and get the configuration correct through trial-and-error.

Unfortunately, not only are the instructions hard to find on how to set-up a multi-region deployment (as they are somewhat hidden under the Administer > Cluster management menu), they are not step-by-step on setting up a second region. As a 2 region deployment is quite complex and involves lots of moving parts, this area of the instructions would benefit from more detailed steps being provided and also moving the instructions under the Install section of the online guide.

With the second region up and running, we tested our API deployments and made sure they were available across both regions (by hitting the individual Istio FQDNs). We also queried the Cassandra database in each region to ensure it was propagating records across the ring.


Overall Key Take-outs

As expected, the build and running of Apigee Hybrid is in between the complexity of Apigee OPDK (a fully on-premise and self-contained system) and Apigee SaaS. Below we give some of our key take-outs from running Apigee Hybrid.

Skillsets

Apart from your usual Apigee skillsets, your engineering and operations team will need to have these skills to ensure the environment is well maintained and is reliable;

  • Linux - having a Linux background is foundational knowledge that will ensure that your operational team is well-versed in the usual commands, scripting and logic that Linux brings. Although you’ll spend your time between Azure and GCloud CLIs a grounding in Linux will help you quickly pick up command syntax.

  • Kubernetes - This is the core infrastructure technology and having container-based experience is critical to keeping the Hybrid environment scaled, up-to-date and available.

  • Cassandra - Although you don’t have to be an expert, having fundamental Cassandra knowledge is crucial as this is the only component in Hybrid that has stateful data and as the Cassandra database holds all the configuration of your APIs, keys, and other data it must be running smoothly for your API platform to be available.

  • Networking - There is complexity in the network, especially across multi-region deployments and where private and public APIs are served from separate networks. Also, connections with GCloud must be factored in. Understanding networking in Azure, GCloud and general network experience will be critical to ensure that the APIs are always available and back-end systems always remain connected.

  • YAML - If you’ve already got Kubernetes experience you’ll already be familiar with YAML, but as this is used quite heavily in the build and maintenance of Hybrid platform it’s worth pointing out that YAML experience is important.

Costs

During our testing and building, we spent around AU$1,000 in Azure resources over the month. For a simple 4 x node cluster running in 2 regions (or 2 clusters in 1 region) expect to pay around AU$2,000 per month for Azure resources. You’ll find the vast majority of costs (80-90%) is in the virtual machines that make up the nodes of your Kubernetes clusters. Of course, our solution was on a simple PAYG plan, costs will be heavily dependent on what your business's Azure agreement is and how heavily you’ll use the gateway (which impact the number of nodes).

apigee-hybrid-costs.png

You’ll need to factor in Apigee license costs and operational/engineering costs on top of this. If your operational teams are already familiar with Azure, Cassandra and Kubernetes, the running of the Hybrid runtime is not a major additional overhead as long as your operational teams keep up with the latest patches and upgrade the runtime regularly to the latest level. By integrating deployments and maintenance with DevOps tools you can reduce your maintenance effort even further through proper release management and automation.

One very important feature is that Kubernetes provides the ability to scale-up and scale-down (either manually or automatically), this can help save costs, especially if you have non-production environments that you could schedule to scale down overnight and scale-up during the day (for example).

An additional trick we employed was scripting our builds, so we could tear-down our clusters completely and then stand them back up again within 30 minutes. With scripting and automation, you could employ this idea to sandbox-like Hybrid environments that only get spun up when needed.

Things to watch for

  • 3 environments - As noted earlier, it’s not abundantly clear until you start building, but there are actually 3 “clouds” involved in a Hybrid on Azure solution; the Google-managed Hybrid Edge UI, GCloud (connecting your Runtime and managing IAM) and Azure for your Runtime. Therefore, you will have to be familiar with Google Cloud, as a Google Cloud Project is also required.

  • Identity management - All user identities are managed through the Google Cloud Project environment. This will be the identity store for Hybrid users and developers.

  • Cost management - Set budgets and cost alerts for your Hybrid platform, as it is easy to quickly consume costs, or accidentally leave your platform with more nodes that it needs.

  • Upgrades - Apigee tends to support N-2 versions of its software, so it’s important to keep in the habit of regularly upgrading your Hybrid environment.

  • Automate - Everything is highly configurable and the infrastructure is run as code, further with the regular upgrades it is worth investing in automating as much of the platform maintenance as possible and regularly testing DR failover scaling-up/down via DevOps tools.


Pros

  • Runtime secured on your own managed IaaS - You can ensure that all API traffic and the configuration state of your APIs are stored locally on your own IaaS. This helps those customers who require API traffic to remain on their own IaaS, and for configuration data to also remain on their own IaaS.

  • Better separation of private and public/partner APIs - Like OPDK, you can separate private and public APIs easily through different environments, clusters and network layouts ensuring that private APIs never reach the public. This can also be a benefit when developing APIs that may have access to sensitive data during development, meaning you can secure your dev and preproduction environments so that they are not publicly visible.

  • More options for physical locations of where your APIs are running - As all the major clouds are covered you can enjoy the Apigee SaaS-like experience from almost any location where there is a major cloud provider presence. At this time, Hybrid is only available on GKE on-premise for fully-private stacks, however, I am sure there will be more options in the future. This allows businesses to set-up DR and regional deployments that are much closer to their back-ends, where latency is a factor.

  • Lower infrastructure skillset required - Although there is a much higher operational overhead of managing infrastructure than Apigee SaaS, if a business needs to deploy it’s own API Gateway, Hybrid is the best option for limiting the infrastructure overhead that will impact operational teams. Given much of the configuration can be automated, and most of the Runtime components don’t need to be touched, most of the operational time can be spent on ensuring great API outcomes.

  • Relatively out-of-the-box configuration and set-up - Although some of the instructions need expansion, we found that the deployment to Azure was mostly out-of-the-box, and certainly much easier than a fully-on-premise deployment (like OPDK).

  • Automation friendly - Due to the infrastructure-as-code approach it offers great scripting and automation scope. This will significantly benefit operational and engineering teams to keep costs minimized, their environments scaled to meet demand and up-to-date.


Cons

  • More operational management overhead than a SaaS solution - It goes without saying that this solution is much more complex than Apigee SaaS. However, for many customers, if they were looking at OPDK, this would be the best choice for them. If your team has only been on Apigee SaaS, and is then moving to Hybrid there will be a steep learning curve unless your operational teams are already familiar with running systems on Kubernetes clusters in the cloud.

  • Some faults may take longer to diagnose - As Hybrid is quite new, and some components are a blackbox to the operator, some of the errors that appear do not point to exactly where the problem is, and therefore troubleshooting the issues took us a little longer than what we would have liked. Some of this is related to experience with Kubernetes and how all the pieces fall together, and once that experience builds it is easier to troubleshoot issues as they occur.

  • Little online community support - Again this is more due to the fact that Hybrid is relatively new. One of the great things about Apigee is the online community support and the richness of the community articles which are a significant help to engineers and operators. The lack of community information is of course a temporal issue, which will resolve itself as more customers start using Hybrid.

  • Installation instructions - this was probably our biggest bug-bear. The installation instructions are somewhat lacking and are disjointed when it comes to Anthos and multi-region deployments. An engineer should be able to take the instructions and follow them step-by-step, with worked examples, to a working system. We can see improvements with each release, however much time is spent in deciphering what step you’re up-to and whether you missed a critical config item and this could be minimized with more comprehensive and better flowing instructions.


Recommendation

Even though Hybrid is a relatively new product, overall the solution is beautifully designed and we can see much engineering thought has gone into ensuring it is a robust, scalable and an easy-to-mange-as-code solution for businesses. It has been cleverly designed and is great for those businesses needing to run their API Gateway on their cloud-based IaaS. If a business has been looking to deploy Apigee OPDK to a cloud, with Hybrid on the table this should give them serious pause and this should be the default choice for any customers wanting a robust, reliable and best-of-class API Management solution on their own Kubernetes clusters.

For those businesses that must have DR within a specific country, or don’t have a GCP region in close proximity to their latency-sensitive systems, Hybrid is a great choice that provides a balance between the simplicity of SaaS and the complexity of on-premise API gateway solutions.

The complexity of Hybrid is much higher than Apigee SaaS, still Apigee Hybrid takes away much of the headache of running fully-private API Management solutions, which tend over time to become a mess of technology debt and out-of-date software that operational teams struggle to keep running effectively after a number of years. Further, the choice of Hybrid is understood upfront as you’re running it on your own Kubernetes cluster and private cloud.

Overall for those wanting a best-of-breed private API Management solution with minimal overhead, Apigee has developed just that, and this should be a serious consideration if looking at private-cloud API Management solutions.

Would you like a demonstration?

If you’d like a demonstration of our Apigee Hybrid on Azure solution, we’d love to show you around. Reach out to us at info@sonrai.com.au
Previous
Previous

Sonrai becomes a Google Cloud Partner to help enable Australian Businesses

Next
Next

Understanding APIs