This is a blog post by Ronak Kothari, of the BloomReach technical staff (email@example.com)
We are excited to announce our new open source initiative, BloomGateway, which is designed to tackle the challenges of entry point service requirements in a distributed environment. It provides application-level security and a way to throttle traffic, re-route requests to either a caching layer or a different data center during transient issues in the current serving layer.
At BloomReach, our serving layer is composed of a number of independent microservices. These services are exposed to each other via AWS Elastic Load Balancer. This simple design has helped us horizontally scale our serving layer and allowed us to independently manage the different services. However, when it comes to these independent services, we needed a gateway solution that allows us to manage requests at the entry point. In particular, we needed a solution that:
- Improved the application-level security.
- Throttled traffic of certain types of requests.
- Isolated internal and external requests.
- Utilized and built a multi-region data-center policy for fallback scenarios.
- Provided a way of bucketing requests for A/B testing, release workflow testing, and grouping of requests to a set of upstream servers.
We investigated a number of existing solutions from open source projects — KONG, Zuul, Tyk, Vulcand — to full API management service layer — Amazon api-gateway and Apigee Edge. We evaluated them, considering the following parameters:
- Latency/performance overhead.
- Ease of adding new services in the serving layer.
- Ease of management and deployment.
- Flexibility with respect to building custom solution to meet our requirements.
It turned out that none of the solutions were able to meet all our requirements. And that’s why we turned to BloomGateway, a solution, that, when compared to existing solutions, is lightweight, high-performance, and comes with low (zero) deployment/management overhead.
Design / Architecture
To meet our primary requirements, BloomGateway uses OpenResty as its underlying engine, which is nothing but Nginx + Lua. On top of that, it has its own core engine, with a pluggable module architecture where each module is independent and may have its own configs. The core engine only controls its initialization and execution workflow, allowing it to build the logic to meet any diverse requirements.
To support changing runtime behavior, BloomGateway supports PULL as well PUSH models for updating/applying configs of a module, as well as changing low-level nginx configurations. The following figure helps illustrate the architecture:
Currently, BloomGateway provides four modules:
- Rate Limiter Module
- Allows throttling traffic to a service or API using URI, any header or query string param and IP address.
- Access Control Module
- Allows easy control over which requests can access the service based on the request’s header, query string param or IP address.
- Fallback Module
- Supports registering of ordered fallback end-points for error scenarios.
- Router Module
- Helps in bucketing of requests to support A/B testing.
- Helps in routing the requests to dedicated set of upstream servers.
The BloomGateway service can be deployed as an independent edge layer or along with upstream service. It can be configured to run in push-model-config-updates mode via POST api or in periodic-pull-model-updates mode for fetching runtime configs.
As shown below, at BloomReach, we have deployed the service along with upstream service, using pull model for runtime configuration changes.
Using a command-line interface, all the configs are stored/updated at the centralized s3 bucket with a unique id called clusterid. In the pull mode, the BloomGateway service is started by passing the clusterid and centralized s3 bucket location, enabling it to form the cluster. In the above example, all the BloomGateway services manage underlying upstream service on all five hosts, starting with “realm1” as clusterid. In that way, it forms a logical cluster.
By updating configurations at centralized s3 location for “realm1”, we can manage the underlying service running on all five hosts. As BloomGateway allows the formation of a soft cluster, it becomes very easy to add or move the host running the same upstream service from one cluster to another. It also makes very easy to manage, even if the service is running on a number of hosts which span multiple regions.
With a simple design, BloomGateway provides flexibility for updating run-time configs, a pluggable framework for extensibility and low latency overhead with no extra deployment or management cost. At BloomReach, we used this service at entry level to improve our security and increase availability by handling our multi-region fallback scenarios. As a performance benchmark, we have observed a maximum of ~2ms latency overhead with 100 rules configured for all the modules.
Special thanks to all the BloomReach contributors and its management:
- Navneet Gupta
- Amit Agrawal
- Ashok Sathyanarayan
- Mohit Jain
- Ashok Raja R
Please leave us feedback, file issues and submit pull requests, if you find this useful. The code is available on GitHub at https://github.com/bloomreach/bloomgateway.