AWS Lambda – Akshay Surve

This blog post was cross-posted from DeltaX Engineering Blog - {recursion} where it was published first.

Using CDNs (Content Delivery Network) for static content has been a long known best practice and something we have been using across our platform and ad-server. I wanted to share a special usecase where we use CDN (AWS Cloudfront) for serving dynamic requests on our ad-server to achieve subsecond response times.

CDN for Static Content

CDNs employ a network of nodes across the globe called edge nodes to get closer to the user (client browser) and hence are able to reduce the latency and roundtrip delay. Add to this a cache policy at the edge nodes and you are able to serve content gloablly with with acceptable latencies.

Here is how it would look like: How CDNs Work

CDNs also come in handy as browsers limit the number of HTTP connections with the same domain - this is anywhere between 2-4 for older browsers and 6-10 for modern. Using multiple CDN sub-domains dynamically helps avoid queing the requests on the browser side.

CDN for Dynamic Content

Using CDN for dynamic content in cases where the response from the server is supposed to be different for every user request is counter intuitive. When it comes to ad-server the response is not only unique by user but also time sensitive. So, caching the dynamic requests of the ad-serving engine is not recommeneded. CDNs that allow supporting dynamic content allow this to be specified in distribution settings or read it from the headers of the response of the origin servers.

Before we get deeper, there is another important consideration - all ad-serving requests are now mandated to be through HTTPS. HTTPS (SSL/TLS) is recommened to protect the security, privacy and integrity of the data but it’s not known to be the fastest off the block. I’m referring to the 3-way handshake which delivers the expected promise of SSL/TLS but adds significant latency while establishing the initial connection. This initial latency can be substantial considering ad-serving performance is measured in subseconds.

By terminating SSL at the edge node of a CDN also called as SSL offloading can speed up initial requests (see realworld results below).

Here is how it would look like:

CDN for Dynamic Content (real-world results)

Theoretically, using CDNs for dynamic content for SSL offloading may look like a minor boost - but when it comes to real world results here is how the results stack up.

This is close to a 900% boost in real world performance for the first request. The results shall vary based the latencies between your user, you origin server and the nearest edge location.

Additional Pointers

We use AWS Cloudfront as our CDN and he are some features which we are able to leverage for subsecond ad deliveries:

Vast coverage - 98 Edge locations (87 Points of Presence and 11 Regional Edge Caches) in 50 cities across 23 countries
HTTP/2 support - which takes advantage of miltiplexing (multiple request & response messages between client and server on the same connection instead of multiple connections). Esp. for usecases where multiple assets are required like richmedia ads; the realworld benchmarks were unbelieveable to me and Amrith (possibly a future blog post).
SSL Session Ticket - to reduce the back and forth for the SSL handshake for subsequent requests.
Support for gzip compression.

CDNs have become a commodity with the ease and flexibility offered by the public cloud providers like AWS & Microsoft. I feel the recent launch of AWS Lambda@Edge the pormise of the on-demand nature of the cloud and serverless architecture will finally culminate into something bigger.

This blog post was cross-posted from DeltaX Engineering Blog - {recursion} where it was published first.

Advancements by cloud-based IAAS providers (Amazon Web Services, Google Cloud and Azure have made on-demand scale and flexibility a reality. Today, as a startup you don’t need to worry about over-provisioning infrastructure, forecasting growth and go over long-term infrastructure contracts to meet your demands. Interestingly, a new suite of cloud services are questioning the very existence of a core aspect of common application architectures - the ‘server’ and are coined as serverless.

What is the ‘server’ in `serverless`?

Let’s say you wanted to run a service on the cloud - for this, you would need to do the following:

Decide the type of computing resources you need. Instance type, cores, memory and storage space.
Choose an OS / Machine image to install on the instance
Setup / deploy your service

Steps 1 & 2 above constitute the ‘server’ in the serverless paradigm and in effect, these are the steps you wouldn’t have to worry about. All you need to do is to choose your execution environment and submit your code.

Available Options

When it comes to the serverless paradigm - each of the major cloud IAAS providers have launched their own options. Here is a quick summary of options available:

IAAS	Serverless Paradigm	Supported Environments
Amazon Web Services	AWS Lambda	Node.js, Java, Python, C# (.NET Core)
Microsoft Azure	Azure Functions	Node.js, C#, F#, Python, PHP, and shell
Google Cloud	Cloud Functions	Node.js

Ref: Click here for a detailed comparison on Stackoverflow

There are slight differences in the extent of support and capabilities but the process to initiate works as follows:

Select a development environment
Choose the amount of memory, execution timeout etc.
Setup a trigger for launch

Proof of Concept

In part, to test drive the paradigm and at the same time build something useful, I worked on two POCs.

Azure Function: Cachewarmer Function

When it comes to our web application, we use Entity Framework as the ORM. Considering the multi-tenant nature of the application and the volume of tables - context initialization takes an unexpectedly long time. It’s for this exact reason we had to build a mechanism to warm the context cache to initialize it and keep it ready for external requests.

Trigger: CRON

Dev Environment: shell

Description: I cooked together a sequence of cURL requests to make pings to a special endpoint on the web application which initiates a context load. Considering we have over 500 tenants we had to batch a series of requests and to avoid hitting the max execution time I had to split this into two separate functions.

Honestly, this was really a trivial function, but it is exactly why having a serverless architecture was justified. Not to forget, we were up and running within 20 mins.

AWS Lambda: Slackbot dxdb

This was in retrospective a solid use case. Let me take a deep dive onto this one:

Purpose: As noted earlier, we have over 500 tenant databases. When it comes to querying the databases - it’s pretty cumbersome to connect to them individually using SMSS and then run individual queries. When it comes to executing small queries to check data; it would be pretty useful to simply fire the query in the Slack channel and see the results. An unexpected consequence of using Slack is also that one can fire the query from the Slack mobile application as well and see the results on the go.

Features Supported:

Detect the DB to connect with intelligently from the schema
Support delayed response. Some queries can take longer to execute while Slack for an immediate response has a window of 3 seconds.
Formatting output to the extent possible
Minimal error notifications

How it works? Slack command dxdb

Every invocation of the command makes a POST request to the AWS API Gateway with the command and the request text; in our case the query.
The AWS API Gateway invokes the AWS lambda function dxdbExecuteSQL and passes the request params. Tip: The AWS API Gateway is probably the most underrated yet one of the most powerful and flexible services AWS has launched. Will explore this in the future.
dxdbExecuteSQL function authenticates the request, does minimal checks on the kind of queries (in our case only read-only) and does two things.First formats the intermediate response in the form of MSSQL prompt to be sent back to Slack through the API gateway. Next invoke the dxdbDelayedSlackResponse lambda function.
dxdbDelayedSlackResponse lambda function parses the query, identifies the tenant, fires the query, reads the results, formats the response and makes POST request back to Slack.

Although the setup is complex and layered, I only had to focus on the workflow and the business logic; the effort of picking an instance, setting it up and keeping it running was not something I had to worry about. Another interesting thing about this setup is that - the function is not running all the time, it is only executed on invocation and the icing on the cake is that you are only billed for the time it executes in increments of 100ms.

Code: Project is available on Github.

Follow-up Thoughts

Going serverless is an extension of adopting the cloud but demands a change in the thought process of layering your architecture. The recent trend around microservices-based architecture also fits well with the serverless paradigm.

Interestingly, each of the cloud services offers a minimal code editor. I can see how in the future you could probably have a full-fledged IDE available at your disposal. Looking at the pace of innovation, we are another step closer to not just programming for the cloud but literally in the cloud.

Tag: AWS Lambda

CDN for serving Dynamic Content

CDN for Static Content

CDN for Dynamic Content

CDN for Dynamic Content (real-world results)

Additional Pointers

CDN for Static Content

CDN for Dynamic Content

CDN for Dynamic Content (real-world results)

Additional Pointers

What is the ‘server’ in serverless?

Available Options

Proof of Concept

Azure Function: Cachewarmer Function

AWS Lambda: Slackbot dxdb

Follow-up Thoughts

What is the ‘server’ in `serverless`?