This document describes in more detail how the Instart platform generally functions and how its components are structured.
This document assumes that the Instart CDN is being used. Some of our features can be used with other CDNs.
First, it will help to describe at a high level what happens when an end user requests a URL from a site that is using the Instart service.
After that, we'll discuss the components of the service and what their functions are.
Following a browser request through the service
Let's trace what happens when someone sits down at their computer, opens a browser and sends a request to a URL that is routed through the Instart service.
The user's browser sends the request for a web page. The request first goes to DNS to find out the IP address corresponding to the domain name. DNS returns the IP address of the closest Instart server using Global Load Balancing instead of the customer's actual origin server. Armed with this information, the browser sends its request to that IP address.
The request arrives at the Instart server. Here, the request is examined, and by the domain name, the proper configuration for the Instart customer (which specifies their domains, which optimizations they have enabled or disabled, etc.) is applied. It then checks to see if this request has already been seen and might therefore be in the cache.
It then performs any applicable processing. Let's say the response to the user's request includes dynamically-generated HTML and JPG images, and that Image Transcoding and HTML Streaming are both enabled for this domain. As it sends the HTML and the image files on to the requesting user, the service also directs copies of the HTML and the JPG files to the HTML processing service and the image processing service, respectively.
The image processing service transcodes the image data, reducing its size by automatically determining the appropriate amount of quality reduction to apply to it using SmartVision (most images can be reduced in quality such that the number of bytes is significantly reduced without noticeably affecting their appearance).
Meanwhile, the HTML processing service externalizes and caches the HTTP headers and the contents of the HTML head element.
Now let's consider what happens when a subsequent user requests this same page (after the HTML Streaming service has seen enough requests to learn how to externalize cacheable content – by default, after the fifth request).
Again, the service examines the request. This time, it finds that the page's elements are in the cache. It looks at the cache control headers and determines that the cached content is still fresh. It now does the following:
- sends the first part of the HTML which is non-unique (such as the HEAD section)
- sends the Nanovisor
- sends the transcoded image
In the user's browser, the non-unique HTML arrives first, warming up the browser. The Nanovisor is then loaded and executed, which starts to download the remaining non-unique parts of the page.
The result is that the page and its images are displayed faster in the browser, and the page appears to the user and is ready to interact with almost instantaneously.
Instart service network architecture
The following sections describe the major building blocks of the Instart service. First, we'll discuss the cloud part of the client-cloud architecture. This will be followed by a description of the client part, the Nanovisor.
At each of Instart's PoPs, the service, running on a cluster of servers, directly handles the requests from user's browsers. It consists of several major components:
High-performance proxy web server
Instart uses the open-source, high-performance HTTP server and reverse proxy Nginx. Unlike traditional servers, Nginx doesn't rely on threads to handle requests. Instead it uses a much more scalable event-driven (asynchronous) architecture. This architecture uses small, but more importantly, predictable amounts of memory under load. Nginx has an active community of developers and powers a large number of high-traffic sites, such as Netflix, Hulu, Pinterest, and many others.
Web application partitioning and streaming services
Partitioning is the process of converting a web application and its component parts into many smaller parts without changing the object. The service then arranges the parts into an optimal order to allow applications to display them and to become interactive after only a partial download of data.
A series of services run to handle the data coming from the customer origin servers and apply optimizations to those parts of the data that are enabled for that particular customer:
- HTML Streaming service
- Image processing service – transcoding and adaptation
The streaming services work with the cache to collect fragments that have already been processed and stored, and fetch content from customer origin servers, passing it the requests for assets that are not currently in the cache or cannot be cached, such as dynamic data.
When requests come in from browsers, the page elements are identified, tracked and stored in full form in the distributed cache. Additionally, as they are processed by the fragmenting and streaming services, the fragmented parts are stored there as well. The system generates a unique ID to use for cache lookup. The ID is based on the request URL. By default this includes the protocol and the query string as a part of the URL, unless your property configuration has been explicitly set to ignore either or both.
For cached file invalidation/expiration, we defer to our customer's origin headers as being authoritative and act appropriately based on their HTTP response headers. The system can also override this behavior for certain type of files or paths as needed. (For details on how the service handles caching, see the document How the Instart Service Handles Caching.)
When an HTTP request arrives and the local cache has a copy of the requested URL, the cache needs to ensure the copy (the one sent in the last response from the origin server) is still fresh. We do this as follows:
First we check that the last response contains a Cache-Control:max-age header. The current age is taken as now minus the time in the Date header. If the current age is less than the time in the max-age header, the copy is fresh; if not, the copy is stale.
If we don't find the max-age header, we then look for the Expires header. If now is earlier than the date from the Expires header, the copy is fresh; if not, the copy is stale.
You can also manually purge the cache. There are two ways: through the customer portal web interface, or through a call to the service's Cache Management API. Using the portal is described in the document Purging Your Cache in the Portal. Using the API is described in the Cache Purge API Guide.
The caching service can also do object revalidation, which allows our service to do a lightweight check with your origin web server when a static object cached in our system expires. If the object is still the same, we then update the expiry time for the existing object in our system without needing to re-download the object. Previously our system would remove and re-request objects once they expired. In the case of images, this required us to re-run image processing operations such as transcoding on the object. This feature reduces the loading on your back-end origin servers and reduces loading on our service.
The Cache-Control headers (and config) enable the proxy to determine how fresh a cached object is. Any request for a non-stale cached object will result in the cached object being sent back to the client.
Information about the traffic passing through the service is monitored to provide statistics and customer billing data. These statistics are displayed in the customer portal web interface. Monitoring services are also used to check and report on the health and performance of the service.
The service works with a centralized configuration management and command/control system. The configuration values are a combination of customer settings and internal Instart configuration values. The configuration services system has extensive validation mechanisms to ensure that only safe and correct settings are deployed across the service.
For older, non-HTML5-compliant browsers such as Internet Explorer 8, the Nanovisor is not sent to the client. In this case the system automatically falls back to delivering full objects from the cache and providing best-in-class CDN levels of performance.