nik codes

Squeezing the Most Into the New W3C Beacon API

Pluralsight If you like my blog, you’ll love my Pluralsight courses:
Tracking Real World Web Performance
WebPageTest Deep Dive

The Setup

It’s common for many websites to build a signaling mechanism that, without user action, sends analytics or diagnostics information back to a server for further analysis. I’ve created one at least a half a dozen times to capture all sorts of information: JavaScript errors, browser and device capabilities, client side click paths, the list goes on and on. In fact, the list is actually getting longer with the W3C’s Web Performance Working Group cranking out lots of great Real User Metrics (RUM) specifications for in-browser performance diagnostics like Navigation Timing, Resource Timing, User Timing and the forthcoming Navigation Error Logging and Frame Timing.

The signaling code, often called a beacon, has traditionally been implemented in many different ways:

  • A JavaScript based timer which regularly and repeatedly fires AJAX requests to submit the latest data gathered.
  • Writing special cookies that become attached to the next “natural” request the browser makes and special server side processing code. 
  • Synchronous requests made during unload. (Browsers usually ignore asynchronous requests made during unload, so they can’t be trusted.)
  • Tracking “pixels”; small non-AJAX requests with information encoded into URL parameters.
  • 3rd party solutions, like Google Analytics, which internally leverage one of the options listed above.

Unfortunately, each of these techniques have downsides. Either the amount of data that can be transferred is severely limited, or the act of sending it has negative affects on performance. We need a better way, and that’s where the W3C’s new Beacon API comes into play.

The Solution

With the new Beacon API, data can be posted to the server during the browsers unload event, without blocking the browser, in a performant manner. The code is rather simple and works as expected:

window.addEventListener('unload', function () {
      var rum = {
              navigation: performance.timing,
              resources: performance.getEntriesByType('resource'),
              marks: performance.getEntriesByType('mark'),
              measures: performance.getEntriesByType('measure')
      rum = reduce(rum);
      navigator.sendBeacon('/rum/submit', JSON.stringify(rum));
}, false);

The Catch

Unfortunately, as of this writing, the Beacon API is not as widely supported as you’d hope. Chrome 39+, Firefox 31+ and Opera 26+ currently support the API. It isn’t supported in Safari and the Internet Explorer team has it listed as “Under Consideration”.

The other catch, and this is the biggie to me, stems from this note about navigator.sendBeacon() in the spec:

If the User Agent limits the amount of data that can be queued to be sent using this API and the size of data causes that limit to be exceeded, this method returns false.

The specification allows the browser to refuse to send the beacon data (thus returning false) if it deems you’re trying to send too much. At this point, Chrome is the only browser that limits the amount of data that can be sent. Its limit is set at right around 16KB 64 KB (65,536 bytes exactly).

A Workaround?

To be fair, 16KB 64KB sure seems like a lot of data, and it is, but I’ve found myself in the situation where I was unable to beacon back diagnostics information on heavy pages because they had just too much Resource Timing data to send. Being unable to send diagnostics data on the worst performing pages really misses the point of the working group’s charter. Further, this problem will only get worse as more diagnostics information becomes available via all the RUM specifications I mentioned at the top of this post. That said, I’ve implemented several ways to reduce a beacon’s payload size without actually losing or giving up any data:

1. Use DOMString over FormData

The Beacon API allows you to submit four data types: ArrayBufferView, Blob, DOMString or FormData. Given that we want to submit RUM data, FormData and DOMString are the only two we can use. (ArrayBufferView and Blob are for working with arrays of typed numeric data and raw file-like objects.)

FormData seems like a natural way to go, particularly because model binding engines in frameworks like ASP.NET MVC and Rails work directly with them. However, you’ll save a few bytes by using a DOMString and accessing the request body directly on the server.

For simplicity in both encoding and parsing, I encode the data via JSON. (Though you could try a more exotic format for larger gains.) On the server, with JSON.NET you can parse the request body directly like this:

var serializer = new JsonSerializer();
Rum rum;
using (var sr = new StreamReader(Request.InputStream))
using (var tr = new JsonTextReader(sr))
     rum = serializer.Deserialize<Rum>(tr);

2. Make Fewer HTTP Requests

My beacon payload size issues arose on pages that had lots of resources (images, scripts, stylesheets, etc) to download, which yielded very large arrays of Resource Timing objects. Reducing the number of HTTP requests that the page was making (by combing scripts and stylesheets and using image sprites) not only helps with page performance, but also reduced the amount of data provided by the Resource Timing API which in turn reduces beacon payload sizes.

3. Use Positional Values

As mentioned above, The Resource Timing API yields an array of objects. The User Timing API does the same thing. The problem with JSON encoding arrays of objects is that all the keys for each key/value pair is repeated over and over again for each array item. This repetition adds up quite quickly.

Instead, I use a simpler array of arrays structure in which individual values are referenced by position. Here’s the JavaScript to convert from a User Timing API array of objects to an array of arrays:

// convert to [name, duration, startTime]
rum.marks = (e) { 
     return [, e.duration, e.startTime]; 
// convert to [name, duration, startTime] 
rum.measures = (e) { 
     return [, e.duration, e.startTime]; 

On the server I use a custom JSON.NET converter to parse the positional values:

public class UserTimingConverter : JsonConverter
     public override object ReadJson(JsonReader reader, 
                                     Type objectType, 
                                     object existingValue, 
                                     JsonSerializer serializer)
         var array = JArray.Load(reader);
         return new UserTiming
             Name = array[0].ToString(),
             Duration = array[1].ToObject<double>(),
             StartTime = array[2].ToObject<double>()
     // ...

4. Derive Data on Client

Depending on the requirements, it may be feasible to send fewer values by making some simple derivations on the client. Why send both domainLookupEnd and domainLookupStart if all that’s required is subtracting one from the other to get the domainLookupTime? The more that’s derived on client, the less raw data to send across the wire.

5. Shorten URL’s

Resource Timing data, in particular, contains a lot of often redundant URL strings. There’s many strategies to reduce URL redundancy:

  1. If all the data is being served from the same host, strip the domain and scheme from the URL entirely. (Basically make it a relative URL.) For example: becomes /content/images/logo.png
  2. Shorten common segments into “macros” of limited characters that can be re-expand later. e.g.: /content/images/logo.png becomes /{ci}/logo.png
  3. The folks at Akami, who gather tons of Resource Timing data, leverage a tree like structure to reduce redundancy even more. They structure their payload like this:
         "http://": {
             "": {
                 "content/style.css": [ /* array of values */ ],
                 "content/images/": {
                     "logo.png": [ /* array of values */ ],
                     "sprite.png": [ /* array of values */ ]

6. Leverage HTTP Headers

Not all data needs to be included in the beacon payload itself. The server can still gather some diagnostics information from the standard HTTP headers from the beacon’s request. These include things like:

  • Referrer
  • UserAgent for browser and device information
  • Application specific user data from cookies
  • Environment specific data via X-Forwarded-For and other similar headers
  • IP Address, and thus approximate geographical location (not technically a header)
  • Date and Time (also not a header, but calculated easily on server)

With this collection of techniques, you should be able to squeeze a little more out of the Beacon API. If you’ve found another way to shave off a few bytes, let me know in the comments.

Single Post Navigation

6 thoughts on “Squeezing the Most Into the New W3C Beacon API

  1. Pingback: The Morning Brew - Chris Alcock » The Morning Brew #1761

  2. Nik, nice writeup. A couple of quick comments…

    Chrome limit is 64kb, not 16. Also, the “unload” use case is *a use case*, but it is probably one of the least interesting (and has some gotchas, see below): you can report some overall stats like time on site or other metrics. More likely, you’d want to use Beacon as part of a click handler: capture the navigation target and Beacon it to your server for analytics while allowing the browser to proceed with the navigation (the unload event fires right after..). Also, you can (read, should) use beacon for any/all reporting.. e.g. you can enable a flag in your Google Analytics config to log all analytics via Beacon; the API is not limited to the “I’m navigating away” case.

    Finally, one gotcha with “onunload” case: registering an unload handler will invalidate your back-button cache [1], which is probably not something that you want for most sites.


    • nikmd23 on said:

      Ilya, thanks so much for the comment!

      Do you know if the beacon payload data limit changed in Chrome 39? My original testing was against an earlier version where I had to enable the #enable-experimental-web-platform-features flag. I just tested again in 39, and you’re right – 64KB all the way.

      Thanks for the other tips as well. I might work these into a follow up blog post I am thinking about with other Beacon tips.

  3. Pingback: Les liens de la semaine – Édition #111 | French Coding

  4. salman on said:

    Is this blog entry copy-righted? I would like to add it to the wikipedia entry for web beacons as it is currently completely outdated and needs re-writing

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: