nik codes

Archive for the month “December, 2014”

Squeezing the Most Into the New W3C Beacon API

Pluralsight If you like my blog, you’ll love my Pluralsight courses:
Tracking Real World Web Performance
WebPageTest Deep Dive

The Setup

It’s common for many websites to build a signaling mechanism that, without user action, sends analytics or diagnostics information back to a server for further analysis. I’ve created one at least a half a dozen times to capture all sorts of information: JavaScript errors, browser and device capabilities, client side click paths, the list goes on and on. In fact, the list is actually getting longer with the W3C’s Web Performance Working Group cranking out lots of great Real User Metrics (RUM) specifications for in-browser performance diagnostics like Navigation Timing, Resource Timing, User Timing and the forthcoming Navigation Error Logging and Frame Timing.

The signaling code, often called a beacon, has traditionally been implemented in many different ways:

  • A JavaScript based timer which regularly and repeatedly fires AJAX requests to submit the latest data gathered.
  • Writing special cookies that become attached to the next “natural” request the browser makes and special server side processing code. 
  • Synchronous requests made during unload. (Browsers usually ignore asynchronous requests made during unload, so they can’t be trusted.)
  • Tracking “pixels”; small non-AJAX requests with information encoded into URL parameters.
  • 3rd party solutions, like Google Analytics, which internally leverage one of the options listed above.

Unfortunately, each of these techniques have downsides. Either the amount of data that can be transferred is severely limited, or the act of sending it has negative affects on performance. We need a better way, and that’s where the W3C’s new Beacon API comes into play.

The Solution

With the new Beacon API, data can be posted to the server during the browsers unload event, without blocking the browser, in a performant manner. The code is rather simple and works as expected:

window.addEventListener('unload', function () {
      var rum = {
              navigation: performance.timing,
              resources: performance.getEntriesByType('resource'),
              marks: performance.getEntriesByType('mark'),
              measures: performance.getEntriesByType('measure')
          };
      rum = reduce(rum);
      navigator.sendBeacon('/rum/submit', JSON.stringify(rum));
}, false);

The Catch

Unfortunately, as of this writing, the Beacon API is not as widely supported as you’d hope. Chrome 39+, Firefox 31+ and Opera 26+ currently support the API. It isn’t supported in Safari and the Internet Explorer team has it listed as “Under Consideration”.

The other catch, and this is the biggie to me, stems from this note about navigator.sendBeacon() in the spec:

If the User Agent limits the amount of data that can be queued to be sent using this API and the size of data causes that limit to be exceeded, this method returns false.

The specification allows the browser to refuse to send the beacon data (thus returning false) if it deems you’re trying to send too much. At this point, Chrome is the only browser that limits the amount of data that can be sent. Its limit is set at right around 16KB 64 KB (65,536 bytes exactly).

A Workaround?

To be fair, 16KB 64KB sure seems like a lot of data, and it is, but I’ve found myself in the situation where I was unable to beacon back diagnostics information on heavy pages because they had just too much Resource Timing data to send. Being unable to send diagnostics data on the worst performing pages really misses the point of the working group’s charter. Further, this problem will only get worse as more diagnostics information becomes available via all the RUM specifications I mentioned at the top of this post. That said, I’ve implemented several ways to reduce a beacon’s payload size without actually losing or giving up any data:

1. Use DOMString over FormData

The Beacon API allows you to submit four data types: ArrayBufferView, Blob, DOMString or FormData. Given that we want to submit RUM data, FormData and DOMString are the only two we can use. (ArrayBufferView and Blob are for working with arrays of typed numeric data and raw file-like objects.)

FormData seems like a natural way to go, particularly because model binding engines in frameworks like ASP.NET MVC and Rails work directly with them. However, you’ll save a few bytes by using a DOMString and accessing the request body directly on the server.

For simplicity in both encoding and parsing, I encode the data via JSON. (Though you could try a more exotic format for larger gains.) On the server, with JSON.NET you can parse the request body directly like this:

var serializer = new JsonSerializer();
Rum rum;
using (var sr = new StreamReader(Request.InputStream))
using (var tr = new JsonTextReader(sr))
{
     rum = serializer.Deserialize<Rum>(tr);
}

2. Make Fewer HTTP Requests

My beacon payload size issues arose on pages that had lots of resources (images, scripts, stylesheets, etc) to download, which yielded very large arrays of Resource Timing objects. Reducing the number of HTTP requests that the page was making (by combing scripts and stylesheets and using image sprites) not only helps with page performance, but also reduced the amount of data provided by the Resource Timing API which in turn reduces beacon payload sizes.

3. Use Positional Values

As mentioned above, The Resource Timing API yields an array of objects. The User Timing API does the same thing. The problem with JSON encoding arrays of objects is that all the keys for each key/value pair is repeated over and over again for each array item. This repetition adds up quite quickly.

Instead, I use a simpler array of arrays structure in which individual values are referenced by position. Here’s the JavaScript to convert from a User Timing API array of objects to an array of arrays:

// convert to [name, duration, startTime]
rum.marks = rum.marks.map(function (e) { 
     return [e.name, e.duration, e.startTime]; 
});
 
// convert to [name, duration, startTime] 
rum.measures = rum.measures.map(function (e) { 
     return [e.name, e.duration, e.startTime]; 
});

On the server I use a custom JSON.NET converter to parse the positional values:

public class UserTimingConverter : JsonConverter
{
     public override object ReadJson(JsonReader reader, 
                                     Type objectType, 
                                     object existingValue, 
                                     JsonSerializer serializer)
     {
         var array = JArray.Load(reader);
         return new UserTiming
         {
             Name = array[0].ToString(),
             Duration = array[1].ToObject<double>(),
             StartTime = array[2].ToObject<double>()
         };
     }
     // ...
}

4. Derive Data on Client

Depending on the requirements, it may be feasible to send fewer values by making some simple derivations on the client. Why send both domainLookupEnd and domainLookupStart if all that’s required is subtracting one from the other to get the domainLookupTime? The more that’s derived on client, the less raw data to send across the wire.

5. Shorten URL’s

Resource Timing data, in particular, contains a lot of often redundant URL strings. There’s many strategies to reduce URL redundancy:

  1. If all the data is being served from the same host, strip the domain and scheme from the URL entirely. (Basically make it a relative URL.) For example: http://domain.com/content/images/logo.png becomes /content/images/logo.png
  2. Shorten common segments into “macros” of limited characters that can be re-expand later. e.g.: /content/images/logo.png becomes /{ci}/logo.png
  3. The folks at Akami, who gather tons of Resource Timing data, leverage a tree like structure to reduce redundancy even more. They structure their payload like this:
    {
         "http://": {
             "domain.com/": {
                 "content/style.css": [ /* array of values */ ],
                 "content/images/": {
                     "logo.png": [ /* array of values */ ],
                     "sprite.png": [ /* array of values */ ]
                 }
             }
         }
    }

6. Leverage HTTP Headers

Not all data needs to be included in the beacon payload itself. The server can still gather some diagnostics information from the standard HTTP headers from the beacon’s request. These include things like:

  • Referrer
  • UserAgent for browser and device information
  • Application specific user data from cookies
  • Environment specific data via X-Forwarded-For and other similar headers
  • IP Address, and thus approximate geographical location (not technically a header)
  • Date and Time (also not a header, but calculated easily on server)

With this collection of techniques, you should be able to squeeze a little more out of the Beacon API. If you’ve found another way to shave off a few bytes, let me know in the comments.

Advertisements

Developer Arcade

During the holiday season I find myself reflecting over the previous year and celebrating various traditions with family and friends alike.

One of my fondest Christmas memories was back in ‘96 when my brother and I got a Nintendo 64. I remember the two of us playing Wave Race 64 together in his room all night long.

Because of this memory I tend to think about video games, which I don’t typically play, in the lead up to Christmas. I often have a desire to unwind playing them, just a little bit, in an attempt to relive those simpler days.

arcade

So this year, I began looking for games that I could play without any guilt. I looked for games that would make me a better software developer, and I found a ton! So if you find yourself bored for a few hours over the holiday break, why not try one of these games out and improve your skills?

Txt Based Games

  • Typing.io: Improve your touch typing skills with this simple little game made specifically for programmers who have to type a lot more curly braces and angle brackets than the standard Mavis Beacon player.
  • VIM Adventures: Keep your fingers moving in this Zelda-like overhead scroller. You’ll need to learn VIM commands, motions and operators to control your character to victory.
  • Terminus: Learn to navigate your way around the Unix shell with common commands in this Zork like text adventure.

Design Eye

  • What The Color?: How well do you understand the #hex, rbg() and hsl() color space used in CSS? This game from Lea Verou times how long it takes for you to guess a particular color. Hint: HSL is much easier to navigate once you get used to it!
  • RGB Challenge: Basically What The Color in reverse. You are given an RGB value and three colors to choose from. Selecting the proper color is more difficult that it should be. My high score is 7.
  • What px?: Very similar to What The Color, What px? challenges you with CSS length units by guessing the width of various elements.
  • Pixact.ly: Another CSS lengths game, Pixact.ly asks you to draw boxes of various sizes.
  • The Bezier Game: Learn how to use the Pen tool, common in SVG/vector editing software, in this advanced connect-the-dots style game.
  • Hex Invaders: (Added June 11, 15) A “more fun” version of RGB Challenge with gameplay mechanics similar to Space Invaders.

Web Development

  • HTML5 Element Quiz: Quite possibly my favorite of the games, this quiz challenges you to name as many HTML5 elements as you can in five minutes. As a life long web dev, I was surprised how many I wasn’t able to remember – but even better, I learned about many new ones I’ve never used before.
  • CSS Diner: I’m also a big fan of this game. CSS Diner teaches you CSS3 query selector syntax by graphically having you select items from a bento box that have been arranged on a dining room table.
  • XSS: Learn to think like a hacker with this security game from Google. In each level you are presented with an increasing “secure” web page, as well as tips to figure out how to exploit the page with some cross site scripting. This game is really eye opening!
  • Flexbox Froggy: This fun little game teaches you the ins and outs of the CSS layout system called Flexbox by challenging you to re-layout a screen so that a frog sits on his lilly pad.

Languages

  • CodeCombat: A very complete video game, with music, sound effects, animations and graphical styling’s, CodeCombat will teach you Python, JavaScript, CoffeeScript, Clojure or Lua by getting you to write simple little programs that lead your character through each level. Think of it as Logo with modern day bells and whistles. This one seems perfect for the kids in your life.
  • Regex Golf: Presented with both a whitelist and a blacklist of strings, can you write an expression that matches all of the whitelist, and none of the blacklist? Each level teaches you a little bit more about Regex operators.
  • Learn Some SQL: Created by a few co-workers of mine at Red Gate, Learn Some SQL shows you a table of data and you have to write the SQL statement that would select the same dataset.
  • Elevator Saga: Put your algorithm skills to the test by programming a bank of elevators to be as efficient as possible.

Miscellaneous

  • Learn Git Branching: Very similar in style to Learn Some SQL, this game shows you a diagram of a Git repository, and you have to branch, merge, rebase and commit your repository into the matching shape. I’ve shown this game to people learning Git and it really helped them.

I hope you enjoy one or two of these games as well as your holiday time. Post your high scores in the comments below – along with any games that are missing from the list.

Happy Holidays!

Post Navigation