Sign into your Cozi account: Sign In

April 09, 2009

Iframes: thinking outside the box

 

Iframe-firebug-rag-ps

Iframes have their uses, but they are not easy to deal with.

I added some text advertisements to our product this week. The standard technique for including advertising is to use an iframe. This works well for banner ads which come in well-known sizes.

I immediately ran into a problem with text ads in an iframe: there's no easy way to apply CSS to the contents of the iframe. Styles do not cascade through the iframe barrier. Normally, this is what you want, a self-contained unit on the page. It's fine for a banner ad, which requires no styling, but Times Roman text is jarring in a page of Arial.

It's difficult, perhaps outright impossible, to inject styles into an iframe coming from another domain.

Another problem is knowing how big to make the iframe. They don't autosize and the ad text could be one or more lines long.

I needed another way.

Making an Ajax call to fetch just the raw data that I cared about (title, body copy, link) was the obvious answer. A little wad of JSON would be much easier to deal with than trying to style an iframe. Unfortunately, the XMLHTTPRequest object cannot make cross-domain calls. But I read up on JSONP last week, so I knew that I could inject a script tag into my HTML DOM and set the src attribute to the adserver.

jQuery makes this easy: jQuery.getScript injects the script tag and removes it after the script has loaded.

We uploaded a custom template to the adserver:

setAdText({
"adTitle": "%%TITLE%%", "bodyCopy": "%%BODYCOPY%%", "clickUrl": "%%CLICKURL%%" });

I put this call to invoke the adserver's template in my page:

$.getScript(adServerUrl + cache_busting_random_token());

And the setAdText handler in my HTML page looks like this:

function setAdText(data)
{
console.log("setAdText: adTitle=[%s], bodyCopy=[%s], clickUrl=%s." data.adTitle, data.bodyCopy, data.clickUrl);
// add the ad to the DOM }

Problem solved.

March 30, 2009

Augmenting Python's strftime

Big Ben

The strftime function is the prescribed way to format dates and times in Python (and other languages). It has limitations, such as forcing a leading zero on days of the month, 01-31, and on 12-hour clock hours, 01-12.

Edit I noticed that we were repeatedly writing expressions like these

d.strftime('%A, %B ') + str(d.day)
t.strftime("%I:%M").lstrip('0') + ('a' if t.hour < 12 else 'p')

and realized that there had to be a better way.

Here's a straightforward way to augment the directives: preprocess the format string, replacing new directives with their values, then let the underlying strftime implementation take care of the rest.

import re

_re_aux_format = re.compile("%([DiP])")

def strftime_aux(d, format):
    """
    Augmented strftime that handles additional directives.

    %D  Day of the month as a decimal number [1,31] (no leading zero)
    %i  Hour (12-hour clock) as a decimal number [1,12] (no leading zero)
    %P  'a' for AM, 'p' for PM

    >>> import datetime
    >>> d = datetime.datetime(2009, 4, 1, 9+12, 37)
    >>> strftime_aux(d, '%A, %B %d, %I:%M %p')
    'Wednesday, April 01, 09:37 PM'
    >>> strftime_aux(d, '%A, %B %D, %i:%M%P')
    'Wednesday, April 1, 9:37p'
    """

    # Precompute the values of the augmented directives
    directive_map = {
        'D': str(d.day),
        'i': '12' if d.hour in (0, 12) else str(d.hour % 12),
        'P': 'a' if d.hour < 12 else 'p',
    }
    # Substitute those values into the format string
    new_format = _re_aux_format.sub(
        lambda match: directive_map.get( match.group(1), ''),
        format)
    # Let the stock implementation of strftime handle everything else
    return d.strftime(new_format)

if __name__ == "__main__":
    import doctest
    doctest.testmod()

December 02, 2008

Bourne Shell-style Command Parsing in Erlang

At Cozi, our family-oriented software depends on several web services.  We have our own deployment tool, called Artemis to deploy our web services.  Artemis has a simple web interface that processes commands using the same quoting rules as Unix's Bourne shell.

Had we written Artemis in Python, we could have simply used Python's built-in shlex module.  However, as we chose Erlang for its robust and intelligent built-in support for concurrency and network communication, I found there to be no built-in or easily available version, so I decided to write one myself (if any reader does know of such a thing, please add a comment).

The goals (and non-goals) of the Erlang version of shlex are as follows:

  • mimic the behavior of shlex.split, minus the posix and infile options
  • performance is not a major concern, as this code is called infrequently
  • try to use an Erlang-y coding style

Originally, this code was about half the present length, but it failed many of the unit tests, in particular the ones involving using quoting at the end of the string.  Probably the most interesting thing about the code (especially for someone, such as myself, with more experience using C-like languages), is the lack of explicit control structures (i.e., there are no if statements).  Instead, the "function head" style is used.

%%% See the documentation for Python's built-in shlex module for what
%%% this does: http://docs.python.org/lib/module-shlex.html

-module(shlex).
-include_lib("eunit.hrl").

-export([split/1]).

%% Must be macros so we can use them in guard clauses.
-define(IS_WHITESPACE(Char), 
        Char =:= $\s; Char =:= $\t; Char =:= $\r; Char =:= $\n).
-define(IS_QUOTE(Char), Char =:= $\'; Char =:= $\").

%% Rough equivalent of Python's shlex.split(). We support its
%% optional 'comment' argument, though.
split(String) ->
 split(String, _Word = "", _Line = "", _ActiveQuote=none, _Escape=false).

%% Explanation of these lines: Each line handles a different case. The
%% first argument is the input (which is processed recursively). Next
%% is the word currently being built up. Next is whether or not
%% quoting is currently active (and what the quote char actually is),
%% and lastly is whether escaping is currently active (e.g., whether
%% we're just after a backslash).
split([], _Word, _Line, _QuoteChar, _Escape=true) ->
 {error, trailing_backslash};
split([], _Word, _Line, $\", _Escape=false) ->
 {error, unterminated_double_quote};
split([], _Word, _Line, $\', _Escape=false) ->
 {error, unterminated_single_quote};
split([], [], Line, _QuoteChar=none, _Escape=false) ->
 {ok, Line};
split([], Word, Line, _QuoteChar=none, _Escape=false) ->
 {ok, Line ++ [Word]};
split([AnyChar|Rest], Word, Line, QuoteChar, _Escape = true) ->
 split(Rest, Word ++ [AnyChar], Line, QuoteChar, false);
split([$\\|Rest], Word, Line, QuoteChar, _Escape = false) ->
 split(Rest, Word, Line, QuoteChar, true);
split([Whitespace|Rest], [], Line, none, false) 
  when ?IS_WHITESPACE(Whitespace) ->
 split(Rest, [], Line, none, false);
split([Whitespace|Rest], Word, Line, none, false) 
  when ?IS_WHITESPACE(Whitespace) ->
 split(Rest, [], Line ++ [Word], none, false);
split([QuoteChar|Rest], Word, Line, none, false) 
  when ?IS_QUOTE(QuoteChar) ->
 split(Rest, Word, Line, QuoteChar, false);
split([Char | Rest], Word, Line, none, false) ->
 split(Rest, Word ++ [Char], Line, none, false);
split([QuoteChar], Word, Line = [_Head|_Rest], QuoteChar, false) ->
 %% Special case: "a ''" -> ["a", []]. Don't want this to fire
 %% when Line is empty b/c that would mess up "''" -> [].
 split([], [], Line ++ [Word], none, false);
split([QuoteChar, Whitespace | Rest], Word, Line, QuoteChar, false) 
  when ?IS_WHITESPACE(Whitespace) ->
 %% Special case: "a '' b" -> ["a", [], "b"]
 split(Rest, [], Line ++ [Word], none, false);
split([QuoteChar|Rest], Word, Line, QuoteChar, false) ->
 split(Rest, Word, Line, none, false);
split([NonQuoteChar|Rest], Word, Line, QuoteChar, false) ->
 split(Rest, Word ++ [NonQuoteChar], Line, QuoteChar, false).

%%% Tests

test_happy([]) ->
 ok;
test_happy([[Input|Output]|Rest]) ->
 ?debugFmt("testing ~p -> ~p", [Input, Output]),
 {ok, Output} = split(Input),
 test_happy(Rest).

test_sad([]) ->
 ok;
test_sad([[Input, ErrorAtom]|Rest]) ->
 ?debugFmt("testing ~p -> {error, ~p}", [Input, ErrorAtom]),
 {error, ErrorAtom} = split(Input),
 test_sad(Rest).


big_test() ->
 HappyCases =
 [[""],
 [" "],
 [" "],
 [" \t\n\r\t "],
 ["abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ", 
  "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"],
 ["0123456789", "0123456789"],
 ["a ", "a"],
 ["a", "a"],
 [" a", "a"],
 ["\"a\"", "a"],
 ["'a'", "a"],
 ["a''", "a"],
 ["a\"\"", "a"],
 ["\"\"a", "a"],
 ["''a", "a"],
 ["''"],
 [" ''"],
 ["a '' b", "a", "", "b"],
 ["a '' ", "a", ""],
 ["a \"\"", "a", ""],
 ["a ''", "a", ""],
 ["a ' '", "a", " "],
 ["a \" \"", "a", " "],
 ["a \" \" b", "a", " ", "b"],
 ["a \" \" b ' ' c", "a", " ", "b", " ", "c"],
 ["a \"\" b", "a", "", "b"],
 ["12 \"\" 34", "12", "", "34"],
 ["a", "a"],
 ["ab", "ab"],
 ["\\\"ab\\\"", "\"ab\""],
 ["a a", "a", "a"],
 ["a a a", "a", "a", "a"],
 ["a ", "a"],
 ["a ", "a"],
 ["a ", "a"],
 ["a b", "a", "b"],
 ["xy", "xy"],
 ["xyb", "xyb"],
 ["xy xy", "xy", "xy"],
 ["xy xy xy", "xy", "xy", "xy"],
 ["xy ", "xy"],
 ["xy ", "xy"],
 ["xy ", "xy"],
 ["xy b", "xy", "b"],
 ["margle", "margle"],
 ["margle margle", "margle", "margle"],
 ["margle margle margle", "margle", "margle", "margle"],
 ["margle ", "margle"],
 ["margle ", "margle"],
 ["margle ", "margle"],
 ["margle b", "margle", "b"],
 ["\"\""]],
 test_happy(HappyCases),
 SadCases =
 [
 ["\"", unterminated_double_quote],
 ["'", unterminated_single_quote],
 ["\\", trailing_backslash],
 ["\\\\\\", trailing_backslash],
 ["\"\"\"", unterminated_double_quote],
 ["'''", unterminated_single_quote],
 ["a \"", unterminated_double_quote],
 ["a '", unterminated_single_quote],
 ["a \\", trailing_backslash],
 ["a \\\\\\", trailing_backslash],
 ["a \"\"\"", unterminated_double_quote],
 ["a '''", unterminated_single_quote],
 ["a '''a", unterminated_single_quote],
 ["a '''a b", unterminated_single_quote],
 ["a '''a b c", unterminated_single_quote]
 ],
 test_sad(SadCases).

September 03, 2008

Cheetah Tips

[Posted by George V. Reilly]

Cheetah Tips Cheetah Tips

We're writing our new web services in Python (a story for another day). I thought I'd write up a few hard-won tips on using the Cheetah Template library.

Among other things, I'm using Cheetah to generate JSON. I wanted strict control over the layout of the result, including the order of dictionary members. Many of the properties that I'm serializing need custom handling; for example, datetimes require a non-standard format string. It might have been possible to use simplejson, but I didn't think that I could get the control that I wanted.

An explicit transformation via a template is much easier to control than an implicit one, such as simplejson's. The main downside is that if additional properties are added to the underlying data structures, I need to remember to update the template. If I were using implicit serialization à la simplejson, it would just work.

One of my requirements was that lists (arrays) and dictionaries should be prettyprinted:

[
  {
    "foo": "bar"
  },
  {
    "foo": "baz"
  },
  {
    "foo": "quux"
  }
]

Note that all but the last } is followed by a , separator—trailing commas in JavaScript initializers cause script errors in Internet Explorer, and are disallowed by the JSON spec. Getting both the comma and the indentation right proved tricky.

 1
2
3
4
5
6
7
8
9
10
11
12
13
"""
  "Attendees": [
    #set sep = ''
    #for $attendee in $attendees
$sep#slurp
      {
          "Name": "$attendee.display_name"
      }#slurp
    #set sep = ',\\n'
    #end for

  ]
"""

The #slurp directive discards everything to the end of the line, including the newline. Line 5, $sep#slurp, produces no output on the first iteration; it must be left-aligned. Line 8, }#slurp, produces a properly indented }, but no newline. On subsequent iterations, line 5 generates a comma and a newline, immediately after the } from the previous iteration. Line 11, the blank line between #end for and the closing ], ensures that the ] ends up on a new line; without it, the }#slurp would cause the last line to be }].

Here's a more complex example.

It's not safe, of course, to send arbitrary strings without encoding them as JSON. One additional requirement was that some strings also needed to be HTML escaped. Cheetah has no built-in function for JSON encoding, but it's easy to import a Python function. Here, I've defined a Cheetah function, $jh(), in terms of json_html() from the surrounding Python module, cheetah_demo.py.

#!/usr/bin/env python

import Cheetah.Template
from simplejson import JSONEncoder
import cgi

def json(s):
    return JSONEncoder().encode(s)
def json_html(s):
    """
    Convert a string to JSON notation with &, <, and > replaced by HTML entities.
    """
    return cgi.escape( json(s) )

attendees_json_template_def="""#encoding UTF-8
{
#import cheetah_demo
#def jh($t): $cheetah_demo.json_html($t)
#if len($attendees) == 0
"Attendees": []
#else
"Attendees": [
  #set sep = ''
  #for $attendee in $attendees
$sep#slurp
   {
    "Name": $jh($attendee.display_name),
    "IColor": "$attendee.color"
   }#slurp
  #set sep = ',\\n'
  #end for

]
#end if
}"""

def cheetah_attendees():
    attendees = [
        {"display_name": "Fred Flintstone", "color": "#ffabcd"},
        {"display_name": "Barney Rubble", "color": "#123456"},
        {"display_name": "Caspar the Ghost", "color": "#eee"},
    ]
    t = Cheetah.Template.Template(attendees_json_template_def, searchList=[{
        'attendees': attendees
    }])
    print t

if __name__ == '__main__':
    cheetah_attendees()

One other tip: I found the hard way that despite its superficial similarities to the C preprocessor, Cheetah is picky about spaces after #: the innocuous # for yields a cryptic error that was tedious to track down.

May 30, 2008

Preload Ajax Data as JSON

[Posted by George V. Reilly]

ASP.NET - Ajax + JSON = speed

Preloading Ajax data as JSON has helped improve the load time and perceived performance of our family software application. Most of the pages in our Web client are dynamically generated in the browser from a complex set of JavaScript and CSS, so we're always looking out for ways to make them appear more quickly.

Our Combiner Control has been a big win: it coalesces a large number of small files together, reducing the latency of loading the page.

A few months ago, our Home Page would make eight Ajax calls as it loaded, to fetch data to populate different parts of the page, such as the family calendar and shopping list panes. That behavior fell out of the modular design, but the two-connection limit forced these calls to be serialized, magnifying the latency of the page. Panes were initially rendered blank, with an all-too noticeable pause before the data appeared.

My first change was to aggregate the eight calls into one ‘mondo’ call. That helped.

// pseudo code
function BuildHomePage()
{
    CreateCalendarPane();
    CreateShoppingPane();
    // other panes
    GetAjaxData( "/Ajax/GetMondoData.ashx", RenderMondoData );
}

// Callback function, executed several hundred milliseconds later
function RenderMondoData(ajaxData)
{
    CalendarPane.Fill( ajaxData.calendarData );
    ShoppingPane.Fill( ajaxData.shoppingData );
    // other panes
}

Later, I had the insight that I should send all that Ajax data down in the initial payload of the page. Now, the page can be rendered immediately: we've eliminated the latency of a roundtrip. Not only does this reduce the overall load time for the page, it significantly improves the perceived performance of the page, as panes are rendered sooner and with full data.

It's simple to make this work in ASP.NET:

<script runat="server">
static string HomePageDataJson()
{
    DateTime startDate = DateTime.Now;
    DateTime endDate = startDate.AddDays(9);
    return AjaxPro.JavaScriptSerializer.Serialize(
        new HomePageData( startDate, endDate ) );
}
</script>

<script type="text/javascript">
var homepagedata = <%= HomePageDataJson() %>;
var homepage = new HomePage( );
RenderMondoData( homepagedata );
</script>

The HomePageDataJson() function in the server script block creates the same object that the Ajax endpoint would have, then serializes it into a JSON string, using AjaxPro. This function could also be declared in a CodeBehind .cs page.

In the client script block, this JSON string is used to initialize a var with a JavaScript literal, which is then passed down into the JavaScript code.

In principle, this could make the client-side JavaScript a little simpler as it no longer needs to break the processing into asynchronous steps. In practice, we still need those handlers when the user navigates through the dataset.

This JSON preloading technique could easily be adapted to other Ajax libraries and other Web servers.

May 29, 2008

A Way To Unit Test ASP.NET IHttpHandler Implementations

[Posted by Pavel Repin]

If you find yourself writing simple HTTP handler code that produces and consumes structured data (for instance, some RESTful application), you may wonder how to test it without fiddling with IIS or configuration files. Here's a trick to write pure unit tests that verify your IHttpHandler implementation does what you expect. By "pure unit tests", I mean test code that:

  • works without configuration files (like web.config);
  • needs no servers (like Cassini, IIS, or your own mock HTTP server with full ASP.NET pipeline that you were about to write and debug just before you stumbled upon this article);
  • doesn't access file system (like ashx files);
  • avoids globals (like HttpContext.Current).

The Unit Tests In Action

Pretend we have an IHttpHandler implementation, TagHandler, that lets browsers retrieve a list of tags (using GET requests) and post new tags (using POST requests). The handler responds to GET requests with an array of tag objects serialized as JSON. It also parses submitted data (JSON) in POST requests and creates tags based on that.

Here are the unit tests that verify the POST behavior. Note the use of a helper class, HttpHandlerTest, that does all the tricky bits such as setting up HttpContext instance, creating a fake request body, the session state, and more.

[TestFixture]
public class TagHandlerTestFixture {
    private Dictionary<String, Tag> tags;

    [SetUp] public void SetUp() {
        // the tags object is always empty at the start of each test case!
        tags = new Dictionary<String, Tag>();
    }

    [Test, ExpectedException(typeof (ArgumentException))]
    public void PostTagWithInvalidRequestContentType() {
        HttpHandlerTest http = HttpHandlerTest.ImitatePost(
            "tags.ashx", "{}", Encoding.UTF8);
        // Won't like text/xml, it wants application/json!
        http.Context.Request.ContentType = "text/xml";
        http.Execute(new TagHandler(tags));

    }

    [Test] public void PostTag() {
        HttpHandlerTest http = HttpHandlerTest.ImitatePost("tags.ashx",
            "{\"name\":\"foo\",\"description\":\"a a a\"}", Encoding.UTF8);

        http.Context.Request.ContentType = "application/json";
        http.Execute(new TagHandler(tags));

        Assert.IsTrue(tags.ContainsKey("foo"));
        Assert.AreEqual("a a a", tags["foo"].Description);
    }
   
    // ... more tests
}

You start off with setting up either a fake GET or POST request with HttpHandlerTest.Imitate[Get/Post]() method, then you set up additional preconditions in the initialized HttpContext object accessible as a property on HttpHandlerTest instance. Then you pass the IHttpHandler instance to the Execute() method. Your handler does its work. And then you can verify your post-conditions using NUnit assertions.

Here are some more tests from the same fixture that exercise the GET behavior:

    [Test] public void GetTagsResponseIsApplicationJson() {
        HttpHandlerTest http = HttpHandlerTest.ImitateGet("tags.ashx");
        http.Execute(new TagHandler(tags));
        Assert.AreEqual("application/json",
                        http.Context.Response.ContentType);
    }

    [Test] public void GetNoTags() {
        HttpHandlerTest http = HttpHandlerTest.ImitateGet("tags.ashx");
        http.Execute(new TagHandler(tags));
        Assert.AreEqual("{}", http.Output);
        Assert.AreEqual(Encoding.UTF8.GetType(),
            http.Context.Response.ContentEncoding.GetType());
    }

HttpHandlerTest Internals

Under the hood, HttpHandlerTest creates an instance of HttpContext initialized with a custom version of SimpleWorkerRequest. ImitateGet gives you an instance of HttpHandlerTest geared for testing GET requests, which do not have any content in the request body. ImitatePost is for simulating POST requests that do have content in the request. It should be easy to add support for other HTTP methods like DELETE, HEAD, and PUT.

public class HttpHandlerTest
{
    // Methods
    protected HttpHandlerTest(string requestMethod, string requestBody,
        Encoding requestEncoding, string page, string query);
    public virtual void Execute(IHttpHandler testSubject);
    public static HttpHandlerTest ImitateGet(string virtualPath);
    public static HttpHandlerTest ImitatePost(string virtualPath,
        string requestBody, Encoding requestEncoding);

    // Properties
    public virtual HttpContext Context { get; }
    public virtual string Output { get; }
}

Links

  1. Full source code for HttpHandlerTest with the example web app. Note: the example web app uses AjaxPro.JSON.2.dll and nunit.framework.dll which are bundled in the "lib" directory. You can get everything by: svn co http://codebackpack.googlecode.com/svn/tags/how-to-test-httphandlers
  2. Of course once I checked that nobody else found solution just as easy as the one presented here, and wrote this post, I was immediately shown ("Thank you" GeorgeR) that there was already a post about this by Phil Haack and my solution looks very similar :(  Oh well, at least this proves that this is not such a bad idea after all.
  3. Lutz Roeder's Reflector has proven very valuable in figuring out which methods of SimpleWorkerRequest to override.

April 23, 2008

JavaScript Error Tracking: Why window.onerror Is Not Enough

[Posted by Joanna Power]

Cozi builds family organization software, including a free online calendar, shopping list and family journal. In order to provide a richer experience for the users of our web application, over the past year we have been transitioning more and more of our code from C# running in an ASP.NET application to JavaScript running in the browser. We are using the jQuery JavaScript library for event handling, DOM manipulation, effects, styling and AJAX calls. In the process of this migration, we lost our ability to easily track errors and exceptions using an error log on the application server. Not knowing what errors our users might be encountering when they use our product made us nervous, so we set out to regain complete error tracking capabilities. Our first attempt to solve the problem was to attach a custom JavaScript handler to window.onerror. This handler called back to the application server to log an error; an example handler might look like this:

window.onerror = function(msg, url, lineNo) 
{
  trackError(msg, url, lineNo);
}

This attempt largely worked, but as we started paying close attention during development, we noticed that not all JavaScript errors were successfully reported to the application to be logged. Specifically, we were missing some Firefox errors. Investigating further, we discovered that an error thrown once the page was loaded and the browser event loop took over was reported and tracked for IE(6/7), but it was not reported and tracked for Firefox(2/3). You can see evidence of this in the example page. This page throws two errors when it loads: one during the load itself and the other in the jQuery document.ready handler. A third error is thrown when the button is clicked. The custom window.onerror handler shows an alert with the error message and any other details that are available. Try it in Firefox and again in IE. You'll see only one alert in Firefox, but you'll see all three in IE.

I googled and googled, but I found nothing that suggested what the problem might be. It was either one of those situations where you had to know the answer to craft the query that would find the answer, or else no one else had struggled with this problem. Since we use jQuery as the glue in our event-driven application, I sought help from the jQuery Development group. It took some time, but help arrived. (Thanks, dhtmlkitchen!) As it turns out, in Firefox, an error thrown by code in an event handler attached using addEventListener does not make it to the window.onerror handler. There is an existing tracking issue for this bug at Mozilla: issue #312448. The jQuery bind function uses addEventListener in Mozilla and attachEvent in IE.

That explained it. All of our application behavior once the page loads is triggered by user interaction, and we use jQuery bind to attach handlers to the appropriate JavaScript DOM events. In short, most of our application's interesting code, i.e. the code most likely to cause errors, runs as a result of an event handler being triggered. These errors were arriving at window.onerror for IE, but due to the Mozilla bug, not for Firefox.

On then to the fix. It was clear that wrapping code with try/catch blocks was necessary, but there was no way we were going to successfully do that in every handler function. It took some fiddling, but overriding jQuery bind with a version that wraps the provided handler function in try/catch blocks does the trick:

// override jQuery.fn.bind to wrap every provided function in try/catch
var jQueryBind = jQuery.fn.bind;
jQuery.fn.bind = function( type, data, fn ) {
  if ( !fn && data && typeof data == 'function' )
  {
    fn = data;
    data = null;
  }
  if ( fn )
  {
    var origFn = fn;
    var wrappedFn = function() {
      try
      {
        origFn.apply( this, arguments );
      }
      catch ( ex )
      {
        trackError( ex );
        // re-throw ex iff error should propogate
        // throw ex;
       }
     };
     fn = wrappedFn;
   }
   return jQueryBind.call( this, type, data, fn );
};

The bind override is inelegant in that it needs to know too much about how the original jQuery function shuffles parameters around, but if and when it breaks we'll just need to fix it in one place. Since bind function is used under the covers by short-cut functions like click, hover, etc, this one-time fix takes care of tracking almost all application errors. Note that to be completely thorough, we should also override the jQuery one function.

To catch errors that are thrown during page initialization, we also wrap document.ready functionality in try/catch blocks:

$(document).ready( function() {
  try
  {
    initializeEverything();
  }
  catch ( ex )
  {
    trackError( ex );
    // re-throw ex iff error should propogate
    // throw ex;
  }
} );

Try out the fixed version of the example page. Like the first example page, this page throws two errors when it loads: one during the load itself and the other in the jQuery document.ready handler. A third error is thrown when the button is clicked. The window.onerror handler shows an alert with the error message and any other details that are available. Try it in Firefox and again in IE. Now you'll see all three alerts in both Firefox and IE.

What About Safari?

Safari does not yet support window.onerror, which means that without the steps described above, we wouldn't be able to track any errors from Safari. The try/catch additions to document.ready and jQuery bind will allow us to track most, but not all, errors arising in Safari.

Further Reading

If you want to learn more about the extremely cool jQuery JavaScript library, check out jquery.com.
There is a good explanation of how to implement JavaScript error tracking using window.onerror on Matt Snider's blog, though it does not mention the attachEventListener bug.
If you want a refresher on the fundamental differences between event handler attachment in Mozilla and IE (addEventListener versus attachEvent), there is a good explanation at QuirksMode.

April 08, 2008

Combining JavaScript and CSS files for Improved Performance

[Posted by Mark Atherton]

Summary

Cozi has developed an ASP.NET control to combine multiple JavaScript or CSS files into a single reference, thus improving performance. The control handles problems of cache expiry dates and versioning, is easy to use, and can be switched on and off for debugging and testing.

Background

We’d like to organize our JavaScript and CSS files logically, without having to worry about performance. For instance, as we write more and more object-oriented JavaScript, we might want to put each JavaScript class into its own file. However, doing so would increase the number of round trips between the browser and the Web server.

This performance penalty is especially bad for JavaScript files since each one may contain a document.write which could change the sense of everything after it—so the browser can only download one at a time.

Unfortunately each extra JavaScript file adds about 200ms to page load time—the exact time varies by a dizzying array of factors (DSL vs Cable, how far it is from the browser to the server, etc, etc).

CSS files are not quite as bad as page parsing does not have to stop whilst they are being downloaded; however, the page cannot be correctly rendered until they have all been received.

Caching can help mitigate this delay, although it has its own problems.

Caching

To get maximum end-user performance, a Web site needs to set expiration headers on as many files as possible, especially static web content like images, CSS, JavaScript, etc. These headers tell the browser that it does not need to check back with the server for an updated version of a given file for some defined period.

Once that period has expired (or if no expiration is set), the browser will typically send a 'conditional get' giving the 'last-modified-time' of the file. If the file has not changed, the Web server can respond with a '304 Not Modified' response and not send down the data again.

Unfortunately the browser still needs to wait for the response before continuing. Therefore, it is best if the browser has a copy of the file AND it knows that it can be used without referring back to the Web server.

Versioning

Caching can also interfere with file versioning.

The easiest way to enabling caching is to tell your Web server that a whole directory of files can be cached on the client for a week (or day, or whatever). Unfortunately when you edit one of the cached files and deploy the new version, you have a problem because existing clients will run with their locally cached copy of the file until they reach the end of the expiration period.

How we solved these problems at Cozi

At Cozi we are developing an increasingly rich Web-based tool for making life simpler for families, including a shared family calendar, shopping list management, and more.

Our engineers need to have quite a few JavaScript and CSS files so that several of us can work on the site without trampling on each other’s code. However, this created the performance problems described above.

Part of our solution was to combine CSS and JS files into groups that get downloaded as one blob, saving many round trips.

Our requirements for the combining tool were:

1. It had to be easy to use.

2. You must be able to switch it on and off for testing and debugging scenarios (on a page-view-by-page-view basis).

3. It must support long cache expiration dates on these combined files.

4. It must solve the versioning problem.

To implement the solution, we wrote an ASP.NET control that wraps around our <link> and <script> HTML statements.

For example here’s a simplified version what the CSS references for our home page looks like:

<client:CombinerControl runat="server">

<link rel="stylesheet" href="/styles/sIFR-screen.css" type="text/css"/>

<link rel="stylesheet" href="/styles/Cozi.css" type="text/css"/>

<link rel="stylesheet" href="/styles/FooterToolbar.css" type="text/css" />

...

</client:CombinerControl>

The combiner outputs a reference to an Http Handler which will serve the combined file (I've put in some line breaks to make this a little more readable):

<link rel="stylesheet" type="text/css"  ←
href="../Combiner/Combiner.ashx?ext=css ←
&ver=9b0d6e4e ←
&type=text%2fcss ←
&files=!styles*sIFR-screen*Cozi*FooterToolbar*"
/>

As far as the browser is concerned, this is just a reference to a CSS file. It neither knows nor cares that it is programmatically generated.

A few parameters are passed to the handler (Combiner.ashx). They are:

Parameter

Meaning

ext

The file extension of the files, this is to save having to pass the file extension for each file separately.

ver

This is how we handle versioning the CSS files. The combiner calculates the ‘version’ of the combined file by mushing together the last-modified-time of all the constituent files.

If any one of the files changes, this parameter will change and the browser will request a new combined file.

type

This is the Content-Type the handler should return.

files

This is the list of files to be combined. It is somewhat compressed to reduce the length of the URL generated.

! means the following token is a new directory that applies until the next change of directory is seen.

* means a file to be combined.

' replaces / for legibility, as / would have to be encoded as %2f

Here’s a similar code snippet combining JavaScript code:

<WebClientCode:CombinerControl ID="CombineScript" runat="server">

<script src="script/third-party/jquery.js" type="text/javascript"></script>

<script src="script/third-party/sifr.js" type="text/javascript"></script>

<script src="script/third-party/soundmanager.js" type="text/javascript"></script>

<script src="script/cozi_date.js" type="text/javascript"></script>

</WebClientCode:CombinerControl>

The combiner outputs a reference to an Http Handler which will serve the combined file:

<script src="../Combiner/Combiner.ashx?ext=js ←
&ver=59169b00 ←
&type=text%2fjavascript ←
&files=!script'third-party*jquery*sifr*soundmanager*!script*cozi_date*" ←
type="text/javascript"></script>

The combiner control meets our requirements thus:

1. It had to be easy to use

It is, you just wrap the files you want: <client:CombinerControl runat="server">

2. You must be able to switch it on and off for testing and debugging scenarios

By passing a parameter on the Query String you can turn this behaviour off for testing and the control will merely output the list of files as though the control itself didn’t exist.

3. It must support setting long cache expiration dates on these combined files

The handler that serves the combined file automatically sets a very long expiration on the combined file, no configuration is needed.

4. It must solve the versioning problem

Because browsers cache based on the full, case-sensitive URL of the resource, you can set ‘infinite’ expiration, as if any file changes, a different reference will be written out and the browser will request a new file.

Gotchas

As always, we ran into a few gotchas:

  1. You can’t combine CSS files with different media tags; all the CSS files must have the same media type.
  2. You must preserve the order of the files; both for script and CSS files, order can matter.
  3. You must remove UTF-8 BOM marks.  It would be OK to have a BOM mark at the very start but many of our JS files had BOM marks at the start and when we combined them, these BOMs ended up in the middle of the combined output and caused the browsers problems as they were interpreted as strange characters.
  4. It’s very hard to combine script files that take query string parameters; luckily it is quite unusual for static script files to take parameters so this wasn’t much of a limitation. 

Further work

Our combiner also manipulates the contents of the CSS and JS files for other reasons, unrelated to main work of combining the files. For example, we might blog about how we use the combiner to load balance and edge-cache our images in another post.

Related web links

http://www.thinkvitamin.com/features/webapps/serving-javascript-fast
Great article on combiner-type issues, albeit rather PHP- and Apache-centric.

http://developer.yahoo.com/performance/index.html#rules
Rules for high performance Web sites.

April 01, 2008

Multiple Firefox Profiles: Run Firefox 2 and 3 Side-By-Side, and More

[Posted by George V. Reilly]

I find it useful to have multiple Firefox profiles for developing and testing. A clean profile for testing allows you to replicate most users' environments, who don't install extensions. Running a development profile in a separate profile lets you restart the browser without messing with your default environment. You can also run Firefox 2 and Firefox 3 side-by-side in separate profiles.

I currently have the following profiles:

  • default: I browse the web in this profile. Outlook opens links in this profile.

  • dev: I do most of my development here. Firebug gets a heavy workout.

  • test: A clean profile. I only install the ResizeIT extension here.

  • Firefox3: The latest beta of Firefox 3. The other profiles are Firefox 2-based.

Creating a new profile

Follow the instructions here to create test and dev profiles. Essentially, run:

 "%ProgramFiles%\Mozilla Firefox\firefox.exe" -profilemanager -no-remote

and use Create Profile to create profiles called dev and test

Your profile.ini should now look something like this:

 [General]
StartWithLastProfile=1

[Profile0]
Name=default
IsRelative=1
Path=Profiles/y1mds144.default
Default=1

[Profile1]
Name=dev
IsRelative=1
Path=Profiles/k034naef.dev

[Profile2]
Name=test
IsRelative=1
Path=Profiles/f2akbryu.test

Running the new profiles

If you want to run two profiles side-by-side, you need to run firefox.exe with the -no-remote and -P <profile> options. This launches each profile in a separate process.

Either create a batch file that looks like this:

 @start "Firefox Test" "%ProgramFiles%\Mozilla Firefox\firefox.exe" -no-remote -P "test"

Or create a Desktop Shortcut where the Location looks like:

 "C:\Program Files\Mozilla Firefox\firefox.exe" -no-remote -P "dev"

More information: Complete list of Firefox command-line options.

Running Firefox 2 and Firefox 3 side by side

By default, Firefox 2 installs into %ProgramFiles%\Mozilla Firefox, while Firefox 3 Beta 4 installs into %ProgramFiles%\Mozilla Firefox 3 Beta 4.

They can be installed on the same computer, but you must take care to use different profiles. Not all Firefox extensions have been updated yet for Firefox 3.

When you install Firefox 3, make sure that you do not launch it automatically at the end of the installation. Instead, run:

 "%ProgramFiles%\Mozilla Firefox 3 Beta 4\firefox.exe" -profilemanager -no-remote

and use Create Profile to create a profile called Firefox3. Then, follow the instructions above to create a batch file or a desktop shortcut, adjusting the path to firefox.exe.

March 24, 2008

Porting C# code to C++

[Posted by Mark Atherton]

Cozi has a downloadable PC client written in .NET which includes an awesome photo collage screen saver.  We've had many requests to make the screen saver available as a separate download; however the .NET 2.0 dependency makes it a large download for many people, plus we'd like to port it to other platforms in the future. So we decided to convert the screen saver to C++ (using WTL/ATL as the class library). 

Before we started, it seemed like quite a big task as the layout engine in our photo collage screen saver is pretty sophisticated. However, so far it's turned out to be essentially a mechanical translation.

This was helped by a few factors:

  1. .NET uses GDI+ under the hood to provide nearly all the System.Drawing functionality.  GDI+ is also available in the unmanaged world and (luckily) .NET had changed very little of the fundemental interfaces.
  2. GDI+ has a nice class-based model so you're spared all the COM tedium of QueryInterface/Release.
  3. ATL provides a reasonable set of collection classes that mapped quite well onto the C# collection classes.
  4. We just didn't bother with CPP files; there's only 1 in the whole project, and this clearly helps translate C# as otherwise you'd spend hours declaring members then defining them — this is definitely the most tedious bit about C++. Of course, there is some point at which it would get too inefficient to do that for all classes, but our project seems to compile pretty quickly so far.
  5. Expectations have changed. C# has dimmed the appetite for micro-optimizations, and there are many cases in the converted code where classes are being allocated and copied unnecessarily. In the past, we might have optimized these away but now we're happy to leave it and fix any problems that show up during performance testing.  Similarly we use collection classes where before we'd have converted them to arrays.

Class library

As we have a certain amount of UI within the screen saver (for the configuration dialogs) we wanted a class library to help with that. The main choices were WTL/ATL and MFC; we chose WTL as it had a lower download size and we didn't need the sophistication of MFC. This then dictated using ATL for strings (CString), smart pointers (CAutoPtr) and collections.  We could have used STL or another library too, but I know ATL so I preferred to stick with the library I knew.

Mechanical transformations

I ported the code over class by class and the first things I did were just mechanical changes.

  1. Replace Rect with Rectangle (for some reason .NET renamed this class and we use it a lot).
  2. Add a ; at the end of the class definition (this caught me almost every time).
  3. Replace "String" and "string" with "CString".
  4. Remove the "public" from the front of the class definition.
  5. Remove IDisposable if defined
  6. Move any code from Dispose() to the destructor.
  7. Change static (and enum) references from Class.Member to Class::Member.
  8. Add a : after public/private/protected declarations of members.
  9. Change Debug.Assert to ATLASSERT
  10. Map collection classes, usually this was just changing List<blah> to CAtlArray<blah>
  11. CAutoPtr<> any allocations.
  12. CAutoPtr<> any "usings", or just declare an object and let its destructor sort it out.
  13. Change properties into GetXXX and SetXXX methods.
  14. Pass classes either as references or as pointers.
  15. Remove "this." (you could change to this-> to lower the risk of unexpected consequences but it looks ugly).
  16. Change "const int"  class members to enum {};
  17. Replace null with NULL (or #define null NULL).

Notes

Doing this conversion has given me some perspective on the strengths and weakness of the two languages.

  1. Pro for C++: Destructors are good. For code that needs deterministic cleanup (and as it happens this code needs a lot because of all the images) C++ is way better, IDispose is OK but look at the amount of code compared to C++.
  2. Pro for C#: Properties are good. C# properties are a vast improvement over getter and setter methods.
  3. Pro for C#: Field initialization is great. Having been able to initialize fields right at the point of declaration is far superior to C++ where you have to declare in one place and initialize in the constructor.  This extends to consts which you can't declare within the class for some strange reason in C++.
  4. Pro for C#: foreach is great. I'm sure I could STL or similar to get some of the same benefits, but foreach seems so natural.

Another thing I noticed is that garbage collection is not helping as much as I'd imagined. Whilst it's great not to worry about it, so far as long as I'm religious about using CAutoPtr and related auto-cleanup classes memory leaks don't seem to be the problem they were.

Example

(Comments and error handling have been removed.)

namespace Cozi.ScreenSaver
{
  using System;
  using System.Drawing;
  using System.Diagnostics;
  using System.IO;

  public class CollageImage : IDisposable
  {
    private const int c_resizeImagesOverMegaPixels = 9;

    Bitmap m_bmp;
    bool m_disposed = false;

    public CollageImage( string filePath )
    {
      if ( File.Exists( filePath ) )
      {
        this.m_bmp = new Bitmap( filePath );
      }
      else
      {   
        // Create an dummy bitmap if the path doesn't exist
        // (i.e. net path disconnected)
        this.m_bmp = null;
        return;
      }

      int megaPixels = (this.m_bmp.Width * this.m_bmp.Height) / 1000000;

      if (megaPixels > c_resizeImagesOverMegaPixels)
      {
        int width = System.Windows.Forms.Screen.PrimaryScreen.Bounds.Width / 2;
        int height = width * this.m_bmp.Height / this.m_bmp.Width;
        Image resized = new Bitmap(this.m_bmp, new Size(width, height));
        this.m_bmp = (Bitmap)resized;
      }
    }

#region IDisposable Members
    public void Dispose()
    {
      Dispose(true);
      GC.SuppressFinalize(this);
    }

    ~CollageImage()
    {
      Debug.WriteLine("LEAK: Failed to Dispose of CollageImage");
    }

    protected virtual void Dispose(bool disposing)
    {
      if (!this.m_disposed)
      {
        if (disposing)
        {
          if (this.m_bmp != null)
          {
            this.m_bmp.Dispose();
            this.m_bmp = null;
          }
        }

        this.m_disposed = true;
      }
    }
#endregion


    public Size Size
    {
      get
      {
        if (this.m_bmp == null)
        {
          return new Size(1, 1);
        }

        return this.m_bmp.Size;
      }
    }

    public void Draw(Graphics g, Rectangle bounds)
    {
      if (this.m_bmp == null)
      {
        return;
      }

      g.DrawImage(this.m_bmp, bounds);
    }
  }
}

Becomes:

#pragma once
#include "stdafx.h"

class CollageImage
{
  CAutoPtr<bitmap> m_bmp;

  enum
  {
    c_resizeImagesOverMegaPixels = 8
  };

  public: CollageImage( const CString& filePath )
  {
    if (::GetFileAttributes(filePath) != INVALID_FILE_ATTRIBUTES)
    {
      m_bmp.Attach(Bitmap::FromFile(filePath));
    }
    else
    {   
      return;
    }

    int megaPixels = (m_bmp->GetWidth() * m_bmp->GetHeight()) / 1000000;

    if (megaPixels > c_resizeImagesOverMegaPixels)
    {
      Image* resized = null;

      int width = GetSystemMetrics(SM_CXSCREEN) / 2;
      int height = width * m_bmp->GetHeight() / m_bmp->GetWidth();

      Image* resized = new Bitmap(width, height);
      Graphics g(resized);
      g.DrawImage(m_bmp, Rect(0, 0, width, height));

      m_bmp.Free();
      m_bmp.Attach((Bitmap*)resized);
    }
  }

  public: Size GetSize()
  {
    if (m_bmp == null)
    {
      return Size(1, 1);
    }

    return Size(m_bmp->GetWidth(), m_bmp->GetHeight());
  }

  public: void Draw(Graphics* g, Rect bounds)
  {
    if (m_bmp == null)
    {
      return;
    }

    g->DrawImage(m_bmp, bounds);
  }
};

Subscribe

Cozi Tech Blog RSS Feed Cozi RSS XML Feed

Other Cozi Blogs

  • Cozi Connection Blog
    Visit the Cozi Connection Blog for the latest information about Cozi (the company) and tips about Cozi (the software).
  • flow|state
    The user interface blog by Cozi co-founder Jan Miksovsky.

Advertisement