AJAX Analysis

An Overview and Critique of Using XMLHTTPRequest in Client-Side Development

SOAP, Flash, AJAX — one senses a bathroom-cleanser trend these days in web-related nomenclature, akin to the fruity allusions that computer manufacturers have made over time, such as Apple, Apricot, Blackberry, Pearcom, Pineapple 6502, and ...um Olive-tti.

Admittedly, the last one strains the point a little, nevertheless I am cooking up a new approach to web applications currently, called VIM — 'Validated Interface Management' (or something). This technology snoops users' browser-profiles on the sly, but guarantees a nice clean bathtub as compensation. The enterprise edition removes unsightly lime-scale from your boardroom and is called JIF, but I am thinking of changing this to CIF because nobody outside the UK seems to get the joke.

On a more restrained note, few software issues in the last ten years have generated quite so much interest, quite as quickly as AJAX — a moniker for the use of the XMLHTTPRequest class in script-based client-server communication — and this article examines the nature, implications and challenges of this surprise development of 2005.

Thumbnail image of the front cover of issue nine of Objective View This article appeared in issue 9 (February 2006) of Objective View, Ratio Group's journal for software development professionals (sadly, Ratio is now defunct).

AJAX Defined

Asynchronous JavaScript and XML, or AJAX — an iconic and memorable acronym — possesses a marketable spirit of élan and dynamism, but is also rather non-representative. First, there is no such thing as asynchronous JavaScript per se; this term refers to the asynchronous nature of most HTTP-based communication. Secondly, the technology behind AJAX is not limited to JavaScript, but is available from within VBScript as well.

Moreover, AJAX has nothing to do with XML intrinsically, as data may be exchanged in any format from raw character sequences, through proprietary formats, to the serious players such as JSON and XML (although scope for processing large binary-datasets is limited in client-side scripting). In fact, AJAX signifies fully bi-directional client-server transaction from within client-side scripts, using HTTP directly, and conducted through instances of the XMLHTTPRequest class. Critically, this is an improvement on other mechanisms.

Connection Spectrum

To clarify this, consider the (somewhat figurative) diagram. This lays out the range of client-server communication techniques available from within a browser-based script execution-context.

At the coarsest level, one can set the location attribute of the window object, which has the same effect as entering a literal URL in the address field of the browser. This embodies client-server communication, but a page refresh is implicit as is, from that, the re-initialisation of the script execution-context. It is only by recording the values of variables of a previous script, and retrieving them subsequently, can the developer maintain continuity between programs pertaining to different pages. This requires, for example, that one save data in one or more cookies on the client.

The intermediate approaches, such as setting the src attribute of an image object, or using frames, preserve continuity of state, and can allow transaction of useful data. Moreover, they can be forced into service as a means of implementing relatively transparent (for users) communication with the server, the 'image cookie' technique being an example[1]. The problem is that this foists awkward and inelegant design on the programmer, with all the concomitant development, testing, and cross-browser issues that such dissonance carries.

An alternative is the use of proprietary technologies such as Java applets and Flash. While these too can preserve program state between client-server transactions, and allow great flexibility, they are disjoint in that they require languages, platforms and development environments that are quite distinct from the native calling-context.

This can introduce intra-team impedance, causing design fragmentation, longer development times, bigger test regimens, and so on systemically. Then there is the problem of download times: waiting minutes for some unknown Flash-binary to arrive does not endear users with limited bandwidth [this article was written prior to the proliferation of BroadBand].

Native Techniques

This leaves the truly native approaches. These comprise setting the href attribute of a style link from within a page's script — whereupon the property-set is downloaded automatically — or setting the src attribute of a script element, which causes the browser to fetch the relevant script file ('on-demand' JavaScript).

In the first case, the choice (and tradeoffs therein) are clear: a style-sheet can be downloaded according to the user agent in question, as opposed to the more algorithmic approach, where style properties are changed piece-meal.

In the second case, a script can download browser/functionality-specific scripts, with similar tradeoffs to the style-sheet technique. These methods preserve execution state, yet as with the others, they work well when the server talks to the client, but are impoverished bi-directionally.


 <link id    = "StyleTag"
       rel   = "stylesheet"
       type  = "text/css"
       href  = ""
       media = "all" />

 <script type = "text/javascript">

 if (navigator.userAgent.indexOf ("MSIE") != -1)      LoadStyle (".../IE_Style.css");
 else
    {
    if (navigator.appName.indexOf ("Netscape" != -1)) LoadStyle (".../NS_Style.css");

    // etc...

    }

 function LoadStyle (StyleSheetName)
    {
    document.getElementById ("StyleTag").href = StyleSheetName;
    }

 </script>
      

XMLHTTPRequest

In contrast, the XMLHTTPRequest class is a wrapper for an underlying HTTP connection (although you can communicate only with the URL from which the script originated, unlike the on-demand technique). The commonly supported interface for the XMLHTTPRequest class is depicted in the next diagram.

Using this class entails instantiating it, assigning a function reference to the onreadystatechange member, and then calling open, followed by a call to send. The parameters in the open call allow one to stipulate the HTTP method, the URL to send the request to, and optional parameters, such as a flag denoting a synchronous or asynchronous request (synchronous meaning the client waits until the response returns). Any data that must be transmitted is passed as a string during the call to send.

The reference assigned to onreadystatechange must point to a function that is called a number of times during the dialogue between server and client. This function should test the readystate and status members of the XMLHTTPRequest object and, assuming success, can process the data returned (if any) from there. If information has been sent then this can be accessed through the raw character string represented by the responseText attribute. However, if it is well-formed X(HT)ML then the XMLHTTPRequest object will have parsed the data on receipt (you have no choice — see below), resulting in an XML DOM-object hierarchy that is available through the responseXML attribute.

However, browser incompatibilities are the programmer's bane in much of client-side development, and XMLHTTPRequest is a challenge here too. All the major browsers, except IE, support the creation of XMLHTTPRequest objects as instances of native JavaScript classes, but Microsoft implement XMLHTTPRequest on the back of ActiveX, which complicates matters; moreover, two versions are possible.

The listing depicts typical code for issuing a call, and for processing the response. Do note, however, that this is not a general purpose implementation, due to various bugs and inconsistencies exhibited on various user agents (see the end note for more on this).


 function Handler ()
    {
    if (RequestObj.readyState == 4)
       {
       if (RequestObj.status == 200) // Do something here with
                                     // RequestObj.responseText
                                     // or RequestObj.responseXML
       else throw ("Error");

       }

    }

 var RequestObj;

 if      (window.XMLHttpRequest) RequestObj = new XMLHttpRequest ();
 else if (window.ActiveXObject)
    {
    try { RequestObj = new ActiveXObject ("Msxml2.XMLHTTP"); }
    catch (e)
       {
       try       { RequestObj = new ActiveXObject ("Microsoft.XMLHTTP"); }
       catch (e) { throw e; }
       }

    }

 RequestObj.open (Method, URL, SyncFlag, ... );

 RequestObj.onreadystatechange = Handler;

 RequestObj.send (Data);
      

JSON

As pointed out above, XML is not mandatory, nor need it be de rigueur, and what is particularly exciting is the use of XMLHTTPRequest in conjunction with JSON or JavaScript Object Notation. In an article in [Objective View] in 2000[2], I commented that a C-related format would have been preferable to the tag-based grammar and notation of XML; the argument being that syntactic and semantic parallels with the C-family of languages would make this easier to learn and more human-readable than XML, and would therefore represent true standards-unification.

JSON is that idea incarnate: being a subset of JavaScript's object-literal syntax, it relates implicitly to run-time objects, and thus removes all impedance between the communication format and object-representation in client-side code.

In addition, and in contrast with XML's verbose syntax, JSON's parsimony gives an improved 'content to markup' ratio, which makes it considerably more resource-efficient and human-readable than XML ever could be. The listing shows the same data set encoded in XML and JSON.


 <!-- An invoice element in XML -->

 <Invoice InvNum  = "43508"
          InvDate = "23/07/2005"
          PONum   = "53098">

    <Item Desc = "Bolts"   Num = "20" Price = "0.12" />
    <Item Desc = "Washers" Num = "10" Price = "0.13" />
    <Item Desc = "Screws"  Num = "15" Price = "0.05" />

 </Invoice>


 // The same data encoded in JavaScript Object-Literal Notation

 Invoice =
    {
    InvNum  : "43508",
    InvDate : "23/07/2005",
    PONum   : "53098",
    Items   : [ { Desc : "Bolts",   Num : "20", Price : "0.12" },
                { Desc : "Washers", Num : "10", Price : "0.13" },
                { Desc : "Screws",  Num : "15", Price : "0.05" } ]
    }
      

A principal advantage to JSON is that no special parsing technology is required. JavaScript's eval function (part of the underlying ECMAScript standard) allows direct invocation of the interpreter, and when passed a JSON string it will return a fully-fledged object that can be manipulated using standard notation. The following code fragment illustrates this:

var InvObj = eval("( { InvNum : '43508', InvDate : '23/07/2005' } )");

alert (InvObj.InvDate); // Displays 23/07/2005

It follows that object-literal strings transmitted by the server can be passed to eval on the client, yielding data in native form. The reverse process — 'stringifying' an object — requires only a modest function[3], before, data can be passed back to the server. Contrast the transparency of this approach with XML, which frequently requires complex parsing technologies and DOM operations, with all the inefficiencies therein.

First-Order Implications

The core implications of XMLHTTPRequest are clear. Form data can be validated on the fly, and without using awkward techniques, because the application can send the content of each input element to the server transparently as the user moves from field to field. For users, the page remains on screen, while their attention is drawn to any illegal values by means of DHTML techniques.

It also implies dynamic page-construction: X(HT)ML data can be requested, and when received the resulting object-hierarchy can be manipulated, as with any DOM tree, and/or included in part or whole into the structure of the page. Alternatively, text can be requested, and then placed as content into page elements upon receipt.

The listing illustrates this, and shows the code required for displaying (on a rolling basis) the number of users logged on to a system.

Further to this, and while this is not part of JSON per se, JavaScript Object-Literal definitions can contain functions, meaning that the server can send procedures as and when they are needed. It follows that a page can change its functionality on the fly; implying pages that can be downloaded in a minimal form initially, and which develop their functionality contingently and dynamically from there. In the main, this is a more fine-grained alternative to the on-demand approach, but what is so attractive is that both techniques mean you need only pay for what you use. Clearly, this offers positive bandwidth and storage-requirements tradeoffs.

These points imply a major phase over the next few years of radical site-redesign in preference for user interfaces that resemble conventional desktop applications. Visitors to, for example, a railway-timetable site will be able to enter places, dates and times of departure, while the appropriate service-details appear automatically in adjacent parts of the page. Moreover, and beyond site re-design, XMLHTTPRequest implies applications whose implementation (conventionally) would prove challenging if not completely impracticable.

However, there are also negative implications: the price of freedom is responsibility, and XMLHTTPRequest gives the irresponsible developer the freedom to implement user interfaces that are clever in principle but frustrating in practice. Imagine reading a block of content, only to find it disappears from view as the script receives some data from the server and then updates the page dynamically, thus causing the browser to scroll the user's point of interest out of sight.


 <script type = "text/javascript">

 var NumUsers_RequestObj = null;
 var NumUsers_Elem       = null;

 function Init ()
    {
    NumUsers_Elem = document.getElementById ('NumUsers');
    GetNumUsers ();
    }

 function GetNumUsers ()
    {
    try       { NumUsers_RequestObj = IssueXHR ("POST",
                                                ".../NumUsers.php",
                                                "",
                                                true,
                                                DisplayNumUsers); }

    catch (e) { alert ("Error sending request - "
                      + e.name
                      + ": "
                      + e.message); }

    setTimeout ("GetNumUsers ()", 5000);  // Triggers every 5 seconds

    }

 function DisplayNumUsers ()
    {
    if (NumUsers_RequestObj.readyState == 4)
       {
       if (NumUsers_RequestObj.status != 200) throw ("Error");

       NumUsers_Elem.innerText = NumUsers_RequestObj.responseText;

       }

    }

 function IssueXHR (Method, URL, Data, ASync, Handler)
    {
    // Code for creating XMLHTTPRequest, then opening and sending etc.
    }

 </script>

 <body onload = "Init ()">
    Users Logged on Currently = <span id = "NumUsers"></span>
 </body>
      

Concurrency Concerns

An additional concern lies with concurrency, because asynchronous communication with the server means the potential for race conditions. In the diagram, method M must execute before the request returns, because some of the objects it manipulates contain data that can be changed by the response from the server.

All is well in the left-hand scenario, because the response is sufficiently tardy. In the right-hand case, however, the client receives the response (and thus executes its response handler) before M has completed, thereby invoking a bug that may prove difficult to resolve.

To compound this, multiple concurrent XMLHTTPRequests increase the potential for such problems exponentially with each communication thread that is added to the execution context. The solution is to implement some form of locking, although multiple concurrent threads then introduce the potential for deadlock. These points are not unique to XMLHTTPRequest, as they apply to Java applets etc, but as with those, and with traditional application development, the only realistic way to manage such problems is good design.

The Ethical Dimension

There are, however, other implications, some of which hold out exciting possibilities, and some that are less desirable. One example is the potential to capture users' site-navigation patterns through mouse and keystroke events, and return these transparently to the server. By such means, it would be possible to build a statistical picture of the way that users interact with a site, thus allowing remodelling and refinement to reflect this, and thereby provide a better user-experience. This is clearly of benefit to all.

However, other potential applications of XMLHTTPRequest are more cynical. It is entirely possible to capture browsing patterns very precisely, which suggests that certain types of application could allow estimation of a given user's demographic profile in real time. In principle, this would allow adverts that were targeted at that particular user-type to be placed within a page dynamically (indeed, before the user's very eyes).

Heat or Light

Understandably, XMLHTTPRequest also has its critics, and one area of debate (that seems to generate more heat than light) is over issues of URL linearity, site indexing and the so-called 'broken back-button'. Some have used these points to decry the viability of AJAX-style techniques (on-demand JavaScript is in much the same position), yet the fact is that you cannot have your static browsing-cake and eat it. By definition, to introduce a dynamic component into web-based systems is to depart from the fixed-page model upon which these concepts rely.

In the case of the back-button, this has never been more than a browser control that unwinds the URL-visitation history for a given window. It was never intended to support arbitrary undo-mechanisms; therefore one cannot really bemoan the loss of functionality that was never present originally. In the case of site indexing, it is obvious that the data-sets that traditional desktop-applications manipulate (web pages, for example) can be catalogued, as can binary executables themselves (this really does happen[4]). Yet the concept of a running program embodies the very notion of ever-changing state, and in this respect executable code constitutes an index into a particular state-space; one that search engines can never be suited to cataloguing.

Conflation

A lasting criticism, however, is that XMLHTTPRequest was designed rather poorly. Class names with numerous syllables often reflect conflation of abstraction, and from that functionality, and XMLHTTPRequest (nine syllables — tedious to rattle-off repeatedly when teaching AJAX courses) is a good example. As the fact of JSON shows, XML need not be the preferred medium, yet XMLHTTPRequest unifies XML formatting with HTTP connectivity, which are two different issues entirely — XML-processing capability comes along for the ride anyway, whether one desires it or not.

More subtly, one may actually wish to transact X(HT)ML data, but wish to avoid automatic parsing, for reasons of resource management when striking a balance between lazy and eager evaluation. Alternatively, developers may wish to implement some form of proprietary XML parsing that is better suited to the application — a DOM parser is a relatively heavyweight affair, and mobile devices put a squeeze on resources. To stir politics into the mix, IT-related but non-technical colleagues (management) are likely to assume instantly that XMLHTTPRequest is about XML intrinsically, making it all the harder to convince them that XML is not all that it is cracked up to be, and that better alternatives exist.

A preferable approach, therefore, to implementing the spirit of XMLHTTPRequest would be a simple HTTP-connection class. This would implement client-server communication independently from the format in which data was transferred, and would have no fixed association with XML, thus leaving client-code response handlers free to process the data returned as they saw fit.

Performance and Redundancy

Another criticism is that the exception-handling technique that is required to work around browser incompatibilities can be considered an abuse of the exception-handling mechanism. However, the real problem with this is that it is slow. Obviously, ultra-performant code may not be the prime mover in client-side programming, but the inherent inelegance of this approach does tend to set one's teeth on edge.

Further to this, the response handler must check the readyState and status members of the XMLHTTPRequest object, and only proceed with processing the response if the transaction has both completed, and completed successfully. It seems that no meaningful processing can be done before satisfaction of both these conditions, which suggests that this checking should be the responsibility of the XMLHTTPRequest object, not client-code, and which would result in simplified response-handlers were the class implemented this way.

Polymorphism and Closure

Happily, a polymorphic solution to the performance problem is possible, where the correct creation-statement is determined when a script loads (using the exception-handling approach, in part), and is then called through a function reference whenever a connection object is required[5].

In the case of the redundant checking-logic in the response handler: this can be solved using the rather more exotic technique of JavaScript closures, and the listing illustrates this. Here a 'base' response handler is defined as an anonymous inner-function within IssueXHR, and every XMLHTTPRequest object created refers to that function through its onreadystatechange member.

As before, the XMLHTTPRequest returned by IssueXHR is not garbage collected because it is referred to at global scope. However, the XMLHTTPRequest object's reference to the base response-handler also ensures preservation of the scope chain stretching from that function back to the global execution-context. In other words, the parameters passed into a given call to IssueXHR, along with its local reference to the XMLHTTPRequest object that it returns, persist as long as the XMLHTTPRequest object does — this is a closure.

The key factor is that each invocation of IssueXHR creates a distinct scope-chain. This means that when the base response-handler is executed, the XMLHTTPRequest object in whose context it is being called is visible, as is the reference to the client's response handler. This allows the base response-handler to check the readyState and status members of the XMLHTTPRequest object, and then call the correct client response-handler (passing the XMLHTTPRequest object) on successful completion of the transaction.

Gratifyingly, this means the client's response handler can be reduced to just the code for manipulating the data returned by the server, and if client code is not interested in the data, or if no data is returned, then the handler can be a simple empty function (which is an instance of the Null Object design-pattern)


 var NumUsers_RequestObj = IssueXHR ("POST",
                                     ".../NumUsers.php",
                                     "",
                                     true,
                                     DisplayNumUsers);

 function IssueXHR (Method, URL, Data, ASync, ResponseHandler)
    {
    var XHRObj = ...   // Code for creating XMLHTTPRequest object

    XHRObj.open (Method, URL, ASync);

    XHRObj.onreadystatechange = function ()
       {
       if (XHRObj.readyState == 4)
          {
          if (XHRObj.status == 200) ResponseHandler (XHRObj);
          else                      throw ("Transaction complete but unsuccessful");
          }

       };

    XHRObj.send (Data);

    return XHRObj;

    }

 function DisplayNumUsers (XHRObj)
    {
    NumUsers_Elem.innerText = XHRObj.responseText;
    }
      

On Balance

The reality of AJAX is that XMLHTTPRequest — given JSON, on-demand JavaScript, and dynamic style-sheet loading — is but one (albeit powerful) resource in web-based client-server development. It follows that it should be viewed in combination and on balance with these other techniques. Moreover, there are problems that developers will attack using XMLHTTPRequest that may be far more soluble using one or more of the related approaches.

Finally, and for non-UK readers who may be perplexed by the introduction to this piece: Ajax is one particular brand of bathroom/kitchen cleanser, as is the case with Flash, Cif (neé Jif but changed to Cif in 2004 because the name was meaningless outside the UK), and Vim. Validated... virtual... I'll think of it soon.

References

[1] Image-Cookie technique
www.ashleyit.com/rs/rslite/

[2] A Profile of XML
Richard Vaughan
ObjectiveView Issue 4 (February 2000)
This document is available here on this site

[3] Principal JSON site
www.json.org

[4] Finding Binary Clones with Opstrings & Function Digests
Andrew Schulman
Dr Dobbs Journal of Programming
July, August & September 2005
www.ddj.com

Copyright © Richard Vaughan 2006