Using JS code libraries
Getting the code into the code from outside the code
We have our object code all nice and neat now; see Args() and Query catching for fun and profit. It’s so useful that we can’t wait to use it in all our service scripts. Pasting all that code over and over into every little thing we try to write. Joy!
Reusability changes everything
We have run into a veritable tragedy of JavaScript. It has no way to import code into code.
To use new Query()
and new Args()
to create
our objects, we have to include all the code for them and we have to
put it at the top of any service script we write. That means we just
pushed our real program code down a couple hundred lines. Worse, we’ve
broken good coding standards by cutting and pasting the identical code
into a couple dozen places.
Besides the fact that they totally clutter up the scripts they are in,
the different copies of Query()
and Args()
code will now either drift apart or become a maintenance nightmare.
While in theory this problem is easily surmounted but we give up some of the gains we’ve made. While JavaScript can’t import JavaScript directly, HTML has no trouble importing all you want. So we have to go back to multiple scripts to import libraries.
As a sidenote, you don’t see this kind of thing getting much treatment in some of the better books like Javascript: the Definitive Guide or JavaScript & DHTML Cookbook. The reason is, without being able to separate classes out of code, they just a pain to reuse. Maybe we have a way of abstracting out that difficulty.
Lost gains
<script type="text/javascript" src="http://elektrum.org/js/QueryObj.js"> </script> <script type="text/javascript" src="http://elektrum.org/js/ArgsObj.js"> </script> <script type="text/javascript" src="http://elektrum.org/js/fakeService.js?some=args;go=here"> </script>
fakeService.js
can now use the objects defined in
ArgsObj.js
and QueryObj.js
. The problems are
obvious. We went from a single script back to multiple scripts, and
not even just two anymore but three. This also counts on the client
not misusing the scripts or leaving one out.
To account for the potential misuse we would have to wrap all our
service code in try()
s and catch()
es in case
it’s going to throw exceptions from missing library code. Lame and not
what we want if there’s any way to avoid it.
Saved by the D to the HTML
Fortunately, we can squeak by again with a little ingenuity. As we already discussed, we can’t import code directly into code. We can import it into HTML. We can also write our HTML dynamically. Therefore, we can write the scripts we need into the HTML from a single master script.
It adds a layer of complexity for the service code but it keeps the
client code simple. It’s back to a single script call. And the
increased implementation complexity isn’t monstrous. We gain much more
by being able to keep our library code, ArgsObj.js
and
QueryObj.js
, in its own reusable files.
Now, instead of requiring the client to do this.
<script type="text/javascript" src="http://elektrum.org/js/QueryObj.js"> </script> <script type="text/javascript" src="http://elektrum.org/js/ArgsObj.js"> </script> <script type="text/javascript" src="http://elektrum.org/js/fakeService.js?some=args;go=here"> </script>
We can go back to this.
<script src="http://elektrum.org/js/fakeService.js?arg1=It%20works;arg2=Yay!" type="text/javascript"></script>
We just need an intermediary to write them all out. We’ll call our
<script>
writing script,
fakeService.js
, and move the original service code to
realFakeService.js
. That way the client only sees and
uses the name fakeService.js
.
fakeService.js
document.write('<script type="text/javascript" ' + 'src="http://elektrum.org/js/QueryObj.js"></script>'); document.write('<script type="text/javascript" ' + 'src="http://elektrum.org/js/ArgsObj.js"></script>'); document.write('<script type="text/javascript" ' + 'src="http://elektrum.org/js/realFakeService.js"></script>');
And the script it calls with the third line of its output starts something like this.
realFakeService.js
// everything is imported already so we're good to go with our custom // object classes var query = new Query(); var args = new Args(); // ...and roll from there
It’s never easy
Every solution spawns a new concern. Our trick for getting the currently executing script just went out the window because we’ve buried it in a stack of two or more scripts. In this specific case it’s four scripts deep.
-
The single script called by the client,
fakeService.js
, which writes out the<script>
tags for the libraries and the actual service script,realFakeService.js
. But it doesn’t have access to the arguments yet because theArgObj.js
isn’t loaded untilfakeService.js
is done executing. - The first library
<script>
forArgsObj.js
written by #1. - The second library
<script>
forQueryObj.js
written by #1. - The service,
realFakeService.js
, written by #1 which now needs to get at the query string arguments left behind.
That means that the self-seeking code
var scripts = document.getElementsByTagName('script'); var myScript = scripts[ scripts.length - 1 ];
returns script #4 instead of #1 which is where the query string arguments are. This does get a little involved but isn’t too hard to fix. Here’s one way.
Find by depth
function findSelf ( depth ) { if ( ! ( depth > 0 ) ) depth = 1; var scripts = document.getElementsByTagName('script'); var index = scripts.length - depth; var myScript = scripts[index]; return myScript; }
Which we’d then use like so.
Inside realFakeService.js
var myScript = findSelf(4);
That doesn’t really feel satisfactory. We’re spreading complexity
across script boundaries which isn’t good. It also means you must have
a count of how many scripts you’re writing out inside
fakeService.js
, and use that knowledge inside
realFakeService.js
. This is prone to error and update
disconnect if a library script is added or removed but the depth
setting isn’t changed.
There are always options
Find by recursion
function findSelf () { var scripts = document.getElementsByTagName('script'); for ( var i = scripts.length; i >= 0; --i ) { if ( scripts[i].src.match(/^[^\?]+\?/) ) return scripts[i]; } // none has a query string, so default to most recently seen return scripts[ scripts.length - 1 ]; }
And that’s pretty close to the sweet spot. We could use it just the way we used our original and we can use it in an arbitrarily deep stack of imported scripts too. It does what we want regardless of the context.
It does have two assumptions built in.
-
There will be either one or no
src
with a query string or if there are multiple invocations, each one has exactly one query string; if one didn’t and a previous one did, it would pick up the previous one—bad. - There will be no intervening, unrelated scripts with a
query string in its
src
.
#2 is a safe assumption. We’re using this technique specifically so we never split our own scripts up so there will never be intervening scripts.
#1 is not a safe assumption. To fix #1, we have to move to a technique along these lines which lets us cache a note as to what we’ve already seen as we move along.
Recursive check with caching
function findSelf () { var scripts = document.getElementsByTagName('script'); for ( var i = scripts.length - 1; i >= 0; i-- ) { if ( scripts[i].src.match(/^[^\?]+\?/) && scripts[i].innerHTML !== '//seen' ) { scripts[i].innerHTML = '//seen'; return scripts[i]; } scripts[i].innerHTML = '//seen'; } // none has a query string, so default to most recently seen return scripts[ scripts.length - 1 ]; }
In this case, we mark those that have already been looked at as
//seen
so we won’t look at them again. We use the
innerHTML
because we know it’s a real attribute—making
up your own or using those outside the standard can blow-up on
you—and because scripts with src
es ignore their
innerHTML
to start with. It doesn’t change anything if we
mess with it or reset it.
The problem with the technique is the one that so often derails
obvious or elegant JavaScript solutions to problems: browser
compliance. About 50% of major browsers support setting the
scripts[i].innerHTML
. That’s obviously not good enough.
The algorithm is the right one, though, and we can use the same
approach with a different implementation.
findSelf(), take 4; caching + site checking
var _SeenScriptCache; if ( ! _SeenScriptCache ) _SeenScriptCache = new Array(); function findSelf () { var scripts = document.getElementsByTagName('script'); for ( var i = scripts.length - 1; i >= 0; i-- ) { if ( ! scripts[i].src.match.('^http://elektrum.org/') ) continue; if ( scripts[i].src.match(/^[^\?]+\?/) && ! _SeenScriptCache[i] ) { _SeenScriptCache[i] = 1; return scripts[i]; } _SeenScriptCache[i] = 1; } // none has a query string, so default to most recently seen return scripts[ scripts.length - 1 ]; }
We use a global cache in the array _SeenScriptCache
to
keep track of what scripts have already been checked. The reason this
isn’t as good as the previous approach is it’s back in a global. We
use the leading underscore in an attempt to protect the variable’s
privacy. It’s much more elegant and robust to keep the cache in the
objects themselves where the privacy of the scheme would be assured.
Alas.
We also added a sanity check.
if ( ! scripts[i].src.match.('^http://elektrum.org') ) continue;
Without that, we might intercept incorrect arguments from services
provided by other sites if they have a query string and our call
doesn’t. Remember, the query string is never intrinsically tied to the
script like it is with a regular HTTP request. We’re finding it in the DOM and if we’re not careful
we’ll examine the wrong <script>
.
Under Handling client PEBKAC in Pangrams in action, we might want to cross #3 of the list for good. Without arguments we should fail silently or with some kind of error feedback.
The src
check has a hidden benefit as well. You don’t
want bandwidth leeches breaking your service interface and skipping
directly to your libraries. The libraries have to be open to any
referrer to be able to work, just like the service. With the
src
check, the argument handling will fail for anyone
else trying to use the library in scripts outside your own server.
This won’t work for every kind of library, but it’s a nice bit of gravy for this one. To protect others you would probably have to catch the misuse in your weblogs—eg, libraries called by a page without the services they’re for—and ban IP.
The sweet spot
This function is really meant to be a method of the Args()
object we developed. Once we’re back in a
class, we have the ability to make variables private in a much better
way. We’ll make the cache a class variable, available to all objects
Args()
s we create.
Args._findCaller(), a private class method
// a cache for use inside _findCaller() if ( ! Args._SeenScriptCache ) Args._SeenScriptCache = new Array(); // ----------------------------------- Args.prototype._findCaller = function () { var scripts = document.getElementsByTagName('script'); for ( var i = scripts.length - 1; i >= 0; i-- ) { var src = scripts[i].src; var rx = new RegExp(/^http:\/\/elektrum.org/i); if ( ! src.match(rx) ) continue; // ignore other sites' scripts if ( src.match(/^[^\?]+\?/) && ! Args._SeenScriptCache[i] ) { Args._SeenScriptCache[i] = 1; return scripts[i]; } else { Args._SeenScriptCache[i] = 1; // mark it seen anyway } } // none has a query string, so default to most recently seen return scripts[ scripts.length - 1 ]; }
See two full versions of the Args()
class in the
Appendix: Code for the Args() classes.
There is one apparent failure which is actually okay. If there no
arguments in the calling script, _findCaller()
will seem
to find the wrong script; namely realFakeService.js
Since
there are no arguments in this script (again, the service writer’s
responsibility), its irrelevant. The script was meant to get no
arguments and though it looks in the wrong place ultimately, it still
finds the correct number: none.
There is one assumption left for this to be reliable and it’s the
service provider’s responsibility again. All services must use
this same handler—or technique—or risk bleeding over into each
other. They all need to go through our Args()
class. The
whole point of abstracting this class out into its own file was to
make sure it was the only code used, so we are in the clear at last.
Is that clear, class?
Here’s where we stand. We have scripts writing scripts to import
scripts that will be used by the ultimately written script and dig
backwards through the DOM’s scripts to find the arguments in the
original calling script’s src
. If you’re not confused,
you’re Don
Knuth.
Don’t panic. This is really an exceptionally valuable technique. It’s worth picking up and spreading around. So, let’s take the time to walk through a new and code-complete example of how it can be done: Using JS code libraries, part 2.