Asynchronous Web Programming Pitfalls

For the past couple of years we have been actively involved in building web application security testing tools primarily written in web technologies. Needless to say, given the complex nature of some of these applications, we've learned a few things along the way about asynchronous programming and what you should not/do especially when designing complex HTML5 apps, such as games or even web application security scanners like in our case.

Why Asynchronous Programming

The raise of modern Web technologies forced the development community to change old habits and embrace new programming models and paradigms. Today on The Web we talk about asynchronicity - a programming model deeply integrated into the fabrics of dynamic web applications. You may be familiar with events and the browser event model and probably it is not news that virtually every programmable component that the browser exposes is meant to be event-driven. You can easily check this by just creating a while loop like this: while (1); and see what the browser will do. This is a blocking loop and your tab/browser will hang upon execution and in some cases render your entire operating system useless. Obviously your browser is not designed to handle this.

The reason for this behaviour is because the browser makes web applications look and feel single-threaded from the programmers' point of view. Everything is executed sequentially and the more time you need to execute a block of code the more your app will lag and wont be able to respond to other events. So you need threads?!

The only thing that closely resembles the thread, as we know it from more general-purpose programming languages, is the Worker object. While web workers are great, they are sometimes not the right solution. You see, Workers are almost like a backwards compatibility mechanism for people who are used to write applications in synchronous, blocking mode. I am not saying that workers are bad and you should not use them. On the contrary, they have they own strengths and are very useful but just like any other tool you need to know where to use them and how.

Going back on the subject of asynchronous programming, there is really one way to write modern web applications and that is through events. Simply put you have a lot of events and they all fire at unspecific time in the future and you need to handle them appropriately.

Getting To Know Asynchronous Programming

Unfortunately dealing with complex async systems is... well complex. You cannot just hook functions on some events and expect them to be preferment enough for the requirements of your application. What I want to draw your attention to is a collection of techniques/principles we've used across our web application security toolkit to make it as preferment as it is today and bare in mind that we are not using Workers.

The rest of the post is about what to do or not to do in order to create performant and complex async applications. Take all of these recommendations with a pinch of salt because while they work for us they may not be applicable to you. Here they are.

Use Arrays With Care

Don't use arrays. Ok use arrays but only when you know their exact size and type. C and Java arrays always have a predefined size and are type-bound, which is what makes them efficient. In JavaScript you don't have such constrains and arrays are often abused. Simply put, you can create an array, which can grow as much as it can and it is completely mutable. This is pretty much a recipe for disaster. Not only you can quickly run into memory problems but also, if you leak you will end up clobbering the entire memory space with useless data. On the top of that arrays are most of the time used with loops, which is another thing to keep in mind that I will discuss later.

You can use arrays only when you know the number of items and the type of values they will store. For example, you can create an array of 10 items and you know that there are exactly 10 items in there and that's all. More importantly, you must create an array of 10 items of a specific type. Don't mix strings with numbers and booleans with objects, etc. Not only your code will be harder to understand but also it makes it more error-prone.

If you like, you can create some kind of pseudo-array object, which mimics the traditional array structure yet enforcing some kind of usage policies, i.e. new MyArray(20);.

Minimize The Number of Lists

This is related to the topic of arrays. JavaScript arrays are really mutable lists. Using lists is ok but don't over do it. In our projects we tend to have many-fixed size arrays and a few global lists, which may contain a lot of entries. We find that this approach is satisfactory and efficient.

You cannot really program without arrays and lists but use them with care. For example, use fixed-size arrays when you want to store a small number of items of specific type and use lists for storing large number of items of specific type but use them as less as possible because this takes memory space.

Use Loops with Care

We are going back to the fixed-size principle. Simply put, the more iterations you need to perform the longer it will take for a loop to complete - no brainer. Given that there is no threading model, there is nothing to interrupt the loop in the middle in order to perform another task. Therefore, your loops need to be size-bound and that size needs to be reasonable in respect to the operation that you have to perform inside the loop.

In our code we tend to use Iterators a lot, which we find more manageable because they can be also used in an event-driven mode unlike loops. JavaScript has generators (via the yield keyword), which sort of converts traditional loops into Iterators but I haven't see this being used much in the wild. I am sure that will change in the future.

Again, use loops with great care and only when you know the maximum number of iterations and based on that you know that this wont block the browser for too long. If possible do not use loops at all and just use Iterators in event-driven mode, i.e. onSomething: function () { this.iterator.next(); // etc }.

Mutable Object Will Fail You

You cannot go without mutable objects but in an environment where everything is mutable things are likely to go very bad. Immutable objects are usually used to fight threading problems and you may be wondering why you should use them in an environment where there are no threads. Well, while there are no threads to screw up with your memory there are plenty of events, which can make an object out of sync. There are also browser bugs, which we have seen plenty of. Also, events can come in all kinds of order and sometimes in ways we have never anticipated. Worse, these bus are very complex to trace and sometimes almost impossible to kill.

Immutable objects give you a safety-net to fight against race conditions and scenarios, which put your data structures out of sync.

Handle Exceptions Appropriately

First of all throw exceptions as soon as you can. Do not trap exceptions, just so that you don't have to write another try/catch block. Don't be a lazy developer. Trap exceptions only when the code block you are writing should not fail in any circumstances, for example some kind of loop, which checks objects states and you want to make sure that you filter everything that is bad. Most of the time you want the exception to escalate. This makes debugging much easier.

If you have to trap the exception, always log it. If you expect an error to occur but you don't need to write an exception handling block, log it. You will be surprised how many bugs you will be able to catch when you have good error logging practices. Some events will never raise exceptions high enough for you to see it.

Don't Use Plain Objects

Plain objects look like this: var myObj = {};. Don't use them. Use proper objects instead, i.e. function MyObject() {}; var myObje = new MyObject();. Not only this will make your code more readable and manageable but also will prevent memory leaks, which you have never anticipated. This works extremely well with immutable objects in order to make your code very efficient.

Object encapsulation is very important especially in big and complex system. Asynchronous programming feels like juggling with ten balls, all of which are in the air at the same time. Plain objects may be easier and quicker but we find them to be error-prone and often lead to complex problems.

Use the Observer Pattern

The observer pattern is awesome because it gives you the ability to chain multiple operations while still keeping your code clean and encapsulated. In other words, it is a bit like blocking event system. Because of this, use the observer pattern with great care. For example, parts of our web application security testing engine contains lists of 2-3 observers, which execution is chained. We really need this this and this is why we have it. However, the most important observer implementations are used with one observer object only. For example. objectA -> setObserver(objectB) instead of objectA -> addObserver(objectB). Yes it is difficult to think with so many constrains in mind but it is worth it because you will evade a lot of complex bugs.

Find An Event Spinner

Since you cannot write long-lasting loops you need to find an event mechanism to spin the code in your main application engine. Let me explain. Most GUI frameworks start with what is known as the event loop. The event loop handles all events, which occur during the application execution. In web applications you do not need this because there is already an event loop in place. Since you cannot use long-lasting loops you need to find an event mechanism, which is frequent enough in order to make your code execute real-time but not frequent enough to slow down the application. The last thing you need is your event queue to be filled up with unanswered events because you won't be able to cancel them.

In our security testing engines we rely on the event loop of XMLHttpRequest object. We have a complex scheduling system, which spins the rest of the application when a XMLHttpRequest event is received. In other words, our apps are anchored to the event loop of the XMLHttpRequest and the rest of the operations are broken down to execute in tiny fractions when events are received. This has the effect of making our applications look and feel real-time without using any Workers, timers, etc.

So, find your event spinner and design the rest of your application around it.

Don't use JavaScript

The hint is in the title. JavaScript is a scripting language and it is fine for scripting things. "Scripting" is the important keyword. Don't use any other scripting language either.

The very first version of our web application security scanner was written from scratch entirely in JavaScript. There were bugs, which we could not trace. Exception handling was sometimes really bad and if you use some APIs, which you are not familiar with you will end up with strange bugs and problems you have never anticipated. Unit-testing helped the problem a little bit but we were still facing some complex issues.

So, we ended up creating our own static compiler, which works very well for async programming. It took us 3 months to develop it but today we use it to compile our code for any platform and this is what makes our technology unique and really efficient. The static compiler takes care of a lot of the problems I have highlighted above.

You are probably asking right now if we have plans to release this compiler. You see, this compiler is not very useful for anything else but writing security tools so it will be useless to you. However, it is really not that difficult to make your own compiler and the tools to help you are already out there. You just need to find your own way to solve this problem.

Program With Constraints In Mind

I cannot stress this enough so I will keep it simple. Find an automatic way to eliminate all bad coding practices from your application. We use our own compiler for this but you can probably come with something better.

Summary

I hope that your find this article helpful and I am willing to explore some of these topics further on this blog. I think that we've got a lot of things contribute about this subject so it will be good karma to share our experience with the rest of the world.

Let us know if this is something that will be interesting for you.