Skip to main content

Understanding JavaScript Async by Hacking Dropbox

<p>When most online articles explain javascript async behaviour, most immediately gravitate towards the concepts of <em>delayed evaluation</em>, <code>Promise</code>s, and <code>async function</code>s. While this does offer value to the pragmatic programmer, it fails to explain just how a single-threaded programming language deals with immediacy and delayed execution. In fact, I only recently had the opportunity to truly delve in to how <em>most</em> JavaScript runtimes execute asynchronous behaviour.</p> <p>(Not all JavaScript implementations deal with asynchronous behaviour the same way; this becomes important with how we expect our code to e...</p>

When most online articles explain javascript async behaviour, most immediately gravitate towards the concepts of delayed evaluation, Promises, and async functions. While this does offer value to the pragmatic programmer, it fails to explain just how a single-threaded programming language deals with immediacy and delayed execution. In fact, I only recently had the opportunity to truly delve in to how most JavaScript runtimes execute asynchronous behaviour.

(Not all JavaScript implementations deal with asynchronous behaviour the same way; this becomes important with how we expect our code to execute).

I spoke about async before at Lighthouse Labs, and learning the ins and outs proved a challenging task -- even for a self-taught, intermediate developer such as myself.

The Dropbox Example

I had a problem recently, while using rclone: my data and all its corresponding folder accidentally got deleted when I ran:

rclone sync backups dropbox_main:

This example of rclone effectively wipes all folders and files, replacing them with the contents of my local backups folder (not good!) 😞.

So now I had all the previous contents of my dropbox folder in the trash.

disappointed

To fix this, I hopped on over to dropbox's trash page, where I found a randomly-ordered list of all those files and folders I accidentally deleted.

A First Petty Solution

My first attempt involved a petty jQuery solution:

function clickBoxes() {
    jQuery(".mc-checkbox.mc-checkbox-unchecked").trigger("click");
}

I soon found myself repeating the menial task of pressing CTRL and UP. It only took a matter of minutes to realize just how many files I had managed to accidentally delete. I couldn't continue to perform this menial task. Onto the first naiive solution to mitigate that problem:

for (let i = 0; i < 16; i++ {
    clickBoxes();
}

But it didn't make a difference! Why?

The JavaScript Event Loop

In short: the JavaScript event loop.

Every time we click all the checkboxes, dropbox's JavaScript client makes an ajax request to fetch the next items in the infinite list of deleted files and folders. So even though we search for and click all the checkboxes 16 times, we don't give the Dropbox UI enough time to fetch the next files and folders in the list.

Instead, we really want to wait for Dropbox's UI to load the next files and folders before we execute clickBoxes again. How do we do that?

setTimeout

setTimeout allows us to execute code later... kind of.

Let's take a look at the JavaScript event loop diagram.

javascript event loop

By default, setTimeout tells the compiler to queue up the call for clickBoxes to after all synchonous code has executed. This means that Dropbox's UI should theoretically have an opportunity to fetch the next folders and files in our tras. With that in mind, we can rewrite setTimeout(clickBoxes, 0) to setTimeout(clickboxes), omitting the second argument of setTimeout. Our new solution might look something like this:

for (let i = 0; i < 16; i++) {
    setTimeout(clickBoxes);
}

Running this in the console uncovers a strange behaviour of the JavaScript runtime: why don't async request execute as we would expect now that we have moved clickBoxes to the callback queue?

To answer this question, we must look to the diagram again.

event loop

Notice how the same delay I showcased earlier in the console doesn't happen. Because we have moved the calls for clickBoxes to the callback queue, Dropbox's subsequent AJAX request does happen, but not in time for any subsequent executions of clickBoxes. As a result, in the same way as before, all executions of clickBoxes still happen before we get a chance to load subsequent folders and files.

So how do we give the UI a chance to load subsequent files and folders? In short, we need to add a delay to match the race condition of the next pages not loading. Turns out we'll have to use that second argument of setTimeout after all:

for (let i = 0; i < 16; i++) {
    setTimeout(clickboxes, i * 1000); // 1000 = 1 second
}

Alas, it works!

But now we only execute clickBoxes 16 times. That surely won't suffice for millions of files. To start with, a naiive solution might involve increasing the count of times we execute clickBoxes:

for (let i = 0; i < 9000; i++) { // it's over 9000!
    setTimeout(clickboxes, i * 1000); // 1000 = 1 second
}

However, what we really want, instead, will require indefinitely.

To do that, we will need our only other friend: setInterval.

setInterval

setInterval allows us to execute a callback function at a specified interval. Thus, we can indefinitely execute clickBoxes. The second parameter allows us to specify just which interval at which the function should execute.

setInterval(clickBoxes, 1000); // every second, run `clickBoxes`

And voila! It works.

clearInterval

Just how long will we really want to run this, though? When we try to click on the restore button, the clickBoxes function still executes. We need a way to stop the clickBoxes function from executing -- temporarily.

Luckily for us, the clearInterval method allows us to stop a setInterval function from calling.

We do this by passing in the ID (a positive integer returned from the setTimeout):

let repeatingClickID = setInterval(clickBoxes, 1000); // 59

Once we're ready to click that "restore" button, we can halt the clickBoxes from executing every second:

clearInterval(repeatingClickID);

Automating File Restoration

While the solution described above works well, we can still automate the steps further to prevent having to click on the restore button. How can we restore batches of files by 300 files each? Also, it gets annoying to have to type in each of these commands in our console.

To solve this problem, let's start with a function that lets us configure how many files we want to restore:

const restoreDropboxFiles = function(fileCount) {
    // place logic here
};

Leveraging Promises

With our newly-defined function we can now leverage Promises, incorporating our previous setInterval solution:

const selectFilesToRestore = function(fileCount) {
    const clickedBoxesPromise = new Promise((resolve, reject) => {
        const $fileCheckboxes = jQuery(".mc-checkbox.mc-checkbox-unchecked").trigger("click");
        return $fileCheckboxes.length > 0 ? resolve($fileCheckboxes) : reject(`No more files to recycle!  Got ${$fileCheckboxes.length} files!`);
    });
    return clickedBoxesPromise;
}

The above code creates a new Promise which resolves only when the page has available files to delete (if at least 1 file exists, then we can resolve the Promise).

We can check that this works by running it once (without setInterval) in our console like so:

selectFilesToRestore(10).then(() => alert("files restored!"));

One problem, however: the function will indiscriminately click on every file checkbox on the page; we still haven't used fileCount. We need to have some way to check when we've restored the set amount of files.

We can do this with Array.prototype.slice; it allows us to generate a sub-array based on the length we provide (if the provided array is empty, and we try to access non-existent indices, we still get an empty array []).

Let's refactor our previous Promiseey solution of selectFilesToRestore to use Array.prototype.slice (jQuery overrides this with its own implementation):

const selectFilesToRestore = function(fileCount) {
    const clickedBoxesPromise = new Promise((resolve, reject) => {
        const $fileCheckboxes = jQuery(".mc-checkbox.mc-checkbox-unchecked").slice(0, fileCount).trigger("click");
        return $fileCheckboxes.length > 0 ? resolve($fileCheckboxes) : reject(`No more files to recycle!  Got ${$fileCheckboxes.length} files!`);
    });
    return clickedBoxesPromise;
}

Great! Now we can provide a number of files to check, but we still have to restore them. Let's create another Promiseey function to do this:

const restoreButtonSelector = ".restore-button";
const restoreButtonClicked = function() {
    return jQuery(restoreButtonSelector).trigger("click").promise();
}

The promise() method provided by jQuery returns a resolved promise (jQuery's implementation of a deferred object, similar to the native Promise but with polyfill capabilities to support all browsers). Now, only when we've clicked on the restore button will the promise resolve.

Now, we can use our restoreButtonClicked in conjunction with selectFilesToRestore to generate our final solution:

const restoreButtonSelector = ".restore-button";
const restoreButtonClicked = function() {
    return jQuery(restoreButtonSelector).trigger("click").promise();
}

const selectFilesToRestore = function(fileCount) {
    const clickedBoxesPromise = new Promise((resolve, reject) => {
        const $fileCheckboxes = jQuery(".mc-checkbox.mc-checkbox-unchecked").slice(0, fileCount).trigger("click");
        return $fileCheckboxes.length > 0 ? resolve($fileCheckboxes) : reject(`No more files to recycle!  Got ${$fileCheckboxes.length} files!`);
    });
    return clickedBoxesPromise.then(() => clickRestoreButton());
}

We now have a working solution to restore files!

However, this solution only works for each page. How can we implement this solution to delete files on multiple pages?

To do that, we'll need to combine everything we've learned so far about async.

Combining setInterval with Promises

How can we chain Promises within a setInterval? Our first attempt might look something like this:

const restoreDropboxFiles = function(fileCount) {
    let filesDeletedSoFar = 0;
    const selectFilesInterval = setInterval(function() {
        const selectFilesToRestorePromise = selectFilesToRestore(fileCount).then((checkedFiles) => {
            filesDeletedSoFar += checkedFiles.length;
            if (filesDeletedSoFar >= fileCount) {
                return restoreButtonClicked().then(() => clearInterval(selectFilesToRestore);
            }
        });
    }, 1000);
};

Success! Each second, we continuously select new files to be deleted, finally clicking on the restore button. When that button gets clicked, we call clearInterval(). Note that you can call clearInterval from within a setInterval.

Introducing async, await

We can further optimize the solution to reduce the amount of .then chaining to place our code in-line so that we can reason about our code as if it were synchronous. We can do this through JavaScript's new async and await keywords:

const restoreDropboxFiles = function(fileCount) {
    let filesDeletedSoFar = 0;
    const selectFilesInterval = setInterval(async function() {
        const selectFilesToRestorePromise = await selectFilesToRestore(fileCount);
        filesDeletedSoFar += selectFilesToRestorePromise.length;
        if (filesDeletedSoFar >= fileCount) {
            await restoreButtonClicked();
            clearInterval(selectFilesToRestore);
        });
    }, 1000);
};

Conclusion

All in all, asynchronous behaviour in JavaScript is predictable because the language operates on a single thread. Though the implementations used vary from browser to browser, we can take confidence in understanding how our program flow gets interpreted by the JavaScript compiler (V8, Gecko, etc.). Understanding how this works not only allows us to write concise code, but it gives us assurance as to how our code will execute, helping us to tidy up loose ends and to remove potentially unreached code.

In many ways, our adventures in JavaScript async have closley resembled what we may have done while writing acceptance or integration tests (a la Q Unit, Protractor, etc.). While the examples above do not specifically test code, we can use the same techniques with async, await, and setInterval to improve our code quality and readability.