Tuesday, February 21, 2017

Difference between "var f = function" and "function f" in Javascript

Can you tell me what is the difference between these to pieces of code?
var f=function() {
  alert('f u!');

function f() {
  alert('f u!');

It's only one I can think of and with good code hygiene it is one that should never matter. It is related to 'hoisting', or the idea that in Javascript a variable declaration is hoisted to the top of the function or scope before execution. In other words, I can see there is a function f before the declarations above. And now the difference: 'var f = function' will have its declaration hoisted, but not its definition. f will exist at the beginning of the scope, but it will be undefined; the 'function f' format will have both declaration and definition hoisted, so that at the beginning of the scope you will have available the function for execution.

Sunday, February 19, 2017

Promises in Javascript


I am not the authoritative person to go to for Javascript Promises, but I've used them extensively (pardon the pun) in Bookmark Explorer, my Chrome extension. In short, they are a way to handle methods that return asynchronously or that for whatever reason need to be chained. There are other ways to do that, for example using Reactive Extensions, which are newer and in my opinion better, but Rx's scope is larger and that is another story altogether.

Learning by examples

Let's take a classic and very used example: AJAX calls. The Javascript could look like this:
function get(url, success, error) {
    var xhttp = new XMLHttpRequest();
    xhttp.onreadystatechange = function () {
        if (this.readyState == 4) {
            if (this.status == 200) {
            } else {
    xhttp.open("GET", url, true);

get('/users',function(users) {
    //do something with user data
},function(status) {
    alert('error getting users: '+status);

It's a stupid example and not by far complete, but it shows a way to encapsulate an AJAX call into a function that receives an URL, a handler for success and one for error. Now let's complicate things a bit. I want to get the users, then for each active user I want to get the document list, then return the ones that contain a string. For simplicity's sake, let's assume I have methods that do all that and receive a success and an error handler:
var search = 'some text';
var result = [];
getUsers(function (users) {
    .filter(function (user) {
        return user.isActive();
    .forEach(function (user) {
        getDocuments(user, function (documents) {
            result = result.concat(documents.filter(function (doc) {
                        return doc.text.includes(text);
        }, function (error) {
            alert('Error getting documents for user ' + user + ': ' + error);
}, function (error) {
    alert('Error getting users: ' + error);

It's already looking wonky. Ignore the arrow anti pattern, there are worse issues. One is that you never know when the result is complete. Each call for user documents takes an indeterminate amount of time. This async programming is killing us, isn't it? We would have much better liked to do something like this instead:
var result = [];
var search = 'some text';
var users = getUsers();
.filter(function (user) {
    return user.isActive();
.forEach(function (user) {
    var documents = getDocuments(user);
    result = result.concat(documents.filter(function (doc) {
                return doc.text.includes(text);

First of all, no async, everything is deterministic and if the function for getting users or documents fails, well, it can handle itself. When this piece of code ends, the result variable holds all information you wanted. But it would have been slow, linear and simply impossible in Javascript, which doesn't even have a Pause/Sleep option to wait for stuff.

Now I will write the same code with methods that use Promises, then proceed on explaining how that would work.
var result = [];
var search = 'some text';
var userPromise = getUsers();
userPromise.then(function (users) {
    var documentPromises = users
        .filter(function (user) {
            return user.isActive();
        .map(function (user) {
            return getDocuments(user);
    var documentPromise = Promise.all(documentPromises);
    .then(function (documentsArray) {
        documentsArray.forEach(function (documents) {
            result = result.concat(documents.filter(function (doc) {
                        return doc.text.includes(search);
        // here the result is complete
    .catch (function (reason) {
        alert('Error getting documents:' + reason);
Looks more complicated, but that's mostly because I added some extra variables for clarity.

The first thing to note is that the format of the functions doesn't look like the async callback version, but like the synchronous version: var userPromise=getUsers();. It doesn't return users, though, it returns the promise of users. It's like a politician function. This Promise object encapsulates the responsibility of announcing when the result is actually available (the .then(function) method) or when the operation failed (the .catch(function) method). Now you can pass that object around and still use its result (successful or not) when available at whatever level of the code you want it.

At the end I used Promise.all which handles all that uncertainty we were annoyed about. Not only does it publish an array of all the document getting operations, but the order of the items in the array is the same as the order of the original promises array, regardless of the time it took to execute any of them. Even more, if any of the operations fails, this aggregate Promise will immediately exit with the failure reason.

To exemplify the advantages of using such a pattern, let's assume that getting the users sometimes fails due to network errors. The Internet is not the best in the world where the user of the program may be, so you want to not fail immediately, instead retry the operation a few times before you do. Here is how a getUsersWithRetry would look:
function getUserWithRetry(times, spacing) {
    var promise = new Promise(function (resolve, reject) {
            var f = function () {
                .catch (function (reason) {
                    if (times <= 0) {
                    } else {
                        setTimeout(f, spacing);
    return promise;

What happens here? First of all, like all the get* methods we used so far, we need to return a Promise object. To construct one we give it as a parameter a function that receives two other functions: resolve and reject. Resolve will be used when we have the value, reject will be used when we fail getting it. We then create a function so that it can call itself and in it, we call getUsers. Now, if the operation succeeds, we will just call resolve with the value we received. The operation would function exactly like getUsers. However, when it fails, it checks the number of times it must retry and only fails (calling reject) when that number is zero. If it still has retries, it calls the f function we defined, with a timeout defined in the parameters. We finally call the f function just once.

Here is another example, something like the original get function, but that returns a Promise:
function get(url) {
  // Return a new promise.
  return new Promise(function(resolve, reject) {
    // Do the usual XHR stuff
    var req = new XMLHttpRequest();
    req.open('GET', url);

    req.onload = function() {
      // This is called even on 404 etc
      // so check the status
      if (req.status == 200) {
        // Resolve the promise with the response text
      else {
        // Otherwise reject with the status text
        // which will hopefully be a meaningful error

    // Handle network errors
    req.onerror = function() {
      reject(Error("Network Error"));

    // Make the request
Copied it like a lazy programmer from JavaScript Promises: an Introduction.


An interesting thing to remember is that the .then() method also returns a Promise, so one can do stuff like get('/users').then(JSON.parse).then(function(users) { ... }). If the function called by .then() is returning a Promise, then that is what .then() will return, allowing for stuff like someOperation().then(someOtherOperation).catch(errorForFirstOperation).then(handlerForSecondOperation). There is a lot more about promises in the Introduction in the link above and I won't copy/paste it here.

The nice thing about Promises is that they have been around in the Javascript world since forever in various libraries, but only recently as native Javascript objects. They have reached a maturity that was tested through the various implementations that led to the one accepted today by the major browsers.

Promises solve some of the problems in constructing a flow of asynchronous operations. They make the code cleaner and get rid of so many ugly extra callback parameters that many frameworks used us to. This flexibility is possible because in Javascript functions are first class members, meaning they can be passed around and manipulated just like any other parameter type.

The disadvantages are more subtle. Some parts of your code will be synchronous in nature. You will have stuff like a=sum(b,c); in functions that return values. Suddenly, functions don't return actual values, but promised values. The time they take to execute is also unclear. Everything is in flux.


I hope this has opened your eyes to the possibilities of writing your code in a way that is both more readable and easy to encapsulate. Promises are not limited to Javascript, many other languages have their own implementations. As I was saying in the intro, I feel like Promises are a simple subcase of Reactive Extensions streams in the sense that they act like data providers, but are limited to only one possible result. However, this simplicity may be more easy to implement and understand when this is the only scenario that needs to be handled.

Design patterns revisited

A few years ago I wrote a post titled Software Patterns are Useless in which I was drawing attention to the overhyping of the concept of software design patterns. In short, they are either too simple or too complex to bundle together into a concept to be studied, much less used as a lingua franca for coders.

I still feel this way, but in this post I want to explore what I think the differences are between software design patterns and building architecture design patterns. In the beginning, as you no doubt know from the maniacal introduction in design patternese by any of its enthusiasts, it was a construction architect that noticed common themes in solving the problems required to building something. This happened around the time I was born, so 40 years ago. In about ten years, software people started to import the concept and created this collection of general solutions to software problems. It was a great idea, don't get me wrong.

However, things have definitely changed in software since then. While we still meet some of the problems we invented software design patterns for, the problems or at least their scope has changed immensely. To paraphrase a well known joke, design patterns would be similar in construction and software if you would first construct a giant robot to build your house by itself. But it doesn't happen like this, instead the robot is actually the company itself, its purpose being to take your house specifications and build it. So how useful would a "pattern" be for a construction company saying "if you need a house, hire a construction company"?

Look at MVC, a software design pattern that is implemented everywhere in the web world, at different levels of abstractions. You use ASP.Net MVC, for example, to separate the logic from the rendering on the server, but then you use Angular to separate logic from rendering on the client. We are very clever young men, but it's MVCs all the way down! How is that a pattern anymore? "If you want to separate rendering from logic, use MVC" "How do I implement it?" "Oh, just use a framework that was built for you". At the other end of the spectrum there are things that are so simple and basically embedded in programming languages that thinking of them as patterns is pointless. Iterator is one of them. Now even Javascript has a .forEach construct, not to mention generators and iterators. Or Abstract Factory - isn't that just a factory for factories? Should I invent a new name for the pattern that creates a factory of factories for factories?

But surely there are patterns that are very useful and appropriate for our work today. Like the factory pattern, or the builder, the singleton or inversion of control and so many others. Of course they are, I am not debating that at all. I am going to present inversion of control soon to a group of developers. I even find it useful to split these patterns into categories like creational, structural, behavioral, concurrency, etc. Classification in general is a good thing. It's the particulars that bother me.

Take the seminal book Design Patterns: Elements of Reusable Object-Oriented Software (Gamma et al. 1994). And by seminal I mean so influential that the Wikipedia URL is Design_Patterns. Not (book) or the complete title or anything. If you followed the software design patterns link from higher up you noticed that design patterns are still organized around those 23 original ones from the book written by "the Gang of Four" more than twenty years ago.

I want to tell you that this is wrong, but I've seen it in so many other fields. There is one work that is so ground breaking that it becomes the new ground. Everything else seems to build upon it from then on. You cannot be taken seriously unless you know about it and what it contains. Any personal contribution of yours must needs reference the great work. All the turtles are stacked upon it. And it is dangerous, encouraging stagnation and trapping you in this cage that you can only get out of if you break it.

Does that mean that I am writing my own book that will break ground and save us all? No. The patterns, as described and explained in so many ways around the 'net are here to stay. However note the little changes that erode the staleness of a rigid classification, like "fluent interfaces" being used instead of "builder pattern", or the fact that no one in their right mind will try to show off because they know what an iterator is, or the new patterns that emerge not from evaluating multiple cases and extracting a common theme, but actually inventing a pattern to solve the problems that are yet to come, like MVVM.

I needed to write this post, which very much mirrors the old one from a few years ago, because I believe calling a piece of software that solves a problem a pattern increasingly sounds more like patent. Like biologists walking the Earth in the hope they will find a new species to name for themselves, developers are sometimes overdefining and overusing patterns. Remember it is all flexible, that a pattern is supposed to be a common theme in written code, not a rigid way of writing your code from a point on. If software patterns are supposed to be the basis of a common language between developers, remember that all languages are dynamic, alive, that no one speaks the way the dictionaries and manuals tell you to speak. Let your mind decide how to write and next time someone scoffs at you asking "do you know what the X pattern is?" you respond with "let me google that for you". Let patterns be guides for the next thing to come, not anchors to hold you back.

Saturday, February 18, 2017

Javascript Proxy in ES6! Great feature I knew nothing about.

I watched this Beau teaches JavaScript video and I realized how powerful Proxy, this new feature of EcmaScript version 6, is. In short, you call new Proxy(someObject, handler) and you get an object that behaves just like the original object, but has your code intercept most of the access to it, like when you get/set a property or method or when you ask if the object has a member by name. It is great because I feel I can work with normal Javascript, then just insert my own logging, validation or some other checking code. It's like doing AOP and metaprogramming in Javascript.

Let's explore this a little bit. The video in the link above already shows some way to do validation, so I am going to create a function that takes an object and returns a proxy that is aware of any modification to the original object, adding the isDirty property and clearDirt() method.

function dirtify(obj) {
    return new Proxy(obj,{
        isDirty : false,
        get : function(target, property, receiver) {
            if (property==='isDirty') return this.isDirty;
            if (property==='clearDirt') {
                var self=this;
                var f = function() {
                return f.bind(target);
            console.log('Getting '+property);
            return target[property];
        has : function(target, property) {
            if (property==='isDirty'||property==='clearDirt') return true;
            console.log('Has '+property+'?');
            return property in target;
        set : function(target, property, value, receiver) {
            if (property==='isDirty'||property==='clearDirt') return false;
            if (target[property]!==value) this.isDirty=true;
            console.log('Setting '+property+' to '+JSON.stringify(value));
            return true;
        deleteProperty : function(target, property) {
            if (property==='isDirty'||property==='clearDirt') return false;
            console.log('Delete '+property);
            if (target[property]!=undefined) this.isDirty=true;
            delete target[property];
            return true;
var obj={
var proxy=dirtify(obj);
console.log('x' in proxy); //true
console.log(proxy.hasOwnProperty('x')); //true
console.log('isDirty' in proxy); //true
console.log(proxy.x); //1
console.log(proxy.hasOwnProperty('isDirty')); //false
console.log(proxy.isDirty); //false

console.log(proxy.x); //2
console.log(proxy.isDirty); //true

console.log(proxy.isDirty); //false

console.log(proxy.isDirty); //false
delete proxy.isDirty;
console.log(proxy.isDirty); //false

delete proxy.x;
console.log(proxy.x); //undefined
console.log(proxy.isDirty); //true

console.log(proxy.isDirty); //true

console.log(obj.y); //3
console.log(proxy.y); //3
console.log(proxy.isDirty); //false

So, here I am returning a proxy that logs any access to members to the console. It also simulates the existence of isDirty and clearDirt members, without actually setting them on the object. You see that when setting the isDirty property to true, it still reads false. Any setting of a property to a different value or deleting an existing property is setting the internal isDirty property to true and the clearDirt method is setting it back to false. To make it more interesting, I am returning true for the 'in' operator, but not for the hasOwnProperty, when querying if the attached members exist. Also note that this is a real proxy, if you change a value in the original object, the proxy will also reflect it, but without intercepting the change.

Imagine the possibilities!

More info:
ES6 Proxy in Depth
Metaprogramming with proxies
Metaprogramming in ES6: Part 3 - Proxies

Thursday, February 16, 2017

Caching as an antipattern

As part of a self imposed challenge to write a blog post about code on average every day for about 100 days, I am discussing a link that I've previously shared on Facebook, but that I liked so much I feel the need to contemplate some more: The Caching Antipattern

Basically, what is says is that caching is done wrong in any of these cases:
  • Caching at startup - thus admitting that your dependencies are too slow to begin with
  • Caching too early in development - thus hiding the performance of the service you are developing
  • Integrated cache - cache is embedded and integral in the service code, thus breaking the single responsibility principle
  • Caching everything - resulting in an opaque service architecture and even recaching. Also caching things you think will be used, but might never be
  • Recaching - caches of caches and the nightmare of untangling and invalidating them in cascade
  • Unflushable cache - no method to invalidate the cache except restarting services

The alternatives are not to cache and instead know your data and service performance and tweak them accordingly. Try to take advantage of client caches such as the HTTP If-Modified-Since header and 304-not-modified return code whenever possible.

I've had the opportunity to work on a project that did almost everything in the list above. The performance bottleneck was the database, so the developers embedded an in code memory cache for resources that were not likely to change. Eventually other anti-patterns started to emerge, like having a filtered call (let's say a call asking for a user by id) getting all users and then selecting the one that had that id. Since it was in memory anyway, "getting" the entire list of records was just getting a reference to an already existing data structure. However, if I ever wanted to get rid of the cache or move it in an external service, I would have had to manually look for all cases of such presumptions in code and change them. If I wanted to optimize a query in the database I had no idea how to measure the performance as actually used in the product, in fact it led to a lack of interest in the way the database was used and a strong coupling of data provider implementation to the actual service code. Why use memory tables for often queried data? We have it cached anyway.

My take on this is that caching should always be separate from the service code. Have the code work, as slow as the data provider and external services allow it, then measure the performance and add caching where actually needed, as a separate layer of indirection. These days there are so many ways to do caching, from in memory tables in SQL Server, to distributed memory caches provided as a service by most cloud providers - such as Memcached or Redis in AWS, to content caches like Akamai, to html output caches like Varnish and to client caches, controlled by response and request headers like in the suggestion from the original article. Adding your own version is simply wasteful and error prone. Just like the data provider should be used through a thin interface that allows you to replace it at will, the caching layer should also be plug and play, thus allowing your application to remain unchanged, but able to upgrade any of its core features when better alternatives arrive.

There is also something to be said about reaching the limit of your resources. Let's say you cache everything your clients ask for in memory, when they ask for it. At one time or another you might reach the upper limit of your memory. At this time the cache should not fail, instead it should clear the data least used or the oldest inserted or something like that. A cache is not something that is supposed to hold all your data, only the part of it that is most efficient, performance wise, and it should never ever bring more problems like memory overflow crashes. Eek!

Now, I need to find a suitable name for my caching layer invalidation manager ;)

Other interesting resources about caching:
Cache (computing)
Caching Best Practices
Cache me if you can Powerpoint presentation
Caching guidance
Caching Techniques

Tuesday, February 14, 2017

Using Visual Studio key bindings in Eclipse when editing Java

The first thing that strikes anyone starting to use another IDE than the one they are used to is that all the key bindings are wrong. So they immediately google something like "X key bindings for Y" and they usually get an answer, since developers switching IDEs and preferring one in particular is quite a common situation. Not so with Eclipse. You have to install software, remove settings then still modify stuff. I am going to give you the complete answer here on how to switch Eclipse key bindings to the ones you are used to in Visual Studio.

Step 1
First follow the instructions in this Stack Overflow answer: How to Install Visual Studio Key Bindings in Eclipse (Helios onwards)
Short version: Go to Help → Install New Software, select your version in the Work with box, wait until the list populates, check the box next to Programming Languages → C/C++ Development Tools and install (with restart). After that go to Window → Preferences → General → Keys and change the Scheme in a dropdown to Microsoft Visual Studio.

Step 2
When Eclipse starts it shows you a Welcome screen. Disable the welcome screen by checking the box from the bottom-right corner and restart Eclipse. This is to avoid Ctrl-arrows not working in the editor as explained in this StackOverflow answer.

Step 3
While some stuff does work, others do not. It is time to go to Window → Preferences → General → Keys and start changing key bindings. It is a daunting task at first, since you have to find the command, set the shortcut in the zillion contexts that are available and so on. The strategy I found works best is this:
  • Right click on whatever item you want to affect with the keyboard shortcut
  • Find in the context menu whatever command you want to do
  • Remember the keyboard shortcut
  • Go to the key preferences and replace that shortcut everywhere (the text filter in the key bindings dialog allows searching for keyboard shortcuts)

You might want to share or at least backup your keyboard settings. No, the Export CSV option in the key bindings dialog gives you a file you can't import. The solution, as detailed here is to go to File → Export or Import → General → Preferences and work with .epf files. And if you think it gives you a nice list of key bindings that you can edit with a file editor, think again. The format holds the key binding scheme name, then only the custom changes, in a file that is what .ini and .xml would have if they decided on having children.

Now, the real decent thing would be to not go through Step 1 and instead just start from the default bindings and change them according to Visual Studio (2016, not 2005!!) and then export the .epf file in order for all people to enjoy a simple and efficient method of reaching their goal. I leave this as an exercise for the reader.

A short list of shortcuts that I found missing from the Visual Studio schema: rename variable on F2, go to declaration on F12, Ctrl-Shift-F for search across files, Ctrl-Minus to navigate backward ... and more to come I am sure.

Regular expressions in Java

Ugh! I will probably be working in Java for a while. Or I will kill myself, one of the two. So far I hate Eclipse, I can't even write code and I have to blog simple stuff like how to do regular expressions. Well, take it as a learning experience! Exiting my comfort zone! 2017! Ha, ha haaaaa [sob! sob!]

In Java you use Pattern to do regex:
Pattern p = Pattern.compile("a*b");
Matcher m = p.matcher("aaaaab");
boolean b = m.matches();

So let's do some equivalent code:

var regDate=new Regex(@"^(?<year>(?:19|20)?\d{2})-(?<month>\d{1,2})-(?<day>\d{1,2})$",RegexOptions.IgnoreCase);
var match=regDate.Match("2017-02-14");
if (match.Success) {
  var year=match.Groups["year"];
  var month=match.Groups["month"];
  var day=match.Groups["day"];
  // do something with year, month, day

Pattern regDate=Pattern.compile("^(?<year>(?:19|20)?\\d{2})-(?<month>\\d{1,2})-(?<day>\\d{1,2})$", Pattern.CASE_INSENSITIVE);
Matcher matcher=regDate.matcher("2017-02-14");
if (matcher.find()) {
  String year=matcher.group("year");
  String month=matcher.group("month");
  String day=matcher.group("day");
  // do something with year, month, day


The first thing to note is that there is no verbatim literal support in Java (the @"string" format from .NET) and there is no "var" (one always has to specify the type of the variable, even if it's fucking obvious). Second, the regular expression object Pattern doesn't do things directly, instead it creates a Matcher object that then does operations. The two bits of code above are not completely equivalent, as the Success property in a .NET Match object holds the success of the already performed operation, while .find() in the Java Matcher object actually performs the match.

Interestingly, it seems that Pattern is automatically compiling the regular expression, something that .NET must be directed to do. I don't know if the same term means the same thing for the two frameworks, though.

Another important thing is that it is more efficient to reuse matchers rather than recreate them. So when you want to use the matcher on another string, use matcher.reset("newstring").

And lastly, the string class itself has quick and dirty regular expression methods like .matches, replaceFirst and .replaceAll. The matches method only returns a bool if the string is a perfect match (equivalent to a Pattern match with ^ at the beginning and $ at the end).