Functional Programming With Ecmascript6 Generators

The web is abuzz right now with ecmascript 6 on the horizon. If you get node 0.11, you can use it server side already. Once of the big features I’m excited about are generators.

I’ve blogged about them previously Alas, I’ve only found nothing out there on the web that covers anything beyond basic instatiation and invocation. With just that to go on, it was hard for me to initially see the hype. A few days ago, I had an appifany.

Generators are first class objects. Like functions, they can be composed from smaller parts. Therefore, much of we know about functions can be applied to generators!

On that note, I will lay the ground work for understanding how to really USE generators.

It all starts with a bind

If you’ve worked with javascript for any of length of time, You should be familiar with bind.

1
2
3
4
5
  var bind = function(fn, ctx, args) {
    return function() {
      return fn.apply(ctx, arguments);
    }
  }

A function has 2 diffrent modes, literal and called.

  • Literal: a function itself, its not being run.
1
2
3
4
var something = function() {
  console.log('do something');
}

  • Called: running the function which gives us its return value along with side effects.
1
something();

The bind is implimented by taking a literal function and calling it within another literal function passing along the context and possible arguments using .apply().

A Generator has 3 states,

  • Literal: A literal Generator function
1
2
3
4
5
var Gen = function *() {
  var value = yield asyncTask();
  return value;
};

  • Instantiated: a runnable instance is created by calling the Generator function
1
var gen = Gen();
  • Run: You can then iterate through the generator by calling next()
1
gen.next();

Unlike a function, We are going to compose generators to be run inside of a co() function. Co will run a generator until it comes accross a yield. Whatever is on right side of the yield will be passed into co. the generator will be frozen until the value can be resolved. This includes thunks, promises and even other generators!

A generator equivilent for a bind looks like this.

1
2
3
4
5
var bind = function(genFunc, ctx) {
  return function *() {
    return yield genFunc.apply(ctx, arguments);
  };
};

Like functions, generators also have a call() and apply() methods which can be used to invoke the function with an explicit context. The yield is there because when we run this new function inside co(), the instantiated generator will be run and the return value will be be spat out to be returned by this generator.

With that in mind, How about a function that takes two generators and runs one inside the other? How would we impliment that?

1
2
3
4
5
var join = function(gen1, gen2) {
  return function *() {
    return yield gen1.call(this, gen2.call(this);
  };
};

With this function in place, we can now run two generators back to back inside a single coroutiine function.

1
2
3
4
5
6
7
8
var co = require('co');
co(join(function *(next) {
  var foo = yield next;
  console.log(foo); // => 'hello world';
  return foo;
}, function *() {
  return 'hello world';
})).call(this);

If you are interested in exploring this subject further, I’m working on a javacript library for composing generators called Shen. Its a toolkit for composing generators for running inside co like lego pieces.

To give you a taste of its power, Instead of join, Shen implements cascade which allows you to merge 1 or more generators.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
var shen = require('shen');
co(shen.cascade(
  function *(next) {
//each yield freezes the generator until the next returns.
    console.log(1);
    yield next;
    console.log(1);
    return;
  },
  function *(next) {
    console.log(2);
    yield next;
    console.log(2);
    return;
  },
  shen.cascade(
    function *(next) {
      //thats right, you can nest them too!
      console.log(3);
      return yield next
    },
    function *() {
      return
  })))();

/* Outputs
    1
    2
    3
    2
    1
  */

Shen functions all compose with each other allowing you to put together generators as easily as you would curry a function.

In addition to cascade and bind, Shen also currently includes…

  • branch and dispatch for conditional logic
  • delay for… delaying?
  • parallel: run several generators and get back an array of the return values
  • oscillator: run a generator at a specific interval and get back an immediate event emitter that fires with the latest returned value of the generator

Current use-cases off the top of my head include using oscillator and parrallel to run several network requests at the same time. you’d get an event emitter with all the returned values in one place. One thing to note is that you can’t completely escape callbacks but you can create areas in your code where callbacks are invisible. The generator takes care of the hard stuff.

Its still very much a work in progress and only works on node 0.11 but I invite you all to try it out. If you want to help, I’m always welcome to new ideas for pieces to add to the ecosystem. contributing a few tests or implementations of ideas would be great too!

Here’s the github to the project

Cheers.

Koa: Zero to Todo List

Note: you need to run node 0.11 with –harmony to run the code.

From the creators of express comes a brand new framework for node powered by the new ecmascript 6 generators syntax. Koa is an interesting reimagining of how we will be able to build web applications in javascript.

The old paradigm

In the standard node library, The ‘http’ module is used to create servers.

1
2
3
4
5
6
7
8
var server = http.createServer(function(req, res) {
  res.writeHead(200, {'Content-Type': 'text/plain'});
  //server logic goes here
  res.end('');
});

server.listen(3000, '127.0.0.1');
console.log('listening on port 3000');

Express exposes a function that we can give to http.createServer as a callback. Express middleware is a set of functions that take 3 arguments, req, res and next. The middleware performs some operations, modifies either the request or response objects and passed down to the next middleware in the stack by calling next(). Its akin to a water flow model where the the response ends somewhere near the end of the middleware stack.

Enter Koa, The generators based framework

Like express, Koa works internally by generating a callback that can be passed to http.createServer(). Unlike Express, it uses generators to provide a much more fine grained model of control flow.

A very basic koa app looks like this, lets make it serve up the contents of a file

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
var koa          = require('koa');
var Promise      = require('bluebird');

//creates promise yielding versions of fs
var fs = Promise.promisifyAll(require('fs'));
//create the koa instance
var app = koa();

app.use(function *(next) {
  //here's an example middleware that logs to the console
  console.log('timestamp: before request => ', time.now());
  yield next
  console.log('timestamp: after request  => ', time.now());
});

app.use(function *() {
  this.body = yield fs.readFileAsync('./app.js', 'utf8');
});

app.listen(3000);
console.log('now listening on port 3000');

Unlike Express, the middleware is written using generators. Downstream middleware in Koa flows upstream on return. Middleware yields down stream by explicitly calling ‘yield next.’. Upon return, the control flow yields back up to where the upstream middleware yielded downstream.

Where Express passes native node req and res objects through to each function, Koa manages a context where it encapsulates them behind an interface. They are still available through the ‘this’ keyword as this.req and this.res. However, Its not reccomended in the docs that you work with these native objects. One could imagine calling this.res.end(”) would throw a monkey wrench in the control flow. Instead you are supposed to work through the this.response and this.request. Most of the methods are aliased directly to this. ‘this.body’ for example, is aliased to this.response.body.

There does not yet appear to be a direct way to get to the request body. The co-body parser accesses the req.body directly so while the docs say don’t do it, Koa is still young so you may have to get your hands dirty.

A Todo app in koa

Now that we’ve covered the basics, lets try something a little more complex. A todo List seems like a good thing no one has ever tried to make before in any technology ever! To simplify assumptions, lets just store the todos in memory.

By itself, Koa is very minimalistic. It does not provide body parsing, sessions, or routing in the core. Unfortunately Koa is still young so there just aren’t that many npm modules for it just yet. A quick search on the Koa website shows that we do have the necessary modules for a basic todo.

  1. koa-route: for routing
  2. co-body: for parsing the body of post requests
  3. koa-static: for serving up static assets

Here’s the basic server side api

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
//jshint esnext
var koa          = require('koa');
var staticServer = require('koa-static');

//this allows us to parse the native req object to get the body
var parse        = require('co-body');

var router       = require('koa-route');
var _            = require('underscore');

var Promise      = require('bluebird');
var path         = require('path');

var fs = Promise.promisifyAll(require('fs'));
var app = koa();
//our very basic data store
var todos = [];

//gets us unique ids
var counter = (function() {
  var count = 0;
  return function() {
    count++;
    return count;
  };
})();

//serve up the public directory where we have all the assets
app.use(staticServer(path.join(__dirname, 'public')));

app.use(router.post('/todos', function *() {
  /*
    yield lets us pass asynchronous functions that return promises or thunks
    It will freeze the middleware till its resolved and pass it back in.
  */
  var todo = (yield parse.json(this));

  todo.id = counter();
  todos.push(todo);
  this.body = JSON.stringify(todos);
}));


app.use(router.get('/todos', function *() {
  this.body = JSON.stringify(todos);
}));

app.use(router.delete('/todos/:id', function *(id) {
  todos = _(todos).reject(function(todo) {
    console.log('what? ', todo, id );
    return todo.id === parseInt(id, 10);
  }, this);
  this.body = JSON.stringify(todos.sort(function(a, b) { return a - b;}));
}));



app.listen(3000);
console.log('listening on port 3000');

Download the code on github The github version includes frontend code.

A few things of note:

The ‘yield’ keyword ca do some interesting things. If we pass into it an asynchronous function that returns a thunk or promise, it will stop execution of the middleware and wait till it resolves. It then returns the value of the promise or thunk and resumes the generator. This is a hell of a lot easier to read.

A word of caution

The ‘yield’ keyword lets us do some safe blocking but it isn’t always the ideal solution. While the event loop itself isn’t blocked by it the way futures can, it does block resuming of any operations that occur after it.

For example, if we run three asynchronous operations top to bottom that do not depend on each other, like the following…

1
2
3
4
5
app.use(function *() {
  var a = yield async1();
  var b = yield async2();
  var c = yield async3();
});

This completely defeats the purpose of node’s (almost automatic) concurrency. When we call async1, we are blocking until async2 runs. This is unoptimal. It would be better to get the promises for the three functions and yield on a merged promise.

1
2
3
4
5
6
app.use(function *() {
  var a = async1();
  var b = async2();
  var c = async3();
  var result = yield Promise.all([a, b, c]);
});

I’m excited to cut my teeth on some bigger problems. As the framework matures, Its going to allow more fine grained control for how we write the next generation of web applications.

Javascript Generators: First Impressions

Ecmascript 6 (harmony) is coming out soon and one of the most exciting features it offers are generators. Generators are a minimalist flow control system that gives a much finer grained level of control than we were afforded up till now.

Note: the code in this blog will only run in node v0.11.x when run as –harmony.

Like a function, a generator is an object that declares some behavior. Its first class just like javascript functions and you can pass it around as values and return them from other functions.

A generator is declared like a function only with a ‘*’ before the parens. We then create an instance of the generator by calling it.

Here’s a basic example.

1
2
3
4
5
6
7
8
var myGenerator = function *() {
  var foo = yield 5;
  console.log('this doesn't get written until the second call to next()');
};

var gen = myGenerator();
var state = gen.next();
console.log(state.value) //=> 5

When we run gen.next(), the code executes until we get to yield. The generator then stops which is why the console.log() does not get called. The state of the generator is returned by next which gives us two things.

  1. state.value: the value on the right side of the yield; in this case 5.
  2. state.done: a boolean that returns true if there are no more yields in the generator.

We’ve called gen.next() the one time. The second time we call gen.next(), we have the option of passing in an argument to it that will be returned by yield inside the generator.

This example shows a more advanced example of bidirectional passing.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
var myGenerator = function *() {
  var firstWord = yield 5;
  console.log(firstWord); //=> "hello"
  var secondWord = yield 10;
  console.log(secondWord); //=> "world"
};

var gen = myGenerator();
var state = gen.next();
console.log(state.value); //=> 5
state = gen.next("hello");
//=> 'hello' gets printed to the screen from inside the generator
console.log(state.value); //=> 10
console.log(state.done); //=> false
state = gen.next('world');
//=> 'world' gets printed
console.log(state.value); //=> 10
console.log(state.done);  //=> true

One of the biggest growing pains of javascript development is wrapping one’s head around async programming and functions run asyncronously. Promises give us a value that represents the eventual value returned from an asynchronous function.

Promises are promised in ecmascript 6 but they aren’t available when I run node 0.11 with –harmony yet so I use Bluebird

1
2
3
4
5
6
7
8
9
10
11
var Promise = require('bluebird');

//extends node's fileSytem with versions of the async functions that return promises
//the promisified versions are the original name + 'Async'
var fs = Promise.promisifyAll(require('fs'));

fs.readFileAsync('.gitignore',  'utf8').then(function(contents) {
  console.log(contents); // prints the contents of the .gitignore file.
});

console.log('this runs before the callback passed to "then" which is counterintuitive.')

This code Works because when the function is called, it creates a closure that doesn’t get garbage collected because the function passed to the promise retains a reference to this scope. When its called, it can operate on variables in this containing scope. However we cannot return to the original function call. Thus Unless you are used to thinking about promises, its a bit unintuitive that the console.log on the following line runs before the callback passed to the then() handler of the promise.

Generators on the other hand, let us FREEZE the execution context until the file resolves.

There’s an excellent library called co from the creator of express that allows us to create coroutines using generators. thus we could write the previous code using generators.

1
2
3
4
5
6
7
8
9
10
11
12
var Promise = require('bluebird');
var fs = Promise.promisifyAll(require('fs'));
var co = require('co');

co(function *(){
  var a = yield fs.readFileAsync('.gitignore',  'utf8');
  console.log(a); //this doesn't run until the previous function resolves.
  var c = yield fs.readFileAsync('package.json', 'utf8');
  console.log(a);
  console.log(c);
  return;
})();

This is pretty exciting, This Asyncrouous code looks downright synchronous. Its also running in its own context so it doesn’t block the event loop. Within the generator, we can write much more fine grained flow control for asynchronous functions.

That said, how does co work? The basic premise is that we use yield to pass back the promise into co where it waits till the function resolves. Then we call the next() of this function passing in the value from the resolved promise.

co itself is extremely flexible allowing you to pass in thunks, or A+spec promises into yield. Here I’ll demonstrate a simplified version that can accept only promises.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
var co = function(fngen) {
  /*
  next takes a instatiated generator and calls
  and a value returned from calling next on it
  gen is an instance of a generator
  yieldable is the value returned from calling gen.next()
  */
  var next = function(gen, yieldable) {
    if (! yieldable.done) { //if 
      //we assume yieldable.value is a promise so we call then() to get the value
      yieldable.value.then(function(val) {
        /*
        we call next on gen and pass in the value into gen.next() to 
        inject the value back into out coroutine where it gets returned
        by the yield in the generator. 
        
        By call gen.next(val), gen resumes execustion passing val back 
        and gen.next() return when it hits the next yield keyword returning 
        the value passed in to yield.
        */
        next(gen, gen.next(val));
      });
    }
  };

  return function() {
    //instatiate the generator
    var gensym = fngen();
    //get the first yieldable
    var yieldable = gensym.next();
    if (!yieldable.done) {
      next(gensym, yieldable);
    }
  };
};

The concept is pretty simple, yield passed back the value on the right to gen.next() which it returns. The value we pass into the gen.next call to gen.next() becomes the value returned by yield. Sorta like a zig zag or a needle stitching.

I’m excited to see some of the new projects that will take advantage of this new ecmascript 6 feature. One big example comming to mind is the new koa framework. Unlike its predecessor express/connect, Koa is a set of pluggable middleware components that utilize generators heavily for flow control.

Comments

Lets Build a Backbone Based Framework!!

I’ve been building large scale applications in Backbone for about 8 months now. In that time I’ve used throax as well as building custom solutions in backbone.

Last night, I live coded the creation of a demo backbone framework similar in features to Thorax. I’m going to walk you through how I’ve solved architectural problems in contructing a large scale backbone framework.

Fork or watch it here

When building a large scale Backbone, there are several consderations to consider.

  • How are we going to have templates load? Inline html declarations in the views do not scale
  • how do we handle render?
  • how do we handle child views?

By default, backboneView.render() is a noop. Its meant to be overidden. Backbone.Events gives us a nice set of methods to allow Backbone objects to listenTo() events off each other, However, the actual bindings and callbacks are left to the implimenter.

Loading Templates

the first thing is figure out how we want the templates to load. If using a tool like require.js or browserify, there are plugins for automatically compiling templates and sending to the client just a compiled template function we can plug into out view. Since I’m trying to keep it simple, the easiest way to get templates is to wrap your templates in a script tag. To prevent the browser from executing it as javascript, we add a type that is not ‘text/javascript’.

I like to use “text/template.”

1
2
3
  <script type="text/template" data-name="templateName">
    <!-- template contents goes here -->
  </script>

Transforming these into working templates is fairly straightforward. For each script tag with type=”text/template”, run the template compiler over the html inside and store it somewhere. In this case I add a data-name attribute which will be the key for the hash I store the compiled template in.

/public/javascripts/init.js

1
2
3
4
5
6
  window.templates = {};
  var $sources = $('script[type="text/template"]');
  $sources.each(function(index, el) {
    var $el = $(el);
    templates[$el.data('name')] = _.template($el.html());
  });

In this version, I chose to use express + jade, Of course writing underscore templates in jade seemed a little odd so I delegated those to jade’s include statement. I know there are projects like jade-browserify so maybe I’ll eventually update this using just jade.

/views/layout.jade

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
doctype html
html
  head
    title= title
    link( rel="stylesheet", href='/components/bootstrap/dist/css/bootstrap.css')
    link( rel="stylesheet", href='/stylesheets/style.css')
    script( src="/components/jquery/jquery.js")
    script( src="/components/bootstrap/dist/js/bootstrap.js")
    script( src="/components/underscore/underscore.js")
    script( src="/components/backbone/backbone.js")
    script( src="/javascripts/base.js")
    script( src="/javascripts/application.js")
  body
    block content
    script(type="text/template", data-name="main")
      include templates/main.html
    script(type="text/template", data-name="header")
      include templates/header.html
    script(type="text/template", data-name="menuLinks")
      include templates/menulink.html
    script( src="/javascripts/init.js")

The actual contents of the underscore templates in the *.html templates will now be compiled and loaded into “templates[data-name]” where I can access it later.

Creating our own Base Backbone Objects.

The first thing we want to do is determie the shared behavior of backbone objects in our app. For special cases we can overide them later. However, for the sake of time, we want some decalrative way of telling the object what template to use and a default render() method that we can use in most situations.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
  //first we declare a namespace to store these new Objects
  application = {};
  application.BaseView = Backbone.View.extend({
    initialize: function() {
      Backbone.View.prototype.initialize.apply(this, arguments);
      if (this.model) {
        this.listenTo(this.model, 'change', this.render);
      }
      /*
        if the BaseView is extended with a tpl string, we want to 
        have that be available but we also want to be able to load 
        it at runtime and have it overide the base tpl string
      */
      this.tpl = options.tpl || this.tpl
      this.loadTemplate(this.tpl);
    }
    loadTemplate: function(tpl) {
      /*  
        if tpl is a function, we assign it directly, 
        otherwise, if its a string, we look it up
        in our templates map.
      */
      if (_.isFunction(tpl)) {
        this.template = tpl;
      } else if (_.isString(tpl)) {
        this.template = template[tpl];
      } else {
        throw new Error('tpl must be a function or a string!');
      }
    }
    render: function() {
      //see below
    }
  });

Next up we need to create a render function. Its arbitrary based on the particular needs of your application. For most purposes however, Its sufficiant to make a view’s model’s attributes available inside the templates.

1
2
3
4
5
6
7
8
9
10
  //from the render in the preceeding example
  render: function() {
    this.$el.html('');//empty the view's node
    var context = {}; //create a context object
    if (this.model) {
      _.extend(context, this.model.attributes);
    }
    this.$el.html(this.template(context));
    return this;
  }

Here’s a great first start. Now any View that extends BaseView will have a render. The only interface we must follow is that the View have a model and a tpl which tells the View how to resolve this.template.

Of course there’s a major component missing. Any good backbone framework has a way of embedding subViews. Turns out its sort of tricky.

The Lifecycle of a Backbone view in this case is to listen to changes on model and call render() which updates the view’s $el property. Thus its impossible to proceed until this.$el is completely demystified.

what is $el?

I’ve heard alot of confusion about the nature of ‘this.$el’. Typically, a Backbone View is a controller for a node. This node is a structure relevant to the document object model.

you may be familiar with a dom object if you’ve used jquery.

1
2
  var $body = $('body'); // => returns a jquery wrapped object
  $body[0] //returns the wrapped object representing the body dom node.

In much the same way, a backbone view is a controller for a dom object.

You may have seen the kickstart for a backbone app that involves doing a jquery lookup on a node and setting its html to your view’s $el property

1
2
  var $container =  $('.container');
  $container.html(view.render().$el);

Assuming the render() method of the view returns ‘this’, then calling it before calling the $el property guarantees that the dom object managed by the view will be updated before being placed in the dom. When render() is called again, it updates the contents of that $el. The $container has a reference to $el stored in it thus ensuring that calling render() will change the webpage to reflect the latest state of the model.

Extending out Render() to support subViews.

With the afformentioned definition of $el in mind, The best way to embed a subview (with string based template engines, you do something else with dom based ones.) is to do an initial render of the dom node marking off somehow places to embed subviews. Then afterwards, go back through the $el and systematically embed $el for each subview in their respective places.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
  //updated render()
  this.$el.html('');
  var context = {};
  context._subView = function(viewName) {
    return '<div class="subView view-' + viewName + '"></div>';
  };
  if (this.model) {
    _.extend(context, this.model.attributes);
  }
  //pop it in the dom
  this.$el.html(this.template(context));
  //notice we do this AFTER we rerender the new this.$el
  var subViews = this.subViews;
  this.$el.find('.subView').each(_.bind(function(index, el) {
    var $el = $(el);
    // view-
    var subView = _(Array.prototype.slice.call(el.classList)).chain()
      .filter(function(className, index) {
        return (className.match(/^view-/));
      }, this)
      .map(function(className) {
        return className.slice(5);
      }, this)
      .value()[0]; //grab the first item
    $el.html(subViews[subView].render().$el);
  }, this));

  return this;

There’s a new things we’ve added to this version.

  1. There is now a ‘subView()’ helper being passed into context, this will generate the slug that we will look for when embedding a subview.
  2. After the $el is rendered, we are searching for dom elements with class subview
  3. for each subView, we find the name of the subview we embed in there given as view-[viewName] and look it up in the view’s subViews key/val lookup object. Its a property of teh parent view.

With this, we now have an easy way of supporting subviews.

1
2
3
4
5
6
7
8
9
10
11
12
  childView = new application.BaseView({
    tpl: 'child'
  });

  var parentView = new application.BaseView({
    tpl: 'parent'
    subViews: {
      child: childView
    }
  });

  $('div.container').html(parentView.render().$el);

In parentView’s template, its as simple as using out new subView helper

1
2
3
4
5
6
7
8
9
  <script type="text/template" date-name="parent">
    <p>
      <%= _subView('child') %>
    </p>
  </script>

  <script type="text/template" date-name="child">
    <span> hi I'm the child view!! </span>
  </script>

In the next installment in this series, we’ll build a base ControllerView class which we can use to render generic collections. Yes, they’ll be embedable as subviews in our BaseView.

Comments

How to Lexically Scope Like a Boss

Lexical scoping (scoping accordinng to location in source code) is probably one of the most powerful features a programming language can have. Today, I’m going to show you how to use lexical scoping to create powerful abstractions. I will begin by introducing the concept of scope. From there I will introduce a few rules for figuring out the scope of your variables in javascript and then introduce the concept of a closure and how it will make you positivly badass at Javascript.

What is a scope?

I’m assuming you have written code before. Hell, I’d dare assume you’ve written a var.

1
var variableName = "variable value";

What does it mean to use the var keyword? As you may know, good javascript has you prepend all your variables with var. Most will also advocate wrapping all your code in an immediatly invoked function.

1
2
3
4
5
6
7
8
9
10
11
12
//this is being declared in the global scope.
//its functionally equivilant to window.foo or just plain foo
var foo = "super bar";

(function() {
  var foo = "inner bar";
  console.log('this foo refers to the foo inside this function ', foo);
})();
/* I don't always create inline annonymous functions, but when I do, I
immediatly invoke them. Live dynamically my friends <3 */

console.log('this one will print the top level: ', foo);

Let’s start with this relativly simple example to begin growing some understanding. Internally, the interpreter/compiler represents scopes with a tree (in simple cases, just a linked list). At each node in the tree is a dictionary of values that we refer to with the variable names from your source code. When we refer to foo, the interpreter (or the compiler at compile time) will look for it in the most immediate scope. If it doesn’t find a value there, it will look up the scope tree until it finds a node with a matching key/val and uses it for the current instruction.

Look at that previous code sample. If there wasn’t a var foo at the top of the function, The interpreter would look for scope[foo] and find that it isn’t there. It would then look for an entry named foo in the current scope’s parent. That is the global scope where the variable foo exists. It refers to the value “super bar.”

Creating scopes and determining the current scope’s parent

The rules for creating scopes differ between languages. In Javascript, we start new scopes with a function.

1
2
3
4
5
6
7
8
var foo = "global scope";
var goo = "ber";
(function() {
  var foo = "inner scope";
  (function() {
    var foo = "super inner scope bro!";
  })();
})();

This creates a tree with three nodes we can visually represent in pseudo-coffeescript.

1
2
3
4
5
6
7
8
9
10
11
12
 scope_chain = global:
                  variables:
                    foo: "global scope";
                    goo: "ber"
                  children:
                    inner_scope:
                      variables:
                        foo: "inner scope"
                      children:
                        super_inner_scope:
                          variables:
                            foo : "super inner scope bro!"

As you can see, we have three nodes and there are variables that are available at each node. If you try to access a variable, the interpreter will look up the nodes until it finds an entry that matches.

Lets go deeper. How about we not even bother with immediatly invoking the function. How would that work?

The simple answer is that it changes nothing. The scope chain rules are consistent. The only difference is that we now have two ways variables can enter the scope.

  1. Through the scope tree, variables in parent scopes are available.
  2. When variables are injected in via the function arguments which can be pulled from some other scope where we invoke the function.

We delay the instantiation of the scopes until they are invoked later in the program.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
  //scope A
  var binder = function(st) {
    //scope B
    var state = st
    return {
      set: function(newState) {
        //scope C
        state = newState;
        return state;
      },
      get: function() {
        //scope E
        return state;
      }
    }
  }

Scope order is created lexically. Each new function definition in the source (ie: text, hence the lexical) is being read into the interpreter as a place to mark a new node in the scope chain on invocation. I mentioned that the scope datastructure resembles a tree because both Scope C and E belong to B as siblings. B in turn, belongs to A.

None of these functions are being invoked but the scope lookup rules are locked to A -> B, B-> C, B->E Things get hairier when we invoke them.

1
2
3
4
5
  //scope A
  var b1 = binder(5);
  b1.get();   // => 5
  b1.set(2); // => 2
  b1.get(); // => 2

How on earth is this function maintaining state? The Binder, on invocation, takes the argument 5 and creates a scope according to the rules set forward in the source code of binder.

Now, 2 is in Scope B and being assigned to the variable “state” we instantiated in scope B. We then return the object back.

The object we get back from calling binder() has two functions and thus two child scopes of B, (C, and D) which we return back to Scope A. via the return.

Now it gets weird. When we call b1.get(), we invoke a function from scope A which internally has a scope C who’s parent scope is scope B. It thus returns a value from scope B back into scope A. Think of it like a closed loop of scopes. ie: a closure. ;)

This is pretty powerful. b1.set() does something similar only its injecting a variable from scope A containing 2 into scope E where it is assigned to a variable in scope B and subsuquently returned. B1.get(), on its next invocation returns the same value stored in that variable from the same scope into scope A, the top level context.

We can use this to create objects with completely encapsulated states. This gives us a leg up when trying to wrangle asynchrounous processes.

Here’s an example of a function that runs a function passed as an argument after a delay. while we wait for the function to fire, we can register functions to be called when the function is resolved.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
  //scope A
  var delay = function(fn, args, timeout) {
    //Scope B
    var status = "pending"; //how we track what the status of the internal state is
    var result; //where we are going to store the result of calling fn.
    var deps = []; //where we store registered callbacks.

    setTimeout(function() {
      //Scope C
      /* the scope chain from here is A -> B-> C
      this simply calls teh function and allows us to pass in the
      arguments as an array
      */
      result = fn.apply(this, args); //call the function
      status = "done";
      //we run through all the functions in deps and run them one
      //by one.
      deps.forEach(function(dep) {
        //Scope D
        dep.call(this, result);
      });
    }, timeout);

    //the two objects being returned have access to this scope
    //which allows us to invoke them to affect this internal 
    //environmental.
    return {
      done: function(func) {
        //Scope E
        //A -> B -> C
        if (status === "pending") {
          deps.push(func);
        }
        if (status === "done") {
          func(result);
        }
      }
    }
  }


  var promise = delay(function(msg) {
    //Scope F
    return msg + " world";
  }, ["hello"], 3000);

  //by passing this function to done, you guarantee that
  //when the function passed into delay is called, it will pass
  //the result into the function passed in here.
  promise.done(function(result) {
    //Scope G
    console.log(result);
  });

  //3000 milliseconds later
  //> "hello world"

Yeah its a wee bit contrived but it illustrates a point. By using a closure to wrap a bunch of data into this enclosed space, we can hide away some pretty sophisticated machinery to invert the responsibilty of control. Now the object maintains the state of the async call to the file and we register functions that it will call when the async call is resolved. This removes alot of overhead for managing async functions.

In Node.js, we pass a callback into the third paramater for the standard library functions.

What if we could apply the same principle and get an object we could pass along functions to and know it will take care of running them when the file is available?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
var fileObject = function(fileName, encoding) {
  var fs = require('fs');
  var status = "pending";
  var result_data;
  var result_error;
  var deps_success = [];
  var deps_fail = [];
  var deps_always = [];
  var ret = {};
  encod = encoding || 'utf8';

  fs.readFile(fileName, encoding ,function(err, data) {
    result_data = data;
    result_error = err;
    if (!result_error) {
      deps_success.forEach(function(fn) {
        fn(result_data);
      });
      status = "success";
    } else {
      deps_fail.forEach(function(fn) {
        fn(result_error);
      });
      status = "error";
    }
    deps_always.forEach(function(fn) {
      fn();
    });
  });
  var queueFunction = function(list, fn) {
    list.push(fn);
  };
  var done = function(fn) {
    if (status === "done") {
      fn(result_data);
      return ret;
    }
    if (status === 'pending') {
      queueFunction(deps_success, fn);
      return ret;
    }
  };
  var fail = function(fn) {
    if (status === "error") {
      fn(result_error);
    }
    if (status === "pending") {
      queueFunction(deps_fail, fn);
    }
    return ret;
  };
  var always = function(fn) {
    if (status === "pending") {
      queueFunction(deps_always, fn);
    } else {
      fn();
    }
    return ret;
  };
  ret.done = done;
  ret.fail = fail;
  ret.always = always;


  //this exposes the status as a readonly property
  Object.defineProperty(ret, 'status', {
    get: function() {
      return status;
    },
    enumerable: true
  });

  return ret;
};

Hey, that wasn’t so bad. We return an object of three functions and it will take care of running the appropriate functions based on the status of the object’s internal async operation when its resolved.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
var newFile = fileObject('./hello.txt');

newFile.done(function(data) {
  console.log('two ' + data);
});

newFile.done(function(data) {
  console.log('one ' + data);
});

newFile.fail(function(err) {
  console.log('the file errored out. returned: ' + err);
});

newFile.always(function() {
  console.log('this always gets called');
});

In short, lexical scoping and closures are pretty damn awesome.

Comments

Thorax Is Awesome

I like Backbone.js a lot. It’s a fantastic utlity library for building your own best of breed javascript applications. It is much more configurable than Angular or Ember - both of which make numerous assumptions about the nature of your app, which may or may not lead to frustration when working with a legacy codebase. While you certainly won’t be cranking out apps as quickly as you would in more magical frameworks, Backbone gives you enough to get started. It stays slim in features to avoid stepping on your toes. Backbone is the fixed gear bike of frontend javascript frameworks.

I recommend anyone new to Backbone build a few projects to learn it. In practice, I don’t recommend using Backbone alone for building out a full mvc system. Its missing a few really important features that you would have to implement yourself.

Enter Thorax

In the world of Backbone extensions, there are three major players, Marionette, Thorax, and Chaplain. I haven’t looked at Chaplain and I found Marionette heavy for my needs. Thorax on the other hand, was a breath of fresh air. I especially loved the handlebars.js templates integration.

In the following posts, I will outline some of the major benefits of using Thorax.

  • Out of the box render() that works
  • Child view management
  • Layout Views
  • The MVC in Backbone’s MVC

Out of the box render() that works

Backbone.View’s default render method is a noop(), (ie: function() {}). Backbone’s author’s intend for you to write your own render function which sets its instance’s el property to your view’s generated representation of the model’s data. Backbone avoids adding in this feature to keep it agnostic to your templating system.

Note: Handlebars

Handlebars is an html templating language for javascript.

1
2
3
4
<h1>{{ value }}</h1>
<p>
  {{#if foo}}{{foobar}}{{/if}}>.
</p>

Handlebars

This handlebars code is fed into Handlebars.compile() which returns a template function. This function is then passed a “context” object. example:

1
2
3
4
5
6

    {
      value: "hello there"
      foo: true,
      foobar: "hello Douglas"
    }

and generates a string of html that you can inject into the dom.

1
2
  <h1>hello there</h1>
  <p>hello Douglas</p>

Since Thorax makes the decision of using Handlebars, it can give us a default render() method that’s usable rather than a noop(). Thorax.View.render() constucts a context object containing all of the properties of the view instance and the attributes of the model which is passed into the Handlebars template function so that it’s available in the rendered view.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
//render function, this is what a Thorax render does, 
// *this* resolves to the thorax instance

var render = function() {
  //get the attributes of the property
  var viewProperties = {};
  for ( var property in this) {
    viewProperties[property] = this[property];
  };
  //underscore.js's extend
  if (this.model) {
    viewProperties = _.extend(viewProperties, this.model.attributes);
  };
  this.$el.html(this.template(viewProperties));
  return this;
}

By default, properties of the view insance get passed to the template but not the functions. What if we want to overide this?

In the past, I would write view helpers in order to render attributes that required computation. There’s a more elegant way using the defineProperty function.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
  var peter = new Thorax.Model({
    firstName: 'Peter',
    lastName:  'de Croos',
    githubHandle: 'Cultofmetatron'
  });

  var PersonView = Thorax.View.extend({
    template:
      Handlebars.compile("we can call
               {{firstName}}, 
               {{lastName}} and 
               {{fullName}}"),
    initialize: function() {
      Object.defineProperty(this, 'fullName', {
        get: function() {
          return this.model.get('firstName') + ' ' +
          this.model.get('lastName');
        },
        enumerable: true // see note below
      });
    }
  });

  var peterView = new PersonView({
    model: peter
  });

Technical note: By default, properties defined by defineProperty are not enumerable.

IE: They won’t show up when you do the for (property in this) {}. Because thats how thorax’s render view gets at the attributes, it won’t show up in the template either. Luckily that is easy enough to fix by specifying the option enumerable:true

Child view management

Backbone does nothing to address embedded views. It expects you to write your own logic for handling child views in the render function. As you can imagine, it is kind of a pain. It sounds simple enough in theory, but in practice it means that you have to write code to check if a parent view is being removed and recursivly have it descend into all it child views to undelegate events from the dom. This is a major potential source of javascript memory leaks. Thorax provides some great helper functions for this. It just wins.

Thorax.Views can be embedded in another view simply by adding the subview as a property.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

  var blogModel = Thorax.Model({
    name: 'console.log(this.thought)',
    post: new Thorax.Model({
      title: "Thorax is awesome",
      content: "maximum callstack exeeded"
    })
  });

  var PostView = Thorax.View.extend({
    template: Handlebars.compile('<h2>{{title}}</h2><p>{{content}}</p>')
  });

  var blogView = new BlogView({
    template: Handlebars.compile("<h1>{{name}}</h1>{{view postView}}"),
    model: blogModel,
    postView: new PostView({
      model: this.model.get('post')
    })
  });

As you can see, it was as easy as embedding {{view postView }} inside the template and Thorax takes of care of yielding control of that region to the child view.

Layout Views

Let me just say that layout views are really awesome. A layout view is a view with the sole purpose being a container which can hold whatever view/model combo fits the current context.

The MVC in Backbone’s MVC

There’s a Backbone.Model and a Backbone.View but where is Backbone.Controller? (hint: there isn’t one)

Currently Marrionette augments Backbone with a Controller object but Thorax does nothing of the sort. I guess the real question is, where does a controller fit?

Philosphically, MVC stands for a seperation of concerns between

  1. Models: the data containing the application logic we are modelling on the computer
  2. Views: Objects that manage the presentation of the models. This includes binding event handlers and rerendering when the underlying model(s) change.
  3. Controllers: Objects that take care of managing what view and model are relevant.

In the Rails world, the architectural philosphy revolves around the concept of skinny controller/fat Model. The brunt of the code for manipulating the models should be inside the instance methods in the models themeselves and the controller interacts with it through an api.

When we really get down to it, controllers perform a few functions.

  1. Watch out for some event that should trigger a change in view and/or model.
  2. Manage the creation of new view instances to represent new or altered models.
  3. Render html and insert it into the DOM.

Our controller will handle only the details of getting the proper model and binding it to a view so that it can be rendered.

Lets create a function PostController that does nothing more than lookup a model by its id and creates a post view.

1
2
3
4
5
6
7
8
var postController = function(postId) {
  //for info on Thorax.Views, Thorax.Models and Thorax.Collections,
  //see the note below
  var postView = new Thorax.Views.FullPagePostView({
    model: Thorax.Collections.posts.get(postId)
  });
  return postView;
}

Note About Thorax Registries

Thorax provides a Registry: a series of hashes where you can store the Constructor Functions for your extended models and views.

For more information visit The throax docs

Now we have a basic controller that handles cases 2 and 3 of our list. We still need to take care of figuring out how we want to determine when this controller gets called. In the projects I have worked on, I’ve solved this problem using the router.

Thorax does nothing special with the router so we will use Backbone.Router as is. It will manage one Layout view, a special Thorax view made for holding other views.

1
2
3
4
5
6
7
8
9
10
//router.js
var layoutContainer;
var callController = function(controller) {
  return function() {
    //arguments is used to shift responsibilty of knowing the 
    //amount of paramaters to the controller
    var view = controller.apply(this, arguments);
    layoutContainer.setView(view);
  };
};

The function callController takes a controller, passes along the arguments given to it by the router and takes care of the boiler plate of creating a view and setting it into the container. layoutContainer.setView() takes care of undelegating events of all elements attached to the previous view (if there was a previous view) as it swaps in the html of the new view.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
//continued from above
var Router = Backbone.Router.extend({
  initialize: function() {
    layoutContainer = new Thorax.LayoutView({
      template: '#AppLayout'
    });
    /*
      this is the point in the html where our container will be 
      injected in. from here on out, the router and controller
      system will take care of swappng out application views 
      within this container.
    */
    layoutContainer.appendTo('#entry-point');
    $(document).trigger('popstate');
  },
  routes: {
    'posts/:postId': 'postRoute'
  },
  postRoute: callController(postController)
});

This is a pretty straight forward setup with one possible new thing. So you probably noticed the $(document).trigger(‘popstate’). In standard backbone, the application instance Router has a method navigate() usually called by

1
  router.navigate('path/to/route', { trigger: true });

The trigger true is needed by backbone for the router to trigger any associated actions. The browser by default writes to history when the document recieves the popstate event. The browser triggers popstate automatically when you enter a page or hit the backwards or forward buttons. When you click back, the new url will load and the router which is listening for the popstate event on ‘document’ will perform the behavior associated with that url.

By having the $(document).trigger(‘popstate’) at the end of the initialization, we guarantee that once the router is finished initializing, it will read in the url and trigger the appropriate context for the app for the url. This is great if we want multiple url entry points through which the user enters the app. The user gets the same javascript no matter what url of the website they visit. The app takes care of loading the right content based on the url.

Finally, the template option that was passed into the instantiation of layoutView is optional. By default, Thorax wraps the layout in a div tag. We can customize the layoutView by giving it a template and using the layout-element helper in that template.

1
2
3
<script id="appLayout" type="handlebars/template">
   {{layout-element tag="div" id="currentContext" class="container"}}
</script>

Backbone, a Short Primer - Part 1: Models and Events

Its been a few months since I’ve started using backbone on my personal projects. Its a great library for rolling out MVC structure into your application but the learning curve is pretty brutal.

The first step to really understanding the backbone.js framework.

There is no framework

Seriously, get it out of your head entirely that backbone is a frameowrk. Its far too minimal. Backbone is more like a toolkit library for contructing MVC frameworks.

Backbone provides the following objects

The real key is how we use it. Like Chess, it can be learned very quickly. the problem is that you also need to understand the tactics and strategies. The offical docs leave a lot to the imagination and the examples are lackluster at best if they exist at all.

Backbone.Model

Backbone.Model is a storage container where we can add and remove items via set and get attributes.

1
2
3
  var newModel = new Backbone.Model();
  newModel.set('foo', bar);
  newModel.get('foo') // => 'bar'

You may be wondering why you don’t use newModel.foo = ‘bar’? The real power in backbone is in the events that models can fire. By having you access the model’s attributes via set and get, you ensure that the associated callbacks get fired everytime an attribute is changed.

For example, in a View containing a Model, when we change an attribute on the model via the setter and getter methods. A callback automatically makes the model emit a ‘change’ event which we can set the view to listen to and trigger a rerender.

Extending Backbone.Model

Backbone.Model is inherited using extend. This is also where we add methods that may be specific to our particular subclass of Backbone.Model. For instance, lets create a Comments model which may contain a username, content, rating and timestamp.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
  var Comment = Backbone.Model.extend({
    //defaults are self explanatory
    defaults: {
      rating: 0,
    },
    /*
     * initialize runs whenever new is called, it takes care of
     * setting up __super in the inheritance chain
     * there is also 'constructor' which overides the contructor entirely
     * leaving you to manually impliment the prototype chaining.
     * ie: only use constructor if you know what you are doing.
    */
    initialize: function() {
      this.set('owner', getCurrentUser());
      this.set('timestamp', new Date());
    },
    upvote: function() {
      this.set('rating', this.get('rating')++);
      return this;
    }
  });

  /* now we initialize the model passing in values in an object. */
    var newComment = new Comment({
    content: "backbone needs a library for reactive databindings",
  });

  newComment.upvote().get('rating'); // => 1

Here we’ve set up a basic backbone model for setting representing a comment. In defaults, we put in an object of default attributes. You may be wondering why I set up the owner and timestamp attributes in initialize.

The defaults object and anything inside it is evaluated into a static object inside the extend function call. Thus, if you were to try this

1
2
3
4
5
6
7
8
9
 var Comment = Backbone.Model.extend({
    //defaults are self explanatory
    defaults: {
      rating    : 0,
      owner     : getCurrentUser();
      timestamp : (function() { return new Date() })();
    }
    ...
 })

you would find that every instance will have the same timestamp. To get around this, I have the functionality for those defaults set to run in initialize.

Backbone.Model events

Backbone models should only broadcast events in order to notify things that contain it. A model itself can contain another model as an attribute.

Backbone.Model’s ‘change’ event fires when it detects a change in the value of the attribute. If the value is a model, then a change in that model’s attributes won’t change the reference to the model. Unlike the dom, we have to explicitly set up our own event bubbling.

To set up your own event delegation, you have to set up a listener in the parent model to listen for events in the attribute model.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
  var CommentHolder = Backbone.Model.extend({
    initialize: function() {
      /* We set up event listeners here
        the first argument is message being fired
        the second is the function to be invoked
        the third is the context for function to be called in
      */
      this.get('comment').on('all', this.bubble, this);
    },
    bubble: function() {
      // apply is used to propegate all possible arguments
      // that can be coming form multiple events.
      this.trigger.apply(this, arguments); //trigger is a backbone function
    }
  })

‘all’ is a backbone defined catch all event that sends along the name of the event as the first argument. The model will listen to its comment and broadcast any messages it recieves to any who would hear.

Processes in Node Part 2: Pid to Kill

Internally within node, every object has an id that V8 uses to keep track of these object. In the same way, when we invoke a request to the operating system to create a process, the operating system loads the program into memory and assigns it an id which can be used to track the running program throughout its life cycle.

The id this process is assigned is called a pid and it can be accessed from within node via process

1
2
  var pid = process.pid;
  console.log(pid);

Fun fact: if you open another terminal and type “kill -9 (process pid).” You will have effectivly terminated the process in the other window by having the operating system directly deliver a kill switch to your running node instance.

The process module is global so we don’t need to require it. It gives us alot of information about the running node instance that we can use in our programs.

Example operations include

  • exiting the program
  • changing directories/ getting directory information
  • getting the name of the directory of the start script
  • retrieving argumenst passed into the command line invocation of the script

process events

specific to node.js is the ability to bind to occur at process events.

1
2
3
4
5
6
7
8
9
10
  var runOnExit = function() {
   console.log('...exiting');
  };

  process.on('exit', function() {
    /* its important that you do not invoke
    * any asynchronous functions in here
    */
    runOnExit();
  });

In the case of exit, the event loop terminates. The documentation specifically mentions events scheduled with setTimeout. Its important to note that this applies to all asych events as they all operate on setTimeout underneath.

Consequently, if you need to write some pertinent information to the disc, Be sure to use writeFileSync!!

process.kill

TLDR: Processes can send signals to each other and recieve signals from the ernal

Signals can be compared to events and event listeners in javascript. kill()’s name is unfortunate since kill is really a function for sending messages to other processes as OS signals. You will use this function to tell other processes to go kill themeselves most of the time.

This isn’t limited to node processes speaking amongst themeselves. Ever run node and then exit it using ctrl-c.

For a great list of standard unix signals, check out this Awesome Chart

We can orginaize them according to their default behaviors.

  • kill requests/reports
  • error reporting
  • device access notification
  • coordinating io

We can overide the default behaviors for some of these signals.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
  console.log("to delete => kill -9 " +process.pid);
  console.log("\n\n this is a a reinforced program, it will not easily die");

  /* SIGINT is sent to your process when you type in CTR-C.
     the operating system intercepts this command and relays
     it to your node instance. node's default response is to exit.
   */
  process.on('SIGINT', function() {
    console.log('you tried to press CTRL-C, go ahead, try again');
  });

  process.on('SIGTSTP', function() {
    console.log('LOL, nope - CTRL-Z won\'t save you!! either');
  });
  //this keeps the program from exiting
  var mainLoop = function() { setTimeout(mainLoop, 0); };
  mainLoop();

A more realistic scenario would be a situation in which you need to write some files to the disk before kicking off the exit procedures.

1
2
3
4
5
  process.on('SIGINT', function() {
    console.log('writing to disk');
    saveData();
    return process.exit(0);
  });

You cannot overwrite the kill -9 signal. This is why certain programs can become corrupted when you kill-9 them. they are not given a chance to save state.

For more information about signals, I found these two guides to be excellent.

Signals are a primitive pipeline for relaying messages. If you are looking to build some form of complex communication pipeline between individual processes, you are better off doing so over tcp. There are frameworks that facilitate this such as zeromq with an excellent node binding available.

Comments

Proceses in Node Part 1: Introduction to Processes

As a language made to exist within the browser, javascript did not originally come with a way to do process process related tasks. Until Chrome, browser based javascript was running within the same process as the browser.

In Chrome, every tab runs in its own process. Consequently, your javascript on any given page runs in its own process. Now that node has brought javascript into the server and far from the restrictive confines of the browser, We can do intersting things like run other programs from node.

What is a process?

Much like the relationship between a Constructor function + its prototypye and the object created by invoking the new keyword, a process is an operating system level invocation of some program.

Process vs Program - a very high level overview

Imagine the hardrive that powers your computer. Stored on it is a long range of numbers. For all practical purposes, we can think of it as a long serial stream of bits layed out. with information encoded on them. At some magical location is the boot sector. When the computer first starts running, it startsloading binary data from the hardrive into the RAM and then loads this data into the cpu. At this point we load the master process which we call the operating system kernal. The kernal then takes care of managing other processes in the system.

At this point, we get to your program. If we were doing this in C, the the compiler would compile your source code into binary code which follows an executable format. The actual encoding of this binary instruction stream varies across cpu architectures and operating systems. They do happen to share some common characteristics. Most of this information is located in the header of the file as information that tells the operating system that this code is executable and information on how it is to be run.

Entry Point Address

At some point in the binary stream of bytes that is your program, there exists the first instruction that needs to be loaded into the cpu. The Operating System loads this address into memory and queues it up for running into the cpu.

Data

Constants in the process are stored in the data stream in some area where they can be accessed by the Process as it runs. This includes mathematical constants such as Math.PI or string error messages.

Symbol and Lookup Tables.

To properly explain symbol and lookup table, I need to elaborate on what is going on at the instruction level on your computer. Basicly, it stores the locations of all the variables and function entry points. in memory.

Processes verses threads

The simple $2 answer is that processes have their own copy of all the data and symbol information. In some languages, multiple threads exist within a process and all share the same data. more importantly, you cannot create threads in node.

Processes and node

For us nodesters, the picture gets a bit more complicated. The computer loads up the program instructions we affectionatly know as node and loads it into memory. From there, the node program as a static set of bits on the hardrive becomes an almost living thing that loads your javascript into memory and run it!

To clarify, the process running when you run your node program is an instance of the node program. You can have several node instances running in memory. They can even all serve http requests as long as they are not trying to bind to the same port.

Understanding processes is incredibly useful. In Ruby, processes forking is used heavily in the design of unicorn. Services like nodejitsu and heroku utilize smart people who understand how processes work to architect systems that run and manage your code on the cloud. More importantly, node code can only run on one processor at any given time but by using features such as fork, you can set up a master node process that delegates tasks to subprocesses it spawns yourself. Since node processes are so lightweight, you could concievably run hundreds on your system at the same time.

Comments

On This, Protototypes, and Dunderheads

Of all the stranger than fiction things I have seen, I can’t say I have experienced as blatant a miscarriage of conceptual understanding as I have with javascript’s object system. (Ok, I might be exagerating just a little…)

Javascript is an interesting language. It is quite powerful when paired with appropriate libraries like underscore. Its prototypical inheritence model is a mindblowlingly powerful reductionist critique of classical inheritance. In the right hands, functions become lego pieces that can be glued onto objects in staggeringly flexible ways.

On “this”

Lets take an object

1
  var myObject = {};

This is a basic object. It is also a hash.

1
2
3
4
5
6
7
8
9
10
11
12
  //We can access it via dot notation or hash notation
  myObject['jackson'] = 5;
  5 + myObject['jackson']; // => 10;
  5 + myObject.jackson;    // => 10;

  //we can store strings, numbers, other objects and very importantly,
  // other functions!
  myObject.title = function() {
    return "this is the Jackson " + this.jackson;
  }

  myObject.title() // => "this is the Jackson 5";

If you are currently a javascript programmer, you’ve probably seen code like this. In langages like Java, or c, Functions are simply routines that operate on code. they are static. Once they exist, they exist in only one context. The Java version of ‘this’ refers to the instance of the instantiated object.

Functions in javascript are “First Class.” This is a really important distinction to make because it enables all of the most powerful code reuse techniques. A function can exist beyond an object. It can be made an instance function a several different objects. In practice, it means we can assign functions to variables and pass it into functions as arguments. We can even return functions from other functions.

Put more succinctly…

We can create a function which performs operations related to an abstract “this” attribute without knowing what “this” is going to be till we hook up a context to it when its called.

It is important to discuss the notion of call time. Languages where functions are not first class do not have a call time.

Lets take a cannonical example in Java.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
  public class ExampleClass {

    public int foo;

    public static void main(String[] args) {
        System.out.println("Clojure is way more fun");
    }

    public ExampleClass(int foo) {
        // initialize code goes here
        this.foo = foo;
    }
    public getFoo() {
        return this.foo;
    }
  }

Java functions are not first class. The ‘this’ refers only to the instantiated instance of this class. If I instantiate this class and call the function, I will get the class variable “foo” of the instance.

1
2
3
4
5
  ExampleClass exampleClass = new ExampleClass(5); // Java is ridiculous!!!
  exampleClass.getFoo(); // => 5

   // this results in an error because there is no public variable getFoo
  exampleClass.getFoo;

Because Java’s functions are not first class, there is no notion of this function being referred to unless its being specifically called. Javascript’s functions are dfferent. I can declare a javascript function and bind it to several objects.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
  var getFoo = function() {
    return this.foo;
  };

  var first = { foo:1 };
  first.getFoo = getFoo;

  var second = { foo: 2 };
  second.getFoo = getFoo;

  var third = {foo:3};
  third.theNameIsIrrelevant= getFoo;

  first.getFoo() // => 1
  second.getFoo() // => 2
  third.theNameIsIrrelevant() // => 3

This code in javascript works because javascript functions are variables that can be assigned and passed around like a company pen. (thats about as work safe a metaphor as I could come up with. sorry…) The only large difference between functions and other javascript types is that functions can be called.

Call time is the point where the javascript function is executed. If it contains a ‘this’ inside it. That keyword then resolves to whatever object that function is being called upon. If there is no object, ‘this’ resolves to the global object. In Browsers, the global object is window.

1
2
3
  window.foo = "ramalamadingdong";

  getFoo(); // =>  "ramalamadingdong"

on proto and prototype

new is a feature of javascript that changes the context of ‘this’ in a constructor function. when new is called on a constructor function being called, new becomes an object that is being modified and eventually returned. It will delegate function calls it cannot respond to to whatever the function’s prototype attribute is. IE: functions in javasript are objects.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
  var dundar = {
    gummy:'bear'
  }

  var ContructorFunction = function() {
    // at this point this == {};
    this.foo = "bar"

    // "return this" is implicit

  }

  ContructorFunction.prototype = dundar;

  newObject = new ContructorFunction();

  newObject.gummy; // => 'bear'
  newObject.foo; // => 'bar'

The __proto__ Attribute

Javascript’s prototype delegation is setup such that if an attribute is accessed on an object and the object does not have that attribute. It will defer that request to the object set as its __proto__. The __proto__ is determined by the constructor function’s prototype attribute.

This begs the question? why doesn’t this work?

1
2
3
4
5
6
  var dundar = {
    gummy:'bear'
  }

  newObject = { foo: 'bar' };
  newObject.__proto__ = dundar;

Actually it DOES, But only in V8 - ie: chrome and node.js

(as an interesting aside, this technique is used alot in the express source code. Don’t try this at home, unless you do server side node at home of course!)

Sorry to burst your bubble but I must advise you that you do not do this on the browser as it will not work becase __proto__ is a protected attribute. Thats right, the song and dance above it using new is the only surefire way to assign the __proto__ attribute.

call() and apply()

To recap

Unless you call the function with new, ‘this’ becomes resolved within the function to whatever is on the left side of the ‘.’.

1
2
  // if 'this' is in the definition of getFoo, it will resolve to myObject.
  myObject.getFoo()

call() and apply() can be called on any function and allow you to explicitly set what ‘this’ will resolve to. the only difference is that call takes the arguments of the function right after the new meaning of ‘this’ and apply takes the arguments as an array.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  myObject1 = {
    foo: 'bar',
    num: 1
  }

  myObject2 = {
    foo: 'manshu',
    getFoo : function() {
      return this.foo;
    }
    inc: function(num) {
      this.num += num;
      return this.num;
    }
  }

  //errors out because it doesn't have a getFoo() method
  myObject1.getFoo();

  myObject2.getFoo() // => 'manshu'

  //heres where is gets interesting
  myObject2.getsFoo.call(myObject1) // => 'bar'

  //this one errors because myObject2 does not have a num attribute
  myObject2.inc() // badd

  myObject2.inc.call(myObject1, 1); //sets myObject1.num to 2

  //apply does the same with the arguments in an array
  myObject2.inc.apply(myObject1, [1]); // myObject1.num is now 3!!

Now we’re ready for how to create inheritence chains in javascript! (that article is comming soon.)

Comments