Processes in Node Part 2: Pid to Kill

Internally within node, every object has an id that V8 uses to keep track of these object. In the same way, when we invoke a request to the operating system to create a process, the operating system loads the program into memory and assigns it an id which can be used to track the running program throughout its life cycle.

The id this process is assigned is called a pid and it can be accessed from within node via process

1
2
  var pid = process.pid;
  console.log(pid);

Fun fact: if you open another terminal and type “kill -9 (process pid).” You will have effectivly terminated the process in the other window by having the operating system directly deliver a kill switch to your running node instance.

The process module is global so we don’t need to require it. It gives us alot of information about the running node instance that we can use in our programs.

Example operations include

  • exiting the program
  • changing directories/ getting directory information
  • getting the name of the directory of the start script
  • retrieving argumenst passed into the command line invocation of the script

process events

specific to node.js is the ability to bind to occur at process events.

1
2
3
4
5
6
7
8
9
10
  var runOnExit = function() {
   console.log('...exiting');
  };

  process.on('exit', function() {
    /* its important that you do not invoke
    * any asynchronous functions in here
    */
    runOnExit();
  });

In the case of exit, the event loop terminates. The documentation specifically mentions events scheduled with setTimeout. Its important to note that this applies to all asych events as they all operate on setTimeout underneath.

Consequently, if you need to write some pertinent information to the disc, Be sure to use writeFileSync!!

process.kill

TLDR: Processes can send signals to each other and recieve signals from the ernal

Signals can be compared to events and event listeners in javascript. kill()’s name is unfortunate since kill is really a function for sending messages to other processes as OS signals. You will use this function to tell other processes to go kill themeselves most of the time.

This isn’t limited to node processes speaking amongst themeselves. Ever run node and then exit it using ctrl-c.

For a great list of standard unix signals, check out this Awesome Chart

We can orginaize them according to their default behaviors.

  • kill requests/reports
  • error reporting
  • device access notification
  • coordinating io

We can overide the default behaviors for some of these signals.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
  console.log("to delete => kill -9 " +process.pid);
  console.log("\n\n this is a a reinforced program, it will not easily die");

  /* SIGINT is sent to your process when you type in CTR-C.
     the operating system intercepts this command and relays
     it to your node instance. node's default response is to exit.
   */
  process.on('SIGINT', function() {
    console.log('you tried to press CTRL-C, go ahead, try again');
  });

  process.on('SIGTSTP', function() {
    console.log('LOL, nope - CTRL-Z won\'t save you!! either');
  });
  //this keeps the program from exiting
  var mainLoop = function() { setTimeout(mainLoop, 0); };
  mainLoop();

A more realistic scenario would be a situation in which you need to write some files to the disk before kicking off the exit procedures.

1
2
3
4
5
  process.on('SIGINT', function() {
    console.log('writing to disk');
    saveData();
    return process.exit(0);
  });

You cannot overwrite the kill -9 signal. This is why certain programs can become corrupted when you kill-9 them. they are not given a chance to save state.

For more information about signals, I found these two guides to be excellent.

Signals are a primitive pipeline for relaying messages. If you are looking to build some form of complex communication pipeline between individual processes, you are better off doing so over tcp. There are frameworks that facilitate this such as zeromq with an excellent node binding available.

Comments

Proceses in Node Part 1: Introduction to Processes

As a language made to exist within the browser, javascript did not originally come with a way to do process process related tasks. Until Chrome, browser based javascript was running within the same process as the browser.

In Chrome, every tab runs in its own process. Consequently, your javascript on any given page runs in its own process. Now that node has brought javascript into the server and far from the restrictive confines of the browser, We can do intersting things like run other programs from node.

What is a process?

Much like the relationship between a Constructor function + its prototypye and the object created by invoking the new keyword, a process is an operating system level invocation of some program.

Process vs Program - a very high level overview

Imagine the hardrive that powers your computer. Stored on it is a long range of numbers. For all practical purposes, we can think of it as a long serial stream of bits layed out. with information encoded on them. At some magical location is the boot sector. When the computer first starts running, it startsloading binary data from the hardrive into the RAM and then loads this data into the cpu. At this point we load the master process which we call the operating system kernal. The kernal then takes care of managing other processes in the system.

At this point, we get to your program. If we were doing this in C, the the compiler would compile your source code into binary code which follows an executable format. The actual encoding of this binary instruction stream varies across cpu architectures and operating systems. They do happen to share some common characteristics. Most of this information is located in the header of the file as information that tells the operating system that this code is executable and information on how it is to be run.

Entry Point Address

At some point in the binary stream of bytes that is your program, there exists the first instruction that needs to be loaded into the cpu. The Operating System loads this address into memory and queues it up for running into the cpu.

Data

Constants in the process are stored in the data stream in some area where they can be accessed by the Process as it runs. This includes mathematical constants such as Math.PI or string error messages.

Symbol and Lookup Tables.

To properly explain symbol and lookup table, I need to elaborate on what is going on at the instruction level on your computer. Basicly, it stores the locations of all the variables and function entry points. in memory.

Processes verses threads

The simple $2 answer is that processes have their own copy of all the data and symbol information. In some languages, multiple threads exist within a process and all share the same data. more importantly, you cannot create threads in node.

Processes and node

For us nodesters, the picture gets a bit more complicated. The computer loads up the program instructions we affectionatly know as node and loads it into memory. From there, the node program as a static set of bits on the hardrive becomes an almost living thing that loads your javascript into memory and run it!

To clarify, the process running when you run your node program is an instance of the node program. You can have several node instances running in memory. They can even all serve http requests as long as they are not trying to bind to the same port.

Understanding processes is incredibly useful. In Ruby, processes forking is used heavily in the design of unicorn. Services like nodejitsu and heroku utilize smart people who understand how processes work to architect systems that run and manage your code on the cloud. More importantly, node code can only run on one processor at any given time but by using features such as fork, you can set up a master node process that delegates tasks to subprocesses it spawns yourself. Since node processes are so lightweight, you could concievably run hundreds on your system at the same time.

Comments

On This, Protototypes, and Dunderheads

Of all the stranger than fiction things I have seen, I can’t say I have experienced as blatant a miscarriage of conceptual understanding as I have with javascript’s object system. (Ok, I might be exagerating just a little…)

Javascript is an interesting language. It is quite powerful when paired with appropriate libraries like underscore. Its prototypical inheritence model is a mindblowlingly powerful reductionist critique of classical inheritance. In the right hands, functions become lego pieces that can be glued onto objects in staggeringly flexible ways.

On “this”

Lets take an object

1
  var myObject = {};

This is a basic object. It is also a hash.

1
2
3
4
5
6
7
8
9
10
11
12
  //We can access it via dot notation or hash notation
  myObject['jackson'] = 5;
  5 + myObject['jackson']; // => 10;
  5 + myObject.jackson;    // => 10;

  //we can store strings, numbers, other objects and very importantly,
  // other functions!
  myObject.title = function() {
    return "this is the Jackson " + this.jackson;
  }

  myObject.title() // => "this is the Jackson 5";

If you are currently a javascript programmer, you’ve probably seen code like this. In langages like Java, or c, Functions are simply routines that operate on code. they are static. Once they exist, they exist in only one context. The Java version of ‘this’ refers to the instance of the instantiated object.

Functions in javascript are “First Class.” This is a really important distinction to make because it enables all of the most powerful code reuse techniques. A function can exist beyond an object. It can be made an instance function a several different objects. In practice, it means we can assign functions to variables and pass it into functions as arguments. We can even return functions from other functions.

Put more succinctly…

We can create a function which performs operations related to an abstract “this” attribute without knowing what “this” is going to be till we hook up a context to it when its called.

It is important to discuss the notion of call time. Languages where functions are not first class do not have a call time.

Lets take a cannonical example in Java.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
  public class ExampleClass {

    public int foo;

    public static void main(String[] args) {
        System.out.println("Clojure is way more fun");
    }

    public ExampleClass(int foo) {
        // initialize code goes here
        this.foo = foo;
    }
    public getFoo() {
        return this.foo;
    }
  }

Java functions are not first class. The ‘this’ refers only to the instantiated instance of this class. If I instantiate this class and call the function, I will get the class variable “foo” of the instance.

1
2
3
4
5
  ExampleClass exampleClass = new ExampleClass(5); // Java is ridiculous!!!
  exampleClass.getFoo(); // => 5

   // this results in an error because there is no public variable getFoo
  exampleClass.getFoo;

Because Java’s functions are not first class, there is no notion of this function being referred to unless its being specifically called. Javascript’s functions are dfferent. I can declare a javascript function and bind it to several objects.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
  var getFoo = function() {
    return this.foo;
  };

  var first = { foo:1 };
  first.getFoo = getFoo;

  var second = { foo: 2 };
  second.getFoo = getFoo;

  var third = {foo:3};
  third.theNameIsIrrelevant= getFoo;

  first.getFoo() // => 1
  second.getFoo() // => 2
  third.theNameIsIrrelevant() // => 3

This code in javascript works because javascript functions are variables that can be assigned and passed around like a company pen. (thats about as work safe a metaphor as I could come up with. sorry…) The only large difference between functions and other javascript types is that functions can be called.

Call time is the point where the javascript function is executed. If it contains a ‘this’ inside it. That keyword then resolves to whatever object that function is being called upon. If there is no object, ‘this’ resolves to the global object. In Browsers, the global object is window.

1
2
3
  window.foo = "ramalamadingdong";

  getFoo(); // =>  "ramalamadingdong"

on proto and prototype

new is a feature of javascript that changes the context of ‘this’ in a constructor function. when new is called on a constructor function being called, new becomes an object that is being modified and eventually returned. It will delegate function calls it cannot respond to to whatever the function’s prototype attribute is. IE: functions in javasript are objects.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
  var dundar = {
    gummy:'bear'
  }

  var ContructorFunction = function() {
    // at this point this == {};
    this.foo = "bar"

    // "return this" is implicit

  }

  ContructorFunction.prototype = dundar;

  newObject = new ContructorFunction();

  newObject.gummy; // => 'bear'
  newObject.foo; // => 'bar'

The __proto__ Attribute

Javascript’s prototype delegation is setup such that if an attribute is accessed on an object and the object does not have that attribute. It will defer that request to the object set as its __proto__. The __proto__ is determined by the constructor function’s prototype attribute.

This begs the question? why doesn’t this work?

1
2
3
4
5
6
  var dundar = {
    gummy:'bear'
  }

  newObject = { foo: 'bar' };
  newObject.__proto__ = dundar;

Actually it DOES, But only in V8 - ie: chrome and node.js

(as an interesting aside, this technique is used alot in the express source code. Don’t try this at home, unless you do server side node at home of course!)

Sorry to burst your bubble but I must advise you that you do not do this on the browser as it will not work becase __proto__ is a protected attribute. Thats right, the song and dance above it using new is the only surefire way to assign the __proto__ attribute.

call() and apply()

To recap

Unless you call the function with new, ‘this’ becomes resolved within the function to whatever is on the left side of the ‘.’.

1
2
  // if 'this' is in the definition of getFoo, it will resolve to myObject.
  myObject.getFoo()

call() and apply() can be called on any function and allow you to explicitly set what ‘this’ will resolve to. the only difference is that call takes the arguments of the function right after the new meaning of ‘this’ and apply takes the arguments as an array.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  myObject1 = {
    foo: 'bar',
    num: 1
  }

  myObject2 = {
    foo: 'manshu',
    getFoo : function() {
      return this.foo;
    }
    inc: function(num) {
      this.num += num;
      return this.num;
    }
  }

  //errors out because it doesn't have a getFoo() method
  myObject1.getFoo();

  myObject2.getFoo() // => 'manshu'

  //heres where is gets interesting
  myObject2.getsFoo.call(myObject1) // => 'bar'

  //this one errors because myObject2 does not have a num attribute
  myObject2.inc() // badd

  myObject2.inc.call(myObject1, 1); //sets myObject1.num to 2

  //apply does the same with the arguments in an array
  myObject2.inc.apply(myObject1, [1]); // myObject1.num is now 3!!

Now we’re ready for how to create inheritence chains in javascript! (that article is comming soon.)

Comments

Quicksort Algorithm in Javascript

Here’s a basic Quicksort algorithm. You can call this code using Node.js

1
2
3
4
5
6
7
8
9
10
11
var Sort = require('./quicksort.js');
list = [3, 4, 5,2, 5];
var sorted = Sort(list, function(a, b) {
  if (a > b) {
      return 1;
  } else if (a === b) {
      return 0;
  } else {
      return -1;
  }
});

have fun and hope this proves useful to somebody!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
//quicksort.js
module.exports = function(list1, compareFunction) {
    var list = list1;
    var cmp = compareFunction;

    var swap = function(i, j) {
        /* swaps the indexer A[i] and A[j] */
        var _i = list[i];
        var _j = list[j];
        list[j] = _i;
        list[i] = _j;
    }

    var partition = function(p, r) {
        /* the partition function goes through the array
         *  list[
         */
        var r = r;
        var i = p; //on intiial pass, p is 0

        /* for each entry i where A[i] is less than A[r-1]
         * where r-1 is the index of the Penultimate item
         */
        //from j = 0 to j = r-1
        //exchange A[i] and A[j] if A[j] > A[r]
        for (var j = p; j < r; j++) {
            //console.log("lets try this ", list);
            //console.log("i : " + i + " j : " + j);
            if (cmp(list[j],list[r]) <= 0 ) {
                //swap them!! and then increment i
                swap(i, j);
                i++;
            }
        }

        swap(i, r);
        return i;


    }

    var quickSort = function(p, r) {
        if (p < r) {
            q = partition(p, r);
            quickSort(p, q - 1);
            quickSort(q + 1, r)

        }

    }

    var initialize = function() {
        quickSort(0, list1.length-1);

    }
    //compute the sorted value
    initialize();

    return list1;

}

Sorting Algorithms in Python - Part 1

I’m in a couple of weeks into Robert Sedgewick’s class on algorithms currently running on coursera. To round off the weekend I decided to reimpliment some of the classic sortting algorithms in python.

The compare function

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
    def cmp(a, b):
        if (a == b):
            return 0
        elif (a > b):
            return 1
        else:
            return -1

  def swap(i, j, array):
      swapv = array[i]
      array[i] = array[j]
      array[j] = swapv
      return array

  def biggest(array):
      i = 0
      biggest_val = array[0]
      biggest_index = i
      while(i < len(array)):
          if (array[i] > biggest_val):
              biggest_val = array[i]
              biggest_index = i
          i = i + 1
      return (biggest_index, biggest_val)

  def minv(array, start_index):
      i = start_index
      min_val = array[start_index]
      min_index = i
      while(i < len(array)):
          if (array[i] < min_val):
              min_val = array[i]
              min_index = i
          i = i + 1
      return (min_index, min_val)

  def min(array):
      return minv(array, 0)

Selection Sort

1
2
3
4
5
6
7
   def selectionSort(array):
      i = 0
      array1 = array
      while (i < len(array1)):
          array1 = swap(i, minv(array1, i)[0], array1)
          i = i + 1
      return array1

Insertion sort is just a specialized case of shellsort so I created a base composite function that encapsulates the core of both algorithms.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
   def gapSort(array, gap):
      """helper function to aid insertion sort and shell sort"""
      i = 0
      array1 = array
      while (i < len(array)):
          if (i > 0):
              j = i
              while ((cmp(array1[j], array1[j-gap]) < 0) and (j != 0 ) ):
                  #while object at i is less than the one before it
                  swap(j, j-gap, array1)
                  j = j - 1
          i = i + 1
      return array1

  def insertionSort(array):
      return gapSort(array, 1)



  def shellSort(array):
      """sorts using the shellsort algorithm"""
      vals = [3*h+1 for h in range(len(array)/3)][::-1]
      for val in vals:
          array = gapSort(array, val)
      return array

Finally, Merge sort; Running in N log(N), it is the only algorithms other than Quicksort worth using on large datasets.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
   def merge(array, p, q, r):
  """The merge function"""
      if ((r - p) > 1):
          left = array[p:q+1]
          loggr("left" + str(left))
          right = array[q+1:r+1]
          loggr("right"+ str(right))
          left.append('c')
          right.append('c')
          i = 0
          j = 0
          for k in range(p, r+1):
              if left[i] <= right[j]:
                  array[k] = left[i]
                  i = i + 1
              else:
                  array[k] = right[j]
                  j = j + 1
      elif ((r - p) == 1 ):
          if (array[r] < array[p]):
              i = array[p]
              j = array[r]
              array[p] = j
              array[r] = i



  def mergeSort(array):
      def sort(p, r, msg):
          if p < r:
              q = (p+r)/2
              if (r - p) > 1:
                  sort(p, q, "in left array")
                  sort(q+1, r, "right array")
              merge(array, p, q, r)
      sort(0, len(array)-1, "root")
      return array

Quick sort and its probabilistic guarantee of fast enough run time strikes me as the most mathematically perverse form of black magic. Beautiful in the inherent underlying fabric of its utility. I’ll cover that when I get to part 2.

Comments