Skip to main content

Debugging MapReduce in Javascript

Cross post from my employer's development blog: http://rootinc.github.io/2016/02/03/debug-mapreduce-js/

With single line arrow functions in ES6, it’s getting easier and cleaner to use JS’s native Array.prototype.map() and Array.prototype.reduce() functions for molding our data. I recently decided to go the MapReduce route for a project to filter some data upfront. Inevitably, I ran into some issues with my logic and wasn’t getting what I expected. Thing is, there doesn’t seem to be a simple way to add debugging to a chain of map() and reduce() calls, particularly if you’re using single line arrow functions.

     let behaviors = track.metrics
       .reduce((list1, m) => list1.concat(m.behaviors), [])
       .reduce((list2, b) =>
         ((list2.indexOf(b) === -1 && list2.push(b)) || 1) && list2
       , [])
       .map(bId => json.behaviors[bId]);

Here I need to take multiple arrays of id’s, flatten them, remove duplicates, then map in their respective object for the given id. I had an issue with the final value at first (note, above code is correct) and wanted to debug the values between each method call in the chain.

How do I do this? Well, very rough approach would be to switch to bracketed arrow functions, which will allow me to run two statements in the function.

     let behaviors = track.metrics
       .reduce((list1, m) => {
         console.log(list1);
         return list1.concat(m.behaviors)
       }, [])
       .reduce((list2, b) => {
         console.log(list2);
         return ((list2.indexOf(b) === -1 && list2.push(b)) || 1) && list2
       }, [])
       .map(bId => json.behaviors[bId]);

This works but suffers a couple drawbacks. You lose the syntactic conciseness of the implicitly returning no bracket arrow functions, resulting in what looks like a borderline ES5 mess. Second, I get way too many calls to console.log as the functions get run for every item in the array. So I would have to add additional logic to each function to prevent that. I’m not even going to bother typing that all out here.

So, I wrote up a rather simple factory function to use in this case.

function MRDbg(...txt){
 return function(prev, cur, i, arr){
   if (i === 1){
     console.log.apply(console, [...txt, arr]);
   }
   return arr;
 }
}

It expects to be called as a reduce() handler and allows you to label each debug call. So it fits nicely into my current MapReduce chain, and, here’s a big one, I don’t have to modify my existing functions and then worry about changing them back later!

let behaviors = track.metrics
 .reduce((list1, m) => list1.concat(m.behaviors), [])
 .reduce(MRDbg(‘first’))
 .reduce((list2, b) =>
   ((list2.indexOf(b) === -1 && list2.push(b)) || 1) && list2
 , [])
 .reduce(MRDbg('second'))
 .map(bId => json.behaviors[bId]);
});

Much cleaner! We then get some very simple console outputs.

first [140, 141, 142, 143, 144, 132, 145, 146, 147, 148, 149, 150, 151, 152, 153, 87, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 100, 178, 89, 179, 101, 102, 98, 176, 177, 180, 100, 178, 89, 179, 101, 181, 102, 182, 140, 141, 142, 144, 145, 146, 147, 148, 149, 183, 150, 152, 184, 87, 154, 155, 158, 159, 160, 110, 185, 163, 112, 164, 166, 167, 168, 169, 170, 171, 173, 186, 174, 175]
main.js:115
second [140, 141, 142, 143, 144, 132, 145, 146, 147, 148, 149, 150, 151, 152, 153, 87, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 100, 178, 89, 179, 101, 102, 98, 180, 181, 182, 183, 184, 110, 185, 112, 186]

You can find a gist of the function here along with an ES5 compatible version.

Comments

Popular posts from this blog

IE Caches a Lot

Cross post from my employer's development blog: http://rootinc.github.io/2016/03/09/ie-caches/

In developing a page, I decided to do things a bit differently on the server. By doing an explicit check on the HTTP request headers, I can detect server-side if a request to the server is coming via XHR (Ajax) or a standard page load. I can then serve different content based on the request type. So, I can use the same URL for retrieving the initial HTML page and the raw JSON data associated with that page. Express makes this pretty easy:
if (req.xhr){      return res.json(await this.usersData());    }    else {      return res.view('users', await this.usersData());    }
I’m not sure if it’s technically more RESTful than having separate URL routes for data and HTML, but it felt like it made sense. The URL is referring to the same data, and based on a header, I want to determine how it is represented, but the data doesn’t change so why should the URL? This also makes it possible to d…

Quick Deepstream.io Setup Using JSPM

Cross post from my employer's development blog: http://rootinc.github.io/2016/02/12/deepstream-jspm/
Want to use JSPM rather than Bower for running the Deepstream.io example? Follow these steps. This is basically a duplicate of the [Getting Started tutorial][tutorial] on the [Deepstream.io website][website] but using a really simple JSPM setup. This is a very crude guide where I list everything I had to do to get things running.

Create an empty project folder npm install deepstream.io Copy server code verbatim from the Getting Started guide jspm install npm:deepstream.io-client-js Hit enter for all the prompts from JSPM

We’re going to modify the client side code a little bit. We have native support for ES6 compiling with JSPM/Babel so we can import the Deepstream client directly:

import deepstream from 'deepstream.io-client-js'; let ds = deepstream( 'localhost:6020' ).login();

let record = ds.record.getRecord( 'someUser' );

let input = document.querySelector( 'inp…

Atari E3 2004 PAL digital press kit

Making note of some old swag. The Atari E3 2004 PAL digital press kit. See video for details.