Cover ImageD3 Tips and Tricks v3.x
Malcolm Maclean
leanpub.com

Annotations by color

1 yellow note • 32 green notes • 3 blue notes

Created by Ahmed Ouda   – Last synced 2023

Yellow


What can crossfilter do? The best way to get a feel for the capabilities of crossfilter is to visit the demo page for crossfilter and to play with their example. Crossfilter Demo Page Here we are presented with five separate views of a data set that represents flight records demonstrating airline on-time performance. There are 231,083 flight records in the database being used, so getting that rendered in a web page is no small feat in itself. The bottom view is a table showing data for individual flights. The top, left view is of the number of flights that occur at a specific hour of the day.
2023372

Green


Setting up the margins and the graph area. The part of the code responsible for defining the canvas (or the area where the graph and associated bits and pieces is placed ) is this part. var margin = { top : 30 , right : 20 , bottom : 30 , left : 50 }, width = 600 – margin . left – margin . right , height = 270 – margin . top – margin . bottom ; This is really ( really ) well explained on Mike Bostock’s page on margin conventions here http://bl.ocks.org/3019563, but at the risk of confusing you here’s my crude take on it. The first line defines the four margins which surround the block where the graph (as an object) is positioned. var margin = { top : 30 , right : 20 , bottom : 30 , left : 50 },
202336

The line in the JavaScript that parses the time is the following; var parseDate = d3 . time . format ( “%d-%b-%y” ). parse
202344

The “Ah Ha!” moment for me in understanding ranges and scales was after reading Jerome Cukier’s great page on ‘d3:scales and color’
202345

var x = d3 . time . scale (). range ([ 0 , width ]); var y = d3 . scale . linear (). range ([ height , 0 ]); The purpose of these portions of the script is to ensure that the data we ingest fits onto our graph correctly. Since we have two different types of data (date/time and numeric values) they need to be treated separately (but they do essentially the same job).
202345

x . domain ( d3 . extent ( data , function ( d ) { return d . date ; })); y . domain ([ 0 , d3 . max ( data , function ( d ) { return d . close ; })]); The idea of scaling is to take the values of data that we have and to fit them into the space we have available
202346

var x = d3 . time . scale (). range ([ 0 , width ]); Here we set our variable that will tell D3 where to draw something on the x axis. By using the d3.time.scale() function we make sure that D3 knows to treat the values as date / time entities (with all their ingrained peculiarities)
202346

var xAxis = d3 . svg . axis (). scale ( x ) . orient ( “bottom” ). ticks ( 5 ); var yAxis = d3 . svg . axis (). scale ( y ) . orient ( “left” ). ticks ( 5 ); I’ve included both the x and y axes because they carry out the formatting in very similar ways. It’s worth noting that this is not the point where the axes get drawn. That occurs later in the piece where the data.csv file has been loaded as ‘data’. D3 has it’s own axis component that aims to take the fuss out of setting up and displaying the axes. So it includes a number of configurable options. Looking first at the x axis; var xAxis = d3 . svg . axis (). scale ( x ) . orient ( “bottom” ). ticks ( 5 ); The axis function is called with d3.svg.axis() . Then the scale is set using the x values that we set up in the scales, ranges and domains section using .scale(x) .
202351

var svg = d3 . select ( “body” ) . append ( “svg” ) . attr ( “width” , width + margin . left + margin . right ) . attr ( “height” , height + margin . top + margin . bottom ) . append ( “g” ) . attr ( “transform” , “translate(” + margin . left + “,” + margin . top + “)” ); So what exactly does that all mean? Well D3 needs to be able to have a space defined for it to draw things. When you define the space it’s going to use, you can also give the space you’re going to use an identifying name and attributes. In the example we’re using here, we are ‘appending’ an SVG element (a canvas that we are going to draw things on) to the element of the HTML page.
202356

The following line has the effect of shifting the text slightly to the right. . attr ( “dy” , “1em” )
202366

The “dy” attribute is another coordinate adjustment move, but this time a relative adjustment and the “1em” is a unit of measure that equals exactly one unit of the currently specified text point size. So what ends up happening is that the ‘Value’ label gets shifted to the right by exactly the height of the text, which neatly places it exactly on the edge of the canvas.
202366

var valueline = d3 . svg . line () . interpolate ( “basis” ) // <=== THERE IT IS! . x ( function ( d ) { return x ( d . date ); }) . y ( function ( d ) { return y ( d . close ); }); So is that it? Nooooo…….. There’s more! This is one form of interpolation effect that can be applied to your data, but there is a range and depending on your data you can select the one that is appropriate. Here’s the list of available options and for more about them head on over to the D3 wiki and look for ‘line.interpolate’. linear – Normal line (jagged). step-before – a stepping graph alternating between vertical and horizontal segments. step-after – a stepping graph alternating between horizontal and vertical segments. basis – a B-spline, with control point duplication on the ends (that’s the one above). basis-open – an open B-spline; may not intersect the start or end. basis-closed – a closed B-spline, with the start and the end closed in a loop. bundle – equivalent to basis, except a separate tension parameter is used to straighten the spline. This could be really cool with varying tension. cardinal – a Cardinal spline, with control point duplication on the ends. It looks slightly more ‘jagged’ than basis. cardinal-open – an open Cardinal spline; may not intersect the start or end, but will intersect other control points. So kind of shorter than ‘cardinal’. cardinal-closed – a closed Cardinal spline, looped back on itself. monotone – cubic interpolation that makes the graph only slightly smoother.
202370

axis . tickSize ([ major [ ​ [, minor ], end ]]) That tells us that you get to specify the size of the ticks on the axes, by the major ticks, the minor ticks and the end ticks (that is to say the lines on the very end of the graph
202380

The last thing that is included in the code to draw the grid lines is the instruction to suppress printing any label for the ticks; . tickFormat ( “” )
202380

So lets imagine that we want to make the line on our simple graph dashed. All we have to do is insert the following line in our JavaScript code here; svg . append ( “path” ) . attr ( “class” , “line” ) . style ( “stroke-dasharray” , ( “3, 3” )) // <== This line here!! . attr ( “d” , valueline ( data ))
202381

y . domain ([ 0 , d3 . max ( data , function ( d ) { return d . close ;})]); So that only considers d.close when establishing the domain. With d.open exceeding our domain, it just keeps drawing off the graph! The good news is that ‘Bill’ has provided a solution for just this problem here; All you need to replace the y.domain line with is this; y . domain ([ 0 , d3 . max ( data , function ( d ) { return Math . max ( d . close , d . open ); })]); It does much the same thing, but this time it returns the maximum of d.close and d.open (whichever is largest)
202392

I will just be brushing the surface of the subject of transitions in d3.js, and I will certainly not do the topic the justice it deserves for in depth animations. I heartily recommend that you take an opportunity to read Mike Bostock’s “Path Transitions” (http://bost.ocks.org/mike/path/), bar chart tutorial (http://mbostock.github.com/d3/tutorial/bar-2.html) and Jerome Cukier’s “Creating Animations and Transitions with D3” (http://blog.visual.ly/creating-animations-and-transitions-with-d3-js/). Of course, one of the main resources for information on transitions is also the D3 wiki (https://github.com/mbostock/d3/wiki/ Transitions).
2023107

Clipped Path (AKA clipPath) A clipPath is the path of a SVG shape that can be used in combination with another shape to remove any parts of the combined shape that doesn’t fall within the clipPath
2023122

text-anchor The text-anchor attribute determines the justification of a text element Text can have one of three text-anchor types; start where the text is left justified. middle where the text is centre justified. end where the text is right justified.
2023141

lengthAdjust The lengthAdjust attribute allows the textLength attribute to have the spacing of a text element controlled to be either spacing or spacingAndGlyphs ; spacing : In this option the letters remain the same size, but the spacing between the letters and words are adjusted. spacingAndGlyphs : In this option the text is stretched or squeezed to fit.
2023144

fill-opacity The fill-opacity style changes the transparency of the fill of an element.
2023149

stroke-linejoin The stroke-linejoin style specifies the shape of the join of two lines. This would be used on path , polyline and polygon elements (and possibly more). There are three line join options; miter where the join is squared off as would be expected at the join of two lines. round where the outside portion of the join is rounded in proportion to its width. bevel where the join has a straight edged outer portion clipped off to provide a slightly more contoured effect while still being angular
2023153

writing-mode The writing-mode style changes the orientation of the text so that it prints out top to bottom. It has a single option “tb” that accomplishes this. It is relatively limited in scope compared to the equivalent for CSS, but for the purposes of generating some text it has a definite use
2023154

glyph-orientation-vertical The glyph-orientation-vertical style changes the rotation of the individual glyphs (characters) in text and if used in conjunction with the writing-mode style (and set to 0) will allow the text to be displayed vertically with the letters orientated vertically as well.
2023155

. on ( “mouseover” , function ( d ) { div . transition () . duration ( 200 ) . style ( “opacity” , .9 ); div . html ( formatTime ( d . date ) + “” + d . close ) . style ( “left” , ( d3 . event . pageX ) + “px” ) . style ( “top” , ( d3 . event . pageY – 28 ) + “px” ); }) . on ( “mouseout” , function ( d ) { div . transition () . duration ( 500 ) . style ( “opacity” , 0 ); });
2023165

var table = d3 . select ( “body” ). append ( “table” ) . attr ( “style” , “margin-left: 250px” ), thead = table . append ( “thead” ), tbody = table . append ( “tbody” );
2023206

The border-collapse style tells the table to overlap each cell’s borders, rather than treat them as discrete entities
2023214

Plunker is awesome. So what can it do for you? Well, in short, this gives you a place to put your graphs on the web without the hassle of needing a web server as well as allowing others to view and collaborate! There are some limitations to hosting graphs in this environment, but there’s no denying that for ease of use and visibility to the outside world, it’s outstanding
2023224

What is a Force Layout Diagram? This is not a distinct type of diagram per se. Instead, it’s a way of representing data so that individual data points share relationships to other data points via forces. Those forces can then act in different ways to provide a natural structure to the data. The end result can be a wide variety of representations of connectedness and groupings. Mike Bostock gave a great talk which focussed on force layout techniques in 2011 at Trulia for the Data Visualization meetup group. Check video of the presentation here: http://vimeo.com/29458354 and the slides here: http://mbostock.github.com/d3/talk/20110921/#0. The most memorable quote I recall from the talk describes force layout diagrams as an “ Implicit way to do position encoding ”.
2023256

data vs datum One small line gets its own section. That line is; svg . datum ( data ); A casual d3.js user could be forgiven for thinking that this doesn’t seem too fearsome a line, but it has hidden depths. As Mike Bostock explains here, if we want to bind data to elements as a group we would be *.data , but if we want to bind that data to individual elements, we should use *.datum . It’s a function of how the data is stored. If there is an expectation that the data will be dynamic then data is the way to go since it has the feature of preparing enter and exit selections. If the data is static (it won’t be changing) then datum is the way to go
2023364

Crossfilter, dc.js and d3.js for Data Discovery The ability to interact with visual data is the third step on the road to data nirvana in my humble opinion. Step 1: Raw data Step 2: Visualize data Step 3: Interact with data But I think that there might be a 4th step where data is a more fluid construct. Where the influences of interaction have a more profound impact on how information is presented and perceived. I think that the visualization tools that we’re going to explore in this chapter take that 4th step. Step 4: Data immersion The tools we’re going to use are not the only way that we can achieve the effect of immersion, but they are simple enough for me to use and they incorporate d3.js at their core. Introduction to Crossfilter Crossfilter is a JavaScript library for exploring large datasets that include many variables in the browser. It supports extremely fast interactions with concurrent views and was built to power analytics for Square Register so that online merchants can slice and dice their payment history fluidly. It was developed for Square by (amongst other people) the ever tireless Mike Bostock and was released under the Apache Licence. Crossfilter provides a map-reduce function to data using ‘dimensions’ and ‘groups’. Map-reduce is an interesting concept itself and it’s useful to understand it in a basic form to understand crossfilter better. Map-reduce Wikipedia tells us that “ MapReduce is a programming model for processing large data sets with a parallel, distributed algorithm on a cluster ”. Loosely translated into language I can understand, I think of a large data set having one dimension ‘mapped’ or loaded into memory ready to be worked on. In practical terms, this could be an individual column of data from a larger group of information. This column of data has ‘key’ values which we can define as being distinct
2023371

ntroduction to dc.js Why, if we’ve just explored the benefits of crossfilter are we now introducing a completely different JavaScript library (dc.js)? Well, crossfilter isn’t a library that’s designed to draw graphs. It’s designed to manipulate data. D3.js is a library that’s designed to manipulate graphical objects (and more) on a web page. The two of them will work really well together, but the barrier to getting data onto a web page can be slightly daunting because the combination of two non-trivial technologies can be difficult to achieve. This is where dc.js comes in. It was developed by Nick Qi Zhu and the first version was released on the 7th of July 2012. Dc.js is designed to be an enabler for both libraries. Taking the power of crossfilter’s data manipulation capabilities and integrating the graphical capabilities of d3.js.
2023374

However, there is a slight twist… Observant readers will notice that while we have a function that resolves a date/time that is formatted with year, month, day, hour, minute and second values, I don’t include an allowance for the fractions of seconds that appear in the csv file. Well spotted. The reason for this is that in spite of initially including this formatting, I found it caused some behaviour that I couldn’t explain, so I reverted to cheating and you will note that in the next section when I format the values from the csv file, I truncate the date/time value to the first 19 characters ( d.origintime.substr(0,19) ). This solved my problem by chopping off the fractions of a second (admittedly without actually solving the underlying issue) and I moved on with my life.
2023381

Blue


The next part ( .ticks(5) ) sets the number of ticks on the axis.
202352

// Select the section we want to apply our changes to var svg = d3 . select ( “body” ). transition ();
2023105

stroke-opacity The stroke-opacity style changes the transparency of the stroke (line) of an element
2023149

All your annotations

36 notes/highlights

Starting with a basic graph


Setting up the margins and the graph area. The part of the code responsible for defining the canvas (or the area where the graph and associated bits and pieces is placed ) is this part. var margin = { top : 30 , right : 20 , bottom : 30 , left : 50 }, width = 600 – margin . left – margin . right , height = 270 – margin . top – margin . bottom ; This is really ( really ) well explained on Mike Bostock’s page on margin conventions here http://bl.ocks.org/3019563, but at the risk of confusing you here’s my crude take on it. The first line defines the four margins which surround the block where the graph (as an object) is positioned. var margin = { top : 30 , right : 20 , bottom : 30 , left : 50 },
202336

The line in the JavaScript that parses the time is the following; var parseDate = d3 . time . format ( “%d-%b-%y” ). parse
202344

The “Ah Ha!” moment for me in understanding ranges and scales was after reading Jerome Cukier’s great page on ‘d3:scales and color’
202345

var x = d3 . time . scale (). range ([ 0 , width ]); var y = d3 . scale . linear (). range ([ height , 0 ]); The purpose of these portions of the script is to ensure that the data we ingest fits onto our graph correctly. Since we have two different types of data (date/time and numeric values) they need to be treated separately (but they do essentially the same job).
202345

x . domain ( d3 . extent ( data , function ( d ) { return d . date ; })); y . domain ([ 0 , d3 . max ( data , function ( d ) { return d . close ; })]); The idea of scaling is to take the values of data that we have and to fit them into the space we have available
202346

var x = d3 . time . scale (). range ([ 0 , width ]); Here we set our variable that will tell D3 where to draw something on the x axis. By using the d3.time.scale() function we make sure that D3 knows to treat the values as date / time entities (with all their ingrained peculiarities)
202346

var xAxis = d3 . svg . axis (). scale ( x ) . orient ( “bottom” ). ticks ( 5 ); var yAxis = d3 . svg . axis (). scale ( y ) . orient ( “left” ). ticks ( 5 ); I’ve included both the x and y axes because they carry out the formatting in very similar ways. It’s worth noting that this is not the point where the axes get drawn. That occurs later in the piece where the data.csv file has been loaded as ‘data’. D3 has it’s own axis component that aims to take the fuss out of setting up and displaying the axes. So it includes a number of configurable options. Looking first at the x axis; var xAxis = d3 . svg . axis (). scale ( x ) . orient ( “bottom” ). ticks ( 5 ); The axis function is called with d3.svg.axis() . Then the scale is set using the x values that we set up in the scales, ranges and domains section using .scale(x) .
202351

The next part ( .ticks(5) ) sets the number of ticks on the axis.
202352

var svg = d3 . select ( “body” ) . append ( “svg” ) . attr ( “width” , width + margin . left + margin . right ) . attr ( “height” , height + margin . top + margin . bottom ) . append ( “g” ) . attr ( “transform” , “translate(” + margin . left + “,” + margin . top + “)” ); So what exactly does that all mean? Well D3 needs to be able to have a space defined for it to draw things. When you define the space it’s going to use, you can also give the space you’re going to use an identifying name and attributes. In the example we’re using here, we are ‘appending’ an SVG element (a canvas that we are going to draw things on) to the element of the HTML page.
202356

Things you can do with the basic graph


The following line has the effect of shifting the text slightly to the right. . attr ( “dy” , “1em” )
202366

The “dy” attribute is another coordinate adjustment move, but this time a relative adjustment and the “1em” is a unit of measure that equals exactly one unit of the currently specified text point size. So what ends up happening is that the ‘Value’ label gets shifted to the right by exactly the height of the text, which neatly places it exactly on the edge of the canvas.
202366

var valueline = d3 . svg . line () . interpolate ( “basis” ) // <=== THERE IT IS! . x ( function ( d ) { return x ( d . date ); }) . y ( function ( d ) { return y ( d . close ); }); So is that it? Nooooo…….. There’s more! This is one form of interpolation effect that can be applied to your data, but there is a range and depending on your data you can select the one that is appropriate. Here’s the list of available options and for more about them head on over to the D3 wiki and look for ‘line.interpolate’. linear – Normal line (jagged). step-before – a stepping graph alternating between vertical and horizontal segments. step-after – a stepping graph alternating between horizontal and vertical segments. basis – a B-spline, with control point duplication on the ends (that’s the one above). basis-open – an open B-spline; may not intersect the start or end. basis-closed – a closed B-spline, with the start and the end closed in a loop. bundle – equivalent to basis, except a separate tension parameter is used to straighten the spline. This could be really cool with varying tension. cardinal – a Cardinal spline, with control point duplication on the ends. It looks slightly more ‘jagged’ than basis. cardinal-open – an open Cardinal spline; may not intersect the start or end, but will intersect other control points. So kind of shorter than ‘cardinal’. cardinal-closed – a closed Cardinal spline, looped back on itself. monotone – cubic interpolation that makes the graph only slightly smoother.
202370

axis . tickSize ([ major [ ​ [, minor ], end ]]) That tells us that you get to specify the size of the ticks on the axes, by the major ticks, the minor ticks and the end ticks (that is to say the lines on the very end of the graph
202380

The last thing that is included in the code to draw the grid lines is the instruction to suppress printing any label for the ticks; . tickFormat ( “” )
202380

So lets imagine that we want to make the line on our simple graph dashed. All we have to do is insert the following line in our JavaScript code here; svg . append ( “path” ) . attr ( “class” , “line” ) . style ( “stroke-dasharray” , ( “3, 3” )) // <== This line here!! . attr ( “d” , valueline ( data ))
202381

y . domain ([ 0 , d3 . max ( data , function ( d ) { return d . close ;})]); So that only considers d.close when establishing the domain. With d.open exceeding our domain, it just keeps drawing off the graph! The good news is that ‘Bill’ has provided a solution for just this problem here; All you need to replace the y.domain line with is this; y . domain ([ 0 , d3 . max ( data , function ( d ) { return Math . max ( d . close , d . open ); })]); It does much the same thing, but this time it returns the maximum of d.close and d.open (whichever is largest)
202392

// Select the section we want to apply our changes to var svg = d3 . select ( “body” ). transition ();
2023105

I will just be brushing the surface of the subject of transitions in d3.js, and I will certainly not do the topic the justice it deserves for in depth animations. I heartily recommend that you take an opportunity to read Mike Bostock’s “Path Transitions” (http://bost.ocks.org/mike/path/), bar chart tutorial (http://mbostock.github.com/d3/tutorial/bar-2.html) and Jerome Cukier’s “Creating Animations and Transitions with D3” (http://blog.visual.ly/creating-animations-and-transitions-with-d3-js/). Of course, one of the main resources for information on transitions is also the D3 wiki (https://github.com/mbostock/d3/wiki/ Transitions).
2023107

Elements, Attributes and Styles


Clipped Path (AKA clipPath) A clipPath is the path of a SVG shape that can be used in combination with another shape to remove any parts of the combined shape that doesn’t fall within the clipPath
2023122

text-anchor The text-anchor attribute determines the justification of a text element Text can have one of three text-anchor types; start where the text is left justified. middle where the text is centre justified. end where the text is right justified.
2023141

lengthAdjust The lengthAdjust attribute allows the textLength attribute to have the spacing of a text element controlled to be either spacing or spacingAndGlyphs ; spacing : In this option the letters remain the same size, but the spacing between the letters and words are adjusted. spacingAndGlyphs : In this option the text is stretched or squeezed to fit.
2023144

fill-opacity The fill-opacity style changes the transparency of the fill of an element.
2023149

stroke-opacity The stroke-opacity style changes the transparency of the stroke (line) of an element
2023149

stroke-linejoin The stroke-linejoin style specifies the shape of the join of two lines. This would be used on path , polyline and polygon elements (and possibly more). There are three line join options; miter where the join is squared off as would be expected at the join of two lines. round where the outside portion of the join is rounded in proportion to its width. bevel where the join has a straight edged outer portion clipped off to provide a slightly more contoured effect while still being angular
2023153

writing-mode The writing-mode style changes the orientation of the text so that it prints out top to bottom. It has a single option “tb” that accomplishes this. It is relatively limited in scope compared to the equivalent for CSS, but for the purposes of generating some text it has a definite use
2023154

glyph-orientation-vertical The glyph-orientation-vertical style changes the rotation of the individual glyphs (characters) in text and if used in conjunction with the writing-mode style (and set to 0) will allow the text to be displayed vertically with the letters orientated vertically as well.
2023155

Assorted Tips and Tricks


. on ( “mouseover” , function ( d ) { div . transition () . duration ( 200 ) . style ( “opacity” , .9 ); div . html ( formatTime ( d . date ) + “” + d . close ) . style ( “left” , ( d3 . event . pageX ) + “px” ) . style ( “top” , ( d3 . event . pageY – 28 ) + “px” ); }) . on ( “mouseout” , function ( d ) { div . transition () . duration ( 500 ) . style ( “opacity” , 0 ); });
2023165

var table = d3 . select ( “body” ). append ( “table” ) . attr ( “style” , “margin-left: 250px” ), thead = table . append ( “thead” ), tbody = table . append ( “tbody” );
2023206

The border-collapse style tells the table to overlap each cell’s borders, rather than treat them as discrete entities
2023214

Plunker is awesome. So what can it do for you? Well, in short, this gives you a place to put your graphs on the web without the hassle of needing a web server as well as allowing others to view and collaborate! There are some limitations to hosting graphs in this environment, but there’s no denying that for ease of use and visibility to the outside world, it’s outstanding
2023224

Force Layout Diagrams


What is a Force Layout Diagram? This is not a distinct type of diagram per se. Instead, it’s a way of representing data so that individual data points share relationships to other data points via forces. Those forces can then act in different ways to provide a natural structure to the data. The end result can be a wide variety of representations of connectedness and groupings. Mike Bostock gave a great talk which focussed on force layout techniques in 2011 at Trulia for the Data Visualization meetup group. Check video of the presentation here: http://vimeo.com/29458354 and the slides here: http://mbostock.github.com/d3/talk/20110921/#0. The most memorable quote I recall from the talk describes force layout diagrams as an “ Implicit way to do position encoding ”.
2023256

D3.js Examples Explained


data vs datum One small line gets its own section. That line is; svg . datum ( data ); A casual d3.js user could be forgiven for thinking that this doesn’t seem too fearsome a line, but it has hidden depths. As Mike Bostock explains here, if we want to bind data to elements as a group we would be *.data , but if we want to bind that data to individual elements, we should use *.datum . It’s a function of how the data is stored. If there is an expectation that the data will be dynamic then data is the way to go since it has the feature of preparing enter and exit selections. If the data is static (it won’t be changing) then datum is the way to go
2023364

Crossfilter, dc.js and d3.js for Data Discovery


Crossfilter, dc.js and d3.js for Data Discovery The ability to interact with visual data is the third step on the road to data nirvana in my humble opinion. Step 1: Raw data Step 2: Visualize data Step 3: Interact with data But I think that there might be a 4th step where data is a more fluid construct. Where the influences of interaction have a more profound impact on how information is presented and perceived. I think that the visualization tools that we’re going to explore in this chapter take that 4th step. Step 4: Data immersion The tools we’re going to use are not the only way that we can achieve the effect of immersion, but they are simple enough for me to use and they incorporate d3.js at their core. Introduction to Crossfilter Crossfilter is a JavaScript library for exploring large datasets that include many variables in the browser. It supports extremely fast interactions with concurrent views and was built to power analytics for Square Register so that online merchants can slice and dice their payment history fluidly. It was developed for Square by (amongst other people) the ever tireless Mike Bostock and was released under the Apache Licence. Crossfilter provides a map-reduce function to data using ‘dimensions’ and ‘groups’. Map-reduce is an interesting concept itself and it’s useful to understand it in a basic form to understand crossfilter better. Map-reduce Wikipedia tells us that “ MapReduce is a programming model for processing large data sets with a parallel, distributed algorithm on a cluster ”. Loosely translated into language I can understand, I think of a large data set having one dimension ‘mapped’ or loaded into memory ready to be worked on. In practical terms, this could be an individual column of data from a larger group of information. This column of data has ‘key’ values which we can define as being distinct
2023371

What can crossfilter do? The best way to get a feel for the capabilities of crossfilter is to visit the demo page for crossfilter and to play with their example. Crossfilter Demo Page Here we are presented with five separate views of a data set that represents flight records demonstrating airline on-time performance. There are 231,083 flight records in the database being used, so getting that rendered in a web page is no small feat in itself. The bottom view is a table showing data for individual flights. The top, left view is of the number of flights that occur at a specific hour of the day.
2023372

ntroduction to dc.js Why, if we’ve just explored the benefits of crossfilter are we now introducing a completely different JavaScript library (dc.js)? Well, crossfilter isn’t a library that’s designed to draw graphs. It’s designed to manipulate data. D3.js is a library that’s designed to manipulate graphical objects (and more) on a web page. The two of them will work really well together, but the barrier to getting data onto a web page can be slightly daunting because the combination of two non-trivial technologies can be difficult to achieve. This is where dc.js comes in. It was developed by Nick Qi Zhu and the first version was released on the 7th of July 2012. Dc.js is designed to be an enabler for both libraries. Taking the power of crossfilter’s data manipulation capabilities and integrating the graphical capabilities of d3.js.
2023374

However, there is a slight twist… Observant readers will notice that while we have a function that resolves a date/time that is formatted with year, month, day, hour, minute and second values, I don’t include an allowance for the fractions of seconds that appear in the csv file. Well spotted. The reason for this is that in spite of initially including this formatting, I found it caused some behaviour that I couldn’t explain, so I reverted to cheating and you will note that in the next section when I format the values from the csv file, I truncate the date/time value to the first 19 characters ( d.origintime.substr(0,19) ). This solved my problem by chopping off the fractions of a second (admittedly without actually solving the underlying issue) and I moved on with my life.
2023381

Write A Comment