April | 2013 | A HopStat and Jump Away

Thanks to Aaron F. about telling me about Typinator (which is a nice text expander for Mac), which allows me to write more readable LaTeX by using Unicode (think μ(θ) vs. \mu(\theta)), which has been helpful.

It took a little getting used to, and a bit awkward when I made my keystrokes with colons instead of \ for LaTeX, so :mu for μ. It’s helpful to see these symbols in latex, but it’s invaluable when trying to annotate pdf notes from professors. One of my goals this year was to go almost completely digital with respect to notes. I have done a lot of it, and find sometimes it’s faster (think of copying and pasting whole equations when people are writing on a board), but has caused a lot of hand cramps.

Also – (if they show up on your system), Unicode can be easily done in blogging (although I may recommend mathjax or some other latex-embedding javascript system if you’re getting math heavy), and can be used on Twitter:

RT @statfact with “hats”: if 𝜃̂ is the MLE of θ, f(𝜃̂) is mle of f(θ) for any function f. But which unicode can you rely on? @johndcook

— John Muschelli (@StrictlyStat) April 16, 2013

Disclaimer: there are other text expanders out there (AutoHotKey for Windows) and Typinator is pay for, otherwise it has pop-ups somewhat infrequently when you use a set number of characters.

Recently, @simplystats recently released the healthvis package http://t.co/mW0yZEs8wy. As such, I finally got up my courage to take a shot at adapting the iris scatterplot matrix brushing example, which is essentially R’s matplot function, in healthvis. I also am “apart” of the group that’s doing this, so I thought it was necessary to show people that it’s possible

Step 1) Read http://healthvis.org/develop/ completely and follow it step by step. I had an old version of Python (2.6) (and a new one, 2.7, installed in a custom directory), so that was somewhat fun. I installed Python fresh again, and also put a symbolic link to python2.7 in /usr/bin (as this was where Google App Engine was looking). For windows users, this isn’t a problem, but Unix – symbolic link means alias/shortcut, (sudo ln -s /Library/Frameworks/Python.framework/Versions/2.7/bin/python /usr/bin/python2.7). Also I believe you have to source the AllClasses.R, healthvisMethods.R IN ORDER otherwise you may get an error.

Step 2) Find your favorite d3 graphic. The d3 gallery has a bunch of examples: https://github.com/mbostock/d3/wiki/Gallery

Step 3) Passing data into d3. Disclaimer, I don’t know much about d3.

So d3.csv (and I think d3.json) passes each row of a “dataset” as a specific element of an array, with the column names as the names of the objects within the element. If that’s confusing, think of a dataset with 3 columns (x, y, z) and 100 rows. When d3 reads that data in, from what I can tell, it has an array of 100 elements, where each element has 3 components, one labeled x, one y, one z, with the values. Also – it seems as though numerics are all passed as strings (at least in my example with d3.js.

Now to do this in R, (see https://gist.github.com/muschellij2/5310615 for code to copy)

require(RJSONIO) ## for toJSON

nr <- nrow(data) #

ll <- vector(“list”, length=nr)

for (irow in 1:nr) ll[[irow]] <- data[irow,]

js <- toJSON(ll, pretty=TRUE)

So overall, I simply made a list of the data set by looping over rows. This is not the most efficient way – but it worked at the time. Also my strategy is get it working, get it working efficiently, and then get it working prettily. So now I just create a JSON object (names js), that has all the data in JSON format, in a character string.

I then pass this data into the d3params file (again – see development page above).

Now in javascript, to get the data, we have

this.json = d3Params.json;

this.json = JSON.parse(this.json);

Now, this.json is in the same format as the data you would get from d3.csv. Now wherever you see “data” you put “this.json”. You should be ready to go when you want. Example of output below:

The example output is where you now have a drop-down box to change the colors of the data points depending on different discrete factors. Also if you highlight a few points in a plot – it shows how those points fall in the other scatterplots.

Step 3) Tweak original d3 script. Now you have your data and it’s all fun and games from here right? Partly. So anything you define in this.init or this.initalize are treated as they would be locally in that function. If you want to use things across this.init, this.visualize, or this.update, define them globally outside these functions. That said – d3 already attaches the data and such, so it should probably only be reusable functions that you need in multiple this functions. (I’m not sure if you define the function as this.function it gets around this – I believe it would, but didn’t investigate).

Step 4) Update! Now for what I did – I passed in a variable called “dropouts” (as to denote which aren’t numeric and therefore to drop out of the scatterplot matrix), which are the discrete factors by which I want to color the plot. Now you need to add some things into this.update for your transition/interactions. For example here, I passed the names of the dropout variables in my varList for healthvis, and retrieve them in this.update using formdata[0].value. That then allows me to call the circles in d3 and then color them by the value of the variable chosen in the dropdown.

Code hosted at https://github.com/muschellij2/matplotVis.

Comments – It sounds like a lot. But let me clarify this: April 2nd it was released, I finished the app yesterday. It was one day, and I had classes, and other stuff. So by a lot I mean I don’t know d3 well enough to be fluent and needed a lot of console.log(“blank = “, blank) statements to figure out what was going on. So if you know R, have seen javascript, you can make some cool visualizations. Take 20 mins to set up your system and clone the repo. Take 20 minutes finding some application that would be relevant for your work (that hopefully someone else has implemented in d3). Then take 2 hours trying to get it to work (again get it to work, then efficient, then pretty). If you can’t get it done by then, take a break, and I bet in the next 2 days (if you have time), I’ll see your healthvis app all over the place. Or just email healthvis and hope they implement it for you.

A HopStat and Jump Away

Trying to at least Doggie Paddle through the Sea of Data, Contributor to http://bmorebiostat.com

Monthly Archives: April 2013

Unicoding with Typinator and LaTeXing notes

Knitr and TeXmaker #rstats

Adapting d3 iris matplot for healthVis() Package