Using Vagrant to test Apache Spark applications

Apache Spark is fast becoming the established platform for developing big data applications both in batch processing and, more recently, processing real-time data with the use of Spark streaming. For me, Apache Spark really shines in that it allows you to write applications to run on a Yarn Hadoop cluster and there is little to […]

R XML Package

I’ve spent a number of years programming in Java so, during my MSc in Bioinformatics, it took me a while to become acquainted with the nuances and the idioms of writing code in R. It has been discussed extensively elsewhere, little better than John Cook’s lecture R: The Good, The Bad and The Ugly. While […]