DATA

the motivation...

Reproducibility is all the rage these days (and rightfully so). In my opinion, R is really on the bleeding edge in this issue because the amazing RMarkdowns. RMarkdown uses the Pandoc markdown language to produce high quality reports, seamlessly integrated with R code chunks. Pandoc can also read many other markup syntaxes, such as HTML and LaTex, so you also get that for free when you use RMarkdown. Finally, you can choose your RMarkdown to "knit" to an HTML, a PDF or a Word document.  When it comes to making a report about your data, I don't think it gets more reproducible than this.

the goods...

Here are two reports produced with RMarkdown. The one on the left is means for a class of beginner data scientists, walking through a rank-choice voting election dataset. The report explores and tidies the data, visualizes it in a couple of different ways and also maps it. You can take a look (download) at the original RMarkdown file here and the referenced R functions here.

The report on the right is a Computational Linear Algebra assignment, which embeds code chunks, graphs, pictures and plenty of mathematical notation (courtesy of LaTex). A couple of highlights: Problem 3 uses Markov Chains to predicts gambling wins on a casino and find best strategies; Problem 4 looks at the accessibility of medieval Russian towns, as measured through the Gould index; and Problem 7 studies image compression through the method of singular value decomposition. You can find the original RMarkdown file here.