Should data mining newcomers have to learn programming at the same time? Here is a contrarian view, which advocates a GUI (“drag and drop”) environment. Even though the popularity of R (and recently, Python) is increasing.
I have certainly considered this in the Big Data course I teach. All my coding is in R, but I don’t have the time or see enough value in teaching “programming.” Over time as better tools come out of the R community, I have found ways to teach only a minimal amount of R. I start them off with a menu-driven system called Rattle. It does a lot of data manipulation, importing, descriptive statistics, etc. It also has modules for a number of standard mining algorithms. Finally, it generates R code as it goes, so students can edit the methods if they need something a little different.
I have considered taking the next step, and using an IBM or Microsoft web-based platform for mining. But there is little or no material for students using these platforms, which is a deterrent. For example, there are at least 5 good books on data mining in R.
So for now, I’m happy with using a minimal subset of R for teaching. But I continue to look at alternatives.
Additional discussion of environments for data mining. http://www.datasciencecentral.com/profiles/blogs/what-are-the-big-guys-using
And for further alternatives, here is a new cheat sheet on data mining using Stata (which all economists at UCSD use, and most of my students therefore learn).