Some U.S. police departments dump body-camera programs amid high costs – The Washington Post

Posted by Roger Bohn

Smaller departments that struggle with the cost of equipment and storage of data are ending or suspending programs aimed at transparency and accountability.

Source: Some U.S. police departments dump body-camera programs amid high costs – The Washington Post

My comment: this was predictable. Video data gets big very quickly. See my discussion 3 years ago.

Should we teach data mining without using a programming language?

Posted by Roger Bohn

Should data mining newcomers have to learn programming at the same time? Here is a contrarian view, which advocates a GUI (“drag and drop”) environment. Even though the popularity of R (and recently, Python) is increasing.

Continue reading →

Police body cams will cost $1000s per cop per year!

Posted by Roger Bohn

Police body cams sound great, but it will take years to work out all the ramifications, rules for using them, etc. One concern is cost. It’s likely that the initial cost of the cameras is a small fraction of the total cost.

One issue is the cost of storing the video recorded by cams. According to my rough calculations, this could be thousands of dollars per user per year. That will put a hole in any department’s budget.

Continue reading →

Web site: Data mining with R for MBA level students.

Posted by Roger Bohn

I just completed teaching a 10 week course on data mining for MS level professional degree students. Most of the material is on a web site, https://irgn452.wordpress.com/chron/ The course assumes good knowledge of OLS regression, but other than that is self-contained.
Software is R, with a heavy dose of Rattle for the first few weeks. (Rattle is a front end for R.) The main algorithms I emphasize are Random Forests and LASSO, for both classification and regression. I emphasize creating new variables that correspond to the physical/economic characteristics of the problem under study. The course requires a major project; some students scrape or mash their own data. Because we have only 10 weeks, I provide a timetable and a lot of milestones for the projects, and frequent one-on-one meetings.
The web site is not designed for public consumption, and is at best in “early beta” status. I am making it available in case anyone wants mine it for problem sets, discussions of applied issues not covered in most books, etc. Essentially, it is a crude draft of a text for MBAs on data mining using R. This was about the fifth time I taught the course.

By the way, a lot of the lecture notes are modestly modified versions of the excellent lecture material from Matt Taddy. His emphasis is more theoretical than my course, but his explanations and diagrams are great. Readings were generally short sections from either ISLR by James et al, or Data Mining with Rattle and R. Both are available as ebooks at many universities. My TA was Hyeonsu Kang.

Good data mining reference books

Posted by Roger Bohn

The students in my Big Data Analytics course asked for a list of books on the subject they should have in their library. UCSD has an excellent library, including digital versions of many technical books, so my list is entirely books that can be downloaded on our campus. Many are from Springer. There are several other books that I have purchased, generally from O’Reilly, that are not listed here because they are not available on campus.

These are intended as reference books for people who have taken one course in R and data mining. Some of them are “cookbooks” for R. Others discuss various machine learning techniques. BDA16 reference book suggestions

If you have other suggestions, please add them in the comments with a brief description of what is covered.

Using data mining to ban trolls on League of Legends

Posted by Roger Bohn

Something I just found for my Big Data class.

Riot rolls out automated, instant bans for League of Legends trolls

Machine learning system aims to remove problem players “within 15 minutes.”

An interesting thread of player comments has a good discussion of potential problems with automated bans. Only time will tell how well the company develops the system to get around these issues.

This company also took an experimental approach to banning players. And hired 3 PhDs in Cognitive Science to develop it. (Just to be clear, their experiments did not appear to be automated A/B style experiments.) After the jump is a screen shot from that system.

But, I’m not tempted to play League of Legends to study player behavior and experiment with getting banned! (I don’t think I’ve ever tried an MMO beyond some prototypes 15 years ago.) If any players want to post your observations here, great.

Chartjunk: Second-worst graphic of the month!

Posted by Roger Bohn

A bad graphic from a pro-solar group is perhaps not surprising. (See previous post.) Here is one from Bloomberg that verges on incomprehensible. Bloomberg as a source is surprising.

Which way is up? (Answer: down is up)

Looking closer, it appears that Skill Desirability increase from left to right, and Skill Frequency increases from top to bottom?! Graphs should be drawn so that UP means higher. In any case, it should not take prolonged inspection to deduce which variable is on the X axis.

The graphic also manages to make as many schools as possible look good at something. In Financial Services, the top 3 schools for Communications skills are listed as Tuck, McCombs, and Kellogg. But in Technology, the top 3 schools change to Fuqua, Haas, and Kellogg. And for Consulting, the top 3 are London, Harvard, and Ivey. Since “Communication Skills” are the most desired skill of all according to the graph, eight schools can say they are in the Top 3 for teaching the most sought-after skills.

Art and Science in Technology – Roger Bohn's Blog

Technologies are mixtures of Craft and Science

Big Data

Some U.S. police departments dump body-camera programs amid high costs – The Washington Post

Should we teach data mining without using a programming language?

Police body cams will cost $1000s per cop per year!

Web site: Data mining with R for MBA level students.

Good data mining reference books

Using data mining to ban trolls on League of Legends

Riot rolls out automated, instant bans for League of Legends trolls

Machine learning system aims to remove problem players “within 15 minutes.”

Chartjunk: Second-worst graphic of the month!