Monday, December 9, 2013

Tuesday, December 3, 2013

Monday, November 11, 2013

Booz Allen Field Guide to Data Science

I haven't read this, so I can't actually vouch for it. However, it's been getting a lot of coverage in my twitter stream.

Monday, November 4, 2013

ggplot in python

One of the best features of the R programming language is its fantastic ggplot2 visualization library. Looks like there is now a python clone, complete with a nice functional interface. I don't think all the wrinkles have been ironed out, but still pretty cool. 

pandasql: manipulating pandas dataframes with SQL

Pretty cool. I haven't really put this through it's paces, but it's neat after a first impression.

Wednesday, October 2, 2013

Tuesday, October 1, 2013

The Target Story

This was a fairly major story about the use (and abuse) of applied data science in retail. Target was able to discover that a young woman was pregnant before her father knew, by looking at her purchase patterns.

NY Times
Forbes


Python the Hard Way

Just a reminder to those looking for the free online version of Python the Hard Way.

Monday, September 30, 2013

Updated Syllabus

The syllabus has recently been updated to include course office hours.

  1. Josh: Wednesdays, 2-3pm, KMC 8-171 and by appointment
  2. Kumar: Mondays, 2-4pm, KMC 7-100

Sunday, September 29, 2013

NYC Data Hack Week

A great opportunity to get involved with NYC's data community, network with those with a variety of data skills. From the site:

Passionate about data? Come to the DataWeek Challenge! An event on November 2nd that will challenge the city's hackers, data scientists, and data newbies to test their skills in the ultimate data hacking competition.
Held on November 2nd at the conclusion of this year's NYC DataWeek. The Data Week Challenge is a full day of hacking challenges that brings together developers with NYC Open Data, as well as hand picked data from a number of compaines in demonstrating the potential of it to address civic challenges in a fun, friendly competition.

Data ChallengesWill be announced soon! Some of the core areas you can look forward to are Data Visualization, Data Scraping, Analysis and Data Driven Application development, and we'll also have breakout talks about the most important issues facing the data world these days.

Things to look forward to:
  • Meeting other amazing data entusiasts
  • A friendly and fun data-driven competition.
  • rizes for completing the best challenges.
  • Sharing of hacks for sourcing, scraping, cleaning, and visualizing data
  • A wide set of data to hack on from government and enterprise

Saturday, September 28, 2013

Datasets released by google

Might be a good course project in here somewhere.

Guides on Git and Github

Github is a great way to store and share code you've written. However, if you've never used it before, it can seem a bit confusing! Here are some links that should prove helpful. Sign up for a free account and give 'em a shot



Think Stats

Think Stats, one of the texts that we'll be using for this course is available for free online from the author. If you prefer a hard copy, it's only about $20 on Amazon

New Room

It's strange that we would even need to think of this. Unfortunately the classroom we were initially assigned only had a single power outlet. Not enough for a programming-intensive class! Fortunately, we've located a better option. From now on, we'll be meeting in KMC 4-80. See you all there!

Wednesday, September 25, 2013

Course Objectives Posted

I've added an outline of some course objectives. This should give you a rough estimate of what you can expect to learn throughout the semester!

Monday, September 23, 2013

Saturday, September 21, 2013

Homework 0 posted

The purpose of this initial homework assignment is to ensure that you're set up with the tools that we'll be using throughout this course, namely the unix terminal and the python programming language. Don't worry if these concepts are foreign at first, we'll be making sure that everyone gets up to speed. Find homework 0 here

Friday, September 13, 2013

Course Website

Here, hosted on Github. A repository for some key material and links to course resources.

Syllabus: Practical Data Science 2013



Here. Still somewhat in flux. Versioning on the course github here.