Learning Python for Data Science

Just Starting Out

Your First Python Video Course

Intro to Python for Data Science from DataCamp (Time: ~4 hours; Format - videos and exercises)

Course

  • Python basics
  • Lists
  • Functions and packages
  • Numpy

An Introductory Written Tutorial

A nice course from Software Carpentry on basics of Python programming with a data analysis twist.

Course (Time: ~8 hours; Format: written tutorial and exercises)

  1. Analyzing Patient Data
  2. Repeating Actions with Loops
  3. Storing Multiple Values in Lists
  4. Analyzing Data from Multiple Files
  5. Making Choices
  6. Creating Functions
  7. Errors and Exceptions
  8. Defensive Programming
  9. Debugging
  10. Command-Line Programs

Your Second Python Video Course

An entertaining set of videos for the new Python dev with live coding with which to follow along (Time: ~7hrs; Format - Video/live coding)

Videos for Python Programming Tutorial for the Absolute Beginner (6 Videos)

All New Data Scientists Should Learn about Jupyter

Nice, short video tour of Jupyter Notebooks (Format: Video)

Course

  • EXERCISE: Go to Azure Notebooks to check out Jupyter notebooks live and try to follow along with the video.

Programming and Plotting with Python

Basics with plotting theme throughout and nice exercies from Software Carpentry.

Course (Time: ~8 hours; Format: written tutorial and exercises)

  1. Running and Quitting
  2. Variables and Assignment
  3. Data Types and Type Conversion
  4. Built-in Functions and Help
  5. Libraries
  6. Reading Tabular Data into DataFrames
  7. Pandas DataFrames
  8. Plotting
  9. Lists
  10. For Loops
  11. Looping Over Data Sets
  12. Writing Functions
  13. Variable Scope
  14. Conditionals
  15. Programming Style

Intermediate

The Data Science Handbook

By Jake VanderPlas, this handbook outlines everything you need to know with cool Examples and Applications, on how to get started in Data Science with Python.

Book

Python intro and data sciencey tools - go through in order or skip around

Python for Data Science and Intro to Jupyter Notebooks and on Jupyter Notebooks on Azure (Time: ~15 hrs; Format - written tutorial and exercises)

Jupyter Notebooks - Note, solutions to exercises are in the last notebook.

  • Basics
  • Data Structures
  • Functional Programming
  • Sorting and Pattern Matching
  • Object Oriented Programming
  • Basic Difference from 2 to 3
  • Numerical Computing
  • Data Analysis with pandas I
  • Data Analysis with pandas II
  • Machine Learning I - ML Basics and Data Exploration
  • Machine Learning II - Supervised and Unsupervised Learning
  • Machine Learning III - Parameter Tuning and Model Evaluation
  • Visualization

A different take on Python and data science (either of these should cover your Python needs)

For a more in-depth Python course, this is a good one on edX out of UC San Diego: Python for Data Science from edX. (Time: 10 weeks/8-10 hours per week)

  • Basic process of data science
  • Python and Jupyter notebooks
  • An applied understanding of how to manipulate and analyze uncurated datasets
  • Basic statistical analysis and machine learning methods
  • How to effectively visualize results

Numerical Python, a.k.a. using the numpy package, is essential for the data scientist

See my python/numpy.html article for a detailed list of numpy resources.

Advanced

Some books really worth checking out

For a great dive into Python in the context of ML check out this book by Sebastian Raschka (you'll get to write algorithms from scratch in pure Python!): Python Machine Learning (2nd Ed.)

Not sure if this book is out yet, but Sebastian Raschka is writing a sequel with more deep learning in Python with TensorFlow: Introduction to Artificial Neural Networks and Deep Learning.

"This book is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. My goal is to offer a guide to the parts of the Python programming language and its data-oriented library ecosystem and tools that will equip you to become an effective data analyst. While 'data analysis' is in the title of the book, the focus is specifically on Python programming, libraries, and tools as opposed to data analysis methodology. This is the Python programming you need for data analysis." Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython by Wes McKinney

"Perhaps you would like to give your homemade robot a brain of its own? Make it recognize faces? Or learn to walk around? Or maybe your company has tons of data (user logs, financial data, production data, machine sensor data, hotline stats, HR reports, etc.), and more than likely you could unearth some hidden gems if you just knew where to look." Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurélien Géron