What you'll learn:
- Pick up programming even if you have NO programming experience at all
- Write Python programs of moderate complexity
- Perform complicated text processing - splitting articles into sentences and words and doing things with them
- Work with files, including creating Excel spreadsheets and working with zip files
- Apply simple machine learning and natural language processing concepts such as classification, clustering and summarization
- Understand Object-Oriented Programming in a Python context
A Note on the Python versions 2 and 3:The code-alongs in this class all use Python 2.7. Source code (with copious amounts of comments) is attached as a resource with all the code-alongs. The source code has been provided for bothPython 2 and Python 3wherever possible.
What's Covered:
- Introductory Python: Functional language constructs; Python syntax; Lists, dictionaries, functions and function objects; Lambda functions; iterators, exceptions and file-handling
- Database operations: Just as much database knowledge as you need to do data manipulation in Python
- Auto-generating spreadsheets: Kill the drudgery of reporting tasks with xlsxwriter; automated reports that combine database operations with spreadsheet auto-generation
- Text processing and NLP: Python’s powerful tools for text processing - nltk and others.
- Website scraping using Beautiful Soup: Scrapers for the New York Times and Washington Post
- Machine Learning : Use sk-learn to apply machine learning techniques like KMeans clustering
- Hundreds of lines of code with hundreds of lines of comments
- Drill #1: Download a zip file from the National Stock Exchange of India; unzip and process to find the 3 most actively traded securities for the day
- Drill #2: Store stock-exchange time-series data for 3 years in a database. On-demand, generate a report with a time-series for a given stock ticker
- Drill #3: Scrape a news article URL and auto-summarize into 3 sentences
- Drill #4: Scrape newspapers and a blog and apply several machine learning techniques - classification and clustering to these