Module 12 of Modules 11-15: Skill Refreshment & Foundational Tooling

  • Tasks:

    • Day 12 (Pythonic ecosystem): Explore the pythonic ecosystem, including: a) NumPy, the library for numerical computing and tools for handling large, multi-dimensional arrays and matrices, as well as functions for mathematical operations b) pandas, the library for data manipulation and analysis, providing data structures for handling tabular data, time series data, and more. pandas also includes functions for data cleaning, merging, and reshaping c) SciPy, the library for scientific computing in Python, including tools for optimization, integration, interpolation, and more d) statsmodels, the library for statistical modeling in Python; SciPy provides tools for regression analysis, time series analysis, and more. e) scikit-learn, the library for machine learning in Python. It provides tools for supervised and unsupervised learning, as well as tools for data preprocessing and model selection. f) Matplotlib, library for creating visualizations which provides tools for creating line plots, scatter plots, histograms, and more. g) seaborn, the library for creating statistical visualizations which provides tools for creating heatmaps, scatter plots, and more.
  • Deliverable: New mdBook chapters documenting refreshed mathematical knowledge, most likely using Python, but possibly also looking at the path for similar investigations with Mathematica and using Wolfram notebooks; a reusable Docker image for ML projects; and demonstrated proficiency in advanced Git workflows.