Certificate Curriculum: 8 Credits
Applied Machine Learning Using Health Data
This course teaches popular machine learning (ML) models using Python and their applications on health data and beyond. The topics include (1) Python programming basics (coding with Python and essential Python modules such as NumPy, Pandas, Matplotlib, and Scikit-learn); (2) Classification ML models; (3) Regression ML models; (4) ML model training and validation; (5) Support vector machines and decision trees; (6) Ensemble methods; (7) Dimensionality reduction; and (8) Unsupervised learning techniques. Students who complete this course will: (1) Understand the mathematical/statistical algorithms and computer programming routines for ML models widely adopted in health and social sciences; (2) Proficiently apply ML models to analyze real-world data; and (3) Appraise the pros and cons of alternative ML models in the contexts of problem-solving. The course will build a solid foundation for enthusiastic students who want to learn deep learning models, a subdomain of ML models based on artificial neural networks with representation learning.
(3 credits)
Applied Deep Learning Using Health Data
This course teaches a wide range of deep learning (DL) models using Python and their applications on health data and beyond. The topics include (1) Introduction to deep learning, Python, and NumPy; (2) Introduction to PyTorch and neural network; (3) Computer vision (image classification, object detection, image segmentation, keypoint detection, audio classification, and video classification); (4) Natural language processing (text preprocessing, text classification, text generation, text summarization, and text question answering); (5) Time series forecasting; (6) Recommender system; (7) Generative adversarial networks; and (8) Synthetic data generation. Students who complete this course will: (1) Gain a deep understanding of the key concepts and elements of AI, ML, and DL; (2) Familiarize themselves with a vast pool of popular, state-of-the-art DL models and their applications in health and beyond; (3) Understand the strengths, limitations, and tradeoffs of different DL models and best practices in implementing them; (4) Use Python in conjunction with popular APIs and cloud platforms (e.g., PyTorch, PyTorch Lightning/Flash, fastai2, IceVision, Hugging Face, spaCy, Haystack, Synthetic Data Vault, Google Colab, and Kaggle) to implement DL models (e.g., convolutional neural networks, recurrent neural networks, transformers) on various data types (e.g., text, image, video, audio, tabular).
(3 credits)
Skill Lab: Data & Algorithmic Bias
This skill lab focuses on critically thinking about data practices that have the potential to amplify rather than reduce the racial, economic, gender, age, and other biases found in society today. Students will learn to build sophisticated ethical reasoning skills to address and recommend concepts of right and wrong conduct, with the transparency and defensibility of actions and decisions driven by AI concerning data in general and personal data in particular.
(1 credit)
Skill Lab: Introduction to Python for Public Health Data Analysis
This skill lab will introduce students to the fundamentals of the Python language, common Python modules for data manipulation and analysis, and the Jupyter notebook environment. The skill lab will begin with acquiring data from publicly available sources and databases, cleansing and transforming data, and creating descriptive statistics and graphics. The skill lab will also introduce Python’s natural language processing and machine learning modules for basic data classification and predictive modeling applications.
(1 credit)