Technologies you will use
- Python via Anaconda
- Data Science libraries in Python, including NumPy, Pandas, Matplotlib, scikit-learn, AI Fairness 360, ML Inspect, Fairlearn, DataSynthesizer, SHAP
- Jupyter notebooks
- Structured Query Language (SQL)
- Amazon Web Services (AWS) – Elastic Compute Cloud (EC2), cluster setup and management, and Elastic Map Reduce (EMR)
- Machine Learning with TensorFlow
- Data Visualization with matplotlib, ggplot, Google Data Studio and Tableau
- Big Data with Hadoop and MapReduce, PySpark, Apache Pig
Professional skills for presentation, writing, and networking activities
- Meet data science professionals and learn about the practices within different companies and industries.
- Communicate components of the data science life cycle to technical and non-technical audiences.
- Interpret your findings into actionable recommendations.
- Explain the value and limitations of your methodology and analysis.
- Engage with data sets from various disciplines to understand the contextual importance of your work.
Technical Skills you will develop
- Utilize statistics to solve and analyze data science problems
- Simulate, model, and work with experimental data sets
- Utilize parallel computing to process big data workflows
- Employ UNIX commands for file system navigation and management
- Acquire and manage data using the Extract-Transform-Load (ETL) paradigm
- Design and utilize a relational database
- Configure and utilize big data platforms to manage large datasets
- Analyze ethical issues around a data science product, with an eye towards fairness, privacy, transparency, and interpretability
- Integrate skills in a multi-faceted data science project
Traditional Track – Full time
Students are encouraged to enroll in the summer and winter online course offerings for the above requirements when available.
– Fieldwork Experiences, which may count towards an elective, is also frequently offered during the summer session.