Virtual Training for Collaboration in Environmental Data Science
This work is licensed under a Creative Commons Attribution 4.0 International License.
Welcome to the Environmental Data Science Innovation & Inclusion Lab (ESIIL)! These training sessions are intended to provide attendees to the ESIIL Virtual Hackathon - Environmental MosAIc with some technical background to help foster innovation and collaboartion during the hackathon. Through this training, attendees will learn how to access and engage with cloud-based communication and computational tools that will be relied on throughout the hackathon, fundamental Environmental Data Science (EDS) skills using R and Python, and fundamentals of AI. These sessions are being recorded and published on our YouTube channel for anyone who is unable to attend, or for those who would like to go back and revisit these lessons. Finally, we will have super fun virtual help desks set up during the hackathon as questions come up and for anyone who would like help troubleshooting.
Agenda
Sessions 1 and 2 will run from 10am - 12pm Mountain Daylight Time and will meet virtually over Zoom.
Session 3 will run from 9am - 12pm Mountain Daylight Time and will meet virtually over Zoom.
Week 1 -- Collaborating in the Cloud (Thursday October 26th)
Watch this training or read the transcript
- Opening remarks: Jennifer Balch, ESIIL Director
- Session overview: Nate Quarderer, ESIIL Education Director
Introduction to Github and the CyVerse Discovery Environment
- Instructors: Culler, Verleye
- GitHub for Collaboration 🤝
- Why GitHub?
- A brief overview of GitHub collaboration features
- Pair programming, or How we suggest you use GitHub for the Hackathon
- Navigating the Constellation of Cyberinfrastructure
Test drive
- Instructors: Culler, Verleye, Tuff, Swetnam
- Activity:
- Split into groups of 2 (randomly assigned). Each group will go to a breakout room and have 10 minutes to:
- One person ONLY accepts the GitHub Classroom assignment by clicking this link. This will create a GitHub repository that contains two notebooks for both team members to collaborate on. One notebook is for creating a file with annual LA temperature data, and the other notebook uses it to fit a linear model and plot the data.
- Both team members clone the GitHub repository into Cyverse using either Jupyter Lab or RStudio. If you have trouble with this, don't worry - we'll go over it later. Please make sure that both team members have the URL to their GitHub repository!
- Choose which person will work on which notebook.
- Plan what your intermediate data file (annual temperatures in LA) will look like. There is a spot in both notebooks for writing down the plan.
- Everyone comes back to the main room
- We split into two breakout rooms by notebook - there should be one member from each group in each room. Each room will walk through completing the notebook and pushing your results up to GitHub.
- Return to your group of 2, and see if you can get your code to work together!
- When you are in a room with your group of 2, you can always get someone to come look at your code using the Ask For Help feature.
Additional Readings from our Earth Data Science Textbook!
Week 2 -- Environmental Data Science and Environmental MosAIc's Data Library (Thursday November 2nd)
Watch this training or read the transcript
- Session overview: Nate Quarderer, ESIIL Education Director
Cyverse Discovery Environment
- Instructors: Culler, Verleye
- Review
Data Library
- Instructors: Tuff, Amaral
- The Data Library
- Accessing data with an API
- Saving your own data cube
Recommended Readings
Week 3 -- Environmental MosAIc Data Library & Artificial Intelligence (Thursday November 9th)
Watch this training or read the transcript
- Session overview: Nate Quarderer, ESIIL Education Director
Environmental MosAIc Data Library
- Instructor: Tuff
- The Data Library
- Accessing data with an API
- Saving your own data cube
Artificial Intelligence
- Instructor: Ghriss
- AI and Machine Learning algorithms
- Examples of Machine Learning techniques
- Interacting with a machine learning model
- Fairness and ethical use of AI
Links
GitHub repository: https://github.com/CU-ESIIL/hackathon2023_datacube
CyVerse User Portal: https://user.cyverse.org
Open Earth Data Science Textbook: https://www.earthdatascience.org/
ESIIL Data Library: https://data-library.esiil.org/