Scott Brandenberg, UCLA
A Hands-On Tutorial
Wednesday, July 24
NHERI Computational Academy
Data scientists spend most of their time wrangling and cleaning data, and comparably small amounts of time training algorithms to learn from the data. Dealing with data not as exciting as training a neural network, or random forest, or building a LLM, but is very important to provide the foundation from which we can do our work. Many natural hazards engineers are not familiar with relational databases, structured query language, and application programming interfaces, favoring instead to work with Excel files or comma separated value (CSV) files.
The hands-on presentation will focus on application programming interfaces, following a presentation by Charlie Dey on July 23 involving real-time traffic data in Austin, Texas. It will use Python requests, JSON, Pandas, and Matplotlib. The presentation will then focus on relational databases, using SQLite as the database engine. Participants will create a database, populate it with data, and query data using SELECT and JOIN statements.