Introduction

We're going to look at our first dataset today. Specifically, this will be a dataset that undergirds the public display of a specific library's collections.

This will allow us to build a conceptual model of the links between what people see when they explore an archive, and the structured data that has to be put in place to allow them to find what they're looking for.

Taking a peek at the dataset

Let's look at how the Library of Congress presents its collections in human and machine-readable formats.

Public-facing webpages
1. Open up the Library of Congress collections portal
  1. What do we see here, structurally and descriptively?
2. Click through to a specific collection
  1. What do we see here?
API -- same thing in a different format! Just add ?fo=json&at=results
1. What do we see in the main collections view?
2. What do we see in the view of the specific collection?

Let's poke at it from the command line

Log into Python Anywhere.

Open up a terminal. Click the "New console" >>> Python 3.10 button.

On the command line, do the following, one line at a time:

import requests
r=requests.get('https://www.loc.gov/collections/?fo=json&at=results')
r.status_code
import json
j=json.loads(r.text)
print(json.dumps(j,indent=2))
print(j.keys())

What do we see here?

Solo Exercise

We'll continue our exercise solo.

1. Fork this repository

Create a markdown file with your username (like last round).

2. Choose a Collection in LOC

Pick one of the sub-collections, like "10th-16th Century Liturgical Chants"

3. Read the following pages:

The Python Requests module

The Python JSON module

4. Write up your findings

Poke around in your collection, in

The JSON format
The normal webpage view
The command-line view

What is interesting about these collections and these views? What is difficult / doesn't make sense? How might you want to make use of this data?

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
assets		assets
.DS_Store		.DS_Store
Allx92.md		Allx92.md
BarelyJ.md		BarelyJ.md
Bluee21.md		Bluee21.md
BrokenTrident.md		BrokenTrident.md
Collections.md		Collections.md
JFarley2024.md		JFarley2024.md
Javonn-Alleyne.md		Javonn-Alleyne.md
Reader641.md		Reader641.md
Readme.md		Readme.md
RoshawnL.md		RoshawnL.md
Svsonnia_collections.md		Svsonnia_collections.md
ToniVD.md		ToniVD.md
dario-j-c.md		dario-j-c.md
mahwood.md		mahwood.md
rodsmoove.md		rodsmoove.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Taking a peek at the dataset

Let's poke at it from the command line

Solo Exercise

1. Fork this repository

2. Choose a Collection in LOC

3. Read the following pages:

4. Write up your findings

About

Releases

Packages

Javonn-Alleyne/road_requests_exercise

Folders and files

Latest commit

History

Repository files navigation

Introduction

Taking a peek at the dataset

Let's poke at it from the command line

Solo Exercise

1. Fork this repository

2. Choose a Collection in LOC

3. Read the following pages:

4. Write up your findings

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages