Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OBIS EOV notebook & update env #220

Merged
merged 17 commits into from
Sep 20, 2024
Merged

Add OBIS EOV notebook & update env #220

merged 17 commits into from
Sep 20, 2024

Conversation

MathewBiddle
Copy link
Contributor

Just starting #219 over again

@MathewBiddle
Copy link
Contributor Author

I tested with the obisindicators package and h3 and was able to generate a heat map of occurrences for mangroves at H3 resolution 2.

image

and resolution 1:

image

@MathewBiddle
Copy link
Contributor Author

Sorry, @laurabrenskelle I totally messed up #219. I think this looks like it has all your contributions in it though.

closes #218

@MathewBiddle
Copy link
Contributor Author

Here is the code I used to generate those maps: https://gist.github.com/MathewBiddle/d502fe089dfc867161c21ad6fe835b3f

@laurabrenskelle
Copy link
Contributor

It took forever, but I used the code you shared to run it for marine mammals.
marineMammalheatmap

@ocefpaf
Copy link
Member

ocefpaf commented Sep 17, 2024

It took me a while to realize that darker is less and lighter/brighter is more. The max values approach white and that can be confusing with 0/NA. Does it make sense to invert the colorbar?

@MathewBiddle is that near 1 data point in the Arctic Circle real? That is one cold mangroove!

@MathewBiddle
Copy link
Contributor Author

@ocefpaf looks like it is: https://mapper.obis.org/?taxonid=235091 Not sure if it's a 'valid' observation or not. I would assume a latitude is messed up somewhere along the lines. FYI @sformel-usgs

@sformel-usgs
Copy link
Contributor

@MathewBiddle yeah, that's a chilly mangrove indeed. It's supposed to be in Palau. I think a latitude of 7 was somehow misinterpreted to ~70

There are weird inconsistencies between the data in OBSI and GBIF that I want to explore, and I know the Bishop Museum can be slow to respond. For now, just exclude `occurrenceID ==9ac63d44-af8a-4e84-8491-e655b917ded2"

@MathewBiddle
Copy link
Contributor Author

Thanks @sformel-usgs. I think we should just leave it and continue on. I don't want to start adding caveats to our processing.

@laurabrenskelle
Copy link
Contributor

Agreed, I think if we make a story map with these with @mimidiorio we can just make a disclaimer about how a minimal number of records in each EOV may contain errors (in location or taxa, it's almost inevitable).

@laurabrenskelle
Copy link
Contributor

It took me a while to realize that darker is less and lighter/brighter is more. The max values approach white and that can be confusing with 0/NA. Does it make sense to invert the colorbar?

I think white is NA, not the max.

@mimidiorio
Copy link

@laurabrenskelle I am ready to help with the story development! I will put some time on our calendars to plan!

@sformel-usgs
Copy link
Contributor

sformel-usgs commented Sep 18, 2024

Ok, pulled an issue about it: gbif/portal-feedback#5480

@MathewBiddle one thing you can do to make your code simpler:

the readr package allows you to read in some files, like csv, directly from their URL. So, no need for a local download. Here is a rewrite to get the AphiaID in one lengthy pipe. I realize not everyone finds marathon pipes as fun as I do :-)

# Get mangrove aphiaID identifiers from IOOS Github Repo

mangroveIdentifiers <- gh::gh("GET /repos/:owner/:repo/contents/:path",
                              owner = "ioos",
                              repo = "marine_life_data_network",
                              path = "eov_taxonomy"
                              ) %>%
  map(.f = as.data.frame) %>%  # turn all lists into data frames
  bind_rows() %>% # bind into a single data frame
  filter(str_detect(download_url, # filter for mangrove
                    pattern = "mangrove")) %>%
  pull(download_url) %>% #turn url into vector
  read_csv() %>% # read in csv from url
  mutate(
    acceptedTaxonId = stringr::str_split_i(acceptedTaxonId, # extract AphiaID
                                           pattern = ":", 
                                           i = 5)) %>% 
  pull(acceptedTaxonId) #pull AphiaIDs as vector

@laurabrenskelle
Copy link
Contributor

I think this PR can be merged and closed, but I accidentally put a duplicate, old version of the code in my branch and it doesn't seem to let me delete it? The one we want in the repo is titled 2024-09-13-OBIS_EOVs.ipynb. The one that should be deleted is titled OBIS_EOVs.ipynb. I'm not sure if I can't delete it because it is part of this PR? If anyone can fix that, I'd appreciate it.

Matt and I discussed it and this notebook will just show how to use the aphiaIDs in https://github.com/ioos/marine_life_data_network/tree/main/eov_taxonomy to query OBIS and map occurrences. The notebook includes a reference to NOAA-GIS4Ocean where we will publish a notebook for generating geojson files for the EOVs for use in that project.

@ocefpaf
Copy link
Member

ocefpaf commented Sep 18, 2024

I think white is NA, not the max.

Yep exactly. But the max values go lighter and always blend with white, making it intuitive as the next color in the scale is 0 and not a greater value. Maybe that is just my eyes but reversing would make

  • 0/NA → white
  • lighter color → smaller values
  • darker colors → bigger values

- r-dt
- r-finch
- r-ggfortify
- r-ggplot2
- r-gh
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need this anymore as you are grabbing the csv right from the repo.

removing call for gh library
Removing gh library, adding htmlwidgets & readr
@laurabrenskelle
Copy link
Contributor

Good call @MathewBiddle. I deleted the call for the gh library from the environment file and notebook.

@MathewBiddle
Copy link
Contributor Author

For some reason, when I run the notebook, the map doesn't appear.

image

@ocefpaf is there a trick to get R leaflet maps to appear in jupyter notebooks?

@MathewBiddle
Copy link
Contributor Author

@laurabrenskelle I love how simple and effective this notebook is! Great job!

Some comments (take 'em or leave 'em):

  • I think adding a little bit more context at the beginning would provide a user with more of an understanding as to why this is important. What are the GOOS EOVs? and why would I care that we can now map the 'observed' data from OBIS?
  • Lastly, I like to add a conclusion markdown box at the end to wrap up the notebook. Essentially this summarizes what the notebook taught the user (or highlighted) and what they could do with this new knowledge. You already got at the second part by pointing to the NOAA GIS for the ocean, but summarizing the gist of the notebook would be helpful.
  • Don't be afraid to use more links. For example, robis could link to the package page. This helps a user understand where to go for more help.

@laurabrenskelle
Copy link
Contributor

@MathewBiddle Thanks! Sure, I have no problem adding more content to the markdown to make the notebook more informative for users. I'll work on that.

@ocefpaf
Copy link
Member

ocefpaf commented Sep 19, 2024

@ocefpaf is there a trick to get R leaflet maps to appear in jupyter notebooks?

We probably need to wrap that into an HTML object. Let me try something here and I'll post the result ASAP.

@ocefpaf
Copy link
Member

ocefpaf commented Sep 19, 2024

@MathewBiddle in the past we need to wrap it into an iframe like:

saveWidget(m, "mangroveMap.html", selfcontained = TRUE)
display_html('<iframe src="mangroveMap.html"></iframe>')

However, it did work for me. Maybe you need to mark that notebook as "trusted" so it can run JS.
Screenshot from 2024-09-19 19-36-12

@MathewBiddle
Copy link
Contributor Author

Cool! If that will work in the jupyterbook, that's fine by me.

@ocefpaf
Copy link
Member

ocefpaf commented Sep 19, 2024

@laurabrenskelle I love how simple and effective this notebook is! Great job!

Agreed. I love short notebooks that have a concise message. Just remember to commit the notebook with the outputs when you are done. While we do run them in the CIs to check if anything is broken, we do not run them to publish b/c some of the notebooks here cab take too long to run and crash the CIs.

@laurabrenskelle
Copy link
Contributor

I added some more contextual information to the notebook. Let me know if I need to add the iframe bit to make the map show up, or if it is okay as is.

@MathewBiddle
Copy link
Contributor Author

This looks great! In the last markdown cell, where you state

Do you see the dot on the mangrove map in the Arctic?

Let's add a link to gbif/portal-feedback#5480 to highlight how we can provide feedback to make a better product.

After that, I'm good with merging.

@laurabrenskelle
Copy link
Contributor

Done. 👍🌴🥶

@MathewBiddle
Copy link
Contributor Author

LGTM! Thanks!

@ocefpaf if you agree, I can merge this one in.

Copy link
Member

@ocefpaf ocefpaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'll create the gallery entry in a another PR, it is easier than edition this one.

@ocefpaf ocefpaf merged commit 40ccbde into ioos:main Sep 20, 2024
10 of 11 checks passed
@ocefpaf
Copy link
Member

ocefpaf commented Sep 20, 2024

@MathewBiddle now that I'm trying to render the HTML for publication I realize that the problem you were having with the map is b/c it is huge. There are 132,901 points with position precision up to 13 decimal places in some cases. Note that GeoJSON recommends 6, which is ~10 cm. However, rounding the positions values is not enough as each point is also annotated with the scientificName value.

Were you able to load mangroveMap.html map and browse it? Zoom in/out? My laptop cannot handle it :-/

We need to figure out another way to show those results. Here is what I did for a similar notebook:

https://ioos.github.io/ioos_code_lab/content/code_gallery/data_access_notebooks/2018-02-20-obis.html

I aggregated the species and merged the points within a certain distance. That would be easier to navigate, interprete, and plot. Note that we have 64 unique entries in scientificName and 23 in genus. Maybe aggregating by the latter would simplify this a bit more. What do you all think?

@sformel-usgs
Copy link
Contributor

Very interested in what you figure out, since I ran away from this challenge a while back. Can the python package lonboard do any good here (https://github.com/developmentseed/lonboard)? I know @jdpye is a fan of this package, but I don't know much about how it works.

@jdpye
Copy link

jdpye commented Sep 21, 2024

https://github.com/jdpye/lonsnapper/ has a very, very rough example of how to identify individual tracks within an EventCore/Occurrence Extension-style of archive (which I make and publish lots of!)

I am delinquent in cleaning up and expanding on this example, lonboard is good at setting up deck.gl style animation and it's something i'm interested in doing with these data sources but I am terrible at finding time for this!

@ocefpaf
Copy link
Member

ocefpaf commented Sep 23, 2024

Thanks for sharing lonboard. I was not aware of that project. I see if uses deck.gl bindings. I'm not sure there are some for R though, which is the base language for that notebook. In Python-land, we also have clusters as an option, where points are rendered as one zooms in. Another option is to use the holoviz/datashader ecosystem, where plotting millions of points is a breeze, but then again, Python only :-/

I may be wrong but many points are for individual collections of mangrove species, right? When aggregating by species we kind of have a rough mangrove area polygon of sorts. Good enough for figuring out where those trees are and even give an idea of how they spread. Maybe we can modify this notebook to show these data that way?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[New Notebook]: Use OBIS data for the bio/eco EOVs and make occurrence maps
6 participants