Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AUTO] Update data #85

Merged
merged 11 commits into from
Nov 9, 2023
Merged

[AUTO] Update data #85

merged 11 commits into from
Nov 9, 2023

Conversation

damianooldoni-bot
Copy link
Contributor

Brief description

This is an automatically generated PR.
The following steps are all automatically performed:

  • Fetch raw data
  • Map raw data to DwC standard and save the output in ./data/processed
  • Get an overview of the changes
  • Run some tests, e.g. check the uniqueness of occurrenceID, check that all occurrences have a eventID and scientificName, ...

Note to the reviewer: the workflow automation is still in a development phase. Please, check the output thoroughly before merging to main. In case, improve the data fecthing fetch_data.Rmd, the mapping dwc_mapping.Rmd, both in ./src or the GitHub workflows fetch-data.yaml and mapping_and_testing.yaml in ./.github/workflows.

Files changed:
M	data/raw/rato_data.csv
damianooldoni-bot and others added 3 commits October 8, 2023 04:16
Files changed:
M	data/processed/occurrence.csv
Files changed:
M	data/processed/occurrence.csv
@PietrH
Copy link
Member

PietrH commented Oct 24, 2023

New species! Zizania latifolia (Griseb.) Stapf

To be investigated !

@PietrH
Copy link
Member

PietrH commented Oct 24, 2023

Dossier_ID OBJECTID Dossier_Status Domein Soort Waarneming Actie Materiaal_Vast Opmerkingen_admin Opmerkingen Melder_Naam Melder_Klant Planning_Datum X Y Gemeente Aard_Locatie GBIF_Code Dossier_Link Dossier_Link_ID Hoofddossier_ID Aangemaakt_Datum Laatst_Bewerkt_Datum Datum_Van Geometrie_Type Shape
460271028 589775 Opvolging Plant Mantsjoerese wilde rijst NA NA NA NA NA NA Andere NA 95383.03 189125.1 Deinze Publiek 7901745 0 NA -1 2023-10-09 15:19:50 2023-10-09 15:20:11 2023-10-09 15:19:50 Point POINT (95383.02510000 189125.06350000)

@PietrH
Copy link
Member

PietrH commented Oct 24, 2023

New species! Zizania latifolia (Griseb.) Stapf

To be investigated !

Should be Zizania latifolia (Griseb.) Hance ex F.Muell. according to LIFE RIPARIAS target list: https://alert.riparias.be/about-data

See also #24

@PietrH
Copy link
Member

PietrH commented Oct 26, 2023

@LienReyserhove eventID has changed in the RATO data again (see #79 )

This time all records have had zero's added to the end of their eventID:

image

@PietrH
Copy link
Member

PietrH commented Oct 26, 2023

I've sent an email to Karel to see if he might know more

@PietrH
Copy link
Member

PietrH commented Oct 27, 2023

This is blocking the update to the GBIF dataset

@PietrH PietrH added blocker question Further information is requested labels Oct 27, 2023
PietrH and others added 2 commits October 27, 2023 14:38
Files changed:
M	data/processed/occurrence.csv
@PietrH
Copy link
Member

PietrH commented Nov 8, 2023

The new identifier is the old identifier (a time based unique number) and 3 digets unique to the user creating the case, where they didn't have a user linked to the case (all the old records) they used 3 zero's.

This change came about because they had users create cases at basically the same time, resulting in a collion between their eventID/Dossier_ID's because they are time based. To keep the identifier the same length, they padded the old records with zero's.

They should have also deleted a number of duplicates/resolved identifier collisions on the database end. I did not check for this in this PR.

@PietrH
Copy link
Member

PietrH commented Nov 8, 2023

I can't find any removed occurrenceID's at the moment

@PietrH
Copy link
Member

PietrH commented Nov 8, 2023

680 new records, no deleted records. No new species.

15 species:

scientificName n
Ondatra zibethicus 485
Vespa velutina 112
Trachemys scripta 18
Heracleum mantegazzianum 16
Ludwigia grandiflora 13
Martes foina 8
Fallopia japonica 6
Gallus gallus domesticus 6
Myriophyllum aquaticum 5
Castor fiber 4
Hydrocotyle ranunculoides 3
Anatidae 1
Oryctolagus cuniculus 1
Psittacula krameri 1
Zizania latifolia 1

On 31 days:

date n
2023-10-12 60
2023-09-19 59
2023-09-26 47
2023-09-28 45
2023-10-10 43
2023-10-03 39
2023-09-25 38
2023-09-22 36
2023-10-02 36
2023-10-04 34
2023-09-27 33
2023-10-11 30
2023-10-05 26
2023-10-06 25
2023-09-20 23
2023-09-29 23
2023-10-09 22
2023-10-13 22
2023-09-18 14
2022-05-19 8
2023-09-21 7
2022-04-21 1
2022-05-02 1
2022-05-23 1
2022-05-24 1
2022-05-30 1
2022-08-03 1
2022-09-12 1
2023-05-19 1
2023-08-24 1
2023-09-09 1

Files changed:
M	data/processed/occurrence.csv
@PietrH PietrH merged commit 6f56833 into main Nov 9, 2023
@PietrH PietrH deleted the automatic-update-2023-10-08T03-45-09Z branch November 9, 2023 08:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automated workflow blocker question Further information is requested
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants