Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with processing of iNaturalist taxonomic entries #347

Open
Mesibov opened this issue Oct 9, 2019 · 3 comments
Open

Problem with processing of iNaturalist taxonomic entries #347

Mesibov opened this issue Oct 9, 2019 · 3 comments
Labels

Comments

@Mesibov
Copy link

Mesibov commented Oct 9, 2019

I downloaded the 638655-record iNaturalist dataset and found 3 fields with full scientific names:
"scientificName" (here called 1)
"scientificName" (here called 2)
"species" (here called 3)
If I can believe the headings.csv that came with the records, the first is raw scientificName, the second is ALA-processed scientificName and the 3rd is "The species the Atlas has matched this record to in the NSL".

As expected there are a few records in (1) with (2) and (3) blank. Unexpectedly, there are also many records with (1) blank but with (2) and (3) filled. Looking at just one of these, the taxonomy gets weird:
https://biocache.ala.org.au/occurrences/9e4c4726-068c-4860-b142-4ff43ba9fa57

The "original vs processed" tables says that iNaturalist actually did supply a raw ID, namely

Plantae|Tracheophyta|Magnoliopsida|Fabales|Fabaceae|Acacia|Acacia ampliata

Nothing wrong there, but the ALA-processed classification replaces "Tracheophyta" with "Charophyta" (green algae), and "Magnoliopsida" with "Equisetopsida" (horsetails).

I haven't looked at any more of the -/2/3 records. Is this a processing failure? How to explain the phylum and class errors?

@djtfmartin djtfmartin added the bug label Oct 9, 2019
@djtfmartin
Copy link
Member

Just on the difference in higher classification, ALA's source for Acacia ampliata is here:

https://biodiversity.org.au/nsl/services/rest/node/apni/2919087

which has:

Plantae / Charophyta / Equisetopsida / Magnoliidae / Rosanae / Fabales / Fabaceae / Acacia /Acacia ampliata

@Mesibov
Copy link
Author

Mesibov commented Oct 9, 2019

That explains the errors. Will ALA be querying APC about this? Their hierarchy is defective.
My download was https://doi.org/10.26197/5d9c2f72356e8

@djtfmartin
Copy link
Member

Thanks @Mesibov. Ive passed that question to the NSL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants