Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing DQ checks: taxonomy - Invalid scientific name #100

Open
Mesibov opened this issue Dec 10, 2015 · 0 comments
Open

Testing DQ checks: taxonomy - Invalid scientific name #100

Mesibov opened this issue Dec 10, 2015 · 0 comments

Comments

@Mesibov
Copy link

Mesibov commented Dec 10, 2015

Tested on 457075 beetle records downloaded 2 December 2015. ALA says about this test: 'Supplied scientific name cannot be recognised' and 'Catches things like rank values or values such as "UNKNOWN"'. Testing turned up a few possible false positives (matchable scientific names flagged as invalid) and a large number of false negatives (invalid scientific names that passed this test).

452014 false
5061 true

All the 141 'true' flags (5061 records) contain 'un-' words: unassigned, undetermined, unknown, unplaced. Among these, however, are names uploaded with 'unplaced' marking the subgenus. Note that the name parser has correctly recognised the scientific name in the second two 'invalid' names but not the first. Epuraea californica (Gillogly, 1946) is not in AFD but is in CoL.

Epuraea (Unplaced) californica (Gillogly) Matched Scientific Name = Epuraea
Philonthus (Unplaced) cupreotinctus Lea, 1925 Matched Scientific Name = Philonthus cupreotinctus
Tachyporus ((unplaced)) rarus Lea, 1899 Matched Scientific Name = Tachyporus rarus

However, there are many 'false'-flagged records (scientific name not invalid) with 'un-' names:

undet Aspidimerini Matched Scientific Name = COCCINELLIDAE
undet Cerambycid. Matched Scientific Name = CERAMBYCIDAE
undet Coccinellid Matched Scientific Name = COCCINELLIDAE
undet Dynastinae Matched Scientific Name = Dynastinae
undet Prioninae Matched Scientific Name = PRIONINAE
undet staphylinid Matched Scientific Name = STAPHYLINIDAE

Many of the 'true'-flagged records are of the form 'Cryptophagidae unplaced TFIC sp 07'. Without the 'un-' word, records of this kind were passed as valid scientific names, e.g.

Adelium TMIC sp 01

and were inconsistently matched:

Aedriodes sp. fc1597 Matched Scientific Name = Aedriodes
Aedriodes sp. fc3290 Matched Scientific Name = Aedriodes
Aedroides sp. fc3366 Matched Scientific Name = CURCULIONIDAE

Many more of the 'false'-flagged records are clearly invalid scientific names but were parsed anyway and referred to higher taxa, e.g.

(Oliver) RTU 106 Matched Scientific Name = ZOPHERIDAE
(York) RTU 116 Matched Scientific Name = STAPHYLINIDAE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant