Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalize and merge copyright lines #23

Open
kellnerd opened this issue Jun 9, 2024 · 2 comments
Open

Normalize and merge copyright lines #23

kellnerd opened this issue Jun 9, 2024 · 2 comments
Labels
harmonizer Harmonized data representation and processing provider Metadata provider

Comments

@kellnerd
Copy link
Owner

kellnerd commented Jun 9, 2024

Continuing the discussion from #22 (comment)

We can factor out the copyright normalization logic and reuse it for other providers, e.g. as suggested for Tidal in https://community.metabrainz.org/t/harmony-music-metadata-aggregator-and-musicbrainz-importer/698641/15
But I'd do this after merging this PR. It also needs some further research. I know Tidal includes the copyright text both with and without the © symbol. What I'm unsure is whether this strictly contains copyright © info, or whether it also can sometimes contain phonographic copyright ℗ info. Spotify has those separated, which makes it easier.

I fully agree, this is enough for its own PR and it needs more research.
Tidal also has a copyright property at the track level by the way, this should also be considered if it is different from the release level coypright. So far they were identical for the releases which I have checked, maybe a compilation has different values there.

For starters I have a commit in the dev branch which displays the alternative copyright values.
When we have more examples we can decide how the release merge algorithm should handle these, one possibility would be to keep all and deduplicate them.

@kellnerd kellnerd added provider Metadata provider harmonizer Harmonized data representation and processing labels Jun 9, 2024
@phw
Copy link
Collaborator

phw commented Jun 10, 2024

For Tidal it is more complicated, the copyright field can contain both (P) or (C) entries, and it can be with or without a symbol (ASCII or proper character).

So I'm not sure whether we can or should add © or ℗ (or maybe "© + ℗") to the string if the symbol is missing. What we could do is converting the ASCII variants into symbols.

EDIT: Tidal's API docs give this as an example value for copyright: "(p)(c) 2017 S. CARTER ENTERPRISES, LLC. MARKETED BY ROC NATION & DISTRIBUTED BY ROC NATION/UMG RECORDINGS INC."

So even this makes it clear that it is not clearly for either © or ℗, but rather free to use for the labels / artists.

@phw
Copy link
Collaborator

phw commented Jun 12, 2024

I wonder that maybe "copyright" should be a list of strings per provider instead of a single string. Currently it is I think only Spotify which offers explicit distinct values for (P) and (C). But having those as actual separate values will make de-duplication easier. Right now the provider just separates both with a new line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
harmonizer Harmonized data representation and processing provider Metadata provider
Projects
None yet
Development

No branches or pull requests

2 participants