Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add more languages #684

Merged
merged 110 commits into from
Feb 23, 2024
Merged

Conversation

StefanVukovic99
Copy link
Member

On top of #584, adds more languages and their text preprocessors. Minor changes to the pattern set there with the types/organization.

@djahandarie
Copy link
Collaborator

Don't know if we need to do it as part of this PR, but we will probably eventually want to add to CODEOWNERS separate lines for each per-language folder, as individual language concerns will likely need to be reviewed by someone who knows something about that language.

@toasted-nutbread
Copy link

So I haven't gotten into this a whole lot yet, but one thing stand out to me: the amount of tiny files added for language descriptors. Would it be better if we just put these into a single file? IMO the huge bloat of the copyright in each file, followed by like 4 lines of actual code, seems like a big waste. And reducing ~30 files into 1 may just be better for the way that they are imported as well. It's already impossible to dynamically import them anyways, so I don't think we would be losing anything.

What do you and others think?

@StefanVukovic99
Copy link
Member Author

I've had the thought that some files were way too big, never too small lol. Having everything language-specific in its own folder seems tidy. I'm quite incredulous that the copyright is needed in every file, and not sure that dynamic loading is absolutely impossible.

That said, I don't really mind either way, I guess we can do tiny languages in one file (directly in languages.js?), and separate them out when they get to a reasonable size.

@toasted-nutbread
Copy link

With regard to the header, I mean it's probably not strictly needed if we decide we want to change it, so long as it exists somewhere. It could probably be turned into a minimal stub if there's desire to keep a header that references the full license externally. During the Yomichan era, this thought never really came to mind because all the files were significantly larger than the header itself. Obviously that has been changing.

With regard to files that are too large, I also agree with this on a lot of things and I am doing some work to address this where possible. Unfortunately, there are a lot of complex classes that, at least presently, can't really be broken down easily. Backend is a huge cause of issues on that front.

With regard to dynamic loading of files, I agree there are probably some clever ways to handle this, but given the small size and repetitive content of all of the existing and new language descriptors, dynamic loading would probably just result in more overhead at the end of the day.

My two cents is that grouping them all into a unified file would actually make it easier to read through all of the languages with no difference in existing functionality. I can try to mock this up more formally to see how I/we feel about it rather than speculating.

@djahandarie
Copy link
Collaborator

Hmm, I feel it's more tidy to have all language-specific stuff in its own folder which can basically be owned by some expert in that language (so I don't need to really review it beyond a ceremonial merge).

The PR as-is doesn't have that many differences per language which makes each file look small and redundant, but I think in theory they could each potentially grow in complexity quite a bit as each language gets more complex functionality added by experts, no?

@toasted-nutbread
Copy link

Language-specific things can still be located in the folders, I just think the final composition of all the descriptors is simplified if it's all declared in a single config file. #704 shows this, while still language-specific things like japanese-text-preprocessors.js in the specific language folder.

The reason why I think the single location hosting all the config is useful is because it gives you an overview of all of the languages at a glance without having to open 30+ different files if we need to add a new field to a LanguageDescriptor. I'm thinking of it similar to how eslintrc and similar config files can also be written directly in JavaScript.

(At the very least, #704 tidies some of the types, even if we decide we want to have language-specific folders for things.)

@djahandarie
Copy link
Collaborator

@StefanVukovic99 I think this is good to merge whenever you have time to resolve comments and rebase on top of toasted-nutbread's new file layout

@StefanVukovic99
Copy link
Member Author

I think @toasted-nutbread will want at least one more pass at this.

Copy link

@toasted-nutbread toasted-nutbread left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@djahandarie djahandarie added this pull request to the merge queue Feb 23, 2024
Merged via the queue into themoeway:master with commit 62ac615 Feb 23, 2024
8 checks passed
@djahandarie djahandarie added the kind/enhancement The issue or PR is a new feature or request label Feb 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement The issue or PR is a new feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants