Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVRO-2385: generate camelCase method names for UPPER_SNAKE fields #735

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ecopoesis
Copy link

Changes the implementation of Avro record name to Java method name
converter. Though this breaks no existing tests, it is a breaking change
for records named in UPPER_SNAKE style.

Make sure you have checked all steps below.

Jira

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Adds:

  • org.apache.avro.compiler.specific.TestSpecificCompiler#testCamelize

Modifies:

  • org.apache.avro.compiler.specific.TestSpecificCompiler#generateGetMethod
  • org.apache.avro.compiler.specific.TestSpecificCompiler#generateSetMethod

Commits

  • My commits all reference Jira issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain Javadoc that explain what it does

@probot-autolabeler probot-autolabeler bot added the Java Pull Requests for Java binding label Dec 5, 2019
@zeshuai007
Copy link
Member

If you use mvn spotless: apply, it can solve the problem of the code format built by CI

Copy link
Member

@zeshuai007 zeshuai007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick check, the test case is great, but I think that instead of using toString, returning to StringBuilder directly can reduce the amount of code.

// use underscores as hints where the capitals go
StringBuilder builder = new StringBuilder();
for (String part : StringUtils.split(s, '_')) {
builder.append(StringUtils.capitalize(StringUtils.lowerCase(part)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'StringUtils.lowerCase ()' will have better performance if placed outside the loop.

@ecopoesis ecopoesis force-pushed the feature/avro-2385-upper-getset branch from 8770fc2 to 1424de1 Compare December 6, 2019 22:04
@ecopoesis
Copy link
Author

Not all the branches of camelize use StringBuilders, so always returning a StringBuilder would sometimes require creating an additional object. Additionally we need to have a string at the end when we decide on the leading capital.

camelize could be embedded generateMethodName like before, but I think it's a useful utility to have. The same casing problems exist for generated field and class names so we may want to use it there too.

Changes the implementation of Avro record name to Java method name
converter. Though this breaks no existing tests, it is a breaking change
for records named in UPPER_SNAKE style.
@adelamodwala
Copy link

Hi @ecopoesis - is there a reason this PR is still open? Would love to use this feature.

@ecopoesis
Copy link
Author

Hi @ecopoesis - is there a reason this PR is still open? Would love to use this feature.

I am not a committer for Avro so I can’t merge this branch.

@adelamodwala
Copy link

@zeshuai007 Looks like you're one of the contributors. Would it be possible to have this PR merged?

@adityagoel
Copy link

Trying to get the attention back for this to be merged if the contribution looks good.

assertEquals("SnakeUuid", SpecificCompiler.camelize("snake_UUID", true));
assertEquals("Lower", SpecificCompiler.camelize("lower", true));
assertEquals("Capitalized", SpecificCompiler.camelize("Capitalized", true));
assertEquals("CAPS", SpecificCompiler.camelize("CAPS", true));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional ?
At https://github.com/apache/avro/pull/735/files#diff-bdda25ef6cc04f1ca556df625d9d4ecdd65951cfcd8b4e38da5916765bff75dbR946 UUID has been transformed to Uuid but here CAPS remains as CAPS. I'd expect Caps instead.

assertEquals("CAPS", SpecificCompiler.camelize("CAPS", true));
assertEquals("CamelCased", SpecificCompiler.camelize("camelCased", true));
assertEquals("CamelCased", SpecificCompiler.camelize("CamelCased", true));
assertEquals("SomeUUID", SpecificCompiler.camelize("someUUID", true));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as form CAPS above.

assertEquals("CamelCased", SpecificCompiler.camelize("camelCased", true));
assertEquals("CamelCased", SpecificCompiler.camelize("CamelCased", true));
assertEquals("SomeUUID", SpecificCompiler.camelize("someUUID", true));
assertEquals("UUIDFirst", SpecificCompiler.camelize("UUIDFirst", true));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

return StringUtils.uncapitalize(s);
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And what about MIXED_CamelCasedTYPED ? :)
Shouldn't it be better to split at start, then ignore part that contains only '_'

StringUtils.splitByCharacterTypeCamelCase("MIXED_CamelCasedTYPED") =>
["MIXED3, "_", "Camel", "Cased", "TYPED"]

I suggest

static String camelize(String s, boolean upper) {
   String[] parts = StringUtils.splitByCharacterTypeCamelCase(s);
   int index = 0;
   while (index < parts.length && "_".equals(parts[index])) {
       index++;
   }
   // subsections should retain their casing if all upper
  ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Java Pull Requests for Java binding
Projects
None yet
7 participants