Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate fileutil functions in a package #1457

Merged
merged 1 commit into from
Jun 29, 2020

Conversation

manish-sethi
Copy link
Contributor

Signed-off-by: manish [email protected]

Type of change

  • Improvement (improvement to code, performance, etc)
  • Test update

Description

Consolidate fileutil functions in a package

Additional details

There were some file util functions in a common/ledger/util package and some more functions were added to both common/ledger and core/ledger. This PR consolidates all these functions in common/ledger/fileutil package.

@manish-sethi manish-sethi requested a review from a team as a code owner June 25, 2020 03:46
@cendhu
Copy link
Contributor

cendhu commented Jun 25, 2020

@manish-sethi I know that we had agreed to common/ledger/fileutil earlier but sorry that I have a change of mind on that for the following reasons:

  1. [FAB-17960] Ch.Part.API: Implement filerepo #1441 orderer/common/filerepo can utilize this package to ensure that data is written to the disk.
  2. The chaincode install also does not call fsync. Hence, core/chaincode/persistence can also use this package.
    if err := tmpFile.Close(); err != nil {
    return errors.Wrapf(err, "error closing temp file '%s'", tmpFile.Name())
    }
    if err := os.Rename(tmpFile.Name(), filepath.Join(path, name)); err != nil {
    return errors.Wrapf(err, "error renaming temp file '%s'", tmpFile.Name())
    }
    return nil

Can we please move /common/ledger/fileutil to /pkg/fileutil?

There are 42 famous golang projects (https://github.com/golang-standards/project-layout/tree/master/pkg) which heavily use /pkg except for Fabric :-)

Copy link
Contributor

@ale-linux ale-linux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Manish, a couple of minor nits

common/ledger/util/fileutil/fileutil.go Outdated Show resolved Hide resolved
common/ledger/util/fileutil/fileutil.go Outdated Show resolved Hide resolved
@manish-sethi
Copy link
Contributor Author

I would surely prefer having these kind of util stuff at a common place. As of now, we have it all scattered. /pkg sounds good home for these kind of stuff to me. There are more than one reference projects that have it like this.

However, I would like to hear if @sykesm is in agreement with this or his preference is to keep in /internal.
Also, any preferences between fileutil and util/fileutil (mine is latter).

@cendhu
Copy link
Contributor

cendhu commented Jun 25, 2020

@manish-sethi As per Golang-standard repo, only 6 famous projects use /internal. There are 42 famous projects that use /pkg. If you look here https://github.com/golang-standards/project-layout/tree/master/internal, you can find This is the code you don't want others importing in their applications or libraries. Fabric being the open-source project, I don't mind keeping this fileutil in the /pkg so that others can also use it (similar to how we use a lot of other open source libraries) 😄

On the name, I prefer pkg/fileutil.

@sykesm
Copy link
Contributor

sykesm commented Jun 28, 2020

Packages with an internal qualifier are not accessible to packages outside of the immediate parent of the tree that contains it. It is used to hold code that is shared across packages within a project but that should not be consumed by others.

As per Golang-standard repo, only 6 famous projects use /internal.

True or not, it's not particularly relevant. The location of the code should reflect the scope and the intent.

Fabric being the open-source project, I don't mind keeping this fileutil in the /pkg so that others can also use it

This is not how we should be thinking of the Fabric repository. If the value of Fabric is that we offer file management utilities, or logging utilities, or certificate management utilities, we've failed. We also don't do proper versioning or dependency management of our code; we barely manage that for the packages we actually pulled out to be consumed as libraries.

However, I would like to hear if @sykesm is in agreement with this or his preference is to keep in /internal.

As to the question: I don't believe 'internal' is the right location for this. Moving it to a /pkg is preferred to something like common.

@cendhu
Copy link
Contributor

cendhu commented Jun 29, 2020

Packages with an internal qualifier are not accessible to packages outside of the immediate parent of the tree that contains it. It is used to hold code that is shared across packages within a project but that should not be consumed by others.

I am aware of this. While it make sense to restrict say some ledger sub-packages being imported into other layer (for e.g., we import rwsetutil pkg in the vscc layer), i am not sure about a project level /internal.

Keeping code in /pkg makes things clear and simple. Otherwise, every time, we need to spend time discussing on where to place the pkg -- /internal or /pkg. As you mentioned many times, just because we can, it doesn't mean we should use it.

Why only 6 projects use /internal -- it is relavant here. Most of the time we feel what we think is correct. Hence, it is necessary to look into stats and reason about it -- whether I am thinking wrong or everyone else is thinking wrong. One reason could be that it is a recent feature and hence not widely used.

Another simple approach is to keep everything inside /internal and use /pkg only to export things to sdk or other fabric repo. With this, fileutil gets into /internal/pkg.

@manish-sethi
Copy link
Contributor Author

The purpose of /pkg and /internal is quite well understood. My intent is to get an agreement between /pkg vs /internal/pkg in our context.

@sykesm
Copy link
Contributor

sykesm commented Jun 29, 2020

I'd really like to avoid getting dragged into a debate on this. I spent a great deal of time and energy in 2018 trying to get others to detangle the code base into proper packages that were easier to reason about and more loosely coupled. While I'm happy to see some of that happening now, I've lost the passion.

Based on convention, things in /pkg are generally assumed to be consumable by others outside the project. We have very little of that in Fabric so I'm not an advocate for a /pkg directory as the default home. Things that get moved to /pkg should be moved there intentionally and for a reason.

If we think that fileutil is something that we produced for people consuming Fabric, that they should be using it, and we commit to making backward only compatible changes, then it belongs in /pkg.

If we think it's a shared bit of code to consolidate what we do, it belongs in /internal. (I don't see the need for a pkg qualifier under internal as that directory should really only contain packages but someone has already created it so...)

Moving something out of /internal will never impact an external consumer. To me, putting code in /internal provides maximum flexibility for the project without impacting consumers. This is why the feature exists and why a bunch of Fabric code was moved there. The effort was only stopped when the impact to existing users of plugins was understood.

So, you wrote the package. Is this something internal or something for others? Which consumer did you have in mind? Are we committing to make changes that are strictly backward compatible? Those questions should inform your answer.

@cendhu
Copy link
Contributor

cendhu commented Jun 29, 2020

Thanks for the detailed explanation @sykesm

I stand corrected. Understood the reasoning. Make sense.

I agree that keeping stuff in /internal provides more flexibility while keeping stuff in /pkg adds an additional burden on backward compatibility. You have nicely listed the steps to follow while deciding where to place a pkg.

Going forward, we will use this rule for all new packages and old packages present within the common.

@manish-sethi
Copy link
Contributor Author

Thanks @sykesm. I agree on /internal in our context. I'll move this there. I remember your push on this and if you recall I had advised that a better way to implement would be for all squads to take a break for while from the feature push and re-org the code first.

@manish-sethi manish-sethi force-pushed the files_util branch 2 times, most recently from ac0bf1f to ea7c2cd Compare June 29, 2020 15:35
@cendhu
Copy link
Contributor

cendhu commented Jun 29, 2020

Thanks @sykesm. I agree on /internal in our context. I'll move this there. I remember your push on this and if you recall I had advised that a better way to implement would be for all squads to take a break for while from the feature push and re-org the code first.

+1 It would be really good if we do this. Otherwise, we are continuously adding things to the existing bad pattern.

Copy link
Contributor

@cendhu cendhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just a minor comment.

Comment on lines +28 to +31
if err := os.Rename(tempFilePath, finalFilePath); err != nil {
return err
}
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we do syncDir(dir) here after the rename? -- https://www.slideshare.net/nan1nan1/eat-my-data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it to the user to call this explicitly. As I do here.

If we do implicit here then it would be better to do in this function as well, in order to maintain a uniform behavior but the latter function may sometime not need it based on the context. So, I decided to leave up to the consumer to explicitly invoke it - So, does the function name convey. Atomicity and durability are two different dimensions. syncDir is more for the durability.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Manish, for the offline discussion. https://man7.org/linux/man-pages/man2/fsync.2.html clarified some details for me. So both APIs ensures that the data is written to the disk and sync of parent directory is left to the caller.

if err := os.MkdirAll(dirPath, 0755); err != nil {
return false, errors.Wrapf(err, "error while creating dir: %s", dirPath)
}
if err := SyncParentDir(dirPath); err != nil {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, do we only need to sync the direct parent of the directory created? What if the MkdirAll on line 78 needs to create multiple folders in the path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes @caod123 - you are right, this should be on all new folders in the path. I copied this function from existing one where assumption was that up to parent it exists. But better to fix this here. I'll fix this in a separate PR after this.

@cendhu cendhu merged commit 957a726 into hyperledger:master Jun 29, 2020
@manish-sethi manish-sethi deleted the files_util branch November 4, 2020 03:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants