Skip to content
This repository has been archived by the owner on Mar 30, 2018. It is now read-only.

Fix issue 2120 #2213

Merged
merged 4 commits into from
Jul 19, 2016
Merged

Fix issue 2120 #2213

merged 4 commits into from
Jul 19, 2016

Conversation

jeroiraz
Copy link
Contributor

Member services fails under load with "database is locked" error

"Database is locked" error could be easily reproduced when no sync mechanism is present. See original gist https://gist.github.com/mrnugget/0eda3b2b53a70fa4a894 and its fork: https://gist.github.com/jeronimoirazabal/cbe014eb22333cc3b6b56aa66379e933

Description

Mutex variable has been changed to RWMutex type in order to allow simultaneous reads and exclusive writes to sqlite3 db. Synchronization has been taken into account in all db access within the scope of Membership Services.

Motivation and Context

Fixes #2120

How Has This Been Tested?

This PR does not require additional test cases as no new functionality was introduced.

Unit tests have been run without errors.
Behave tests have been run without errors.
Node.js tests have been run without errors. Even under heavy load (i.e. 1000 users, 1000 iterations)

Checklist:

  • I have added a Signed-off-by.
  • This change requires no new documentation.
  • This change requires no new tests.
  • I have run golint and have fixed valid warnings in code I have added or modified.

Signed-off-by: Jeronimo Irazabal [email protected] / [email protected]

@dco-bot
Copy link

dco-bot commented Jul 14, 2016

Hi jeronimoirazabal,

Thanks for submitting this pull request!

Please add a comment with a DCO1.1 Signed-off-by statement in order to allow us to process your pull request.

dco-bot

@srderson
Copy link
Contributor

srderson commented Jul 14, 2016

@jeronimoirazabal There's a merge conflict with this PR, possibly as a result of merging PR #2205. Could you please rebase on master? Thanks

@jeroiraz
Copy link
Contributor Author

@srderson done!

@jeroiraz
Copy link
Contributor Author

hi @srderson, @christo4ferris, could you take a look at this changes? Many thanks

@srderson
Copy link
Contributor

Thanks, LGTM

@srderson srderson merged commit a511daa into hyperledger-archives:master Jul 19, 2016
@GrapeBaBa
Copy link
Contributor

sorry for jump in, but i am confused why need use lock for db operation . From my previous experience in java , no ORM framework will explicit use lock in memory, they all use the isolation level which database should already supportted.

@jeroiraz
Copy link
Contributor Author

jeroiraz commented Jul 20, 2016

Thanks @GrapeBaBa. I agree with you, but it seems such error (“database is locked error”) needs to be handled when using sqlite3.

Setting a greater 'busy timeout' value might help but that wouldn’t be enough and handling “database is locked error” is required: mattn/go-sqlite3#274

Did you figure out an alternative way to address this issue?

Many thanks

@GrapeBaBa
Copy link
Contributor

@jeronimoirazabal No, I have no idea about this issue. I think we need do more deeper investigation for this.

@JonathanLevi
Copy link
Contributor

[Good morning] My 2 cents:

  1. I agree that the proposed (and merged) locking strategy seems pretty aggressive. It is suspicious (as sqlite can handle multiple readers easily, just not multiple writers).
  2. I think that having a deterministic, reproducible database is locked error with a Fabric use-case examples/tests...would go a long way. It will both highlight the need for such locks AND will allow us to also try out alternative solutions.
  3. Yes, very happy to join and investigate it further too. I think that we can (and should!) come up when we add a few stress tests and benchmarks. It's actually not so difficult to do in Go - and I believe that such deeper investigation (and tests!) will make this clearer to both of us.

Practically: In my opinion, we can probably do away with not reverting this PR/change at this point: Even if these multiple locks prove to be over defensive - at this point and stage of both Fabric and Membership Services (i.e., Incubation) the priority/preference is to get things working/stabilizing the various components. So we can improve this as an 'optimization' once we have (more/any) related stress/performance tests in.

That said, I'm pretty sure we will revisit these.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Member services fails under load with "database is locked" error
5 participants