Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[MXNET-312] Added Matthew's Correlation Coefficient to metrics #10524

Merged
merged 26 commits into from
Jun 28, 2018

Conversation

dabraude
Copy link
Contributor

@dabraude dabraude commented Apr 12, 2018

Description

Added Matthew's Correlation Coefficient to metrics

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http:https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Added MCC binary classification metric (and when applicable, API doc)

Comments

  • In documentation gave example where it shows the difference with F1

cjolivier01 and others added 18 commits April 12, 2018 11:58
* send as char

* fix bug on pull response, and rowsparse on worker side

* three modes

* default to mode 0 and add support for row sparse

* refactor sparse

* rowsparse numbytes fixes

* WIP tests

* update test sync

* remove prints

* refactoring

* Revert "refactoring"

This reverts commit 05ffa1b.

* undo refactoring to keep PR simple

* add wait to stored in pull default

* lint fixes

* undo static cast for recvblob

* lint fixes

* mode 1 changes

* sparse bug fix dtype

* mshadow default

* remove unused var

* remove debug statements

* clearer variables, reduced multiplication, const vars

* add const for more vars, comments

* comment syntax, code watcher, test default val

* remove unnecessary print in test

* trigger ci

* multi precision mode (debugging race condition)

* working rsp pushes

* finish multiprecision for row sparse

* rename num-bytes

* fix bug due to rename of numbytes, and remove debug logs

* address comments

* add integration test

* trigger ci

* integration test

* integration test

* fix path of script

* update mshadow

* disable f16c for amalgamation

* fix amalgamation build

* trigger ci

* disable f16c for jetson
* changed url references from dmlc to apache/incubator-mxnet

* prepping scala landing pages

* infer api info added
* Fix infer_storage_type

* Add test

* Fix lint

* Trigger CI
* add slice_like and doc

* pass unittest and lint
* initial update on setting up scala ide with mxnet

* moving images to web-data project

* updated links to images; added readme for root folder

* scala hello world feature added

* workaround for make transitive error

* fixed systempath

* minor updates

* table fix

* added some spacing

* more spacing
@dabraude dabraude requested a review from szha as a code owner April 12, 2018 11:20
@dabraude
Copy link
Contributor Author

@marcoabreu can you check the CI?

@marcoabreu
Copy link
Contributor

What's the matter?

@dabraude
Copy link
Contributor Author

I meant can you check the test I added is ok

Copy link
Contributor

@marcoabreu marcoabreu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

def __init__(self, name='mcc',
output_names=None, label_names=None, average="macro"):
self.average = average
self.metrics = _BinaryClassificationMetrics()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_average and _metrics

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

While slower to compute the MCC can give insight that F1 or Accuracy cannot.
For instance, if the network always predicts the same result
then the MCC will immeadiately show this. The MCC is also symetric with respect
to positive and negative catagorisation, however, there needs to be both
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

categorisation*

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks fixed (and changed to US spelling)

@dabraude
Copy link
Contributor Author

@piiswrong Would you mind updating your review?

@dabraude
Copy link
Contributor Author

dabraude commented Jun 6, 2018

Hey is there anything I can fix on this?

@@ -122,6 +123,54 @@ def test_f1():
np.testing.assert_almost_equal(microF1.get()[1], fscore_total)
np.testing.assert_almost_equal(macroF1.get()[1], (fscore1 + fscore2) / 2.)

def test_mcc():
Copy link
Member

@szha szha Jun 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add a consistency test with Matthew's Correlation Coefficient in numpy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never mind. I didn't realize that numpy doesn't have it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested that this implementation of micro mcc is consistent with http:https://scikit-learn.org/stable/modules/generated/sklearn.metrics.matthews_corrcoef.html

@piiswrong piiswrong merged commit 044afa3 into apache:master Jun 28, 2018
@dabraude dabraude deleted the matthews branch June 29, 2018 05:10
XinYao1994 pushed a commit to XinYao1994/incubator-mxnet that referenced this pull request Aug 29, 2018
…e#10524)

* Remove Fermi from cmake (apache#10486)

* updated R docs (apache#10473)

* [MXNET-120] Float16 support for distributed training (apache#10183)

* send as char

* fix bug on pull response, and rowsparse on worker side

* three modes

* default to mode 0 and add support for row sparse

* refactor sparse

* rowsparse numbytes fixes

* WIP tests

* update test sync

* remove prints

* refactoring

* Revert "refactoring"

This reverts commit 05ffa1b.

* undo refactoring to keep PR simple

* add wait to stored in pull default

* lint fixes

* undo static cast for recvblob

* lint fixes

* mode 1 changes

* sparse bug fix dtype

* mshadow default

* remove unused var

* remove debug statements

* clearer variables, reduced multiplication, const vars

* add const for more vars, comments

* comment syntax, code watcher, test default val

* remove unnecessary print in test

* trigger ci

* multi precision mode (debugging race condition)

* working rsp pushes

* finish multiprecision for row sparse

* rename num-bytes

* fix bug due to rename of numbytes, and remove debug logs

* address comments

* add integration test

* trigger ci

* integration test

* integration test

* fix path of script

* update mshadow

* disable f16c for amalgamation

* fix amalgamation build

* trigger ci

* disable f16c for jetson

* Fix rat excludes (apache#10499)

* MXNET-308 added missing license (apache#10497)

* refactored example (apache#10484)

* [MXNET-298] Scala Infer API docs landing page (apache#10474)

* changed url references from dmlc to apache/incubator-mxnet

* prepping scala landing pages

* infer api info added

* Fix infer storage type (apache#10507)

* Fix infer_storage_type

* Add test

* Fix lint

* Trigger CI

* [MXNET-306] Add slice_like operator (apache#10491)

* add slice_like and doc

* pass unittest and lint

* Minor simplifications in ci/build.py (apache#10496)

* [MXNET-305] Scala tutorial table fix (apache#10488)

* initial update on setting up scala ide with mxnet

* moving images to web-data project

* updated links to images; added readme for root folder

* scala hello world feature added

* workaround for make transitive error

* fixed systempath

* minor updates

* table fix

* added some spacing

* more spacing

* added ability to set search path for Accelerate library

* [MXNET-311] change test needs a docker with sudo, hence image changed (apache#10510)

* added new metric

* changed back from branch to upstream

* lint changed

* Fixed typo

* Clarified interpretation

* Changes for variable names

* fixed variable names

* replay the unit tests

* changed comment
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet