Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial Neural Net - DO NOT MERGE #349

Open
wants to merge 38 commits into
base: master
Choose a base branch
from
Open

Initial Neural Net - DO NOT MERGE #349

wants to merge 38 commits into from

Conversation

Aylr
Copy link
Contributor

@Aylr Aylr commented Aug 29, 2017

This requires substantial review and discussion before merging.

Shufang Ci and others added 30 commits July 17, 2017 13:32
…ataframes to numpy arrays to make keras happy

* added a self.columns attribute to save the dataframe column names
KerasClassifier is not a BaseEstimator. This raises an error in neural
network.
Don't need it now.
# Conflicts:
#	healthcareai/advanced_supvervised_model_trainer.py
This dataset is used for multi class classification
Changed `roc_auc` to `accuracy` to accommodate multi class
classification
Add a `binary` check in `calculate_binary_classification_metrics()`, and
changed the name to `calculate_classification_metrics()`. Choose
performance metrics based on the number of classes.
@mxlei01
Copy link
Contributor

mxlei01 commented Aug 29, 2017

Although you removed the ctg dataset in the end, I can still see it at ab5b130. Dunno if it matters though.

create_nn:

  • Maybe give the user the an option create a deeper network? Looks like a fixed layer size network
  • Last layer activation: Maybe have a choice for sigmoid for multi-label classification?

Just my two cents.

@Aylr
Copy link
Contributor Author

Aylr commented Sep 4, 2017

@mxlei01 I agree with your comments - Thank you for checking it out! It is important to note that is only the first step toward getting neural nets into healthcare.ai. I may be pulling pieces of this in slowly (for example the multiclass support) as we decide how we want to handle nets.

@mxlei01
Copy link
Contributor

mxlei01 commented Sep 6, 2017

@Aylr Regarding the neural network that Healthcare-AI would use. Would you guys rather use TensorFlow, or high level tool like Keras? I have researched a little bit about Keras vs TensorFlow. I'm not sure your deep learning training flow, but it used to be multi-threading + queues, then now the recommended way is to use DataSets. With Keras batch training using a Python generator, you would only get a portion of throughput you get for TensorFlow (plus the overhead of switching between the underlying C++ code and Python) compared to pure TensorFlow. However, Keras with MXNet backend seems to be a good alternative, with a high training throughput, although I'm not sure the performance compared with pure TensorFlow.

Good GPUs are expensive, and training times are long so we might want every performance we can squeeze out of a GPU.

With deep learning, we would also want to batch data for training, but right now we actually read in the whole dataset for training.

Would we want to somehow make a scalable version of our data pipeline? Don't need to actually replace the whole thing, but can be set with user settings. For example pipeline='TensorFlow'.

I could invest in some time playing with TensorFlow and see how we could integrate it in Healthcareai-py.

Finally, I might be wrong, so I'm throwing this out there to see if anyone corrects my statements.

# Conflicts:
#	healthcareai/advanced_supvervised_model_trainer.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants