Determine the tags for each summary of a book.
本次作業目標是利用書籍摘要預測類型(tags)。當中有多個類別,屬於multi-label。也希望各位同學能透過此次作業熟悉機器學習在NLP上的應用。
https://inclass.kaggle.com/c/ml2017-hw5
numpy
scipy
pandas
Keras
tensorflow
scikit-learn
h5py
Cython
nltk
https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html
https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MultiLabelBinarizer.html
https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html
https://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html
https://scikit-learn.org/stable/modules/generated/sklearn.multiclass.OneVsRestClassifier.html
https://github.com/fchollet/keras/blob/master/examples/imdb_lstm.py
https://github.com/fchollet/keras/blob/master/tests/keras/preprocessing/text_test.py
keras-team/keras#741
https://machinelearningmastery.com/sequence-classification-lstm-recurrent-neural-networks-python-keras/
https://blog.keras.io/using-pre-trained-word-embeddings-in-a-keras-model.html
keras-team/keras#2607
https://github.com/fchollet/keras/blob/53e541f7bf55de036f4f5641bd2947b96dd8c4c3/keras/metrics.py
keras-team/keras#3977
https://keras.io/preprocessing/text/#tokenizer
https://keras.io/preprocessing/sequence/#pad_sequences
https://keras.io/layers/embeddings/#embedding
https://keras.io/layers/recurrent/#lstm
https://stackoverflow.com/questions/7961363/removing-duplicates-in-lists
https://www.quora.com/What-is-bagging-in-machine-learning
https://www.tutorialspoint.com/python/string_split.htm