Skip to content

yyyukeqi/Text-Classification-Model-on-Stack-Overflow

Repository files navigation

Text-Classification-Model-on-Stack-Overflow

I used SQL query to extract the data from the public dataset stackoverflow from Google BigQuery, includes an archive of Stack Overflow content, including posts, votes, tags, and badges.

This project aims to predict the tags of questions from Stack Overflow

Focus on the questions containing 5 possible ML-related tags: Tensorflow|keras|matplotlib|pandas|scikit-learn

  • Converting free-form text input into matrices using Big of Words model

  • Encoding tags as multi-hot arrays using Scikit-learn’s MultiLabelBinarizer

  • Keras Sequential Model