Skip to content

Commit

Permalink
fit train & test
Browse files Browse the repository at this point in the history
  • Loading branch information
orbxball committed May 21, 2017
1 parent 8da2fc4 commit fc66dec
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion hw5/tfidf_linearSVC.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,10 @@ def main():

### Tokenize
vectorizer = TfidfVectorizer(stop_words='english')
sequences = vectorizer.fit_transform(texts)
# vectorizer = TfidfVectorizer(stop_words='english', ngram_range=(1, 3), max_features=40000)
all_corpus = texts + test_texts
vectorizer.fit(all_corpus)
sequences = vectorizer.transform(texts)
test_data = vectorizer.transform(test_texts)

(x_train, y_train),(x_valid, y_valid) = validate(sequences, tags, valid_size)
Expand Down

0 comments on commit fc66dec

Please sign in to comment.