Transforms tokenized text into something machine can read
2 Categories: Sparse, Dense
Sparse: return feature vectors with a lot of missing values (eg. 0s)
Dense: Contains mostly non-zeroes
Requires: MitieNLP
Dense featurizer
only SklearnIntentClassifier can use this
For pre-training your own word vectors (need a huge corpus initially)
Requires: SpacyNLP
Requires: ConveRTTokenizer
Short training time
Do not fine-tune parameters
Need intent AND RESPONSE features
eg. EmbeddingIntentClassifier
eg. ResponseSelector
No requirements
Only supports CRFEntityExtractor
Sparse featurizer
bag-of-words representations (intent and response)
To fine-tune: sklearn.feature_extractionarrow-up-right
Last updated 4 years ago