Improving Nlu Model
Pipelines
2.pretrained_embeddings_convert 3.pretrained_embeddings_spacy
supervised_embeddings
supervised_embeddingsUses whitespace for tokenization
Default Components:
language: "en"
pipeline:
- name: "WhitespaceTokenizer"
- name: "RegexFeaturizer"
- name: "CRFEntityExtractor"
- name: "EntitySynonymMapper"
- name: "CountVectorsFeaturizer"
- name: "CountVectorsFeaturizer"
analyzer: "char_wb"
min_ngram: 1
max_ngram: 4
- name: "EmbeddingIntentClassifier"eg. if chosen language is not whitespace-tokenized, replace
WhitespaceTokenizerwith your own tokenizerNote: uses 2
CountVectorsFeaturizer1st one: featurizes text based on words
2nd one: Featurizes based on character n-grams, preserving word boundaries
pretrained_embeddings_convert
pretrained_embeddings_convertpretrained sentence encoding model ConveRT to extract vector representations of complete user utterance as a whole
language: "en"
pipeline:
- name: "WhitespaceTokenizer"
- name: "ConveRTFeaturizer"
- name: "EmbeddingIntentClassifier"pretrained_embeddings_spacy
pretrained_embeddings_spacypre-trained word vectors from either GloVe or fastText
language: "en"
pipeline:
- name: "SpacyNLP"
- name: "SpacyTokenizer"
- name: "SpacyFeaturizer"
- name: "RegexFeaturizer"
- name: "CRFEntityExtractor"
- name: "EntitySynonymMapper"
- name: "SklearnIntentClassifier"MITIE
MITIENeed your own word corpus Learn more to train
language: "en"
pipeline:
- name: "MitieNLP"
model: "data/total_word_feature_extractor.dat"
- name: "MitieTokenizer"
- name: "MitieEntityExtractor"
- name: "EntitySynonymMapper"
- name: "RegexFeaturizer"
- name: "MitieFeaturizer"
- name: "SklearnIntentClassifier"Last updated
Was this helpful?