Optimizing Entity Recognition
SpaCy (SpacyEntityExtractor) - ner_spacy
ner_spacy
Pretrained entity extractors
Statistical BILOU transition model
Duckling
Number related information (dates, distance, duration)
Run server on docker image ner_duckling
NER_CRF (CRFEntityExtractor)
trained from scratch
Need to annotate training data yourself
Annotate training examples EVERYWHERE in training data (even if entity is not relevant for intent)
Use of lookup tables makes
ner_crf
prone to overfittingIf training data matches Regex or Lookup, it will ignore other features, so if you have message with entity that is not matched by Regex,
ner_crf
will not detect
Common Problems
Entities are not recognizing unseen values
Could be:
Lack of training data
Overfitting of
ner_crf
(try training model without regex or look up)
Map extracted entity to different value
Last updated