Language Processing Pipelines
When you call nlp
on a text:
Pipeline:
Spacy tokenizes text -> produces
Doc
object and this object is passed through the next stepsTagger
Assigns POS labels
Parser
Assigns dependency labels
ner
Detect and label named entities
Final Doc object has been processed
Training
Gather training data and evaluation data - ie. examples of text and labels (could be POS, named entity, etc.)
Model is shown unlabelled text and makes a prediction (statistical model)
Since we know the correct answer, can provide loss function
Last updated