Member-only story
Below is a comprehensive list of libraries, functions, methods, and techniques commonly used in text and natural language processing (NLP):
I. Libraries
1. NLTK (Natural Language Toolkit):
nltk.word_tokenize()
: Tokenizes a text into words.nltk.sent_tokenize()
: Tokenizes a text into sentences.nltk.pos_tag()
: Tags parts of speech in a sentence.nltk.FreqDist()
: Computes the frequency distribution of words.
2. SpaCy:
- Tokenization:
doc = nlp(text)
, wheredoc
is a processed text. - Part-of-speech tagging:
token.pos_
provides the part of speech of a token. - Named Entity Recognition (NER):
ent.text
provides the named entity.
3. TextBlob:
TextBlob(text)
: Creates a TextBlob object for text processing.blob.words
,blob.sentences
: Accesses words and sentences in a TextBlob.blob.noun_phrases
: Extracts noun phrases.blob.tags
: Tags parts of speech.