pycantonese.CHAT.search

CHAT.search(*, onset=None, nucleus=None, coda=None, tone=None, initial=None, final=None, jyutping=None, character=None, pos=None, word_range=(0, 0), utterance_range=(0, 0), by_token=True, by_utterance=False, by_file=False)[source]

Search the data for the given criteria.

Parameters:

onsetstr, optional: Onset to search for. A regex is supported.
nucleusstr, optional: Nucleus to search for. A regex is supported.
codastr, optional: Coda to search for. A regex is supported.
tonestr, optional: Tone to search for. A regex is supported.
initialstr, optional: Initial to search for. A regex is supported.
finalstr, optional: Final to search for.
jyutpingstr, optional: Jyutping romanization of one Cantonese character to search for.
characterstr, optional: One or more Cantonese characters to search for.
posstr, optional: A part-of-speech tag to search for. A regex is supported.
word_rangetuple[int, int], optional: Span of words around a match. Default is (0, 0).
utterance_rangetuple[int, int], optional: Span of utterances around a match. Default is (0, 0).
by_tokenbool, optional: If True, return Token objects. Otherwise return word strings.
by_utterancebool, optional: If True, return full utterances containing matches.
by_filebool, optional: If True, return data organized by file.

Returns:

list