pycantonese.CHAT.search

CHAT.search(*, onset=None, nucleus=None, coda=None, tone=None, initial=None, final=None, jyutping=None, character=None, pos=None, word_range=(0, 0), utterance_range=(0, 0), by_token=True, by_utterance=False, by_file=False)[source]

Search the data for the given criteria.

Parameters:

onset (str, optional) – Onset to search for. A regex is supported.
nucleus (str, optional) – Nucleus to search for. A regex is supported.
coda (str, optional) – Coda to search for. A regex is supported.
tone (str, optional) – Tone to search for. A regex is supported.
initial (str, optional) – Initial to search for. A regex is supported.
final (str, optional) – Final to search for.
jyutping (str, optional) – Jyutping romanization of one Cantonese character to search for.
character (str, optional) – One or more Cantonese characters to search for.
pos (str, optional) – A part-of-speech tag to search for. A regex is supported.
word_range (tuple[int, int], optional) – Span of words around a match. Default is (0, 0).
utterance_range (tuple[int, int], optional) – Span of utterances around a match. Default is (0, 0).
by_token (bool, optional) – If True, return Token objects. Otherwise return word strings.
by_utterance (bool, optional) – If True, return full utterances containing matches.
by_file (bool, optional) – If True, return data organized by file.

Returns:

list