parakeet.frontend package¶
Subpackages¶
Submodules¶
parakeet.frontend.phonectic module¶
- class parakeet.frontend.phonectic.Chinese[source]¶
Bases:
parakeet.frontend.phonectic.PhoneticsNormalize Chinese text sequence and convert it into ids.
- numericalize(phonemes)[source]¶
Convert pronunciation sequence into pronunciation id sequence.
- Parameters
- phonemes: List[str]
The list of pronunciation sequence.
- Returns
- List[int]
The list of pronunciation id sequence.
- phoneticize(sentence)[source]¶
Normalize the input text sequence and convert it into pronunciation sequence.
- Parameters
- sentence: str
The input text sequence.
- Returns
- List[str]
The list of pronunciation sequence.
- reverse(ids)[source]¶
Reverse the list of pronunciation id sequence to a list of pronunciation sequence.
- Parameters
- ids: List[int]
The list of pronunciation id sequence.
- Returns
- List[str]
The list of pronunciation sequence.
- property vocab_size¶
Vocab size.
- class parakeet.frontend.phonectic.English[source]¶
Bases:
parakeet.frontend.phonectic.PhoneticsNormalize the input text sequence and convert into pronunciation id sequence.
- numericalize(phonemes)[source]¶
Convert pronunciation sequence into pronunciation id sequence.
- Parameters
- phonemes: List[str]
The list of pronunciation sequence.
- Returns
- List[int]
The list of pronunciation id sequence.
- phoneticize(sentence)[source]¶
Normalize the input text sequence and convert it into pronunciation sequence.
- Parameters
- sentence: str
The input text sequence.
- Returns
- List[str]
The list of pronunciation sequence.
- reverse(ids)[source]¶
Reverse the list of pronunciation id sequence to a list of pronunciation sequence.
- Parameters
- ids: List[int]
The list of pronunciation id sequence.
- Returns
- List[str]
The list of pronunciation sequence.
- property vocab_size¶
Vocab size.
- class parakeet.frontend.phonectic.EnglishCharacter[source]¶
Bases:
parakeet.frontend.phonectic.PhoneticsNormalize the input text sequence and convert it into character id sequence.
- numericalize(sentence)[source]¶
Convert a text sequence into ids.
- Parameters
- sentence: str
The input text sequence.
- Returns
- List[int]
List of a character id sequence.
- phoneticize(sentence)[source]¶
Normalize the input text sequence.
- Parameters
- sentence: str
The input text sequence.
- Returns
- str
A text sequence after normalize.
- reverse(ids)[source]¶
Convert a character id sequence into text.
- Parameters
- ids: List[int]
List of a character id sequence.
- Returns
- str
The input text sequence.
- property vocab_size¶
Vocab size.
parakeet.frontend.vocab module¶
- class parakeet.frontend.vocab.Vocab(symbols: Iterable[str], padding_symbol='<pad>', unk_symbol='<unk>', start_symbol='<s>', end_symbol='</s>')[source]¶
Bases:
objectVocabulary.
- Parameters
- symbols: Iterable[str]
Common symbols.
- padding_symbol: str, optional
Symbol for pad. Defaults to “<pad>”.
- unk_symbol: str, optional
Symbol for unknow. Defaults to “<unk>”
- start_symbol: str, optional
Symbol for start. Defaults to “<s>”
- end_symbol: str, optional
Symbol for end. Defaults to “</s>”
- property end_index¶
The index of end symbol.
- property num_specials¶
The number of special symbols.
- property padding_index¶
The index of padding symbol
- property start_index¶
The index of start symbol.
- property unk_index¶
The index of unknow symbol.