paddlespeech.t2s.datasets.preprocess_utils module
- paddlespeech.t2s.datasets.preprocess_utils.compare_duration_and_mel_length(sentences, utt, mel)[source]
check duration error, correct sentences[utt] if possible, else pop sentences[utt] Args:
sentences (Dict): sentences[utt] = [phones_list ,durations_list] utt (str): utt_id mel (np.ndarry): features (num_frames, n_mels)
- paddlespeech.t2s.datasets.preprocess_utils.get_input_token(sentence, output_path, dataset='baker')[source]
get phone set from training data and save it Args:
sentence (Dict): sentence: {'utt': ([char], [int])} output_path (str or path):path to save phone_id_map
- paddlespeech.t2s.datasets.preprocess_utils.get_phn_dur(file_name)[source]
read MFA duration.txt Args:
file_name (str or Path): path of gen_duration_from_textgrid.py's result
- Returns:
Dict: sentence: {'utt': ([char], [int])}
- paddlespeech.t2s.datasets.preprocess_utils.get_phones_tones(sentence, phones_output_path, tones_output_path, dataset='baker')[source]
get phone set and tone set from training data and save it Args:
sentence (Dict): sentence: {'utt': ([char], [int])} phones_output_path (str or path): path to save phone_id_map tones_output_path (str or path): path to save tone_id_map