paddlespeech.vector.io.batch module

paddlespeech.vector.io.batch.batch_feature_normalize(batch, mean_norm: bool = True, std_norm: bool = True)[source]

Do batch utterance features normalization

Args:

batch (list): the batch feature from dataloader mean_norm (bool, optional): mean normalization flag. Defaults to True. std_norm (bool, optional): std normalization flag. Defaults to True.

Returns:

dict: the normalized batch features

paddlespeech.vector.io.batch.batch_pad_right(arrays, mode='constant', value=0)[source]

Given a list of numpy arrays it batches them together by padding to the right on each dimension in order to get same length for all.

Args:

arrays : list. List of array we wish to pad together. mode : str. Padding mode see numpy.pad documentation. value : float. Padding value see numpy.pad documentation.

Returns:

array : numpy.array. Padded array. valid_vals : list. List containing proportion for each dimension of original, non-padded values.

paddlespeech.vector.io.batch.feature_normalize(feats: Tensor, mean_norm: bool = True, std_norm: bool = True, convert_to_numpy: bool = False)[source]

Do one utterance feature normalization

Args:

feats (paddle.Tensor): the original utterance feat, such as fbank, mfcc mean_norm (bool, optional): mean norm flag. Defaults to True. std_norm (bool, optional): std norm flag. Defaults to True. convert_to_numpy (bool, optional): convert the paddle.tensor to numpy

and do feature norm with numpy. Defaults to False.

Returns:

paddle.Tensor : the normalized feats

paddlespeech.vector.io.batch.pad_right_2d(x, target_length, axis=-1, mode='constant', **kwargs)[source]
paddlespeech.vector.io.batch.pad_right_to(array, target_shape, mode='constant', value=0)[source]

This function takes a numpy array of arbitrary shape and pads it to target shape by appending values on the right.

Args:

array: input numpy array. Input array whose dimension we need to pad.

target_shape : (list, tuple). Target shape we want for the target array its len must be equal to array.ndim mode : str. Pad mode, please refer to numpy.pad documentation. value : float. Pad value, please refer to numpy.pad documentation.

Returns:

array: numpy.array. Padded array. valid_vals : list. List containing proportion for each dimension of original, non-padded values.

paddlespeech.vector.io.batch.waveform_collate_fn(batch)[source]

Wrap the waveform into a batch form

Args:
batch (list): the waveform list from the dataloader

the item of data include several field feat: the utterance waveform data label: the utterance label encoding data

Returns:

dict: the batch data to dataloader