增加声音wav文件

时间:2018-01-02 16:29:48

标签: audio deep-learning wav librosa

当我从某个wav文件路径中提取x_train时

new_sample_rate = 8000
y_train = []
x_train = []

    for label, fname in zip(labels, fnames):
        sample_rate, samples = wavfile.read(os.path.join(train_data_path, label, fname))
        #print(label ,fname)
        samples = pad_audio(samples)
        if len(samples) > 16000:
            #print('chop->',label,fname)
            n_samples = chop_audio(samples)
        else: n_samples = [samples]
        sampleglobal = samples
        for samples in n_samples:
            resampled = signal.resample(samples, int(new_sample_rate / sample_rate * samples.shape[0]))
            _, _, specgram = log_specgram(resampled, sample_rate=new_sample_rate)
            y_train.append(label)
            x_train.append(specgram)

文件大小为

64524 99 81 1

现在我想增加波浪数据的大小

librosa.effects.pitch_shift(y,sr, n_steps =-1)
 y_fast = librosa.effects.time_stretch(y, 0.34)
 y_fast_wn = y_fast + 0.005*wn


def pad_audio(samples):
    if len(samples) >= L: return samples
    else: return np.pad(samples, pad_width=(L - len(samples), 0), mode='constant', constant_values=(0, 0))

def chop_audio(samples, L=16000, num=20):
    for i in range(num):
        beg = np.random.randint(0, len(samples) - L)
        yield samples[beg: beg + L]

def label_transform(labels):
    nlabels = []
    for label in labels:
        if label == '_background_noise_':
            nlabels.append('silence')
        elif label not in legal_labels:
            nlabels.append('unknown')
        else:
            nlabels.append(label)
    return pd.get_dummies(pd.Series(nlabels))

但问题是当我用移位间距更改wav文件时 或者time_stretch。
它还会更改输出x_train [0] .size
我的收缩到了

71,81,1
  1. 为什么缩小到那么大?
  2. 如果我想保持99 81 1的尺寸,我该怎么办?

0 个答案:

没有答案