Question

我从Google的WebRTC C ++参考实现（插入VoEBaseImpl::GetPlayoutData中的一个钩子）接收原始PCM流。音频似乎是线性PCM，签名为int16，但是当使用AssetWriter录制时，它会保存到高度失真和高音调的音频文件。

我假设这是一个带输入参数的错误，很可能是关于将stereo-int16转换为AudioBufferList然后转换为CMSampleBuffer。以下代码有问题吗？

void RecorderImpl::RenderAudioFrame(void* audio_data, size_t number_of_frames, int sample_rate, int64_t elapsed_time_ms, int64_t ntp_time_ms) {
    OSStatus status;

    AudioChannelLayout acl;
    bzero(&acl, sizeof(acl));
    acl.mChannelLayoutTag = kAudioChannelLayoutTag_Stereo;

    AudioStreamBasicDescription audioFormat;
    audioFormat.mSampleRate = sample_rate;
    audioFormat.mFormatID = kAudioFormatLinearPCM;
    audioFormat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
    audioFormat.mFramesPerPacket = 1;
    audioFormat.mChannelsPerFrame = 2;
    audioFormat.mBitsPerChannel = 16;
    audioFormat.mBytesPerPacket = audioFormat.mFramesPerPacket * audioFormat.mChannelsPerFrame * audioFormat.mBitsPerChannel / 8;
    audioFormat.mBytesPerFrame = audioFormat.mBytesPerPacket / audioFormat.mFramesPerPacket;

    CMSampleTimingInfo timing = { CMTimeMake(1, sample_rate), CMTimeMake(elapsed_time_ms, 1000), kCMTimeInvalid };

    CMFormatDescriptionRef format = NULL;
    status = CMAudioFormatDescriptionCreate(kCFAllocatorDefault, &audioFormat, sizeof(acl), &acl, 0, NULL, NULL, &format);
    if(status != 0) {
        NSLog(@"Failed to create audio format description");
        return;
    }

    CMSampleBufferRef buffer;
    status = CMSampleBufferCreate(kCFAllocatorDefault, NULL, false, NULL, NULL, format, (CMItemCount)number_of_frames, 1, &timing, 0, NULL, &buffer);
    if(status != 0) {
        NSLog(@"Failed to allocate sample buffer");
        return;
    }

    AudioBufferList bufferList;
    bufferList.mNumberBuffers = 1;
    bufferList.mBuffers[0].mNumberChannels = audioFormat.mChannelsPerFrame;
    bufferList.mBuffers[0].mDataByteSize = (UInt32)(number_of_frames * audioFormat.mBytesPerFrame);
    bufferList.mBuffers[0].mData = audio_data;
    status = CMSampleBufferSetDataBufferFromAudioBufferList(buffer, kCFAllocatorDefault, kCFAllocatorDefault, 0, &bufferList);
    if(status != 0) {
        NSLog(@"Failed to convert audio buffer list into sample buffer");
        return;
    }

    [recorder writeAudioFrames:buffer];

    CFRelease(buffer);
}

作为参考，我在iPhone 6S + / iOS 9.2上从WebRTC接收的采样率为48kHz，每次调用此挂机有480个样本，我每10毫秒接收一次数据。

Answer 1

首先，祝贺您从头开始创建音频CMSampleBuffer。对于大多数人来说，它们既不是被创造也不是被摧毁，而是从CoreMedia和AVFoundation传递完美和神秘。

时间信息中的presentationTimeStamp是整数毫秒，不能代表48kHz样本的时间位置。

而不是CMTimeMake(elapsed_time_ms, 1000)，请尝试CMTimeMake(elapsed_frames, sample_rate)，其中elapsed_frames是您之前写过的帧数。

这可以解释失真，但不能解释音高，因此请确保AudioStreamBasicDescription符合您的AVAssetWriterInput设置。如果没有看到AVAssetWriter代码，很难说。

p.s注意writeAudioFrames - 如果它是异步的，那么audio_data的所有权就会出现问题。

p.p.s。看起来你正在泄漏CMFormatDescriptionRef。

Answer 2

我最终打开了在Audacity中生成的音频文件，看到每一帧丢失了一半，如这个相当怪异的波形所示：

将acl.mChannelLayoutTag更改为kAudioChannelLayoutTag_Mono并将audioFormat.mChannelsPerFrame更改为1解决了这个问题，现在音质非常完美。万岁！

从原始PCM流

2 个答案: