Question

在我想要创建的应用程序中，我面临一些技术障碍。我在应用程序中有两个音乐曲目。例如，用户将音乐背景导入为第一曲目。第二路径是由用户记录到由扬声器设备（或耳机）播放的第一轨道的节奏的语音。此刻我们面临延迟。在应用程序中录制和播放后，用户会听到音轨之间失去同步，这是由于麦克风和扬声器延迟造成的。

首先，我尝试通过过滤输入声音来检测延迟。我使用android的AudioRecord类和方法read()。这个方法用音频数据填充我的短数组。我发现这个数组的初始值是零，所以在我开始将它们写入输出流之前我决定将它们删除。因此我将这些零视为麦克风的“预热”延迟。这种方法是否正确？这个操作给出了一些结果，但它没有解决问题，在这个阶段，我远离那个。

但最糟糕的情况是启动扬声器和播放音乐之间的延迟。这种延迟我无法过滤或检测。我试图创建一些计算延迟的校准功能。我通过扬声器发出“嘟嘟”的声音，当我开始播放时，我也开始测量时间。然后，我开始录制并听取麦克风检测到的声音。当我在应用程序中识别出这个声音时，我会停止测量时间。我重复这个过程几次，最终值是这些结果的平均值。这就是我尝试测量设备延迟的方法。现在，当我有这个值时，我可以简单地向后移动第二个轨道以实现两个记录的同步（我将丢失一些初始毫秒的记录，但我跳过这种情况，现在，有一些可能来解决它）。我认为这种方法可以解决问题，但事实证明这并不像我想象的那么简单。我在这里发现了两个问题： 1.同时播放两首曲目时延迟 2.设备音频延迟随机。

第一个：我使用AudioTrack类播放两个曲目，我运行方法play()，如下所示：

val firstTrack = //creating a track
val secondTrack = //creating a track

firstTrack.play()
secondTrack.play()

此代码导致播放曲目阶段的延迟。现在，我甚至不必考虑录制时的延迟;我不能同时播放两首曲目而没有延迟。我用一些外部音频文件测试了这个（没有记录在我的应用程序中） - 我使用上面的代码启动相同的音频文件，我可以看到延迟。我也尝试使用MediaPlayer类，我也有相同的结果。在这种情况下，我甚至尝试在回调OnPreparedListener时调用曲目：

val firstTrack = //AudioPlayer
val secondTrack = //AudioPlayer

second.setOnPreparedListener {
  first.start()
  second.start()
}

并没有帮助。我知道Android还提供了一个名为SoundPool的类。根据文档，同时播放曲目可能会更好，但我不能使用它，因为它只支持小音频文件，这不能限制我。我该如何解决这个问题？如何才能同时准确地开始播放两首曲目？

第二种：音频延迟不是确定性的 - 有时它更小，有时它更大，而且它不在我的手中。因此，测量设备延迟可能会有所帮助 - 它无法解决问题。

总结一下：有没有任何解决方案可以为我提供每个设备（或应用会话？）或其他检测实际延迟的触发器的确切延迟，以便在同时播放两个音轨时提供最佳同步？

提前谢谢！

Answer 1

同步卡拉OK应用的音频非常困难。您似乎面临的主要问题是输出流中的可变延迟。

这几乎可以肯定是由于＃34;热身＆＃34;延迟：击中＆＃34; play＆＃34;在您的背景轨道上，音频设备（例如耳机）呈现的音频数据的第一帧。这可能有很大的差异，很难衡量。

第一个（也是最简单的）尝试是在构建MODE_STREAM时使用AudioTrack，并在调用play（more here）之前用bufferSizeInBytes数据填充它。这应该会导致更低，更一致的预热＆＃34;潜伏。

更好的方法是使用Android NDK来连续运行音频流，直到您点击播放时输出静音，然后立即开始发送音频帧。这里唯一的延迟是连续输出延迟。

如果您决定沿着这条路走下去，我建议您查看Oboe library（完全披露：我是其中一位作者）。

回答您的一个具体问题......

有没有办法以编程方式计算音频输出流的延迟？

是。最简单的解释方法是使用code sample（这是AAudio API的C ++，但使用Java AudioTrack的原理是相同的）：

// Get the index and time that a known audio frame was presented for playing
int64_t existingFrameIndex;
int64_t existingFramePresentationTime;
AAudioStream_getTimestamp(stream, CLOCK_MONOTONIC, &existingFrameIndex, &existingFramePresentationTime);

// Get the write index for the next audio frame
int64_t writeIndex = AAudioStream_getFramesWritten(stream);

// Calculate the number of frames between our known frame and the write index
int64_t frameIndexDelta = writeIndex - existingFrameIndex;

// Calculate the time which the next frame will be presented
int64_t frameTimeDelta = (frameIndexDelta * NANOS_PER_SECOND) / sampleRate_;
int64_t nextFramePresentationTime = existingFramePresentationTime + frameTimeDelta;

// Assume that the next frame will be written into the stream at the current time
int64_t nextFrameWriteTime = get_time_nanoseconds(CLOCK_MONOTONIC);

// Calculate the latency
*latencyMillis = (double) (nextFramePresentationTime - nextFrameWriteTime) / NANOS_PER_MILLISECOND;

警告：此方法依赖于音频硬件报告的准确时间戳。我知道这可以在Google Pixel设备上运行，但是已经听说过它在其他设备上如此准确，所以YMMV。

Answer 2

Android MediaPlayer类开始音频播放的速度非常慢，我在创建的应用程序中遇到了一个问题，即开始播放音频剪辑的延迟时间超过一秒。我通过切换到ExoPlayer解决了这个问题，导致播放在100毫秒内开始。我还读到ffmpeg启动音频启动时间比ExoPlayer更快，但我还没有使用它，所以我不能做出任何承诺。

Answer 3

在回答donturner之后，这是一个Java版本（根据SDK版本的不同，它还会使用其他方法）

/** The audio latency has not been estimated yet */
private static long AUDIO_LATENCY_NOT_ESTIMATED = Long.MIN_VALUE+1;

/** The audio latency default value if we cannot estimate it */
private static long DEFAULT_AUDIO_LATENCY = 100L * 1000L * 1000L; // 100ms

/**
 * Estimate the audio latency
 *
 * Not accurate at all, depends on SDK version, etc. But that's the best
 * we can do.
 */

private static void estimateAudioLatency(AudioTrack track, long audioFramesWritten) {

    long estimatedAudioLatency = AUDIO_LATENCY_NOT_ESTIMATED;

    // First method. SDK >= 19.
    if (Build.VERSION.SDK_INT >= 19 && track != null) {

        AudioTimestamp audioTimestamp = new AudioTimestamp();
        if (track.getTimestamp(audioTimestamp)) {

            // Calculate the number of frames between our known frame and the write index
            long frameIndexDelta = audioFramesWritten - audioTimestamp.framePosition;

            // Calculate the time which the next frame will be presented
            long frameTimeDelta = _framesToNanoSeconds(frameIndexDelta);
            long nextFramePresentationTime = audioTimestamp.nanoTime + frameTimeDelta;

            // Assume that the next frame will be written at the current time
            long nextFrameWriteTime = System.nanoTime();

            // Calculate the latency
            estimatedAudioLatency = nextFramePresentationTime - nextFrameWriteTime;

        }
    }

    // Second method. SDK >= 18.
    if (estimatedAudioLatency == AUDIO_LATENCY_NOT_ESTIMATED && Build.VERSION.SDK_INT >= 18) {
        Method getLatencyMethod;
        try {
            getLatencyMethod = AudioTrack.class.getMethod("getLatency", (Class<?>[]) null);
            estimatedAudioLatency = (Integer) getLatencyMethod.invoke(track, (Object[]) null) * 1000000L;
        } catch (Exception ignored) {}
    }

    // If no method has successfully gave us a value, let's try a third method
    if (estimatedAudioLatency == AUDIO_LATENCY_NOT_ESTIMATED) {
        AudioManager audioManager = (AudioManager) CRT.getInstance().getSystemService(Context.AUDIO_SERVICE);
        try {
            Method getOutputLatencyMethod = audioManager.getClass().getMethod("getOutputLatency", int.class);
            estimatedAudioLatency = (Integer) getOutputLatencyMethod.invoke(audioManager, AudioManager.STREAM_MUSIC) * 1000000L;
        } catch (Exception ignored) {}
    }

    // No method gave us a value. Let's use a default value. Better than nothing.
    if (estimatedAudioLatency == AUDIO_LATENCY_NOT_ESTIMATED) {
        estimatedAudioLatency = DEFAULT_AUDIO_LATENCY;
    }

    return estimatedAudioLatency
}

private static long _framesToNanoSeconds(long frames) {
    return frames * 1000000000L / SAMPLE_RATE;
}

音频延迟问题

3 个答案: