使用 PulseAudio 在流之间传输音频时增加延迟答案

【问题标题】：Increasing lag when transferring audio between streams with PulseAudio使用 PulseAudio 在流之间传输音频时增加延迟
【发布时间】：2021-10-05 08:28:25
【问题描述】：

我一直在使用 PulseAudio 库在 C++ 中进行个人项目，我注意到一些奇怪的行为，我不确定是什么原因造成的。

到目前为止，我的设置相当简单：

我创建了两个音频流（一个记录流和一个播放流）
从记录流中读取音频并写入读取回调中的缓冲区
然后从该缓冲区中读取它并在写入回调中写入播放流

这个设置确实工作（我可以听到声音很好），但我注意到随着时间的推移，缓冲区大小似乎会略微增加，因此最终也会增加延迟导致明显的音频“滞后”。

这个问题可以用一些相当简单的代码重现（忽略程序中为缓冲区分配的内存量不断增长的事实，我只担心buffer_length增加）：

#include <iostream>
#include <pulse/pulseaudio.h>
#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
#include <unistd.h>
#include <cstring>

void contextStateChanged(pa_context *ctx, void *userdata);
void sinkCreated(pa_context *context, uint32_t idx, void *userdata);
void writeToStream(pa_stream *stream, size_t nbytes, void *userdata);
void readFromStream(pa_stream *stream, size_t nbytes, void *userdata);
void streamStateChanged(pa_stream *p, void *userdata);

pa_context *context;

void *buffer;
size_t buffer_index, buffer_length;

int32_t bytesRead = 0;
int32_t bytesWritten = 0;

pa_mainloop *mainloop;

pa_sample_spec spec = {
    .format = PA_SAMPLE_S16BE,
    .rate = 48000,
    .channels = 2
};

int main(int argc, char **argv) {
    mainloop = pa_mainloop_new();
    assert(mainloop);

    pa_mainloop_api *mainloopAPI = pa_mainloop_get_api(mainloop);
    assert(mainloopAPI);

    pa_proplist *props = pa_proplist_new();
    pa_proplist_sets(props, PA_PROP_APPLICATION_NAME, "PulseTest");
    pa_proplist_sets(props, PA_PROP_APPLICATION_ID, "me.mrletsplay.pulsetest");
    pa_proplist_sets(props, PA_PROP_APPLICATION_VERSION, "1.0");
    pa_proplist_sets(props, PA_PROP_APPLICATION_ICON_NAME, "audio-card");

    context = pa_context_new_with_proplist(mainloopAPI, "PulseTest", props);
    assert(context);

    pa_context_set_state_callback(context, contextStateChanged, NULL);

    pa_context_connect(context, NULL, (pa_context_flags_t) 0, NULL);

    std::cout << "Waiting for Pulseaudio" << std::endl;

    return pa_mainloop_run(mainloop, 0);
}

void initStreams() {
    pa_stream *stream = pa_stream_new(context, "playback", &spec, NULL);

    pa_buffer_attr bufferAttr;
    bufferAttr.maxlength = (uint32_t) 4096;
    bufferAttr.tlength = (uint32_t) 256;
    bufferAttr.prebuf = (uint32_t) -1;
    bufferAttr.minreq = (uint32_t) 64;
    assert(pa_stream_connect_playback(stream, NULL, &bufferAttr, PA_STREAM_ADJUST_LATENCY, NULL, NULL) == 0);
    pa_stream_set_state_callback(stream, streamStateChanged, NULL);
    pa_stream_set_write_callback(stream, writeToStream, NULL);

    pa_stream *in = pa_stream_new(context, "record", &spec, NULL);

    pa_buffer_attr inBuffer;
    inBuffer.maxlength = (uint32_t) 1024;
    inBuffer.fragsize = (uint32_t) 512;
    assert(pa_stream_connect_record(in, NULL, &inBuffer, PA_STREAM_ADJUST_LATENCY) == 0);
    pa_stream_set_state_callback(in, streamStateChanged, NULL);
    pa_stream_set_read_callback(in, readFromStream, NULL);
}

void contextStateChanged(pa_context *ctx, void *userdata) {
    if(pa_context_get_state(ctx) == PA_CONTEXT_READY) {
        std::cout << "Connected to Pulseaudio" << std::endl;
        initStreams();
    }
}

void writeToStream(pa_stream *stream, size_t nbytes, void *userdata) {
    bytesWritten += nbytes;

    // Output the difference between how many bytes we've read and how many bytes we've written
    std::cout << (bytesRead - bytesWritten) << std::endl;

    size_t write = nbytes;
    if(write > buffer_length) {
        write = buffer_length;
    }

    void *data;
    if(pa_stream_begin_write(stream, &data, &nbytes) < 0) {
        std::cout << "ERROR writing data: " << pa_strerror(pa_context_errno(context)) << std::endl;
        exit(1);
        return;
    }

    memcpy(data, (uint8_t *) buffer + buffer_index, write);
    buffer_length -= write;
    buffer_index += write;

    if(pa_stream_write(stream, data, nbytes, NULL, 0, PA_SEEK_RELATIVE) < 0) {
        std::cout << "ERROR writing data: " << pa_strerror(pa_context_errno(context)) << std::endl;
        exit(1);
        return;
    }
}

void readFromStream(pa_stream *stream, size_t nbytes, void *userdata) {
    bytesRead += nbytes;

    const void *data;
    if(pa_stream_peek(stream, &data, &nbytes) < 0) {
        std::cout << "ERROR reading data: " << pa_strerror(pa_context_errno(context)) << std::endl;
        exit(1);
        return;
    }

    if(buffer) {
        buffer = pa_xrealloc(buffer, buffer_index + buffer_length + nbytes);
        memcpy((uint8_t *) buffer + buffer_index + buffer_length, data, nbytes);
        buffer_length += nbytes;
    }else {
        buffer = pa_xmalloc(nbytes);
        memcpy(buffer, data, nbytes);
        buffer_length = nbytes;
        buffer_index = 0;
    }

    pa_stream_drop(stream);
}

void streamStateChanged(pa_stream *p, void *userdata) {
    std::cout << "State changed for stream: " << pa_stream_get_state(p) << std::endl;

    if(pa_stream_get_state(p) == PA_STREAM_READY) {
        std::cout << "Stream is ready" << std::endl;
    }
}

在代码中，我记录了我读取的字节数和写入的字节数。 bytesRead 的增长似乎超过了 bytesWritten，导致缓冲区随着时间的推移而增长。

我尝试写入比 PulseAudio 请求更多的字节，但这似乎只是导致 PulseAudio 挂起并且根本不播放任何音频。

您可以在此图表中很容易地看到问题，该图表由大约 10 分钟的程序输出生成： Program output chart

【问题讨论】：

标签： c++ pulseaudio

【解决方案1】：

同时输入和输出相同数据的声音应用程序（使用脉冲音频、直接声音等）是常见的问题。滞后的原因可能来自列表：

不同的输入和输出设备；
USB 声音设备；
处理线程很少滞后； ...等等。

问题在于缺少输出数据和声音设备（很少）必须添加裂纹/静音/任何声音。

常见的解决方法是：

已输出读取数据量（可获取输出设备的读取播放位置）。
计算你的数据量（发送到缓冲区）
控制“尾巴”您发送了多少声音数据而它们尚未播放。
您必须根据“尾巴”将声音数据输入输出：如果尾巴很短 - 添加更多数据；如果尾巴很长 - 删除一些数据。

当然，您应该生成假声音数据（有时）并丢弃额外数据（有时）。

尾巴的合理长度约为 100 - 300 毫秒。

【讨论】：

这会不会导致周期性的音频噼啪声，因为“过剩”数据不断增长（而不仅仅是由于某处的滞后而突然飙升）？
常见答案 - 是的。但是，这取决于……你的声音类型。如果你演奏清晰的鼻窦 - 是的，很明显，你会听到噼啪声。如果您播放人声 - 可以不时添加单个样本，因此您会听到很好的声音。