修复从mediasoup发送的RTP / RTCP的A / V同步问题

时间:2017-12-09 18:29:11

标签: ffmpeg webrtc rtp rtcp

一点背景:我正在尝试记录通过mediasoup v2 SFU进行的webrtc调用。我正在使用mediasoup的room.createRtpStreamer()方法生成一个流,它将RTP / RTCP镜像到ffmpeg。在彼此约30ms内为音频和视频创建两个拖缆并开始广播。然后FFmpeg旋转并开始接受。非常确定RTCP正在工作,因为尽管在流媒体开始广播之后启动了ffmpeg,但始终以关键帧开始。

问题是我遇到了看似随机偏移的音频/视频去同步。我目前的理论是这个偏移是基于RTCP请求启动流的最后一个关键帧的年龄。请参阅下面的ffmpeg配置和输出,但我的问题是:我可以使用哪些ffmpeg参数来调整视频帧时间戳以匹配音频帧时间戳,反之亦然?我和-map 0:0,0:1 -map 0:1,0:1搞砸了,但它似乎没有找到我正在寻找的东西。

ffmpeg flags:

'-y',
'-loglevel',
'debug',
'-dump',
'-protocol_whitelist',
'file,crypto,udp,rtp,data',
'-analyzeduration',
'20M',
'-probesize',
'20M',
'-i',
`data:text/plain;base64,${sdp.toString('base64')}`,
'-fflags',
'+genpts',
'-vcodec',
'copy',
'-acodec',
'aac',
'-bsf:v',
'h264_mp4toannexb',
'-start_number',
'0',
'-hls_list_size',
'2147480000',
'-hls_wrap',
'0',
'-hls_time',
'10',

SDP用于输入(模板):

v=0
o=- 0 0 IN IP4 <%=ip %>
s=title
c=IN IP4 <%=ip %>
m=audio <%=audioPort %> RTP/AVPF <%=audioPayload %>
a=sendrecv
a=rtcp-mux
a=rtpmap:<%=audioPayload %> opus/48000/2
a=fmtp:<%=audioPayload %> minptime=10; useinbandfec=1
m=video <%=videoPort %> RTP/AVPF <%=videoPayload %>
a=sendrecv
a=rtcp-mux
a=rtpmap:<%=videoPayload %> H264/90000
a=rtcp-fb:<%=videoPayload %> ccm fir
a=rtcp-fb:<%=videoPayload %> nack
a=rtcp-fb:<%=videoPayload %> nack pli
a=rtcp-fb:<%=videoPayload %> goog-remb
a=rtcp-fb:<%=videoPayload %> transport-cc
a=fmtp:<%=videoPayload %> level-asymmetry-allowed=1;packetization-mode=1

ffmpeg输出 - 带有一些时间戳的乱码

1512775954585 - stderr: ffmpeg version 3.4 Copyright (c) 2000-2017 the FFmpeg developers
  built with Apple LLVM version 9.0.0 (clang-900.0.38)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/3.4 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-ffplay --enable-libmp3lame --enable-libvpx --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma
1512775954587 - stderr:   libavutil      55. 78.100 / 55. 78.100
  libavcodec     57.107.100 / 57.107.100
  libavformat    57. 83.100 / 57. 83.100
  libavdevice    57. 10.100 / 57. 10.100
1512775954587 - stderr:   libavfilter     6.107.100 /  6.107.100
  libavresample   3.  7.  0 /  3.  7.  0
  libswscale      4.  8.100 /  4.  8.100
  libswresample   2.  9.100 /  2.  9.100
  libpostproc    54.  7.100 / 54.  7.100
Splitting the commandline.
Reading option '-y' ... matched as option 'y' (overwrite output files) with argument '1'.
Reading option '-loglevel' ... matched as option 'loglevel' (set logging level) with argument 'debug'.
Reading option '-dump' ... matched as option 'dump' (dump each input packet) with argument '1'.
Reading option '-protocol_whitelist' ...1512775954589 - stderr:  matched as AVOption 'protocol_whitelist' with argument 'file,crypto,udp,rtp,data'.
Reading option '-analyzeduration' ...1512775954590 - stderr:  matched as AVOption 'analyzeduration' with argument '20M'.
Reading option '-probesize' ...1512775954590 - stderr:  matched as AVOption 'probesize' with argument '20M'.
Reading option '-i' ... matched as input url with argument 'data:text/plain;base64,dj0wCm89LSAwIDAgSU4gSVA0IDEyNy4wLjAuMQpzPWUwYzkyZmEwLWRhZDUtMTFlNy04Njg3LTA5MGRkYTk1YjFhNCBmb29ib2FyCmM9SU4gSVA0IDEyNy4wLjAuMQptPWF1ZGlvIDIwMDAwIFJUUC9BVlBGIDEwMAphPXNlbmRyZWN2CmE9cnRjcC1tdXgKYT1ydHBtYXA6MTAwIG9wdXMvNDgwMDAvMgphPWZtdHA6MTAwIG1pbnB0aW1lPTEwOyB1c2VpbmJhbmRmZWM9MQptPXZpZGVvIDIwMDAyIFJUUC9BVlBGIDEwMQphPXNlbmRyZWN2CmE9cnRjcC1tdXgKYT1ydHBtYXA6MTAxIEgyNjQvOTAwMDAKYT1ydGNwLWZiOjEwMSBjY20gZmlyCmE9cnRjcC1mYjoxMDEgbmFjawphPXJ0Y3AtZmI6MTAxIG5hY2sgcGxpCmE9cnRjcC1mYjoxMDEgZ29vZy1yZW1iCmE9cnRjcC1mYjoxMDEgdHJhbnNwb3J0LWNjCmE9Zm10cDoxMDEgbGV2ZWwtYXN5bW1ldHJ5LWFsbG93ZWQ9MTtwYWNrZXRpemF0aW9uLW1vZGU9MTtwcm9maWxlLWxldmVsLWlkPTQyZTAxZgo='.
Reading option '-fflags' ...1512775954591 - stderr:  matched as AVOption 'fflags' with argument '+genpts'.
Reading option '-vcodec' ... matched as option 'vcodec' (force video codec ('copy' to copy stream)) with argument 'copy'.
Reading option '-acodec' ... matched as option 'acodec' (force audio codec ('copy' to copy stream)) with argument 'aac'.
Reading option '-vsync' ... matched as option 'vsync' (video sync method) with argument '0'.
Reading option '-map' ... matched as option 'map' (set input stream mapping) with argument '0:0,0:1'.
Reading option '-map' ... matched as option 'map' (set input stream mapping) with argument '0:1,0:1'.
Reading option '-bsf:v' ... matched as option 'bsf' (A comma-separated list of bitstream filters) with argument 'h264_mp4toannexb'.
Reading option '-start_number' ...1512775954591 - stderr:  matched as AVOption 'start_number' with argument '0'.
Reading option '-hls_list_size' ...1512775954591 - stderr:  matched as AVOption 'hls_list_size' with argument '2147480000'.
Reading option '-hls_wrap' ... matched as AVOption 'hls_wrap' with argument '0'.
Reading option '-hls_time' ...1512775954591 - stderr:  matched as AVOption 'hls_time' with argument '10'.
Reading option '/tmp/archive/e0c92fa0-dad5-11e7-8687-090dda95b1a4_10e1c990-dc70-11e7-888d-9f39ca0c79bc/1512775954465.m3u8' ... matched as output url.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option y (overwrite output files) with argument 1.
1512775954592 - stderr: Applying option loglevel (set logging level) with argument debug.
Applying option dump (dump each input packet) with argument 1.
Applying option vsync (video sync method) with argument 0.
Successfully parsed a group of options.
Parsing a group of options: input url data:text/plain;base64,dj0wCm89LSAwIDAgSU4gSVA0IDEyNy4wLjAuMQpzPWUwYzkyZmEwLWRhZDUtMTFlNy04Njg3LTA5MGRkYTk1YjFhNCBmb29ib2FyCmM9SU4gSVA0IDEyNy4wLjAuMQptPWF1ZGlvIDIwMDAwIFJUUC9BVlBGIDEwMAphPXNlbmRyZWN2CmE9cnRjcC1tdXgKYT1ydHBtYXA6MTAwIG9wdXMvNDgwMDAvMgphPWZtdHA6MTAwIG1pbnB0aW1lPTEwOyB1c2VpbmJhbmRmZWM9MQptPXZpZGVvIDIwMDAyIFJUUC9BVlBGIDEwMQphPXNlbmRyZWN2CmE9cnRjcC1tdXgKYT1ydHBtYXA6MTAxIEgyNjQvOTAwMDAKYT1ydGNwLWZiOjEwMSBjY20gZmlyCmE9cnRjcC1mYjoxMDEgbmFjawphPXJ0Y3AtZmI6MTAxIG5hY2sgcGxpCmE9cnRjcC1mYjoxMDEgZ29vZy1yZW1iCmE9cnRjcC1mYjoxMDEgdHJhbnNwb3J0LWNjCmE9Zm10cDoxMDEgbGV2ZWwtYXN5bW1ldHJ5LWFsbG93ZWQ9MTtwYWNrZXRpemF0aW9uLW1vZGU9MTtwcm9maWxlLWxldmVsLWlkPTQyZTAxZgo=.
Successfully parsed a group of options.
Opening an input file: data:text/plain;base64,dj0wCm89LSAwIDAgSU4gSVA0IDEyNy4wLjAuMQpzPWUwYzkyZmEwLWRhZDUtMTFlNy04Njg3LTA5MGRkYTk1YjFhNCBmb29ib2FyCmM9SU4gSVA0IDEyNy4wLjAuMQptPWF1ZGlvIDIwMDAwIFJUUC9BVlBGIDEwMAphPXNlbmRyZWN2CmE9cnRjcC1tdXgKYT1ydHBtYXA6MTAwIG9wdXMvNDgwMDAvMgphPWZtdHA6MTAwIG1pbnB0aW1lPTEwOyB1c2VpbmJhbmRmZWM9MQptPXZpZGVvIDIwMDAyIFJUUC9BVlBGIDEwMQphPXNlbmRyZWN2CmE9cnRjcC1tdXgKYT1ydHBtYXA6MTAxIEgyNjQvOTAwMDAKYT1ydGNwLWZiOjEwMSBjY20gZmlyCmE9cnRjcC1mYjoxMDEgbmFjawphPXJ0Y3AtZmI6MTAxIG5hY2sgcGxpCmE9cnRjcC1mYjoxMDEgZ29vZy1yZW1iCmE9cnRjcC1mYjoxMDEgdHJhbnNwb3J0LWNjCmE9Zm10cDoxMDEgbGV2ZWwtYXN5bW1ldHJ5LWFsbG93ZWQ9MTtwYWNrZXRpemF0aW9uLW1vZGU9MTtwcm9maWxlLWxldmVsLWlkPTQyZTAxZgo=.
1512775954592 - stderr: [NULL @ 0x7f81fd000000] Opening 'data:text/plain;base64,dj0wCm89LSAwIDAgSU4gSVA0IDEyNy4wLjAuMQpzPWUwYzkyZmEwLWRhZDUtMTFlNy04Njg3LTA5MGRkYTk1YjFhNCBmb29ib2FyCmM9SU4gSVA0IDEyNy4wLjAuMQptPWF1ZGlvIDIwMDAwIFJUUC9BVlBGIDEwMAphPXNlbmRyZWN2CmE9cnRjcC1tdXgKYT1ydHBtYXA6MTAwIG9wdXMvNDgwMDAvMgphPWZtdHA6MTAwIG1pbnB0aW1lPTEwOyB1c2VpbmJhbmRmZWM9MQptPXZpZGVvIDIwMDAyIFJUUC9BVlBGIDEwMQphPXNlbmRyZWN2CmE9cnRjcC1tdXgKYT1ydHBtYXA6MTAxIEgyNjQvOTAwMDAKYT1ydGNwLWZiOjEwMSBjY20gZmlyCmE9cnRjcC1mYjoxMDEgbmFjawphPXJ0Y3AtZmI6MTAxIG5hY2sgcGxpCmE9cnRjcC1mYjoxMDEgZ29vZy1yZW1iCmE9cnRjcC1mYjoxMDEgdHJhbnNwb3J0LWNjCmE9Zm10cDoxMDEgbGV2ZWwtYXN5bW1ldHJ5LWFsbG93ZWQ9MTtwYWNrZXRpemF0aW9uLW1vZGU9MTtwcm9maWxlLWxldmVsLWlkPTQyZTAxZgo=' for reading
1512775954593 - stderr: [data @ 0x7f81fca001a0] Content-type: text/plain
{"level":"info","time":"Dec 8, 2017 11:32 PM","message":"ffmpeg started"}
1512775954595 - stderr: [sdp @ 0x7f81fd000000] Format sdp probed with size=2048 and score=50
1512775954598 - stderr: [sdp @ 0x7f81fd000000] audio codec set to: opus
[sdp @ 0x7f81fd000000] audio samplerate set to: 48000
[sdp @ 0x7f81fd000000] audio channels set to: 2
1512775954639 - stderr: [sdp @ 0x7f81fd000000] video codec set to: h264
[sdp @ 0x7f81fd000000] RTP Packetization Mode: 1
[sdp @ 0x7f81fd000000] RTP Profile IDC: 42 Profile IOP: e0 Level: 1f
[udp @ 0x7f81fcb007e0] end receive buffer size reported is 65536
[udp @ 0x7f81fbe00180] end receive buffer size reported is 65536
[sdp @ 0x7f81fd000000] setting jitter buffer size to 500
[udp @ 0x7f81fbe00680] end receive buffer size reported is 65536
[udp @ 0x7f81fbe00740] end receive buffer size reported is 65536
[sdp @ 0x7f81fd000000] setting jitter buffer size to 500
[sdp @ 0x7f81fd000000] Before avformat_find_stream_info() pos: 479 bytes read:479 seeks:0 nb_streams:2
{"level":"info","time":"Dec 8, 2017 11:32 PM","message":"new active speaker","activePeer":"9f05d96a-9641-4c63-8f0e-486b98e48eb5"}
1512775954773 - stderr: [AVBSFContext @ 0x7f81fc8018e0] nal_unit_type: 7, nal_ref_idc: 3
[AVBSFContext @ 0x7f81fc8018e0] nal_unit_type: 8, nal_ref_idc: 3
[AVBSFContext @ 0x7f81fc8018e0] nal_unit_type: 5, nal_ref_idc: 3
1512775954774 - stderr: [h264 @ 0x7f8200000c00] nal_unit_type: 7, nal_ref_idc: 3
1512775954774 - stderr: [h264 @ 0x7f8200000c00] nal_unit_type: 8, nal_ref_idc: 3
1512775954774 - stderr: [h264 @ 0x7f8200000c00] nal_unit_type: 5, nal_ref_idc: 3
1512775954774 - stderr: [h264 @ 0x7f8200000c00] Reinit context to 640x480, pix_fmt: yuv420p
1512775954801 - stderr: [h264 @ 0x7f8200000c00] nal_unit_type: 1, nal_ref_idc: 3
1512775954939 - stderr: [h264 @ 0x7f8200000c00] nal_unit_type: 7, nal_ref_idc: 3
1512775954939 - stderr: [h264 @ 0x7f8200000c00] nal_unit_type: 8, nal_ref_idc: 3
1512775954940 - stderr: [h264 @ 0x7f8200000c00] nal_unit_type: 5, nal_ref_idc: 3
1512775955003 - stderr: [h264 @ 0x7f8200000c00] nal_unit_type: 1, nal_ref_idc: 3
1512775955999 - stderr:     Last message repeated 3 times
[sdp @ 0x7f81fd000000] All info found
1512775955999 - stderr: [sdp @ 0x7f81fd000000] rfps: 29.750000 0.019566
[sdp @ 0x7f81fd000000] rfps: 29.833333 0.015263
[sdp @ 0x7f81fd000000] rfps: 29.916667 0.011503
1512775955999 - stderr: [sdp @ 0x7f81fd000000] rfps: 30.000000 0.008285
[sdp @ 0x7f81fd000000] rfps: 31.000000 0.011990
    Last message repeated 1 times
[sdp @ 0x7f81fd000000] rfps: 29.970030 0.009380
[sdp @ 0x7f81fd000000] After avformat_find_stream_info() pos: 479 bytes read:479 seeks:0 frames:98
1512775956000 - stderr: Input #0, sdp, from 'data:text/plain;base64,dj0wCm89LSAwIDAgSU4gSVA0IDEyNy4wLjAuMQpzPWUwYzkyZmEwLWRhZDUtMTFlNy04Njg3LTA5MGRkYTk1YjFhNCBmb29ib2FyCmM9SU4gSVA0IDEyNy4wLjAuMQptPWF1ZGlvIDIwMDAwIFJUUC9BVlBGIDEwMAphPXNlbmRyZWN2CmE9cnRjcC1tdXgKYT1ydHBtYXA6MTAwIG9wdXMvNDgwMDAvMgphPWZtdHA6MTAwIG1pbnB0aW1lPTEwOyB1c2VpbmJhbmRmZWM9MQptPXZpZGVvIDIwMDAyIFJUUC9BVlBGIDEwMQphPXNlbmRyZWN2CmE9cnRjcC1tdXgKYT1ydHBtYXA6MTAxIEgyNjQvOTAwMDAKYT1ydGNwLWZiOjEwMSBjY20gZmlyCmE9cnRjcC1mYjoxMDEgbmFjawphPXJ0Y3AtZmI6MTAxIG5hY2sgcGxpCmE9cnRjcC1mYjoxMDEgZ29vZy1yZW1iCmE9cnRjcC1mYjoxMDEgdHJhbnNwb3J0LWNjCmE9Zm10cDoxMDEgbGV2ZWwtYXN5bW1ldHJ5LWFsbG93ZWQ9MTtwYWNrZXRpemF0aW9uLW1vZGU9MTtwcm9maWxlLWxldmVsLWlkPTQyZTAxZgo=':
1512775956000 - stderr:   Metadata:
    title           : e0c92fa0-dad5-11e7-8687-090dda95b1a4 fooboar
  Duration: N/A, start: 0.000000, bitrate: N/A
    Stream #0:0, 70, 1/48000: Audio: opus, 48000 Hz, stereo, fltp1512775956000 - stderr:
    Stream #0:1, 28, 1/90000: Video: h264 (Constrained Baseline), 1 reference frame, yuv420p(progressive, left), 640x480, 0/1, 30 tbr, 90k tbn, 180k tbc
1512775956000 - stderr: Successfully opened the file.
Parsing a group of options: output url /tmp/archive/e0c92fa0-dad5-11e7-8687-090dda95b1a4_10e1c990-dc70-11e7-888d-9f39ca0c79bc/1512775954465.m3u8.
Applying option vcodec (force video codec ('copy' to copy stream)) with argument copy.
Applying option acodec (force audio codec ('copy' to copy stream)) with argument aac.
Applying option map (set input stream mapping) with argument 0:0,0:1.
Applying option map (set input stream mapping) with argument 0:1,0:1.
Applying option bsf:v (A comma-separated list of bitstream filters) with argument h264_mp4toannexb.
Successfully parsed a group of options.
1512775956000 - stderr: Opening an output file: /tmp/archive/e0c92fa0-dad5-11e7-8687-090dda95b1a4_10e1c990-dc70-11e7-888d-9f39ca0c79bc/1512775954465.m3u8.
1512775956000 - stderr: Successfully opened the file.
1512775956001 - stderr: [AVBSFContext @ 0x7f81fcb020a0] The input looks like it is Annex B already
Stream mapping:
1512775956001 - stderr:   Stream #0:0 -> #0:0 [sync #0:1] (opus (native) -> aac (native))
  Stream #0:1 -> #0:1 (copy)
Press [q] to stop, [?] for help
cur_dts is invalid (this is harmless if it occurs once at the start per stream)
stream #0:
  keyframe=1
  duration=0.000
  dts=0.000  pts=0.000
  size=82
1512775956001 - stderr: [SWR @ 0x7f820001e600] Using fltp internally between filters
1512775956002 - stderr: detected 8 logical cores
1512775956003 - stderr: [graph_0_in_0_0 @ 0x7f81fc90dfc0] Setting 'time_base' to value '1/48000'
[graph_0_in_0_0 @ 0x7f81fc90dfc0] Setting 'sample_rate' to value '48000'
1512775956003 - stderr: [graph_0_in_0_0 @ 0x7f81fc90dfc0] Setting 'sample_fmt' to value 'fltp'
[graph_0_in_0_0 @ 0x7f81fc90dfc0] Setting 'channel_layout' to value '0x3'
[graph_0_in_0_0 @ 0x7f81fc90dfc0] tb:1/48000 samplefmt:fltp samplerate:48000 chlayout:0x3
[format_out_0_0 @ 0x7f81fc914da0] Setting 'sample_fmts' to value 'fltp'
[format_out_0_0 @ 0x7f81fc914da0] Setting 'sample_rates' to value '96000|88200|64000|48000|44100|32000|24000|22050|16000|12000|11025|8000|7350'
1512775956004 - stderr: [AVFilterGraph @ 0x7f81fbd02200] query_formats: 4 queried, 9 merged, 0 already done, 0 delayed
1512775956007 - stderr: [hls @ 0x7f81fd80f000] Opening '/tmp/archive/e0c92fa0-dad5-11e7-8687-090dda95b1a4_10e1c990-dc70-11e7-888d-9f39ca0c79bc/15127759544650.ts' for writing
[file @ 0x7f81fbf01ee0] Setting default whitelist 'file,crypto'
1512775956007 - stderr: [mpegts @ 0x7f81fd877800] muxrate VBR, pcr every 9000 pkts, sdt every 2147483647, pat/pmt every 2147483647 pkts
Output #0, hls, to '/tmp/archive/e0c92fa0-dad5-11e7-8687-090dda95b1a4_10e1c990-dc70-11e7-888d-9f39ca0c79bc/1512775954465.m3u8':
  Metadata:
    title           : e0c92fa0-dad5-11e7-8687-090dda95b1a4 fooboar
    encoder         : Lavf57.83.100
1512775956007 - stderr:     Stream #0:0, 0, 1/90000: Audio: aac (LC), 48000 Hz, stereo, fltp, delay 1024, 128 kb/s
    Metadata:
      encoder         : Lavc57.107.100 aac
    Stream #0:1, 0, 1/90000: Video: h264 (Constrained Baseline), 1 reference frame, yuv420p(progressive, left), 640x480 (0x0), 0/1, q=2-31, 30 tbr, 90k tbn, 90k tbc
cur_dts is invalid (this is harmless if it occurs once at the start per stream)
1512775956007 - stderr:     Last message repeated 1 times
stream #0:
  keyframe=1
  duration=0.000
  dts=0.020  pts=0.020
  size=79
1512775956007 - stderr: cur_dts is invalid (this is harmless if it occurs once at the start per stream)
    Last message repeated 1 times
stream #0:
  keyframe=1
  duration=0.000
1512775956007 - stderr:   dts=0.040  pts=0.040
  size=75
1512775956014 - stderr: cur_dts is invalid (this is harmless if it occurs once at the start per stream)
stream #0:
  keyframe=1
  duration=0.000
  dts=0.0601512775956014 - stderr:   pts=0.060
  size=81
1512775956017 - stderr: cur_dts is invalid (this is harmless if it occurs once at the start per stream)
stream #0:
  keyframe=1
  duration=0.000
  dts=0.080  pts=0.080
  size=76
1512775956022 - stderr: cur_dts is invalid (this is harmless if it occurs once at the start per stream)
stream #0:
  keyframe=1
  duration=0.000
  dts=0.1001512775956022 - stderr:   pts=0.100
  size=79
1512775956023 - stderr: cur_dts is invalid (this is harmless if it occurs once at the start per stream)
stream #0:
  keyframe=1
1512775956023 - stderr:   duration=0.000
  dts=0.120  pts=0.120
  size=95
1512775956024 - stderr: cur_dts is invalid (this is harmless if it occurs once at the start per stream)
1512775956024 - stderr: stream #0:
  keyframe=1
  duration=0.000
  dts=0.140  pts=0.140
  size=93
1512775956025 - stderr: cur_dts is invalid (this is harmless if it occurs once at the start per stream)
stream #0:
  keyframe=1
1512775956025 - stderr:   duration=0.000
  dts=0.160  pts=0.160
  size=94
1512775956026 - stderr: cur_dts is invalid (this is harmless if it occurs once at the start per stream)
stream #1:
  keyframe=1
1512775956026 - stderr:   duration=0.000
  dts=N/A  pts=N/A
  size=992
[hls @ 0x7f81fd80f000] Timestamps are unset in a packet for stream 1. This is deprecated and will stop working in the future. Fix your code to set the timestamps properly
1512775956026 - stderr: stream #0:
  keyframe=1
  duration=0.000
1512775956026 - stderr:   dts=0.180  pts=0.180
  size=108
1512775956027 - stderr: stream #1:
  keyframe=0
  duration=0.000
1512775956027 - stderr:   dts=0.002  pts=0.002
  size=3047
[hls @ 0x7f81fd80f000] pkt->duration = 0, maybe the hls segment duration will not precise
stream #0:
  keyframe=1
  duration=0.000
  dts=0.200  pts=0.200
  size=89
1512775956060 - stderr: stream #0:
  keyframe=1
  duration=0.000
  dts=0.220  pts=0.220
  size=73
stream #0:
  keyframe=1
  duration=0.000
  dts=0.240  pts=0.240
  size=78
stream #0:
  keyframe=1
  duration=0.000
  dts=0.260  pts=0.260

注意流#1(视频)的第一帧是如何在多个音频帧之后开始的?具体来说,它从流#0 dts / pts 0.18开始。在这种情况下,a / v同步问题几乎无法察觉,但是对于一堆重复,我已经确定a / v同步偏移始终是在第一个视频帧之前发送长音频帧的持续时间(有时秒)。我一直只开始几十毫秒的RTP流,所以我无法控制输入端的这种差异。

在初始音频帧进入后,第一个视频帧的dts / pts大约为0.我将使用什么ffmpeg设置来相应地调整时间戳?我不关心丢失没有视频的起始音频,因此任何可以调整时间戳的解决方案都可以。

1 个答案:

答案 0 :(得分:1)

我对ffmpeg了解不多,我不确定它是否能够同步单独的音频和视频RTP输入流。当浏览器接收到那些音频/视频RTP流时,根本没有同步问题。实际上,mediasoup正确设置了RTP时间戳和RTCP ntp字段。

我建议你在ffmpeg邮件列表中询问。