使用Intel HD4000 / VA API / FFMPEG / OpenGL在Linux + GPU中编码两个完整的高清视频流

时间:2016-03-28 10:30:27

标签: ffmpeg intel xorg vaapi

我希望在主机上使用linux / xorg实时编码/流式传输两个完整的高清视频流,从笔记本电脑到远程位置。


为此我一直在使用VA API,但5.59 fps的性能非常糟糕(请参阅下面的粘贴)。


使用带有CPU编码的ffmpeg我得到大约200 fps,但是我的英特尔(R)酷睿(TM)i7-3520M CPU @ 2.90GHz的所有内核都忙,风扇开启。




VA API之外似乎有QuickSync,但我还没有尝试过,因为它尚未打包在NixOS上。


h264encode -w 1920 -h 1080 --profile MPSource frame is 1920x1080 and will code clip to 1920x1088 with crop

INPUT:Try to encode H264...
INPUT: Resolution   : 1920x1080, 60 frames
INPUT: FrameRate    : 30
INPUT: Bitrate      : 14929920
INPUT: Slieces      : 1
INPUT: IntraPeriod  : 30
INPUT: IDRPeriod    : 60
INPUT: IpPeriod     : 1
INPUT: Initial QP   : 26
INPUT: Min QP       : 0
INPUT: Source YUV   : AUTO generated
INPUT: Coded Clip   : /tmp/test.264
INPUT: Rec   Clip   : Not save reconstructed frame

libva info: VA-API version 0.38.1
libva info: va_getDriverName() returns 0
libva info: Trying to open /run/opengl-driver/lib/dri/i965_drv_video.so
libva info: Found init function __vaDriverInit_0_38
libva info: va_openDriver() returns 0
Use profile VAProfileH264Main
Support rate control mode (0x12):CBR CQP 
RateControl mode: CQP
Support VAConfigAttribEncPackedHeaders
Support packed sequence headers
Support packed picture headers
Support packed slice headers
Support packed misc headers
Support 1 RefPicList0 and 1 RefPicList1
Loading data into surface 15.....Complete surface loading
      \00000059(054456 bytes coded)

PERFORMANCE:   Frame Rate           : 5.59 fps (60 frames, 10730 ms (178.83 ms per frame))
PERFORMANCE:   Compression ratio    : 51:1
PERFORMANCE:     UploadPicture      : 10467 ms (174.45, 97.55% percent)
PERFORMANCE:     vaBeginPicture     : 0 ms (0.00, 0.00% percent)
PERFORMANCE:     vaRenderHeader     : 1 ms (0.02, 0.01% percent)
PERFORMANCE:     vaEndPicture       : 42 ms (0.70, 0.39% percent)
PERFORMANCE:     vaSyncSurface      : 244 ms (4.07, 2.27% percent)
PERFORMANCE:     SavePicture        : 7 ms (0.12, 0.07% percent)
PERFORMANCE:     Others             : -31 ms (71582787.75, 40027653.91% percent)
(Multithread enabled, the timing is only for reference)


I have written a write-up将使您能够在ffmpeg和libav上部署和利用基于VAAPI的硬件编码,并且作为一个优点,根据您的硬件给出您可能遇到的任何编码器限制的参考。


在Skylake验证测试平台上构建支持VAAPI的FFmpeg二进制文件,支持VP8 / 9解码和编码硬件加速:

构建平台:Ubuntu 16.04LTS。



  1. cmrt
  2. 这是适用于英特尔G45的媒体运行时GPU内核管理器的C& HD Graphics系列。 它是构建intel-hybrid-driver包的先决条件。

    git clone https://github.com/01org/cmrt
    cd cmrt
    time make -j$(nproc) VERBOSE=1
    sudo make -j$(nproc) install
    sudo ldconfig -vvvv
    1. intel-hybrid-driver
    2. 此软件包为WebM项目VPx编解码器提供支持。 GPU加速 通过在英特尔GEN GPU上执行的媒体内核提供。混合驱动程序提供CPU 绑定熵(例如,CPBAC)解码并管理GEN GPU媒体内核参数和缓冲区。


      git clone https://github.com/01org/intel-hybrid-driver
      cd intel-hybrid-driver
      time make -j$(nproc) VERBOSE=1
      sudo make -j$(nproc) install
      sudo ldconfig -vvv
      1. intel-vaapi-driver
      2. 此软件包为Intel GEN Graphics系列SKU提供VA-API(视频加速API)用户模式驱动程序。 当前的视频驱动程序后端通过打包缓冲区和命令提供到GEN GPU的桥接,以便发送到i915驱动程序,以执行视频解码,编码和处理的硬件和着色器功能。 当调用它以在受支持的硬件上处理VP8 / 9混合解码任务时,它还为intel-hybrid-driver提供了一个包装器(必须使用--enable-hybrid-codec选项配置)。

        git clone https://github.com/01org/intel-vaapi-driver
        cd intel-vaapi-driver
        ./configure --enable-hybrid-codec
        time make -j$(nproc) VERBOSE=1
        sudo make -j$(nproc) install
        sudo ldconfig -vvvv
        1. libva
        2. Libva是VA-API(视频加速API)的实现


          git clone https://github.com/01org/libva
          cd libva
          time make -j$(nproc) VERBOSE=1
          sudo make -j$(nproc) install
          sudo ldconfig -vvvv




          libva info: VA-API version 0.40.0
          libva info: va_getDriverName() returns 0
          libva info: Trying to open /usr/local/lib/dri/i965_drv_video.so
          libva info: Found init function __vaDriverInit_0_40
          libva info: va_openDriver() returns 0
          vainfo: VA-API version: 0.40 (libva 1.7.3)
          vainfo: Driver version: Intel i965 driver for Intel(R) Skylake - 1.8.3.pre1 (glk-alpha-58-g5a984ae)
          vainfo: Supported profile and entrypoints
                VAProfileMPEG2Simple            : VAEntrypointVLD
                VAProfileMPEG2Simple            : VAEntrypointEncSlice
                VAProfileMPEG2Main              : VAEntrypointVLD
                VAProfileMPEG2Main              : VAEntrypointEncSlice
                VAProfileH264ConstrainedBaseline: VAEntrypointVLD
                VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
                VAProfileH264ConstrainedBaseline: VAEntrypointEncSliceLP
                VAProfileH264Main               : VAEntrypointVLD
                VAProfileH264Main               : VAEntrypointEncSlice
                VAProfileH264Main               : VAEntrypointEncSliceLP
                VAProfileH264High               : VAEntrypointVLD
                VAProfileH264High               : VAEntrypointEncSlice
                VAProfileH264High               : VAEntrypointEncSliceLP
                VAProfileH264MultiviewHigh      : VAEntrypointVLD
                VAProfileH264MultiviewHigh      : VAEntrypointEncSlice
                VAProfileH264StereoHigh         : VAEntrypointVLD
                VAProfileH264StereoHigh         : VAEntrypointEncSlice
                VAProfileVC1Simple              : VAEntrypointVLD
                VAProfileVC1Main                : VAEntrypointVLD
                VAProfileVC1Advanced            : VAEntrypointVLD
                VAProfileNone                   : VAEntrypointVideoProc
                VAProfileJPEGBaseline           : VAEntrypointVLD
                VAProfileJPEGBaseline           : VAEntrypointEncPicture
                VAProfileVP8Version0_3          : VAEntrypointVLD
                VAProfileVP8Version0_3          : VAEntrypointEncSlice
                VAProfileHEVCMain               : VAEntrypointVLD
                VAProfileHEVCMain               : VAEntrypointEncSlice
                VAProfileVP9Profile0            : VAEntrypointVLD




          sudo mkdir -p /apps/ffmpeg/dyn
          sudo chown -Rc $USER:$USER /apps/ffmpeg/dyn
          mkdir -p ~/ffmpeg_sources


          <强>(a)中。构建和部署nasm: Nasm是x264和FFmpeg使用的x86优化的汇编程序。强烈推荐或您的结果可能非常慢。


          cd ~/ffmpeg_sources
          wget wget http://www.nasm.us/pub/nasm/releasebuilds/2.14rc0/nasm-2.14rc0.tar.gz
          tar xzvf nasm-2.14rc0.tar.gz
          cd nasm-2.14rc0
          ./configure --prefix="/apps/ffmpeg/dyn" --bindir="/apps/ffmpeg/dyn/bin"
          make -j$(nproc) VERBOSE=1
          make -j$(nproc) install
          make -j$(nproc) distclean

          <强>(b)中。静态构建和部署libx264: 该库提供H.264视频编码器。有关更多信息和用法示例,请参阅H.264 Encoding Guide。 这需要使用 - enable-gpl - enable-libx264 配置ffmpeg。

          cd ~/ffmpeg_sources
          wget http://download.videolan.org/pub/x264/snapshots/last_x264.tar.bz2
          tar xjvf last_x264.tar.bz2
          cd x264-snapshot*
          PATH="/apps/ffmpeg/dyn/bin:$PATH" ./configure --prefix="/apps/ffmpeg/dyn" --bindir="/apps/ffmpeg/dyn/bin" --enable-static --disable-opencl
          PATH="/apps/ffmpeg/dyn/bin:$PATH" make -j$(nproc) VERBOSE=1
          make -j$(nproc) install VERBOSE=1
          make -j$(nproc) distclean

          <强>(c)中。构建和配置libx265: 该库提供H.265 / HEVC视频编码器。有关更多信息和用法示例,请参阅H.265 Encoding Guide

          sudo apt-get install cmake mercurial
          cd ~/ffmpeg_sources
          hg clone https://bitbucket.org/multicoreware/x265
          cd ~/ffmpeg_sources/x265/build/linux
          PATH="$/apps/ffmpeg/dyn/bin:$PATH" cmake -G "Unix Makefiles" -DCMAKE_INSTALL_PREFIX="/apps/ffmpeg/dyn" -DENABLE_SHARED:bool=off ../../source
          make -j$(nproc) VERBOSE=1
          make -j$(nproc) install VERBOSE=1
          make -j$(nproc) clean VERBOSE=1

          <强>(d)中。构建和部署libfdk-aac库: 这提供了AAC音频编码器。有关更多信息和用法示例,请参阅AAC Audio Encoding Guide。 这需要使用 - enable-libfdk-aac 配置ffmpeg(如果还包含 - enable-gpl ,则 - enable-nonfree >)。

          cd ~/ffmpeg_sources
          wget -O fdk-aac.tar.gz https://github.com/mstorsjo/fdk-aac/tarball/master
          tar xzvf fdk-aac.tar.gz
          cd mstorsjo-fdk-aac*
          autoreconf -fiv
          ./configure --prefix="/apps/ffmpeg/dyn" --disable-shared
          make -j$(nproc)
          make -j$(nproc) install
          make -j$(nproc) distclean


             cd ~/ffmpeg_sources
             git clone https://github.com/webmproject/libvpx/
             cd libvpx
             ./configure --prefix="/apps/ffmpeg/dyn" --enable-runtime-cpu-detect --enable-vp9 --enable-vp8 \
             --enable-postproc --enable-vp9-postproc --enable-multi-res-encoding --enable-webm-io --enable-vp9-highbitdepth --enable-onthefly-bitpacking --enable-realtime-only \
             --cpu=native --as=yasm
             time make -j$(nproc)
             time make -j$(nproc) install
             time make clean -j$(nproc)
             time make distclean


             cd ~/ffmpeg_sources
             wget -c -v http://downloads.xiph.org/releases/vorbis/libvorbis-1.3.5.tar.xz
             tar -xvf libvorbis-1.3.5.tar.xz
             cd libvorbis-1.3.5
             ./configure --enable-static --prefix="/apps/ffmpeg/dyn"
             time make -j$(nproc)
             time make -j$(nproc) install
             time make clean -j$(nproc)
             time make distclean


          cd ~/ffmpeg_sources
          git clone https://github.com/FFmpeg/FFmpeg -b master
          cd FFmpeg
          PATH="/apps/ffmpeg/dyn/bin:$PATH" PKG_CONFIG_PATH="/apps/ffmpeg/dyn/lib/pkgconfig" ./configure \
            --pkg-config-flags="--static" \
            --prefix="/apps/ffmpeg/dyn" \
            --extra-cflags="-I/apps/ffmpeg/dyn/include" \
            --extra-ldflags="-L/apps/ffmpeg/dyn/lib" \
            --bindir="/apps/ffmpeg/dyn/bin" \
            --enable-debug=3 \
            --enable-vaapi \
            --enable-libvorbis \
            --enable-libvpx \
            --enable-gpl \
            --cpu=native \
            --enable-opengl \
            --enable-libfdk-aac \
            --enable-libx264 \
            --enable-libx265 \
          PATH="/apps/ffmpeg/dyn/bin:$PATH" make -j$(nproc) 
          make -j$(nproc) install 
          make -j$(nproc) distclean 
          hash -r




          less /usr/share/modules/modulefiles/ffmpeg/vaapi
          ## ffmpeg media transcoder modulefile
          ## By Dennis Mungai <dmngaie@gmail.com>
          ## February, 2016
          # for Tcl script use only
          set     appname         ffmpeg
          set     version         dyn
          set     prefix          /apps/${appname}/${version}
          set     exec_prefix     ${prefix}/bin
          conflict        ffmpeg/git
          prepend-path    PATH            ${exec_prefix}
          prepend-path    LD_LIBRARY_PATH ${prefix}/lib


          module load ffmpeg/vaapi


          which ffmpeg





          ffmpeg  -hide_banner -encoders | grep vaapi 
           V..... h264_vaapi           H.264/AVC (VAAPI) (codec h264)
           V..... hevc_vaapi           H.265/HEVC (VAAPI) (codec hevc)
           V..... mjpeg_vaapi          MJPEG (VAAPI) (codec mjpeg)
           V..... mpeg2_vaapi          MPEG-2 (VAAPI) (codec mpeg2video)
           V..... vp8_vaapi            VP8 (VAAPI) (codec vp8)


          ffmpeg -hide_banner -h encoder='encoder name'


          使用GNU parallel,我们将使用以下示例将系统上〜/ src路径上的一些mp4文件(4k H.264测试样本,每个40分钟,AAC 6声道音频)分别编码为VP8和HEVC。请注意,我已根据我的使用情况调整了编码器,并启用了重新调整为1080p的功能。根据需要进行调整。



          parallel -j 10 --verbose '/apps/ffmpeg/dyn/bin/ffmpeg -loglevel debug -threads 4 -hwaccel vaapi -i "{}"  -vaapi_device /dev/dri/renderD129 -c:v vp8_vaapi -loop_filter_level:v 63 -loop_filter_sharpness:v 15 -b:v 4500k -maxrate:v 7500k -vf 'format=nv12,hwupload,scale_vaapi=w=1920:h=1080' -c:a libvorbis -b:a 384k -ac 6 -f webm "{.}.webm"' ::: $(find . -type f -name '*.mp4')

          使用GNU Parallel的HEVC:

          到HEVC Main Profile,同时启动10个编码作业:

          parallel -j 4 --verbose '/apps/ffmpeg/dyn/bin/ffmpeg -loglevel debug -threads 4 -hwaccel vaapi -i "{}"  -vaapi_device /dev/dri/renderD129 -c:v hevc_vaapi -qp:v 19 -b:v 2100k -maxrate:v 3500k -vf 'format=nv12,hwupload,scale_vaapi=w=1920:h=1080' -c:a libvorbis -b:a 384k -ac 6 -f matroska "{.}.mkv"' ::: $(find . -type f -name '*.mp4')


          1. 英特尔的QuickSync非常高效。请参阅同时运行10个编码的电源利用率跟踪和平均系统负载here
          2. Skylake的HEVC编码器非常慢,我怀疑在我的硬件上,可能比基于软件的x265编码器和kvazaar的HEVC编码器慢。但是,它的&#39;经过精心调校后,质量明显优于其他基于硬件的编码器,例如Maxwell GM200系列SKU上的Nvidia NVENC HEVC编码器。然而,Pascal上的NVENC编码器比英特尔Skylake HEVC编码器实现的速度更快,更优越。
          3. 与Nvidia的NVENC不同,消费者SKU没有同时编码限制。我能够与VAAPI同时运行10个编码会话,而对于NVENC,我在测试平台上的GeForce GTX系列GPU上限制了两个最大同时编码。干得好,英特尔。
          4. 截至今天,现在可以为FFmpeg提供VP9硬件加速编码。但是,您需要一个基于Intel Kabylake的集成GPU才能利用此功能。



            ffmpeg -h vp9_vaapi


            Encoder vp9_vaapi [VP9 (VAAPI)]:
                General capabilities: delay 
                Threading capabilities: none
                Supported pixel formats: vaapi_vld
            vp9_vaapi AVOptions:
              -loop_filter_level <int>        E..V.... Loop filter level (from 0 to 63) (default 16)
              -loop_filter_sharpness <int>        E..V.... Loop filter sharpness (from 0 to 15) (default 4)



            [Parsed_format_0 @ 0x42cb500] compat: called with args=[nv12]
            [Parsed_format_0 @ 0x42cb500] Setting 'pix_fmts' to value 'nv12'
            [Parsed_scale_vaapi_2 @ 0x42cc300] Setting 'w' to value '1920'
            [Parsed_scale_vaapi_2 @ 0x42cc300] Setting 'h' to value '1080'
            [graph 0 input from stream 0:0 @ 0x42cce00] Setting 'video_size' to value '3840x2026'
            [graph 0 input from stream 0:0 @ 0x42cce00] Setting 'pix_fmt' to value '0'
            [graph 0 input from stream 0:0 @ 0x42cce00] Setting 'time_base' to value '1/1000'
            [graph 0 input from stream 0:0 @ 0x42cce00] Setting 'pixel_aspect' to value '1/1'
            [graph 0 input from stream 0:0 @ 0x42cce00] Setting 'sws_param' to value 'flags=2'
            [graph 0 input from stream 0:0 @ 0x42cce00] Setting 'frame_rate' to value '24000/1001'
            [graph 0 input from stream 0:0 @ 0x42cce00] w:3840 h:2026 pixfmt:yuv420p tb:1/1000 fr:24000/1001 sar:1/1 sws_param:flags=2
            [format @ 0x42cba40] compat: called with args=[vaapi_vld]
            [format @ 0x42cba40] Setting 'pix_fmts' to value 'vaapi_vld'
            [auto_scaler_0 @ 0x42cd580] Setting 'flags' to value 'bicubic'
            [auto_scaler_0 @ 0x42cd580] w:iw h:ih flags:'bicubic' interl:0
            [Parsed_format_0 @ 0x42cb500] auto-inserting filter 'auto_scaler_0' between the filter 'graph 0 input from stream 0:0' and the filter 'Parsed_format_0'
            [AVFilterGraph @ 0x42ca360] query_formats: 6 queried, 4 merged, 1 already done, 0 delayed
            [auto_scaler_0 @ 0x42cd580] w:3840 h:2026 fmt:yuv420p sar:1/1 -> w:3840 h:2026 fmt:nv12 sar:1/1 flags:0x4
            [hwupload @ 0x42cbcc0] Surface format is nv12.
            [AVHWFramesContext @ 0x42ccbc0] Created surface 0x4000000.
            [AVHWFramesContext @ 0x42ccbc0] Direct mapping possible.
            [AVHWFramesContext @ 0x42c3e40] Created surface 0x4000001.
            [AVHWFramesContext @ 0x42c3e40] Direct mapping possible.
            [AVHWFramesContext @ 0x42c3e40] Created surface 0x4000002.
            [AVHWFramesContext @ 0x42c3e40] Created surface 0x4000003.
            [AVHWFramesContext @ 0x42c3e40] Created surface 0x4000004.
            [AVHWFramesContext @ 0x42c3e40] Created surface 0x4000005.
            [AVHWFramesContext @ 0x42c3e40] Created surface 0x4000006.
            [AVHWFramesContext @ 0x42c3e40] Created surface 0x4000007.
            [AVHWFramesContext @ 0x42c3e40] Created surface 0x4000008.
            [AVHWFramesContext @ 0x42c3e40] Created surface 0x4000009.
            [AVHWFramesContext @ 0x42c3e40] Created surface 0x400000a.
            [vp9_vaapi @ 0x409da40] Encoding entrypoint not found (19 / 6).
            Error initializing output stream 0:0 -- Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height
            [AVIOContext @ 0x40fdac0] Statistics: 0 seeks, 0 writeouts
            [aac @ 0x40fcb00] Qavg: -nan
            [AVIOContext @ 0x409f820] Statistics: 32768 bytes read, 0 seeks
            Conversion failed!


            libva info: VA-API version 0.40.0
            libva info: va_getDriverName() returns 0
            libva info: Trying to open /usr/local/lib/dri/i965_drv_video.so
            libva info: Found init function __vaDriverInit_0_40
            libva info: va_openDriver() returns 0
            vainfo: VA-API version: 0.40 (libva 1.7.3)
            vainfo: Driver version: Intel i965 driver for Intel(R) Skylake - 1.8.4.pre1 (glk-alpha-71-gc3110dc)
            vainfo: Supported profile and entrypoints
                  VAProfileMPEG2Simple            : VAEntrypointVLD
                  VAProfileMPEG2Simple            : VAEntrypointEncSlice
                  VAProfileMPEG2Main              : VAEntrypointVLD
                  VAProfileMPEG2Main              : VAEntrypointEncSlice
                  VAProfileH264ConstrainedBaseline: VAEntrypointVLD
                  VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
                  VAProfileH264ConstrainedBaseline: VAEntrypointEncSliceLP
                  VAProfileH264Main               : VAEntrypointVLD
                  VAProfileH264Main               : VAEntrypointEncSlice
                  VAProfileH264Main               : VAEntrypointEncSliceLP
                  VAProfileH264High               : VAEntrypointVLD
                  VAProfileH264High               : VAEntrypointEncSlice
                  VAProfileH264High               : VAEntrypointEncSliceLP
                  VAProfileH264MultiviewHigh      : VAEntrypointVLD
                  VAProfileH264MultiviewHigh      : VAEntrypointEncSlice
                  VAProfileH264StereoHigh         : VAEntrypointVLD
                  VAProfileH264StereoHigh         : VAEntrypointEncSlice
                  VAProfileVC1Simple              : VAEntrypointVLD
                  VAProfileVC1Main                : VAEntrypointVLD
                  VAProfileVC1Advanced            : VAEntrypointVLD
                  VAProfileNone                   : VAEntrypointVideoProc
                  VAProfileJPEGBaseline           : VAEntrypointVLD
                  VAProfileJPEGBaseline           : VAEntrypointEncPicture
                  VAProfileVP8Version0_3          : VAEntrypointVLD
                  VAProfileVP8Version0_3          : VAEntrypointEncSlice
                  VAProfileHEVCMain               : VAEntrypointVLD
                  VAProfileHEVCMain               : VAEntrypointEncSlice
                  VAProfileVP9Profile0            : VAEntrypointVLD


            使用Kabylake试验台,运行这些编码测试并报告: - )
