In this example the device is exposed to the system as /dev/dri/renderD128, the subject to correction. The profile string, filtering chain output pixel format and H.264 level should match exactly to what the VAAPI context (device pipeline) supports. Here the example is done for Intel Core i5-6200U Mobile with the following capabilities (VLD entries are not implemented in this CPU, and they are left as stubs):
vainfo: VA-API version: 1.20 (libva 2.12.0)
vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 24.1.0 ()
vainfo: Supported profile and entrypoints
VAProfileMPEG2Simple : VAEntrypointVLD
VAProfileMPEG2Main : VAEntrypointVLD
VAProfileH264Main : VAEntrypointVLD
VAProfileH264Main : VAEntrypointEncSliceLP
VAProfileH264High : VAEntrypointVLD
VAProfileH264High : VAEntrypointEncSliceLP
VAProfileJPEGBaseline : VAEntrypointVLD
VAProfileJPEGBaseline : VAEntrypointEncPicture
VAProfileH264ConstrainedBaseline: VAEntrypointVLD
VAProfileH264ConstrainedBaseline: VAEntrypointEncSliceLP
VAProfileVP8Version0_3 : VAEntrypointVLD
VAProfileHEVCMain : VAEntrypointVLD
ffmpeg -init_hw_device vaapi=vadev:/dev/dri/renderD128 \
-hwaccel vaapi \
-hwaccel_device vadev \
-i <input_file> \
-filter_hw_device vadev \
-vf "format=nv12,hwupload" \
-c:v h264_vaapi \
-profile:v main \
-level:v 4.1 \
-q 4 \
<output_file>