prev/next

Encoding Settings

Advanced configuration options for the libx264 video codec

x264 ffmpeg mapping and options guide

This guide maps most of x264's options to FFmpeg's options along with detailed descriptions by x264 developer Dark_Shikari.

FFmpeg developer superdump has implemented x264 presets to FFmpeg. You can find his guide here.

Frame-type options:

g <integer>
Keyframe interval, also known as GOP

 More -->

x264 ffmpeg mapping and options guide

This guide maps most of x264's options to FFmpeg's options along with detailed descriptions by x264 developer Dark_Shikari.

FFmpeg developer superdump has implemented x264 presets to FFmpeg. You can find his guide here.

Frame-type options:

g <integer>
Keyframe interval, also known as GOP length. This determines the maximum distance between I-frames. Very high GOP lengths will result in slightly more efficient compression, but will make seeking in the video somewhat more difficult. Recommended default: 250


    keyint_min <integer>
    Minimum GOP length, the minimum distance between I-frames. Recommended default: 25


    sc_threshold <integer>
    Adjusts the sensitivity of x264's scenecut detection. Rarely needs to be adjusted. Recommended default: 40


bf <integer>
B-frames are a core element of H.264 and are more efficient in H.264 than any previous standard. Some specific targets, such as HD-DVD and Blu-Ray, have limitations on the number of consecutive B-frames. Most, however, do not; as a result, there is rarely any negative effect to setting this to the maximum (16) since x264 will, if B-adapt is used, automatically choose the best number of B-frames anyways. This parameter simply serves to limit the max number of B-frames.

Notes:

- Baseline Profile, such as that used by iPods, does not support B-frames. Recommended default: 16. 

- If you want to generate H.264 baseline for older iPhone 3 phones, then do not use Advanced Parameters like B-frames or else the output will be generated as Main profile.


0: Very fast, but not recommended. Does not work with pre-scenecut (scenecut must be off to force off b-adapt).

1: Fast, default mode in x264. A good balance between speed and quality.

2: A much slower but more accurate B-frame decision mode that correctly detects fades and generally gives considerably better quality. Its speed gets considerably slower at high bframes values, so its recommended to keep bframes relatively low (perhaps around 3) when using this option. It also may slow down the first pass of x264 when in threaded mode.

    b_strategy <integer>
    x264, by default, adaptively decides through a low-resolution lookahead the best number of B-frames to use. It is possible to disable this adaptivity; this is not recommended. Recommended default: 1


    bframebias 
    Make x264 more likely to choose higher numbers of B-frames during the adaptive lookahead. Not generally recommended. Recommended default: 0


    flags2 +bpyramid
    Allows B-frames to be kept as references. The name is technically misleading, as x264 does not actually use pyramid coding; it simply adds B-references to the normal reference list. B-references get a quantizer halfway between that of a B-frame and P-frame. This setting is generally beneficial, but it increases the DPB (decoding picture buffer) size required for playback, so when encoding for hardware, disabling it may help compatibility.


    coder
    CABAC is the default entropy encoder used by x264. Though somewhat slower on both the decoding and encoding end, it offers 10-15% improved compression on live-action sources and considerably higher improvements on animated sources, especially at low bitrates. It is also required for the use of trellis quantization. Disabling CABAC may somewhat improve decoding performance, especially at high bitrates. CABAC is not allowed in Baseline Profile. Recommended default: -coder 1 (CABAC enabled)


    refs <integer>
    One of H.264's most useful features is the abillity to reference frames other than the one immediately prior to the current frame. This parameter lets one specify how many references can be used, through a maximum of 16. Increasing the number of refs increases the DPB (Decoded Picture Buffer) requirement, which means hardware playback devices will often have strict limits to the number of refs they can handle. In live-action sources, more reference have limited use beyond 4-8, but in cartoon sources up to the maximum value of 16 is often useful. More reference frames require more processing power because every frame is searched by the motion search (except when an early skip decision is made). The slowdown is especially apparent with slower motion estimation methods. Recommended default: -refs 6


    flags
    Disable loop filter. Recommended default: -flags +loop (Enabled)

    deblockalpha <integer>
    deblockbeta <integer>
    One of H.264's main features is the in-loop deblocker, which avoids the problem of blocking artifacts disrupting motion estimation. This requires a small amount of decoding CPU, but considerably increases quality in nearly all cases. Its strength may be raised or lowered in order to avoid more artifacts or keep more detail, respectively. Deblock has two parameters: alpha (strength) and beta (threshold). Recommended defaults:-deblockalpha 0 -deblockbeta 0 (Must have '-flags +loop')

Ratecontrol:


    cqp <integer>
    Constant quantizer mode. Not exactly constant completely--B-frames and I-frames have different quantizers from P-frames. Generally should not be used, since CRF gives better quality at the same bitrate.


    b <integer>
    Enables target bitrate mode. Attempts to reach a specific bitrate. Should be used in 2-pass mode whenever possible; 1-pass bitrate mode is generally the worst ratecontrol mode x264 has.


    crf <float>
    Constant quality mode (also known as constant ratefactor). Bitrate corresponds approximately to that of constant quantizer, but gives better quality overall at little speed cost. The best one-pass option in x264.


    maxrate <integer>
    Specifies the maximum bitrate at any point in the video. Requires the VBV buffersize to be set. This option is generally used when encoding for a piece of hardware with bitrate limitations.


    bufsize <integer>
    Depends on the profile level of the video being encoded. Set only if you're encoding for a hardware device.


    rc_init_occupancy <float>
    Initial VBV buffer occupancy. Note: Don't mess with this.


    qmin <integer>
    Minimum quantizer. Doesn't need to be changed. Recommended default: -qmin 10


    qmax <integer>
    Maximum quantizer. Doesn't need to be changed. Recommended default: -qmax 51


    qdiff <integer>
    Set max QP step. Recommended default: -qdiff 4


    bt <float>
    Allowed variance of average bitrate


    i_qfactor <float>
    Qscale difference between I-frames and P-frames. Note: -i_qfactor is handled a little differently than --ipratio. Recommended: -i_qfactor 0.71

    b_qfactor <float>
    Qscale difference between P-frames and B-frames.

    chromaoffset <integer>
    QP difference between chroma and luma.


    pass <1,2,3>
    Used with --bitrate. Pass 1 writes the stats file, pass 2 reads it, and 3 both reads and writes it. If you want to use three pass, this means you will have to use --pass 1 for the first pass, --pass 3 for the second, and --pass 2 or 3 for the third.

    rc_eq <string>
    Ratecontrol equation. Recommended default: -rc_eq 'blurCplx^(1-qComp)'

    qcomp <float>
    QP curve compression: 0.0 => CBR, 1.0 => CQP. Recommended default: -qcomp 0.60


    complexityblur <float>
    Reduce fluctuations in QP (before curve compression) [20.0]


    qblur <float>
    Reduce fluctuations in QP (after curve compression) [0.5]


p8x8 (x264) /+partp8x8 (FFmpeg)

p4x4 (x264) /+partp4x4 (FFmpeg)

b8x8 (x264) /+partb8x8 (FFmpeg)

i8x8 (x264) /+parti8x8 (FFmpeg)

i4x4 (x264) /+parti4x4 (FFmpeg)


    partitions <string> One of H.264's most useful features is the ability to choose among many combinations of inter and intra partitions. P-macroblocks can be subdivided into 16x8, 8x16, 8x8, 4x8, 8x4, and 4x4 partitions. B-macroblocks can be divided into 16x8, 8x16, and 8x8 partitions. I-macroblocks can be divided into 4x4 or 8x8 partitions. Analyzing more partition options improves quality at the cost of speed. The default is to analyze all partitions except p4x4 (p8x8, i8x8, i4x4, b8x8), since p4x4 is not particularly useful except at high bitrates and lower resolutions. Note that i8x8 requires 8x8dct, and is therefore a High Profile-only partition. p8x8 is the most costly, speed-wise, of the partitions, but also gives the most benefit. Generally, whenever possible, all partition types except p4x4 should be used.


    directpred <integer>
    B-frames in H.264 can choose between spatial and temporal prediction mode. Auto allows x264 to pick the best of these; the heuristic used is whichever mode allows more skip macroblocks. Auto should generally be used.

    flags2 +wpred
    This allows B-frames to use weighted prediction options other than the default. There is no real speed cost for this, so it should always be enabled.


dia (x264) / epzs (FFmpeg) is the simplest search, consisting of starting at the best predictor, checking the motion vectors at one pixel upwards, left, down, and to the right, picking the best, and repeating the process until it no longer finds any better motion vector.

hex (x264) / hex (FFmpeg) consists of a similar strategy, except it uses a range-2 search of 6 surrounding points, thus the name. It is considerably more efficient than DIA and hardly any slower, and therefore makes a good choice for general-use encoding.

umh (x264) / umh (FFmpeg) is considerably slower than HEX, but searches a complex multi-hexagon pattern in order to avoid missing harder-to-find motion vectors. Unlike HEX and DIA, the merange parameter directly controls UMH's search radius, allowing one to increase or decrease the size of the wide search.

esa (x264) / full (FFmpeg) is a highly optimized intelligent search of the entire motion search space within merange of the best predictor. It is mathematically equivalent to the bruteforce method of searching every single motion vector in that area, though faster. However, it is still considerably slower than UMH, with not too much benefit, so is not particularly useful for everyday encoding.


    me_method <epzs,hex,umh,full> One of the most important settings for x264, both speed and quality-wise.


    me_range <integer>
    MErange controls the max range of the motion search. For HEX and DIA, this is clamped to between 4 and 16, with a default of 16. For UMH and ESA, it can be increased beyond the default 16 to allow for a wider-range motion search, which is useful on HD footage and for high-motion footage. Note that for UMH and ESA, increasing MErange will significantly slow down encoding.


1: Fastest, but extremely low quality. Should be avoided except on first pass encoding.

2-5: Progressively better and slower, 5 serves as a good medium for higher speed encoding.

6-7: 6 is the default. Activates rate-distortion optimization for partition decision. This can considerably improve efficiency, though it has a notable speed cost. 6 activates it in I/P frames, and subme7 activates it in B frames.

8-9: Activates rate-distortion refinement, which uses RDO to refine both motion vectors and intra prediction modes. Slower than subme 6, but again, more efficient.


    subq An extremely important encoding parameter which determines what algorithms are used for both subpixel motion searching and partition decision.

    flags2 +mixed_refs
    H.264 allows p8x8 blocks to select different references for each p8x8 block. This option allows this analysis to be done, and boosts quality with little speed impact. It should generally be used, though it obviously has no effect with only one reference frame.


    flags2 +dct8x8
    Gives a notable quality boost by allowing x264 to choose between 8x8 and 4x4 frequency transform size. Required for i8x8 partitions. Speed cost for this option is near-zero both for encoding and decoding; the only reason to disable it is when one needs support on a device not compatible with High Profile.


0: disabled

1: enabled only on the final encode of a MB

2: enabled on all mode decisions


    trellis <0,1,2>The main decision made in quantization is which coefficients to round up and which to round down. Trellis chooses the optimal rounding choices for the maximum rate-distortion score, to maximize PSNR relative to bitrate. This generally increases quality relative to bitrate by about 5% for a somewhat small speed cost. It should generally be enabled. Note that trellis requires CABAC.


flags2 -fastpskip
By default, x264 will skip macroblocks in P-frames that don't appear to have changed enough between two frames to justify encoding the difference. This considerably speeds up encoding. However, for a slight quality boost, P-skip can be disabled. In this case, the full analysis will be done on all P-blocks, and the only skips in the output stream will be the blocks whose motion vectors happen to match that of the skip vector and motion vectors happen to match that of the skip vector and which have no residual. The speed cost of enabling no-fast-pskip is relatively high, especially with many reference frames. There is a similar B-skip internal to x264, which is why B-frames generally encode much faster than P-frames, but it cannot be disabled on the commandline.

<-- Hide
tools: email  |  print  |  share  |  click to rate (rated 10 times):
  • BlinkList
  • Del.icio.us
  • Digg
  • Facebook
  • Google Bookmarks
  • LinkedIn
  • MySpace
  • Newsvine
  • Reddit
  • Sphinn
  • Technorati
  • Twitter

What is the difference between CBR and VBR encoding?

Constant bit rate (CBR) encoding persists the set data rate to your setting over the whole video clip. Use CBR only if your clip contains a similar motion level across the entire duration.  CBR is most commonly used for streaming video content using the Flash Media Server (rtmp)
 
Variable bit rate (VBR) encoding adjusts the data rate
 More -->
Constant bit rate (CBR) encoding persists the set data rate to your setting over the whole video clip. Use CBR only if your clip contains a similar motion level across the entire duration.  CBR is most commonly used for streaming video content using the Flash Media Server (rtmp)
 
Variable bit rate (VBR) encoding adjusts the data rate down and to the upper limit you set, based on the data required by the compressor. VBR takes longer to encode but produces the most favorable results.  VBR is most commonly used for http delivery if video content (http progressive)
 
We recommend you do not use CBR unless you have a specific need for playback on a device that only supports CBR. Our default VRB mode will produce higher quality  at competitive bitrates.

Signup for a Free 1GB
of encoding each month and test the difference between CBR and VBR on your own.  When you are using Encoding.com you can chose CBR or VBR in each of your encoding jobs within the API, web interface, or watch folder

In the web interface or watch folder after you specify your source video location, click the "Customize" button and expand the "Video Settings" menu and select CBR to "no" or "yes" 

When using the Encoding.com API set the <cbr>no<cbr/> or <cbr>yes<cbr/> in the <format> section of your API calls.
<-- Hide
tools: email  |  print  |  share  |  click to rate (rated 28 times):
  • BlinkList
  • Del.icio.us
  • Digg
  • Facebook
  • Google Bookmarks
  • LinkedIn
  • MySpace
  • Newsvine
  • Reddit
  • Sphinn
  • Technorati
  • Twitter

How to combine multiple video files to a single one

This feature allows you to combine several video files to one file.

 You can use it via User Interface, or via API.

 User Interface

 To use this feature via user interface, you have to add several video sources in "Add Media" section. And they would automatically combined to one file (in sequential order) during the encoding process.

 More -->

This feature allows you to combine several video files to one file.

 You can use it via User Interface, or via API.

 User Interface

 To use this feature via user interface, you have to add several video sources in "Add Media" section. And they would automatically combined to one file (in sequential order) during the encoding process.

 API

 To combine several video sources to one file, you should use several <source> elements in your XML request.

 For example:

<?xml version="1.0"?>
<query>
<!-- Main fields -->
    <userid>[UserID]</userid><br>
    <userkey>[UserKey]</userkey>
    <action>[Action]</action>
    <mediaid>[MediaID]</mediaid>
    <source>[SourceFile]</source>
    <source>[SourceFile1]</source> <!-- if multiple SourceFile added, they will be concatenated -->
    ...
    <source>[SourceFileN]</source>
    <format>
      [FormatFields]
    </format>
</query>
SourceFile1..SourceFileN — you can specify several source files as well. All of them will be combined to one file (in sequential order) during the encoding process. You can use different source URLs (HTTP, FTP, S3, CloudFiles) with different media properties (format, frame size, bitrate, codec, etc.) Read more
<-- Hide
tools: email  |  print  |  share  |  click to rate (rated 10 times):
  • BlinkList
  • Del.icio.us
  • Digg
  • Facebook
  • Google Bookmarks
  • LinkedIn
  • MySpace
  • Newsvine
  • Reddit
  • Sphinn
  • Technorati
  • Twitter

Understanding bitrates in video files

We often field questions from customers about how the bitrate relates to both quality and the total size of the file. This can confusing to people new to encoding, so I'll try to cover the key points here. Generally the higher the bitrate the higher the image quality of the video output. Modern codecs like H.264 will look noticeably better

 More -->

We often field questions from customers about how the bitrate relates to both quality and the total size of the file. This can confusing to people new to encoding, so I'll try to cover the key points here. Generally the higher the bitrate the higher the image quality of the video output. Modern codecs like H.264 will look noticeably better at the same bitrate vs. older codecs like H.263, and variable bitrate (VBR) will look better than constant bitrate (CBR) in most applications. Keep in mind, there are 8 bits in a byte. So 1 megabyte per second would be 8 megabits per second (mbps). For reference, HD Blu-ray video is generally around 20mbps, standard definition DVD around 6mbps, high-quality web video about 2 mbps, and video for phones in the kilobit range (kbps). Here is the math from testing VP6 output for a video with a duration of 93 seconds: On2 Flix VP6 = 2,080 kbytes x 8 = 16,640 kbits / 93 secs = 179 kbits/sec ffmpeg VP6 = 3,051 kbytes x 8 = 24,408 kbits / 93 secs = 262 kbits/sec But for everyday use, there are a few different tools for detecting bitrate and codecs: MediaInfo is a nice basic tool for quickly seeing all the stats on a video file. http://mediainfo.sourceforge.net/en For MacOS, you can use the Inspector window in QuickTime Player. I strongly recommend having the Perian codec pack to read non-native codecs. http://www.perian.org/ For analyzing Blu-ray Discs, see BDInfo for Windows http://www.cinemasquid.com/blu-ray/tools/bdinfo/

<-- Hide
tools: email  |  print  |  share  |  click to rate (rated 7 times):
  • BlinkList
  • Del.icio.us
  • Digg
  • Facebook
  • Google Bookmarks
  • LinkedIn
  • MySpace
  • Newsvine
  • Reddit
  • Sphinn
  • Technorati
  • Twitter

Suggestions for improving quality with H.264 settings

Since users often will be uploading a wide variety of videos, I generally like to break them down into two types:

Static/Low Action - stable tripod shots, very little background movement, actors standing still 
Active/High Action - panning/jerky camera, lots of action, sports-like movement

A good starting point is to choose a variable bitrate setting equal

 More -->

Since users often will be uploading a wide variety of videos, I generally like to break them down into two types:

Static/Low Action - stable tripod shots, very little background movement, actors standing still 
Active/High Action - panning/jerky camera, lots of action, sports-like movement

A good starting point is to choose a variable bitrate setting equal to the width of the video. So, for example: 640x480 SD at 640 kilobits per sec (kbps), or 1280x720 HD at 1280 kbps. Higher action video may require a slightly higher bitrate to prevent blocking artifacts.

Noise Reduction

Applying some noise reduction is useful to save bits for high detail regions, but be careful not to overdo it. I've seen video samples where whole areas of ocean and grassy fields disappear to achieve a lower bitrate. This of course, falls under artistic preference, but generally I'd rather see a smaller framesize and more detail. At low bitrates, it is increasing important to improve the quality of noisy video sources, such as film containing lots of grain or video shot in low light. The 3D noise reduction in ffmpeg allows control over both luma and chroma values for fine-tuning your output image quality.

	<noise_reduction>4:3:6</noise_reduction> 

luma_spatial – Spatial Luma Strength. Allowed values: [0,255]
chroma_spatial – Spatial Chroma Strength. Allowed values: [0,255]
luma_temp – Temporal Luma Strength. Allowed values: [0,255]

General recommended starting value is 4:3:6. [luma_spatial:chroma_spatial:luma_temp] Noise reduction is also available via our web interface as the High Quality 3D Denoiser option.

 

Single Pass vs. Two Pass

For most purposes 2-pass encoding achieves very good results. It's a tradeoff of diminishing returns, 2-pass gaining perhaps 10% quality bit-for-bit but doubling the encoding time. Do not lower qcomp, CBR is horrible on quality. I'd experiment with values floating between 0.60 and 0.80 if you want more VBR. if qcomp = 1.00 then quantizer is constant for second pass. Real variable bitrate with constant quality. if qcomp = 0.00 then bitrate is constant for second pass. Real constant bitrate with variable quality.

<two_pass>yes</two_pass>

I'd recommend having two sample videos, tell your users to choose Low or High Action content setting, experiment a bit with your B-frames then define two "baseline" settings for each bitrate. For web video it's best to narrow your targets to four different bitrates at most, especially if you are going to be processing thousands of users uploading. Most folks have a slow (up to 240kbits), good (~700kbits), or fast (2mbits and higher) connection. I'd say H.264 over 2mbits is generally overkill for website content. For general purposes, I'd recommend 2-pass and we push a 10 second keyframe interval (300 frames) which may not be appropriate for "high-action" source video.

For more detail on H.264 controls for scenecut thresholds, B-frames, and more, please refer to:

Advanced H.264 Guide http://sites.google.com/site/linuxencoding/x264-ffmpeg-mapping

H.264 parameters for our API http://www.encoding.com/help/article/advanced_configuration_options_for_the_libx264_video_codec

An excellent collection of HD videos at 2mbits/sec can be found at http://californiaisaplace.com/cali/ 

 

16x16 Macroblocks

H.264/AVC does a much more efficient job when the horizontal and vertical framesize dimensions are multiples of 16. Good examples include:

SD (4:3) aspect ratios: 320x240, 432x320, 480x360, 544x400, 640x480, 768x576
HD (16:9) aspect ratios: 432x240, 576x320, 640x360, 720x400, 848x480, 1024x576, 1280x720, 1920x1080

In 4:2:0 H.264/AVC coding, each block contains 4 luminance samples (Y), 1 blue sample (Cb), and 1 red sample (Cr). Modern video decoding chips (GPUs) are optimized for playback of 16x16 macroblocking.

 

Keyframes and GOPs

Low action scenes generally handle more bidirectional (B-frames) better since they don't have to track interframe motion as aggressively. Higher action content will require more keyframes (I-frames) to keep the picture from breaking apart. Longer GOPs with more B-frames also require more buffering by the playback GPU to recursively track the motion for each macroblock. Fortunately, x264 offers very good scene detection, which is why for most applications, we set keyframes to 300.

H.264/AVC sample for modern mobiles (30 fps with a 10 second GOP)

	<framerate>30</framerate>
	<keyframe>300</keyframe>

For older computers, and early generations of iPod and Blackberry phones, the chips might not have enough processing power and memory to successfully buffer longer GOPs. Keep your bitrates low, try lower framerates, and shorter GOPs.

H.264/AVC sample for older mobiles (15 fps with a 4 second GOP)

	<framerate>15</framerate>
	<keyframe>60</keyframe>

More information about GOPs available on wiki http://en.wikipedia.org/wiki/Group_of_pictures

 

Turbo Mode

NOTE: For bigger or longer HD encoding jobs, turbo mode is absolutely recommended since you will see speed gains in the neighborhood of 3x faster vs. normal mode. Please be aware turbo is running on more powerful encoders, so it costs an extra $1 per gigabyte.

<turbo>yes</turbo>

 

<-- Hide
tools: email  |  print  |  share  |  click to rate (rated 10 times):
  • BlinkList
  • Del.icio.us
  • Digg
  • Facebook
  • Google Bookmarks
  • LinkedIn
  • MySpace
  • Newsvine
  • Reddit
  • Sphinn
  • Technorati
  • Twitter