h.264

31
H.263

Upload: ronny72

Post on 17-May-2015

720 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: H.264

H.263

Page 2: H.264

The proposal after reading through h.263 specifications is that we can for this format have it summarised very easily;

vcodec_h263_{Profile}={Level Number}

The {Profile} indicates the decoding capability of handset device. Details can be seen on page 3.

The {Level Number} will reduce the need to specify maximum video width and height, bitrate and frames per second. For a device to support that profile it should meet the standard. If it doesn’t then my suggestion would be to say that device doesn’t support H.263.

H263 levels are all backwards compatible with exception of level 45 which just means support of level 10. 

Page 3: H.264

H.263 Version AKA’s

H.263 H.263_1995H.263v1

H.263v2 H.263+ H.263_1998

H.263v3 H.263++H.263_2000

http://www.itu.int/rec/T-REC-H.263-200501-I/en

Page 4: H.264

http://www.itu.int/rec/T-REC-H.263-200501-I/en

Page 5: H.264

http://www.itu.int/rec/T-REC-H.263-200501-I/en

Page 6: H.264

X.1 ScopeWith the variety of optional modes available in this Recommendation, it is crucial that several preferred mode combinations for operation be defined, so that option-enhanced terminals will havea high probability of connecting to each other using some syntax better than the "baseline". This annex contains a list of preferred feature combinations, which are structured into "profiles" of support. It also defines some groupings of maximum performance parameters as "levels" of support for these profiles. The primary objectives of this annex are:

1) to provide a simple means of describing or negotiating the capabilities of a decoder (by specifying profile and level parameters);2) to encourage common enhancement features to be supported in decoders for achieving maximal interoperability; and3) to describe feature sets chosen as particularly appropriate for addressing certain key applications.The profiles and levels are defined in the following clauses and in Tables X.1 and X.2. The minimum picture interval as specified in Table X.2 is the minimum difference in time between the decoding of consecutive pictures in the bitstream. Support of any level other than level 45 implies support of all lower levels. Support of level 45 implies support of level 10.

Page 7: H.264

X.2.1 The Baseline Profile (Profile 0)The Baseline Profile, designated as Profile 0, is defined herein to provide a profile designation for the minimal "baseline" capability of this Recommendation. "Baseline" refers to the syntax of this Recommendation with no optional modes of operation. This profile of support is composed of only the baseline design.

X.2.4 Version 2 Interactive and Streaming Wireless Profile (Profile 3)The Version 2 Interactive and Streaming Wireless Profile, designated as Profile 3, is defined herein to provide enhanced coding efficiency performance and enhanced error resilience for delivery to wireless devices within the feature set available in the second version of this Recommendation (which did not include Annexes U, V, and W). This profile of support is composed of the baseline design plus the following modes:

1) Advanced INTRA Coding (Annex I) − See X.2.2 item 1.

2) Deblocking Filter (Annex J) − See X.2.2 item 2.

3) Slice Structured Mode (Annex K) − The Slice Structured mode is included here due to its enhanced ability to provide resynchronization points within the video bitstream for recovery from erroneous or lost data. Support for the Arbitrary Slice Ordering (ASO) and Rectangular Slice (RS) submodes of the Slice Structured mode are not included in this profile, in order to limit the complexity requirements of the decoder. The additional computational burden imposed by the Slice Structured mode is minimal, limited primarily to bitstream generation and parsing.

4) Modified Quantization (Annex T) − See X.2.2 item 4.

X.2.5 Version 3 Interactive and Streaming Wireless Profile (Profile 4)The Version 3 Interactive and Streaming Wireless Profile, designated as Profile 4, is defined herein to provide enhanced coding efficiency performance and enhanced error resilience for delivery to wireless devices, while taking advantage of the enhanced features of the third version of this Recommendation. This profile of support is composed of the baseline design, plus the following additional features as follows:

1) Profile 3 − This feature set provides several enhancements useful for support of wireless video transmission.

2) Data Partitioned Slice Mode (Annex V) − This feature enhances error resilience performance by separating motion vector data from DCT coefficient data within slices, and protects the motion vector information (the most important part of the detailed macroblock data) by using reversible variable-length coding. Support of the Arbitrary Slice Ordering (ASO) and Rectangular Slice (RS) submodes are not included in this profile, in order to limit the complexity requirements of the decoder.

3) Previous Picture Header Repetition Supplemental Enhancement Information(Annex W, clause W.6.3.8) − This feature allows the decoder to receive and recover the header information from a previous picture in case of data loss or corruption.

Page 8: H.264

X.4 Levels of performance capabilityEight levels of performance capability are defined for decoder implementation. The Hypothetical Reference Decoder has the minimal size specified in Table X.1 for all levels of Profiles 0 through 4.

In Profiles 5 though 8 the Hypothetical Reference Decoder has an increased size and Enhanced Reference Picture Selection is supported with multiple reference pictures. Table X.2 defines the detailed performance parameters of each of these levels:

1) Level 10 − Support of QCIF and sub-QCIF resolution decoding, capable of operation with a bit rate up to 64 000 bits per second with a picture decoding rate up to (15 000)/1001 pictures per second.2) Level 20 − Support of CIF, QCIF and sub-QCIF resolution decoding, capable of operation with a bit rate up to 2·(64 000) = 128 000 bits per second with a picture decoding rate up to (15 000)/1001 pictures per second for CIF pictures and (30 000)/1001 pictures per second for QCIF and sub-QCIF pictures.3) Level 30 − Support of CIF, QCIF and sub-QCIF resolution decoding, capable of operationwith a bit rate up to 6·(64 000) = 384 000 bits per second with a picture decoding rate up to (30 000)/1001 pictures per second.4) Level 40 − Support of CIF, QCIF and sub-QCIF resolution decoding, capable of operationwith a bit rate up to 32·(64 000) = 2 048 000 bits per second with a picture decoding rate up to (30 000)/1001 pictures per second.4.5) Level 45 – Support of QCIF and sub-QCIF resolution decoding, capable of operation witha bit rate up to 2·(64 000) = 128 000 bits per second with a picture decoding rate up to (15 000)/1001 pictures per second. Additionally, in profiles other than profiles 0 and 2, support of custom picture formats of size QCIF and smaller.5) Level 50 − Support of custom and standard picture formats of size CIF and smaller, capableof operation with a bit rate up to 64·(64 000) = 4 096 000 bits per second with a picture decoding rate up to 50 pictures per second for CIF or smaller picture formats and up to (60 000)/1001 pictures per second for 352 × 240 and smaller picture formats. 206 ITU-T Rec. H.263 (01/2005)6) Level 60 − Support of custom and standard picture formats of size 720 × 288 and smaller,capable of operation with a bit rate up to 128·(64 000) = 8 192 000 bits per second with a picture decoding rate up to 50 pictures per second for 720 × 288 or smaller picture formats and up to (60 000)/1001 pictures per second for 720 × 240 and smaller picture formats.7) Level 70 − Support of custom and standard picture formats of size 720 × 576 and smaller,capable of operation with a bit rate up to 256·(64 000) = 16 384 000 bits per second with a picture decoding rate up to 50 pictures per second for 720 × 576 or smaller picture formats and up to (60 000)/1001 pictures per second for 720 × 480 and smaller picture formats.The bit rate at which a particular profile and level are used in a system shall never exceed that specified in this annex. However, particular systems may include other means to signal further limits on the bit rate. Other aspects of profile and level capabilities may also be subject to additional capability restrictions when used in particular systems, but the capabilities required for decoding any bitstream for a particular profile and level defined herein shall never exceed those specified in this annex.

Source: http://www.itu.int/rec/T-REC-H.263-200501-I/en

Page 9: H.264

Mpeg4

Page 10: H.264

The proposal after reading through mpeg4 specifications is that we can for this format have it summarised very easily;

vcodec_mpeg4_{Profile}={Level Number}

The {Profile} is likely to be SP – That will represent the mpeg4 Part 2* Visual Simple Profile. But we may want to consider adding ASP at a later date (unlikely due to H.264)

The level number will reduce the need to specify maximum video width, height, and bitrate.

It doesn’t have a standard frame rate per second, however I would recommend that we use the “typical” fps table.

For a device to support that profile it should meet the standard including typical frame-rate. If it doesn’t then my suggestion would be choose a lower level or state it doesn’t support mpeg4.

* MPEG-4 consists of several standards—termed "parts“ – 2 simple refers a compression codec for visual data (video, still textures, synthetic images, etc.). The next part that concerns video is mpeg4 part 10 (or H.264)

Page 11: H.264

http://www.iis.fraunhofer.de/Images/IISMpeg4VideoSoftware_v21-1_tcm97-114945.pdf

Page 12: H.264

H.264

Mpeg-4 Part 2

Page 13: H.264

1.The Simple Profile only accepts objects of type Simple, and was created with low complexity applications in mind. The first usage is mobile use of (audio)visual services, and the second is putting very low complexity video on the Internet. Also small camera devices recording moving video to, e.g., disk or memory chips, can make good use of this profile. It supports up to four objects in the scene with, at the lowest level, a maximum total surface of a QCIF picture. There are 3 levels for the Simple Profile with bitrates from 64 to 384 kbit/s.. The levels also define the maximum total surface for the objects and the amount of macroblocks per second that the decoder needs to be able to decode. Further, they define the size of various (hypothetical) buffers needed for decoding. While the maximum total object size is defined, the aspect ratio is not prescribed. This gives maximum creative freedom. It could be used for instance in a personal computer screen, where a very wide or a very tall object could be created, or several smaller objects in various places on the screen, not confined to a typical QCIF area. The same level philosophy is followed for restricting the complexity of the natural video objects in all the visual profiles.

http://www.chiariglione.org/mpeg/faq/mp4-vid/mp4-vid.htm

Page 14: H.264

Profile, Level SP, L0 SP, L0b SP, L1 SP, L2 SP, L3 ASP, L0 ASP, L1 ASP, L2 ASP, L3 ASP, L3b ASP, L4 ASP, L5

Max. Bitrate (kbit/s) 64 128 64 128 384 128 128 384 768 1500 3000 8000

Max. Buffer (kbit) 160 320 160 640 640 160 160 640 640 1040 1280 1792

Max. Delay @ max. Bitrate (sec)

2.5 2.5 2.5 5 1.66 1.25 1.25 1.66 0.86 0.69 0.43 0.22

Max. VP Size (bit) 2048 2048 2048 4096 8192 2048 2048 4096 4096 4096 8192 16384

Max. VOP Size (MB) 99 99 99 396 396 99 99 396 396 396 792 1620

Max. Decoder Rate (MB/s) 1485 1485 1485 5940 11880 2970 2970 5940 11880 11880 23760 48600

Max. Framesize @ 30Hz - - 128x96 256x192 CIF QCIF QCIF 256x192 CIF CIF 352x576

704x288 720x576

Max. Framesize @ 25Hz - - 144x96 304x192

288x208 CIF QCIF QCIF 304x192288x208 CIF CIF 352x576

704x288 720x576

Max. Framesize @ 24Hz - - 160x96 304x208 CIF QCIF QCIF 304x208 CIF CIF 352x576

704x288 720x576

Max. Framesize @ 15Hz QCIF QCIF QCIF CIF CIF QCIF QCIF CIF CIF CIF 352x576

704x288 720x576

Max. Framesize @ 12.5Hz QCIF QCIF QCIF CIF CIF QCIF QCIF CIF CIF CIF 352x576

704x288 720x576

http://www.hthoma.de/video/mpeg4_video_tut/index.html

Page 15: H.264

http://www.m4if.org/resources/profiles/index.php#MPEG01A

Page 16: H.264

Source: http://docs.real.com/docs/rn/whitepapers/CodecSupport.pdf

http://en.wikipedia.org/wiki/MPEG-4

Part Number Title Description

Part 1 ISO/IEC 14496-1 Systems Describes synchronization and multiplexing of video and audio. For example Transport stream.

Part 2 ISO/IEC 14496-2 Visual

A compression codec for visual data (video, still textures, synthetic images, etc.). One of the many "profiles" in Part 2 is the Advanced Simple Profile (ASP).

Part 3 ISO/IEC 14496-3 Audio A set of compression codecs for perceptual coding of audio signals, including some variations of Advanced Audio Coding (AAC) as well as other audio/speech coding tools.

Part 4 ISO/IEC 14496-4 Conformance Describes procedures for testing conformance to other parts of the standard.

Part 5 ISO/IEC 14496-5 Reference Software Provides software for demonstrating and clarifying the other parts of the standard.

Part 6 ISO/IEC 14496-6 Delivery Multimedia Integration Framework (DMIF).  

Part 7 ISO/IEC 14496-7 Optimized Reference SoftwareProvides examples of how to make improved implementations (e.g., in relation to Part 5).

Part 8 ISO/IEC 14496-8 Carriage on IP networks Specifies a method to carry MPEG-4 content on IP networks.

Part 9 ISO/IEC 14496-9 Reference HardwareProvides hardware designs for demonstrating how to implement the other parts of the standard.

Part 10 ISO/IEC 14496-10 Advanced Video Coding (AVC)A codec for video signals which is technically identical to the ITU-T H.264 standard.

Page 17: H.264

H.264

(AKA - mpeg4 part 10 or Mpeg4 AVC)

Page 18: H.264

The proposal after reading through mpeg4 specifications is that we can for this format have it summarised very easily;

vcodec_H264_{Profile}={Level Number}

The {Profile} is likely to be BP – That will represent the H264 Baseline Profile. Unlikely to require any other profiles.

The level number will reduce the need to specify maximum video width, height, bitrate, and frame rate per second.

See next page table for profile and level details .

Page 19: H.264

Level number

Max video bit rate (VCL) for Baseline, Extended and Main Profiles

Examples for high resolution @ frame rate (max stored frames) in Level

1 64 kbit/s [email protected] (8)[email protected] (4)

1b 128 kbit/s [email protected] (8)[email protected] (4)

1.1 192 kbit/[email protected] (9)[email protected] (3)[email protected] (2)

1.2 384 kbit/s [email protected] (7)[email protected] (6)

1.3 768 kbit/s [email protected] (7)[email protected] (6)

2 2 Mbit/s [email protected] (7)[email protected] (6)

2.1 4 Mbit/s [email protected] (7)[email protected] (6)

2.2 4 Mbit/s

[email protected](10)[email protected] (7)[email protected] (6)[email protected] (5)

3 10 Mbit/s

[email protected] (12)[email protected] (10)[email protected] (6)[email protected] (5)

H.264 Levels (AKA Mpeg4 part 10, Mpeg4 AVC)

• source: http://en.wikipedia.org/wiki/H.264/MPEG-4_AVC

Profile Description

Baseline Profile (BP) Primarily for lower-cost applications with limited computing resources, this profile is used widely in videoconferencing and mobile applications.

Main Profile (MP) Originally intended as the mainstream consumer profile for broadcast and storage applications, the importance of this profile faded when the High profile was developed for those applications.

Extended Profile (XP): Intended as the streaming video profile, this profile has relatively high compression capability and some extra tricks for robustness to data losses and server stream switching.

Page 20: H.264
Page 21: H.264
Page 22: H.264
Page 23: H.264

WURFL Discussion and Questions Summaries

Do we need to differ Local and Streaming Values?Runar would like confirmation of the requirement to separate local playback versus streaming for video support. - Rod suggested this was needed for iPhone (Which does progressive download and not handle streaming.) Rod also suggests streaming have different fps and bitrates. [Comment] I wonder if this is due to connection speed of the device. It’d be good to know if these device support different levels for streaming and local playback? If so my theories are holding true. I do think we need to specify what formats video streams support Eg.

Streaming_RTSP-= {True/false}Streamign_SDP = {True/false}

http://tech.groups.yahoo.com/group/wurflvideo/message/56 - Runarhttp://tech.groups.yahoo.com/group/wurflvideo/message/74 Rod

Do we need all those vide sizing values?

Miha would like to have the valuesstreaming_video_max_width = xstreaming_video_max_height = y;

Which pretty much everyone agrees on and has been in the process of being decommissioned since 2004.

http://tech.groups.yahoo.com/group/wurflvideo/message/64http://tech.groups.yahoo.com/group/wmlprogramming/message/19558

Page 24: H.264

WURFL Discussion and Questions Summaries

I do know that frame rates can vary depending on codec and delivery method. H.264 streaming is typically 10 fps, where MPEG4 streaming is 15 fps. MPEG4 playback sometimes supports up to 25 fps. This means we need streaming_video_max_frame_rate separated by codec, at least between MPEG4 and H.264.

Rod Monsees (http://tech.groups.yahoo.com/group/wurflvideo/message/58)

[Comment] Not if you agree with my profiles and level theory.

-----------------------Does anyone know how to detect if a wap browser supports RTSP links? which wurfl fields?

Thanks,

Jim Handy[Comment] Good question and one from my knowledge is not available in WURFL. I propose we need. Unless RTSP is implied by supporting streaming?

Streaming_RTSP-= {True/false}Streamign_SDP = {True/false}Also it’d be good to have Streaming_PSS6= {True/false}

These values can all be found in the User Agent Profile.Pss6: This field this tells us if we can switch between 100kbps and 50kbps, automatically with the streaming server and standard handset video clients.

Page 25: H.264

• 2) Video/streaming processing

New capabilities to introduce:

- video_vcodec_[codec]_bit_rate- video_vcodec_[codec]_frame_rate- video_acodec_[codec]_sampling_rate- video_acodec_[codec]_sampling_resolution- video_acodec_[codec]_bit_rate

- streaming_video_vcodec_[codec]_bit_rate- streaming_video_vcodec_[codec]_frame_rate- streaming_video_acodec_[codec]_sampling_rate- streaming_video_acodec_[codec]_sampling_resolution- streaming_video_acodec_[codec]_bit_rate

Possible [codec] names of visual part (these prefixed with vcodec) (with proposed default bit rates and frame rates in brackets) are:

h263_0 (10547, 10)h263_3 (10547, 10)h264 (10547, 10)mpeg4 (10547, 15)

Possible [codec] names of audio track (those prefixed with acodec) (with proposed default bit rates and sampling rates in brackets) are:

amr (10200, 8000)awb (18250, 16000)aac (96000, 96000)aac_ltp (-, 96000)qcelp (-, 96000)

Examples:video_vcodec_h263_0_bit_rate=10547streaming_video_vcodec_mpeg4_frame_rate=15

video_acodec_amr_sampling_rate=10200streaming_video_acodec_qcelp_bit_rate=96000

3) Audio processing

New capabilities to introduce:

- [codec]_bit_rate- [codec]_sampling_rate- [codec]_sampling_resolution

Possible [codec] names (with proposed default bit rates, sampling rates and sampling resolutions in brackets) are:

mp3 (96000, 48000, -)aac (96000, 96000, -)awb (18250, 16000, -)amr (10200, 8000, -)au (8000, 8000, 8)wav (64000, 44100, 16)mmf_ma2 (8000, 8000, -)mmf_ma3 (46000, 48000, -)mmf_ma5 (12000, 12000, -)mmf_ma7 (24000, 24000, -)qcelp (-, 96000, -)evrc (-, -, -)nokia_ringtone (-, -, -)imelody (-, -, -)digiplug (-, -, -)compactmidi (-, -, -)xmf (-, -, -)rmf (-, -, -)sp_midi (-, -, -)midi_polyphonic (-, -, -)midi_monophonic (-, -, -)mld (-, -, -)smf (-, -, -)

Examples:mp3_bit_rate=96000au_sampling_rate=8000wav_sampling_resolution=16

Jakob did a great job summaries and proposing the following (http://tech.groups.yahoo.com/group/wurflvideo/message/127). Be great to get his opinion on the Video Profile/Levels proposal.

Page 26: H.264

4) Maximum transfer data for streaming

New capabilities to introduce:

- streaming_max_data_rate

Defaults to max_data_rate, which identifies the maximum speed a device can transfer data. Additionally we might assume:

streaming_max_data_rate <= max_data_rate,streaming_max_data_rate < GENERAL_DEVICE_LIMIT, which is probably around 400 kpbs, depending on the codec.

NOTE: Since these speeds are ridiculously fast these days, the devices simply can not decode the video streams so fast. Examples include phones such as SonyEricsson K850i, which can decode at most 200kbps, but supports HSDPA up to 3,6 Mbit/s.

----------------------------------Details on some particular devices posted at the wurflvideo forum.

Samsung D500:

video_vcodec_mpeg4_bit_rate=196k (with acc) - common for both audio&video?video_vcodec_h263_bit_rate=128k (with amr) - common for both audio&video?

Samsung SGH-E720:

video_vcodec_mpeg4_bit_rate=196kvideo_vcodec_mpeg4_frame_rate=15video_acodec_acc_bit_rate=128kvideo_acodec_acc_sampling_rate=44.1

video_vcodec_h263_bit_rate=128kvideo_vcodec_h263_frame_rate=15video_acodec_amr_bit_rate=12,2kvideo_acodec_amr_sampling_rate=8

Samsung SGH-D600e:

video_vcodec_mpeg4_bit_rate=256kvideo_vcodec_mpeg4_frame_rate=30video_acodec_acc_bit_rate=128kvideo_acodec_acc_sampling_rate=48

video_vcodec_h263_bit_rate=196kvideo_vcodec_h263_frame_rate=30video_acodec_amr_bit_rate=12,2kvideo_acodec_amr_sampling_rate=8

Page 27: H.264

Needs to be re-polled as the question is not a Yes or No.

Are people saying yes to having different bitrates for different codecs or are they saying yes to the “not”?

This is most disputed by some documentation, Miha, Myself and Jakub Danilewicz , Robn

   Yes, I think we will need a different max video bit rate by codec.  Perhaps even a max audio bit rate by codec.  We may also need a max frame rate by video codec.  I know of at least one phone (Motorola Q9h, look at the Media Guide) which has a 10 fps max for H.264, but a 15 fps max for MPEG4 and H.263. Rod

Page 28: H.264

User Agent profile help• http://motorola.handango.com/phoneconfig/razrv3xx/Profile/razrv3xx.rdf

<rdf:Description rdf:ID="Streaming"><rdf:typerdf:resource="http://www.3gpp.org/profiles/PSS/ccppschema-PSS5#Streaming"/><pss5:AudioChannels>Stereo</pss5:AudioChannels><pss5:VideoPreDecoderBufferSize>1</pss5:VideoPreDecoderBufferSize>

<pss5:VideoInitialPostDecoderBufferingPeriod>0</pss5:VideoInitialPostDecoderBuff\eringPeriod><pss5:VideoDecodingByteRate>16000</pss5:VideoDecodingByteRate><pss5:RenderingScreenSize>176x144</pss5:RenderingScreenSize><pss5:PssAccept><rdf:Bag><rdf:li>application/sdp</rdf:li><rdf:li>application/x-sdp</rdf:li><rdf:li>application/x-rtsp</rdf:li><rdf:li>audio/AMR</rdf:li><rdf:li>audio/AMR-WB</rdf:li><rdf:li>audio/MP4A-LATM</rdf:li><rdf:li>video/MP4V-ES</rdf:li><rdf:li>video/H263-2000</rdf:li><rdf:li>audio/x-pn-realaudio</rdf:li><rdf:li>video/x-pn-realvideo</rdf:li><rdf:li>x-asf-pf</rdf:li>

Page 29: H.264

This is quite strange as Profile 0 at level 10 shouldn't be able to encode to this specification or speed. I wonder if the document is wrong or the backwards compatibility results in increase load of bitrate for lower profiles.

Sony Ericsson Video Documentation

Page 30: H.264

http://developer.motorola.com/docstools/mediaguides/archive/http://developer.motorola.com/docstools/mediaguides/http://www.forum.nokia.com/main/resources/technologies/audiovideo/av_features/FN\_vid_table.htmlhttp://developer.sonyericsson.com/getDocument.do?docId=65158http://developer.sonyericsson.com/getDocument.do?docId=84942http://sw.nokia.com/id/b652e8f2-81d3-435c-a409-725cac7fbb18/Video_And_Streaming_In_Nokia_Devices_v3_0_en.pdf Multimedia_Framework_Architecture_in_S60_Devices_v1_0_en.pdf

Manufacturer video profile documents links

Page 31: H.264

WURFL contributors to video discussion*David Tolnem/JohanssonAnders Magnus AndersenRunar SolbergMiha ValencicRodney MonseesLuca PassaniAndrew DavidsonJakub DanilewiczSabatini StefanoAndrea Trasatti

Discussions in 2004 contributors

Cha YongAndrea TrasattiZev Blut Soon Yee

* Apologies to anyone I have missed.