I did some analysis of the streams from the Sanyo Xacti HD1000 and from FCP’s H.264 encoder. (See previous post for description of how the streams were captured/edited/encoded.) Here are some observations about the coding of those streams.
HD1000 stream
I had seen a rumor on the internet that the HD1000 uses Ambarella’s encoder silicon. Based on parsing the stream, I have to say that someone is wrong on the internet.
Baseline profile, level 3.2. Thus, no B frames. Just IDRs and Ps.
IDRs every 30 frames, or approximately twice per second.
CAVLC rather than CABAC. (Another constraint of baseline profile.)
Inter prediction mostly 16×16 with some 8×16 and 16×8. Didn’t see any smaller partitions. Seems to be just one ref.
Intra uses 16×16 and 4×4.
The “trimmed” version output by Final Cut Pro prepends frames back to the first IDR prior to the “in” point, and seems to have lost one frame off the end (but that could have been pilot error.) The net is that FCP doesn’t have to dive into the video elementary stream to re-code the head and/or tail. Given the simple GOP and frequent IDRs, this approach is “good enough” for many applications. (When viewing the clip in the Quicktime player, the clip starts at my “in” point, but when using the analyzer, it starts at the IDR prior to the “in” point. There must be some offset indexing going on in the QT Player using a mechanism that I’m unfamiliar with.)
FCP H.264 stream Quarter HD stream
Richer use of H.264 syntax than the HD1000.
Main/3.1.
CAVLC, not CABAC.
IDR-B-P GOP. Non-stored Bs.
B’s use two refs, adjacent stored pictures (P or IDR). Ps use one reference — previous P (or IDR).
“Keyframes” in FCP jargon are IDRs.
Intra uses 16×16 and 4×4.
Inter uses 16×16, 16×8, 8×16, 8×8.
B’s use inter, skip, direct, bi-direct.
I had hoped to find something in the HD1000 stream that would explain why dual core machines have so much trouble decoding it in real time. But baseline profile should be pretty easy to decode, relatively speaking. One thing I learned from a brief visit to Fry’s is that Quicktime (surprise, surprise) runs faster on Intel processors running OS X than Intel processors running XP. An iMac with dual core 2.4 GHz processor ran the stream just fine. As I mentioned in the previous post, the VLC player behaved much better in XP than some other players. So there’s some SW issues associated with real-time decode of HD AVC. I shudder to think what will be needed to do 1080p60 AVC decode, esp. with CABAC.

0 Responses to “Video Geekitude”
Leave a Reply
You must login to post a comment.