Posted in: Internet TV Software & Tools, Mobile Video, News, Video Distribution, Video Sharing & Video Clips by Rich Hall on November 13, 2007

Architecting A Video Transcoding System | Associated Problems & SolutionsAfter reading the well written and thought provoking article about  Architecting a Video Transcoding System by Brian Peebles, I thought I’d talk a bit about the problems and consequent solutions in this field.  

I’ve been involved in building audio/video and graphic systems for over 20 years, and architecting these systems for performance, function, cost, etc. has been a daunting task, with the consequences large and long standing.  

I like to tell the story of when I was a junior engineer building mainframe graphic systems for CAD/CAM applications, we had a director that in every product meeting would remind us of our top 3 product goals; 1) performance, 2) performance, and 3) ah – performance.  

He would then go on with a wry smile and say there is actually a 4th goal, but I don’t think I have to mention what that was.

Performance Is Not The Only Requirement

As Brian pointed out in his article, performance is not the only requirement. These systems must also be architected for functionality, quality, serviceability, upgradeability, as well as other less thought of but equally important factors such as space and power requirements. 

Architecting video conferencing systems for 6 years, including going thru 3 different architectures, has further heightened my awareness of the critical impact in choosing the right architecture now, and for that architecture to be viable as your business grows.

Architecting video transcoding systems leverages very similar requirements demanded by these earlier graphic and recent video conferencing systems.  

Hardware Selection Is Critical

With the technology requirements needed for audio and video transcoding, the architecture and hardware selection is critical.  Audio and video processing demand is increasing exponentially, from the social networking sites, to the deployment of video content to mobile phones and laptops, to IPTV, etc.  

The plethora of audio and video algorithms (e.g. AMR, AAC, MP3, AC3, H.264, WMV, Flash, MPEG), along with all of their various profiles, as well as the almost infinite number of resolutions, frame rates, and bit rates has generated the need for multimedia processing that is hard to comprehend.  

Add to this all of the required image and audio processing (e.g. scaling, de-interlacing, sample rate conversion, audio gain and normalization), and we effectively need the horsepower of a top fuel dragster with the function, reliability, maintainability, and earth friendly features of a Toyota Prius.

So, what does all of this mean?  Well, it means we need ultra fast (for the most part parallel) mathematical processing, extremely efficient data movement, and the flexibility and programming model to quickly react to customer’s changing demands and the ever changing arithmetical standards.  

ASIC Has Its Problems

Unfortunately, these goals have traditionally been at odds with each other.  Typically, the fastest hardware for video type processing has been an ASIC.  But, ASIC’s have their inherent problems.  

They typically only support one or maybe a few codec standards, won’t support new codec standards, and would very likely not support a new profile or appendix of an existing standard.  

And in case anyone get’s the notion that ‘this standard and its profiles are the last’, take a look at the relatively new H.264 – the ever growing number of profiles is getting staggering.  

Add to that the fact that it is usually not cost effective for a company to build its own ASIC, thus relying on a 3rd party vendor, an ASIC solution is usually a risky non extensible solution.

And As For GPPs

At the other end is flexibility.  GPPs traditionally have been the most flexible platform.  But it is quite a universal feeling that GPPs just are not designed for the heavy mathematical processing required for video compression and image processing.  

Throwing multiple cores helps some, but diminishing returns quickly kick in with regard to data movement and power consumption.

Other Solutions

Other solutions are starting to gain some traction, such as FPGAs, and the new class of parallel programmable processors (e.g. Stream Processors, IBM Cell).  FPGAs have the advantage that they’re quite fast, and retain a level of programmability.  

However, they suffer drawbacks in that typically the engineers that have the programming competence are not video algorithmetic engineers, resulting in either sub optimal implementations or difficult project collaboration.  

FPGAs also typically require some means of ‘GPP’ for general control and system interface, so you end up with a multi architecture solution with sometimes significant data movement.

The new class of parallel programmable processors are certainly an interesting piece of technology worth watching.  The claims from a couple of years ago are quite interesting – ASIC level speed with the ease and programmability of a GPP or DSP.  

I think the jury is still out on these claims, and as we’re starting to see some of these technologies coming to fruition, we’ll start to get some visibility into the actual performance and practical programmability.

What About DSPs?

Ok, you noticed I left DSP’s for last.  DSP’s have traditionally been the choice for many multimedia architectures, but have always left architects wishing for more.  They were never quite fast enough, never quite easy enough to extract the maximum performance from.  

Many companies have shifted DSP’s and architectures in search of the ‘holy grail’.  Well, I think we may be finally getting there with the next class of traditional DSP’s that include ‘GPP’ cores and hardware acceleration for mathematical functions required by video compression. 

The result is the potential to have it all – the ease of GPP programmability for control software and system interfacing, the speed and programmability of a DSP core for traditional video and audio processing, and the raw speed of an ASIC for the heavy processing power required by a video compression algorithm like H.264 for functions like motion estimation and de-blocking.  

The result being a single core that can be used to build a transcoding product that is flexible, sustainable, eco friendly, and reflecting back to that director of mine many years ago – just plan ol’ fast.

Originally written by Rich Hall of the RipCode Blog. RipCode offers on-demand video transcoding solutions to ease the process of re-purposing video into multiple viewing formats.


Have Something To Say?

One Person is Speaking Their Mind
Be The Second Person To Say Something:





Keep up to date with Web TV, Video and IPTV News:

Subscribe to Web TV Wire by Email
Subscribe to Web TV Wire via RSS