Thursday, August 7, 2008

How to implement the renderers? Draft 1.

Actually this is more like a brainstorm, but bear with me :)

So far, we have been able to make a workable implementation of VideoOutputDevice. It has the following members:


class VideoOutputDevice : public syAborter {
public:
VideoOutputDevice(); // Constructor
bool Init(); // Initialices the output device
bool IsOk(); // Is the device OK?
bool IsPlaying(); // Is the device currently being transmitted data?
void ShutDown(); // Can only be called from the main thread!
VideoColorFormat GetColorFormat();
unsigned int GetWidth();
unsigned int GetHeight();
bool ChangeSize(unsigned int newwidth,unsigned int newheight);
// Can only be called from the main thread!

void LoadVideoData(syBitmap* bitmap);
virtual bool MustAbort();
virtual ~VideoOutputDevice(); // Destructor
protected:
// ...
private:
// ...
};

The renderer must invoke VideoOutputDevice::Init on playback start and VideoOutputDevice::Shutdown
on playback end; he same for AudioOutputDevice::Init and AudioOutputDevice::ShutDown.
Additionally, it must call VideoOutputDevice::LoadVideoData regularly (in case of playback) or for every frame
(in case of encoding). Therefore, it requires a way to know the input's framerate. Also it needs to know the
input's audio frequency.

It requires to be multithreaded so that the framerate doesn't depend on the main thread's GUI being blocked
or something.

Let's assume that it's VidProject which tells the renderer what the framerate is.

So we have:

void Renderer::Init(VideoInputDevice* videoin,AudioInputDevice* audioin,
VideoOutputDevice* videoout,AudioOutputDevice* audioout);


With this we mean we're gonna need new classes for input: VideoInputDevice and AudioInputDevice.

bool Renderer::SetVideoFramerate(float framerate);


And now, onto the playback functions:

void Renderer::Play(float speed = 1.0,bool muted = false);
void Renderer::Pause();
void Renderer::Stop();
void Renderer::Seek(unsigned long time); // Time in milliseconds to seek to


All that's fine, but what happens when we want to display a still frame? We don't know what video output device we
have - a player or an encoder-, so there must be some way to send a still frame to the video device.

void Renderer::PlayFrame();
// (Note that this should be either a protected function or only be enabled when
the video is paused; otherwise we could desync video and audio)


Now that I think of it, sending still frames is what the Video Playing does. Every N milliseconds, we send a frame to the
output buffer. So there must be separate seeks for video and audio.

void Renderer::SeekVideo(unsigned long time);
void Renderer::SeekAudio(unsigned long time);


And if we're seeking, there must be a way to tell if we're past the clip's duration.

bool Renderer::IsVideoEof();
bool Renderer::IsAudioEof();


And it seems we'll need separate video and audio functions for everything (edit: NOT!)

void Renderer::PlayVideo(float speed = 1.0);
void Renderer::PlayAudio(float speed = 1.0);
void Renderer::PauseVideo();
void Renderer::PauseAudio();
void Renderer::StopVideo();
void Renderer::StopAudio();

But I wonder if having separate stop functions would be good at all because of sync issues. I mean, if we don't
want the audio or video to be shown we just don't decode it. It's matter of seeking, decoding, and sending.
So PlayVideo and Play Audio will just disable video and / or audio, and will only need pause, stop.


void Renderer::PauseVideo(); SCRAPPED
void Renderer::PauseAudio(); SCRAPPED
void Renderer::StopVideo(); SCRAPPED
void Renderer::StopAudio(); SCRAPPED

I think that with this info we'll be able to design a good rendering / playback framework.
Stay tuned.

4 comments:

JeCh said...

Just some thoughts about audio/video synchronization:

Normally it shouldn't cause any problems if you treat the audio and video separately. But there are two things which should be taken into account.

1) VFR - Variable Framerate. It is allowed for MKV, MP4 and FLV. Many YouTube videos are VFR. These files should be supported too. I think that in the editor each video frame should have it's timestamp. Then you should probably move along the timeline in steps according to project settings framerate. If the input and output framerate doesn't match (CFR videos will cause the same problem as VFR here), you have two choices: duplicate/drop frames or blend two frames together. This should be done according to user's choice.

2) I heard from many people that they have problems with MPG videos (usually from TV cards) where after demuxing audio and video goes out of sync. I never came across such file so I can only guess. But I do believe this caused by damaged frames, which are dropped. I'm not sure if ther is a workaround for it, but I would ignore this for now.

So basically I think that audio and video should be treated independently. If you visualize it correctly, it should be clear if the audio and video start/end points match or not. Some video files have delayed audio track, don't forget about it.

Well I have really many ideas what is wrong (or missing) in most todays video editors. Many times I was thinking about creating a video editor myself. Unfortunately I don't have enough programming skills. But I have plenty of ideas and hopefully also some solutions.

I write this mostly to warn you about some potential problems you may come across later. It is easier to think about them now when you are designing the core then trying to implement workarounds after finishing it.

Btw. do you MediaInfo library? It is very useful to identify parameters of video files.

rick_777 said...

Jech: Again, thanks for answering.

1) Variable framerates will be handled by the codec module, which doesn't concern us for now. The codec module must be able to obtain a frame that corresponds to a determinate time in milliseconds (perhaps even nanoseconds when i get the double long integers to work). In other words, the video codec module is a black box that lets me seek into any instant in time.

I'll say it again: Seeking is done in time, *not* frames. This is a result of our philosophy of making Saya framerate independent and resolution/aspect-ratio independent. You can thank the Canopus Edius team for advertising this feature of their expensive video editor ;-)

2) About dropped damaged frames, I really don't know what could be done about it. That's part of the codec module. If the codec library doesn't support fixing desyncing due to dropped frames, there's nothing we can do about it.

3) Ah, thanks for the delayed audio tip. I'll take it into account.

4) The mediainfo library is windows-only and there is no source code compatible with GCC for some of the dependencies (http://mediainfo.sourceforge.net/en/Support/Build_From_Sources/ThirdParty). Sorry.

JeCh said...

Great, seeking based on time is a very good solution. I meant this, maybe just didn't express it clearly. What I was thinking about is this situation: You want to seek to time 00:01:00 and there is no source frame starting at this point. Now you can either process/display a frame closest to this time or blend the previous and following frame together.

It is also a question if it makes sense to seek to any point in time or skip in the interval of frames depending on output framerate. Jumping by output "frames" allows the user to really see how the output will look like. Just some thoughts...

According to MediaInfo, it is multiplatform - Linux/Windows: http://sourceforge.net/project/showfiles.php?group_id=86862&package_id=90614&release_id=612843. I think the only missing GCC dependencies are for Win9x support or GUI only. I might be wrong though.

rick_777 said...

I understand your point. But time is a continuum, and there is no point in the video where no frame is being played. Let's say frame 1 plays from 0 to 30ms. And frame 2 plays from 30ms to 59.999 ms.

If you want the frame at 30ms, you go for frame 2, because it starts there. Before 30ms, even if it's 29.99999999999999999999ms, it's frame 1.

As simple as that. Perhaps the problem is that you think of frames as points in time, when they're actually regions in time. It's the frame transitions that are points in time. And each point corresponds to the next frame.

Now. Regarding the user jumping to frames instead of points in time, that doesn't concern the core. That concerns the UI, so we're not worried about that. Perhaps you haven't noticed, but Adobe Premiere allows you to switch from frames to seconds in the timeline. It's just the same.

Variable framerates are something that has to do with the codec, not with the video format itself. Let's assume that we know what rates they'll be. We'll simply use the maximum framerate to define that. But those are details that we'll deal with later.

I'll give a second check to mediainfo. Perhaps I'll mail the author and ask him directly.

See ya.