DirectShow Introduction and Simple Playback
Update: DirectShow is no longer a part of the DirectX SDK it has now moved into the Platform SDK.
DirectShow is a media streaming API that comes with the DirectX SDK. It can perform audio and video playback and capture.
Directly supports the following formats
- ASF (Advanced Systems Format)
- MPEG (Motion Picture Experts Group)
- AVI (Audio-Video Interleaved)
- MP3 (MPEG Audio Layer 3)
- WAV sound files
DirectShow uses the concept of filters that take an input and provide an output. E.g. A filter may take MPEG data as input, decode it and output a set of image frames for displaying.
Many filters can be connected together to perform playback. This is known as a filter graph.
An AVI file provides a stream of data that interleaves audio and video. We need to separate the audio from the video to send to the correct playback devices (sound card and video card) and also must decode the video data (it is compressed in this case). The set of filters required to do this are shown in the image below (taken directly from the DirectX help file):
Fortunately DirectShow can create the above set of filters automatically for the file types directly supported. You can create your own for your own use and even to support custom data streams.
There are 3 DirectShow objects we need to create instances of. They are:
Graph Builder (Filter Graph Manager)
- Type is IGraphBuilder
- Methods for building the filter graph
- Methods for creating the next two objects:
- Type is IMediaControl
- Controls streaming
- Methods for controlling playback e.g. Run, Stop & Pause
- Type is IMediaEvent
- Interfaces with filter graph manager
- Allows testing of events e.g. if playback is complete
DirectShow, like all DirectX APIs is based on the COM design model. Unlike Direct3D which hides a lot of the COM methods we need to directly work with COM to obtain our DirectShow objects. Since it is a new library we need to link with its library file and include its header:
Library File: strmiids.lib
Header File: dshow.h
It would be wise to create a class to hold the DirectShow objects, instantiate them, release them and provide methods to control playback. In the code below I assume the class has the following member variables:
To set up DirectShow you will need an initialise function that creates the objects and checks for errors. The first thing we have to do is to initialise COM, this is simple:
// Initialise COM
Next we need to create our filter graph manager, this provides the interfaces needed to create the other two objects
// Create the Filter Graph Manager
HRESULT hr = CoCreateInstance (CLSID_FilterGraph, NULL, CLSCTX_INPROC_SERVER, IID_IGraphBuilder, (void **)&m_graphBuilder);
This is a rather complex looking function as it communicates directly with COM. The function instantiates a filter graph manager and sets your pointer m_graphBuilder to point at the new object. This function can be used to create any DirectX COM object but is often hidden to us e.g. in Direct3D there is a function that does all this for us (Direct3DCreate9). CoCreateInstance takes the class identifier (CLSID) of the type of COM object we require, in this case a filter graph and some variables to determine set-up, for more details look in the DirectX help.
Once the filter graph manager COM object is instantiated we can use it to create our other two DirectShow objects. This is achieved by calling its method QueryInterface(...) this asks the object to instantiate another object that we can then use. Again this is a common COM method that is hidden from us in Direct3D e.g. the device creation.
To instantiate our media control object:
hr = m_graphBuilder->QueryInterface(IID_IMediaControl, (void **)&m_mediaControl);
To instantiate our media event object:
hr = pGraphBuilder->QueryInterface (IID_IMediaEvent, (void **)&m_mediaEvent);
Note: as with most DirectX calls these return an HRESULT. Remember to always check that they have not failed.
If the above three calls succeeded we now have all the three DirectShow objects we require to handle streamed playback.
To play back a file we need to create our set of filters that form our filter graph. This is easy for the supported types as DirectShow does it for us via its RenderFile function e.g.
HRESULT hr = m_graphBuilder->RenderFile(L"music.mp3", NULL);
Note: The L before the filename is required as the function takes a Unicode string. Unicode is an alternative to ASCII that provides support for a number of different alphabets, essential for localisation. The L converts the ASCII string into the correct format but only works with hard coded strings. If you were to pass in a filename variable you would need to convert it yourself using the MultiByteToWideChar function described below.
If the above was successful you can now play back your media stream by calling the media control Run interface:
// Run the graph - note: starts a new thread
hr = m_mediaControl->Run();
You can stop playback:
hr = m_mediaControl->Stop();
or to pause it:
hr = m_mediaControl->Pause();
If the file was an MP3 it would play back via the sound card. If it were an AVI a control window will open and the video be played in that. To play video in the main window is a little bit more complex - see the next set of notes on how to do this: Main Window Video.
How do I know when it has stopped?
You can use the IMediaPosition interface to find out when the video / audio has stopped playing. You create this via a query interface call as above but with the ID: IID_IMediaPosition. You can then call functions GetCurrentPosition and GetDuration to work out where you are in the file.
This function allows the conversion from a char string to a wchar_t wide string. The function takes the char string and some flags determining code page etc. and fills a wide string buffer. One issue is how big should the buffer be? You could do something like this:
MultiByteToWideChar(CP_ACP, 0, charString, -1, newWideString, 256);
However are you sure that 256 will be big enough? How big should you make it? Well you can actually find the exact size by calling MultiByteToWideChar twice and allocating memory dynamically like so:
int wideStringLength=MultiByteToWideChar(CP_ACP, 0, charString, strlen(charString), 0, 0);
wchar_t *newWideString=new WCHAR[wideStringLength];
MultiByteToWideChar(CP_ACP, 0, charString, -1, newWideString, 256);
// Use the newWideString and then remember to delete the memory for it
As with all DirectX objects we must release them when we have finished. If you have created a class for DirectShow support the release calls could be placed in the destructor.
And that's all there is for simple playback of media streams using DirectShow. Update: in the February release of DirectX DirectShow has moved into the extras folder and has been removed completely from the April SDK release. It is due to reappear in the Platform SDK at a later date.