DASH MPD Files Explained: The Navigation Map

As a developer, when you want to integrate smooth adaptive video streaming capabilities into your application or website, you will most likely encounter the terms DASH and MPD. DASH is today’s mainstream high-efficiency streaming protocol, and MPD is its core and soul. Simply put, MPD is the roadmap that tells the player “where the video is, what versions are available, and how to play it”.

1. What are DASH and MPD?

DASH stands for Dynamic Adaptive Streaming over HTTP. As the name suggests, it is an HTTP-based adaptive streaming technology. Its core idea is:

  1. Split the video file into countless small segments.
  2. Provide multiple different quality (e.g., resolution, bitrate) versions of the same video.
  3. The player dynamically selects the most suitable quality segment to download and play based on the current network speed, ensuring a smooth viewing experience.

MPD stands for Media Presentation Description. It is an XML-format manifest file containing metadata for all the above information. The first thing a player must do is fetch this MPD file before it can start intelligently pulling video segments.

2. The Structure of an MPD File: Like a Book’s Table of Contents

We can visualize an MPD file as the structure of a book, making it very intuitive to understand.

  • <MPD> Root Element: The book cover, listing the title (mediaPresentationDuration total duration), publication date (publishTime), etc.
  • <Period> Element: Chapters in the book. A live stream might have only one chapter, while a movie might be divided into two chapters: the main feature and the trailer. Chapters are arranged chronologically; one finishes before the next starts.
  • <AdaptationSet> Element: Categories under a chapter. For example, a chapter usually contains two main categories: “Video” and “Audio”. It defines common properties for the same type of media stream, such as the video codec being AVC or the audio codec being EC-3.
  • <Representation> Element: Specific versions under a category. This is the most critical part, representing a quality specification of the video or audio.
  • <SegmentTemplate> / <SegmentList>: Defines how to find each small segment. It provides a path template for downloading segments. This is the core of the player’s puzzle.

Summary of Hierarchy: MPD -> Period (Chapter) -> AdaptationSet (Category: Video/Audio) -> Representation (Quality Version) -> SegmentTemplate (Segment Path Rules)

3. Real-world Analysis: A Real MPD Example

Let’s use a real MPD file (from the famous Big Buck Bunny test video) to gain a deeper understanding. This file describes a VOD (Video on Demand) video with a total duration of approximately 634 seconds (10 minutes 34 seconds). It offers 10 video qualities ranging from 180p to 4K and 1 audio quality. Download link: https://dash.akamaized.net/akamai/bbb_30fps/bbb_30fps.mpd


<MPD mediaPresentationDuration="PT634.566S" minBufferTime="PT2.00S"
     profiles="urn:hbbtv:dash:profile:isoff-live:2012,urn:mpeg:dash:profile:isoff-live:2011" type="static"
     xmlns="urn:mpeg:dash:schema:mpd:2011" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xsi:schemaLocation="urn:mpeg:DASH:schema:MPD:2011 DASH-MPD.xsd">
    <BaseURL>./</BaseURL>
    <Period>
        <AdaptationSet mimeType="video/mp4" contentType="video" subsegmentAlignment="true" subsegmentStartsWithSAP="1"
                       par="16:9">
            <SegmentTemplate duration="120" timescale="30" media="$RepresentationID$/$RepresentationID$_$Number$.m4v"
                             startNumber="1" initialization="$RepresentationID$/$RepresentationID$_0.m4v"/>
            <Representation id="bbb_30fps_1024x576_2500k" codecs="avc1.64001f" bandwidth="3134488" width="1024"
                            height="576" frameRate="30" sar="1:1" scanType="progressive"/>
            <Representation id="bbb_30fps_1280x720_4000k" codecs="avc1.64001f" bandwidth="4952892" width="1280"
                            height="720" frameRate="30" sar="1:1" scanType="progressive"/>
            <Representation id="bbb_30fps_1920x1080_8000k" codecs="avc1.640028" bandwidth="9914554" width="1920"
                            height="1080" frameRate="30" sar="1:1" scanType="progressive"/>
            <Representation id="bbb_30fps_320x180_200k" codecs="avc1.64000d" bandwidth="254320" width="320" height="180"
                            frameRate="30" sar="1:1" scanType="progressive"/>
            <Representation id="bbb_30fps_320x180_400k" codecs="avc1.64000d" bandwidth="507246" width="320" height="180"
                            frameRate="30" sar="1:1" scanType="progressive"/>
            <Representation id="bbb_30fps_480x270_600k" codecs="avc1.640015" bandwidth="759798" width="480" height="270"
                            frameRate="30" sar="1:1" scanType="progressive"/>
            <Representation id="bbb_30fps_640x360_1000k" codecs="avc1.64001e" bandwidth="1254758" width="640"
                            height="360" frameRate="30" sar="1:1" scanType="progressive"/>
            <Representation id="bbb_30fps_640x360_800k" codecs="avc1.64001e" bandwidth="1013310" width="640"
                            height="360" frameRate="30" sar="1:1" scanType="progressive"/>
            <Representation id="bbb_30fps_768x432_1500k" codecs="avc1.64001e" bandwidth="1883700" width="768"
                            height="432" frameRate="30" sar="1:1" scanType="progressive"/>
            <Representation id="bbb_30fps_3840x2160_12000k" codecs="avc1.640033" bandwidth="14931538" width="3840"
                            height="2160" frameRate="30" sar="1:1" scanType="progressive"/>
        </AdaptationSet>
        <AdaptationSet mimeType="audio/mp4" contentType="audio" subsegmentAlignment="true" subsegmentStartsWithSAP="1">
            <Accessibility schemeIdUri="urn:tva:metadata:cs:AudioPurposeCS:2007" value="6"/>
            <Role schemeIdUri="urn:mpeg:dash:role:2011" value="main"/>
            <SegmentTemplate duration="192512" timescale="48000"
                             media="$RepresentationID$/$RepresentationID$_$Number$.m4a" startNumber="1"
                             initialization="$RepresentationID$/$RepresentationID$_0.m4a"/>
            <Representation id="bbb_a64k" codecs="mp4a.40.5" bandwidth="67071" audioSamplingRate="48000">
                <AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011"
                                           value="2"/>
            </Representation>
        </AdaptationSet>
    </Period>
</MPD>

Line-by-line Analysis:

1. Global Information (<MPD> Tag)

  • mediaPresentationDuration="PT634.566S": The total duration of the media, 634.566 seconds.
  • minBufferTime="PT2.00S": For smooth playback, the player must pre-download at least 2 seconds of data.
  • type="static": One of the most critical attributes! static indicates this is a VOD (Video on Demand) video. All video segments are already generated.
  • <BaseURL>./</BaseURL>: The base URL for all media segment paths. ./ indicates a relative path; the segment folder is in the same directory as the MPD file.

2. Video Track (<AdaptationSet> for Video)

  • mimeType="video/mp4": The container format is MP4.
  • <SegmentTemplate>: Segment Rules - This is the recipe for how the player constructs file URLs!
    • timescale="30" + duration="120": 120 / 30 = 4 seconds. Each video segment is 4 seconds long.
    • initialization="$RepresentationID$/$RepresentationID$_0.m4v": The initialization segment path. $RepresentationID$ is a variable that will be replaced by the specific quality ID.
    • media="$RepresentationID$/$RepresentationID$_$Number$.m4v": The media segment path. $Number$ is a variable representing the segment sequence number.
    • Example: For the version with id="bbb_30fps_1280x720_4000k", the path to its 5th segment is: ./bbb_30fps_1280x720_4000k/bbb_30fps_1280x720_4000k_5.m4v
  • <Representation>: Quality Versions - Here lists 10 qualities.
    • 4K Version Example:
      <Representation id="bbb_30fps_3840x2160_12000k"
                     codecs="avc1.640033"
                     bandwidth="14931538"
                     width="3840" height="2160"
                     frameRate="30"/>
      bandwidth="14931538" indicates a bitrate of ~14.9 Mbps, which is the most critical number for the player to make switching decisions.
    • Low Quality Version Example:
      <Representation id="bbb_30fps_320x180_200k"
                     codecs="avc1.64000d"
                     bandwidth="254320"
                     width="320" height="180"
                     frameRate="30"/>
      The bitrate is only ~254 kbps, saving a significant amount of data.

3. Audio Track (<AdaptationSet> for Audio)

  • Usually there is only one audio stream, as different languages are placed in different AdaptationSet elements.
  • Audio <SegmentTemplate>:
    • timescale="48000" + duration="192512": 192512 / 48000 ≈ 4.01 seconds. The segment duration aligns with the video’s ~4 seconds for easier audio-video synchronization.
  • Audio <Representation>:
    • codecs="mp4a.40.5": Represents AAC-LC audio codec.
    • bandwidth="67071": Bitrate is approximately 67 kbps.
    • <AudioChannelConfiguration value="2"/>: Stereo (2 channels).

4. The Player’s Workflow

  1. Fetch MPD: Download and parse this XML file.
  2. Understand Structure: Know that this is a VOD video with 10 video qualities and 1 audio quality, and all segments are about 4 seconds long.
  3. Download Initialization Segments: Download the initialization segments for audio and video (e.g., ..._0.m4v and ..._0.m4a) based on the templates.
  4. Adaptive Loop:
    • Monitor Network: Calculate the current download speed.
    • Select Quality: Compare current network speed with the bandwidth values to select the highest video quality that can play smoothly (e.g., 720p).
    • Download Media Segments: Based on the media attribute template, construct the URL for the next 4-second video segment and download it. Also download the corresponding audio segment.
    • Decode and Play: Feed segments into the decoder for playback.
    • Repeat Continuously: After downloading each segment, re-evaluate network conditions and repeat this process until playback ends.

5. Practical Tips for Developers

  1. Static vs Dynamic: type="static" indicates VOD, where the entire MPD information is complete. type="dynamic" indicates Live streaming, where the MPD updates continuously and includes attributes like minimumUpdatePeriod.
  2. Bitrate Adaptation Logic: The DASH protocol itself only provides the “roadmap”. How to select the most suitable Representation based on bandwidth is an algorithm the player developer needs to implement. You can use open-source libraries like dash.js, Shaka Player, or ExoPlayer, which have built-in mature algorithms.
  3. Debugging Tools: The “Network” tab in your browser’s F12 Developer Tools is the best place to observe the DASH workflow. You will see requests for the .mpd file, as well as requests for numerous .m4s or .m4v/.m4a segments.

Summary

The MPD file is the “brain” and “roadmap” of the DASH streaming system. Through a clear XML structure, it efficiently organizes complex multi-bitrate media stream information in a loosely coupled manner. Through the line-by-line interpretation of the example above, we can see how the player relies on core elements like BaseURL, SegmentTemplate, and Representation to piece together actual media segment URLs and intelligently complete adaptive switching based on the bandwidth attribute. Understanding this gives you the key to integrating DASH streaming functionality.