Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification about when CodecPrivate is mandatory #446

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 37 additions & 14 deletions codec_specs.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,9 @@ which **MUST** be stored within the `CodecPrivate Element`. When the Initializat
within a track, then that updated Initialization data **MUST** be written into the `CodecState Element`
of the first `Cluster` to require it. If the encoding does not require any form of Initialization,
then `none` **MUST** be used to define the Initialization and the `CodecPrivate Element`
**SHOULD NOT** be written and **MUST** be ignored. Data that is defined Initialization to be
stored in the `CodecPrivate Element` is known as `Private Data`.
**SHOULD NOT** be written and **MUST** be ignored. If the encoding does require any form of Initialization,
then `CodecPrivate Element` **MUST** be written and **MUST** be provided to the decoder.
robUx4 marked this conversation as resolved.
Show resolved Hide resolved
The Initialization data to be stored in the `CodecPrivate Element` is referred to as `Private Data`.

### Codec BlockAdditions

Expand Down Expand Up @@ -332,8 +333,8 @@ Codec Name: Theora

Initialization: The `Private Data` contains the first three Theora packets in order. The lengths of the packets precedes them. The actual layout is:

* Byte 1: number of distinct packets `#p` minus one inside the CodecPrivate block. This **MUST** be "2" for current (as of 2016-07-08) Theora headers.
* Bytes 2..n: lengths of the first `#p` packets, coded in Xiph-style lacing. The length of the last packet is the length of the CodecPrivate block minus the lengths coded in these bytes minus one.
* Byte 1: number of distinct packets `#p` minus one inside the `Private Data`. This **MUST** be "2" for current (as of 2016-07-08) Theora headers.
* Bytes 2..n: lengths of the first `#p` packets, coded in Xiph-style lacing. The length of the last packet is the length of the `Private Data` minus the lengths coded in these bytes minus one.
* Bytes n+1..: The Theora identification header, followed by the commend header followed by the codec setup header. Those are described in the [Theora specs](http://www.theora.org/doc/Theora.pdf).

### V_PRORES
Expand Down Expand Up @@ -389,7 +390,7 @@ Description: FFV1 is a lossless intra-frame video encoding format designed to ef
Compared to uncompressed video, FFV1 offers storage compression, frame fixity, and self-description,
which makes FFV1 useful as a preservation or intermediate video format. [Draft FFV1 Specification](https://datatracker.ietf.org/doc/draft-ietf-cellar-ffv1/)

Initialization: For FFV1 versions 0 or 1, `Private Data` **SHOULD NOT** be written. For FFV1 version 3 or greater, the `Private Data` **MUST** contain the FFV1 Configuration Record structure, as defined in https://tools.ietf.org/html/draft-ietf-cellar-ffv1-04#section-4.2, and no other data.
Initialization: For FFV1 versions 0 or 1, none. For FFV1 version 3 or greater, the `Private Data` contains the FFV1 Configuration Record structure, as defined in https://tools.ietf.org/html/draft-ietf-cellar-ffv1-04#section-4.2, and no other data.

## Audio Codec Mappings

Expand Down Expand Up @@ -471,9 +472,11 @@ Codec ID: A_AC3

Codec Name: (Dolby™) AC3

Description: BSID <= 8 !! The private data is void ??? Corresponding ACM wFormatTag : 0x2000 ; channel number have
Description: For BSID <= 8. Corresponding ACM wFormatTag : 0x2000 ; channel number have
to be read from the corresponding audio element

Initialization: none

### A_AC3/BSID9

Codec ID: A_AC3/BSID9
Expand Down Expand Up @@ -549,10 +552,10 @@ Codec ID: A_VORBIS
Codec Name: Vorbis

Initialization: The `Private Data` contains the first three Vorbis packet in order. The lengths of the packets precedes them. The actual layout is:
- Byte 1: number of distinct packets `#p` minus one inside the CodecPrivate block.
- Byte 1: number of distinct packets `#p` minus one inside the `Private Data`.
This **MUST** be "2" for current (as of 2016-07-08) Vorbis headers.
- Bytes 2..n: lengths of the first `#p` packets, coded in Xiph-style lacing.
The length of the last packet is the length of the CodecPrivate block minus the lengths coded in these bytes minus one.
The length of the last packet is the length of the `Private Data` minus the lengths coded in these bytes minus one.
- Bytes n+1..: The [Vorbis identification header](https://xiph.org/vorbis/doc/Vorbis_I_spec.html),
followed by the [Vorbis comment header](https://xiph.org/vorbis/doc/v-comment.html)
followed by the [codec setup header](https://xiph.org/vorbis/doc/Vorbis_I_spec.html).
Expand Down Expand Up @@ -815,24 +818,30 @@ Codec Name: UTF-8 Plain Text

Description: Basic text subtitles. For more information, see (#subtitles) on Subtitles.

Initialization: none

### S_TEXT/SSA

Codec ID: S_TEXT/SSA

Codec Name: Subtitles Format

Description: The [Script Info] and [V4 Styles] sections are stored in the codecprivate. Each event is stored in its own Block.
Description: Each event is stored in its own Block.
For more information, see (#ssa-ass-subtitles) on SSA/ASS.

Initialization: The `Private Data` contains the [Script Info] and [V4 Styles] sections.

### S_TEXT/ASS

Codec ID: S_TEXT/ASS

Codec Name: Advanced Subtitles Format

Description: The [Script Info] and [V4 Styles] sections are stored in the codecprivate. Each event is stored in its own Block.
Description: Each event is stored in its own Block.
For more information, see (#ssa-ass-subtitles) on SSA/ASS.

Initialization: The `Private Data` contains the [Script Info] and [V4+ Styles] sections.

### S_TEXT/WEBVTT

Codec ID: S_TEXT/WEBVTT
Expand All @@ -841,6 +850,8 @@ Codec Name: Web Video Text Tracks Format (WebVTT)

Description: Advanced text subtitles. For more information, see (#webvtt) on WebVTT.

Initialization: none

### S_IMAGE/BMP

Codec ID: S_IMAGE/BMP
Expand All @@ -852,6 +863,8 @@ The timestamp in the block header of Matroska indicates the start display time,
the duration is set with the Duration element. The full data for the subtitle bitmap
is stored in the Block's data section.

Initialization: none

### S_DVBSUB

Codec ID: S_DVBSUB
Expand All @@ -861,6 +874,8 @@ Codec Name: Digital Video Broadcasting (DVB) subtitles
Description: This is the graphical subtitle format used in the Digital Video Broadcasting standard.
For more information, see (#digital-video-broadcasting-dvb-subtitles) on Digital Video Broadcasting (DVB).

Initialization: none

### S_VOBSUB

Codec ID: S_VOBSUB
Expand All @@ -869,16 +884,16 @@ Codec Name: VobSub subtitles

Description: The same subtitle format used on DVDs. Supported is only format version 7 and newer.
VobSubs consist of two files, the .idx containing information, and the .sub, containing the actual data.
The .idx file is stripped of all empty lines, of all comments and of lines beginning with `alt:` or `langidx:`.
The line beginning with `id:` **SHOULD** be transformed into the appropriate Matroska track language element
and is discarded. All remaining lines but the ones containing timestamps and file positions
are put into the `CodecPrivate` element.

For each line containing the timestamp and file position data is read from the appropriate
position in the .sub file. This data consists of a MPEG program stream which in turn
contains SPU packets. The MPEG program stream data is discarded, and each SPU packet
is put into one Matroska frame.

Initialization: The .idx file is stripped of all empty lines, of all comments and of lines beginning with `alt:` or `langidx:`.
The line beginning with `id:` **SHOULD** be transformed into the appropriate Matroska track language element
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SHOULD should probably be a MUST and mention that the line is discarded. But the wording for this whole codec is questionable and could be fixed separately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

poke

If we leave SHOULD we have to explain in which case do use it and in which case we don't.

and is discarded. The `Private Data` contains all remaining lines but the ones containing timestamps and file positions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be CodecPrivate not Private Data.


### S_HDMV/PGS

Codec ID: S_HDMV/PGS
Expand All @@ -888,6 +903,8 @@ Codec Name: HDMV presentation graphics subtitles (PGS)
Description: This is the graphical subtitle format used on Blu-rays. For more information,
see (#hdmv-text-subtitles) on HDMV text presentation.

Initialization: none

### S_HDMV/TEXTST

Codec ID: S_HDMV/TEXTST
Expand All @@ -897,6 +914,8 @@ Codec Name: HDMV text subtitles
Description: This is the textual subtitle format used on Blu-rays. For more information,
see (#hdmv-presentation-graphics-subtitles) on HDMV graphics presentation.

Initialization: none

### S_KATE

Codec ID: S_KATE
Expand All @@ -907,6 +926,8 @@ Description: A subtitle format developed for ogg. The mapping for Matroska is de
on the [Xiph wiki](http://wiki.xiph.org/index.php/OggKate#Matroska_mapping).
As for Theora and Vorbis, Kate headers are stored in the private data as xiph-laced packets.

Initialization: none

## Button Codec Mappings

### B_VOBBTN
Expand All @@ -919,6 +940,8 @@ Description: Based on [MPEG/VOB PCI packets](http://dvd.sourceforge.net/dvdinfo/
The file contains a header consisting of the string "butonDVD" followed by the width and height
in pixels (16 bits integer each) and 4 reserved bytes. The rest is full [PCI packets](http://dvd.sourceforge.net/dvdinfo/pci_pkt.html).

Initialization: none

## Block Addition Mappings

Registered `BlockAddIDType` are:
Expand Down
2 changes: 1 addition & 1 deletion ebml_matroska.xml
Original file line number Diff line number Diff line change
Expand Up @@ -398,7 +398,7 @@ If this Element is used, then any Language Elements used in the same TrackEntry
see [@!I-D.ietf-cellar-codec] for more info.</documentation>
</element>
<element name="CodecPrivate" path="\Segment\Tracks\TrackEntry\CodecPrivate" id="0x63A2" type="binary" maxOccurs="1">
<documentation lang="en" purpose="definition">Private data only known to the codec.</documentation>
<documentation lang="en" purpose="definition">Private data only known to the codec. This element **MUST NOT** be present if the codec mapping specification defines no initialization or an initialization `none`, else **MUST** be present.</documentation>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO present and with a size of 0 would also mean 'none'. For example in VP9 you can have one to give the 10/12 bits profiles. But there may be none. IMO writing an empty field is better than no field at all in this case. It's clear that it's the default profile. (there are tons of 10 bits VP9 files with no CodecPrivate and there's no way to tell if they are 8 or 10 bits).

This is also in line with

If the encoding does require any form of Initialization then CodecPrivate Element MUST be written

V_PRORES would probably benefit from that as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're trying to stay away from "empty elements" this comment is not valid anymore.

The VP9 case still stands. Although there is some definition of what to put in the CodecPrivate, in most cases the element is not there at all. It doesn't fall in neither the no initialization nor the initialization to 'none' categories. So in the end it should be up to the codec to decide when it makes sense to put it and what it means when it's not there (for VP9 that means 8 bits and some unknown profile).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

poke

IMO this addition is not good, it's up to the codec to tell what it means if it's present or not.

</element>
<element name="CodecName" path="\Segment\Tracks\TrackEntry\CodecName" id="0x258688" type="utf-8" maxOccurs="1">
<documentation lang="en" purpose="definition">A human-readable string specifying the codec.</documentation>
Expand Down