Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification about when CodecPrivate is mandatory #446

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

JeromeMartinez
Copy link
Contributor

More wording about when CodecPrivate is mandatory, following https://mailarchive.ietf.org/arch/msg/cellar/8vUqFFDT-Dqu1kc2ZwE_qOBvH8M/.

Also comply to the sentence:

Each encoding supported for storage in Matroska MUST have a defined Initialization.

And use the same wording Private Data everywhere.

I am not 100% sure about the Initialization: none (when nothing was previously written, I consider that there is no init), please double-check.

ebml_matroska.xml Outdated Show resolved Hide resolved
codec_specs.md Outdated Show resolved Hide resolved
@JeromeMartinez
Copy link
Contributor Author

@mbunkus please review again.

Copy link
Contributor

@mbunkus mbunkus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I just noticed two more things, one of which is factually wrong in the original already, the other being a question of style.

I'd be fine having both of those issues resolved in separate MRs (especially the one about verbiage as there are a lot more pieces of text that would benefit from similar scrutiny) and accepting this one as-is. Thoughts?

codec_specs.md Outdated
For more information, see (#ssa-ass-subtitles) on SSA/ASS.

Initialization: The `Private Data` contains the [Script Info] and [V4 Styles] sections.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the original text was wrong. For S_TEXT/ASS the two sections are [Script Info] (this one was correct) and [V4+ Styles] (this one was incorrect; it's [V4 Styles] without the + for SSA but [V4+ Styles] with the + for ASS). So let's correct it while we touch this part anyway.

Thanks!

codec_specs.md Outdated
**SHOULD NOT** be written and **MUST** be ignored. Data that is defined Initialization to be
**SHOULD NOT** be written and **MUST** be ignored. If the encoding does require any form of Initialization,
then `CodecPrivate Element` **MUST** be written and **MUST** be provided to the decoder.
Data that is defined Initialization to be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that this sentence wasn't changed, but to be honest, I didn't like the original anyway. Maybe we can change it right now, too. On the other hand changing verbiage might be better handled as a separate MR.

If we wanted to change it now, the following might be better:

The Initialization data to be stored in the CodecPrivate Element is referred to as Private Data.

Again, I'd be fine accepting this MR without this particular change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we wanted to change it now, the following might be better

Obvious bad typos, I fixed them in the same commit. Please review again.

Copy link
Contributor

@robUx4 robUx4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly OK apart from 'none' which IMO should also accept the empty value (meaning default value in Matroska). Especially for codecs that can have 'none' and some actual value. Or maybe they should NOT have 'none' but define the default value instead. (which would also save a tiny bit of space).
It might be tricky to define a default value though. It could be a large binary blob (Vorbis).

@@ -398,7 +398,7 @@ If this Element is used, then any Language Elements used in the same TrackEntry
see [@!I-D.ietf-cellar-codec] for more info.</documentation>
</element>
<element name="CodecPrivate" path="\Segment\Tracks\TrackEntry\CodecPrivate" id="0x63A2" type="binary" maxOccurs="1">
<documentation lang="en" purpose="definition">Private data only known to the codec.</documentation>
<documentation lang="en" purpose="definition">Private data only known to the codec. This element **MUST NOT** be present if the codec mapping specification defines no initialization or an initialization `none`, else **MUST** be present.</documentation>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO present and with a size of 0 would also mean 'none'. For example in VP9 you can have one to give the 10/12 bits profiles. But there may be none. IMO writing an empty field is better than no field at all in this case. It's clear that it's the default profile. (there are tons of 10 bits VP9 files with no CodecPrivate and there's no way to tell if they are 8 or 10 bits).

This is also in line with

If the encoding does require any form of Initialization then CodecPrivate Element MUST be written

V_PRORES would probably benefit from that as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're trying to stay away from "empty elements" this comment is not valid anymore.

The VP9 case still stands. Although there is some definition of what to put in the CodecPrivate, in most cases the element is not there at all. It doesn't fall in neither the no initialization nor the initialization to 'none' categories. So in the end it should be up to the codec to decide when it makes sense to put it and what it means when it's not there (for VP9 that means 8 bits and some unknown profile).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

poke

IMO this addition is not good, it's up to the codec to tell what it means if it's present or not.

codec_specs.md Show resolved Hide resolved

For each line containing the timestamp and file position data is read from the appropriate
position in the .sub file. This data consists of a MPEG program stream which in turn
contains SPU packets. The MPEG program stream data is discarded, and each SPU packet
is put into one Matroska frame.

Initialization: The .idx file is stripped of all empty lines, of all comments and of lines beginning with `alt:` or `langidx:`.
The line beginning with `id:` **SHOULD** be transformed into the appropriate Matroska track language element
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SHOULD should probably be a MUST and mention that the line is discarded. But the wording for this whole codec is questionable and could be fixed separately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

poke

If we leave SHOULD we have to explain in which case do use it and in which case we don't.

@robUx4 robUx4 added clarifications codec mapping spec_codecs Codec Matroska spec document target labels Jan 2, 2021
@@ -398,7 +398,7 @@ If this Element is used, then any Language Elements used in the same TrackEntry
see [@!I-D.ietf-cellar-codec] for more info.</documentation>
</element>
<element name="CodecPrivate" path="\Segment\Tracks\TrackEntry\CodecPrivate" id="0x63A2" type="binary" maxOccurs="1">
<documentation lang="en" purpose="definition">Private data only known to the codec.</documentation>
<documentation lang="en" purpose="definition">Private data only known to the codec. This element **MUST NOT** be present if the codec mapping specification defines no initialization or an initialization `none`, else **MUST** be present.</documentation>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

poke

IMO this addition is not good, it's up to the codec to tell what it means if it's present or not.


For each line containing the timestamp and file position data is read from the appropriate
position in the .sub file. This data consists of a MPEG program stream which in turn
contains SPU packets. The MPEG program stream data is discarded, and each SPU packet
is put into one Matroska frame.

Initialization: The .idx file is stripped of all empty lines, of all comments and of lines beginning with `alt:` or `langidx:`.
The line beginning with `id:` **SHOULD** be transformed into the appropriate Matroska track language element
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

poke

If we leave SHOULD we have to explain in which case do use it and in which case we don't.


For each line containing the timestamp and file position data is read from the appropriate
position in the .sub file. This data consists of a MPEG program stream which in turn
contains SPU packets. The MPEG program stream data is discarded, and each SPU packet
is put into one Matroska frame.

Initialization: The .idx file is stripped of all empty lines, of all comments and of lines beginning with `alt:` or `langidx:`.
The line beginning with `id:` **SHOULD** be transformed into the appropriate Matroska track language element
and is discarded. The `Private Data` contains all remaining lines but the ones containing timestamps and file positions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be CodecPrivate not Private Data.

robUx4 added a commit to robUx4/matroska-specification that referenced this pull request Nov 10, 2024
robUx4 added a commit that referenced this pull request Nov 11, 2024
robUx4 added a commit to robUx4/matroska-specification that referenced this pull request Dec 14, 2024
robUx4 added a commit to robUx4/matroska-specification that referenced this pull request Dec 22, 2024
"If the encoding does not require any form of Initialization,
then `none` **MUST** be used to define the Initialization"

Ref. ietf-wg-cellar#446

Co-authored-by: Jérôme Martinez <[email protected]>
robUx4 added a commit that referenced this pull request Jan 4, 2025
"If the encoding does not require any form of Initialization,
then `none` **MUST** be used to define the Initialization"

Ref. #446

Co-authored-by: Jérôme Martinez <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clarifications codec mapping spec_codecs Codec Matroska spec document target
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants