-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the TrackEntry\Language mandatory #451
Conversation
Following the bug/inconsistency found in EBML [1] it might be better to make the track language mandatory. It will give better past/present compatibility. In light of the next Track Selection explanation it's also vital to know the language when you have more than one audio/subtitle track. This may be a problem for video tracks that were not mandatory and technically didn't have a defined language. The flag does make sense though, even for video. For example some text on the screen may appear differently between languages. This change will turn every track of every file that doesn't have this value set assume it's in English. That's probably what any system dealing with track selection is already doing anyway given it was the official default value. [1] ietf-wg-cellar/ebml-specification#394
I think this a simple solution, but not the best. I have read all the other issues and PRs to this topic and yes there are many tools/apps out which handle mandatory and default value not correct. Like MKVToolNix which don't write the track-language element if the value is eng, but this is wrong. The header of the track-language element must be written, but it's possible to omit the value. And when reading the track section and there is no language element, then the language is undefined and not English(eng). |
I know you didn't mean it like that, but I find this somewhat insulting. All those programs worked spec-compliant for close to 18 years. Their authors followed what we told them how to handle mandatory elements that aren't present. We just changed the specs under them for no good reason and suddenly they all have to do more work and deal with bug reports and ambiguity. From my point of view it's the other way around: the EBML RFC is buggy and doesn't follow the what's been established practice for 18 years. |
ready to merge? |
Dang, I thought I had approved this one already. Ready from my point of view. |
Yes we talked about it at the Matroska meeting and we agreed, but in this PR are 3(now 2) approval was left. |
It sometimes has a language, you can have the star wars intro in several languages or japanime with signs in japanese or translated. As I understand:
|
No. According to the new rules, such an element (called There are btw three more elements that need to be made (partially) mandatory in order not to break old files: |
For parsers that act before-spec-change-like:
Signalling "there is no linguistic content" or "the language is unknown" or "there are multiple languages" can all be done with special ISO 639-2 codes:
The "I don't know!" situation would call for using |
We are lucky here that there is a value for "I don't know!" but it may not be the case for other strings, so IIUC we'll have to be careful when setting a non 0-sized default value for string, in the case of that a 0-sized string makes sense for an element. |
MTX reads all fine, in the merge tab the value "und" is used, same in the header editor. |
So it is now the wrong behavior, do I understand correctly now? |
Mmmh... yes, "eng" should be used -> new RFC specs |
Indeed.
It can't break compliant older files: All strings with a default value must be in ISO-639-2 form; an empty string is not. |
But also with a mandatory attribute a Matroska writer can write only the element header with a zero-size and a parser have to use "eng". |
Yes, if this feature is used (which makes no sense for a mandatory element), then old parsers are broken. (I presume your post was supposed to be a reply to my comment on your post that you cited?) |
No. I have comment my own words, because I recognized that adding a mandatory attribute not resolves the issue with zero-size elements for some softwares. |
Yes. We're starting with the track language as it's the most common and problematic one. It's also a test to check what is the impact of such a change before we generalize it. |
We already established that the "zero-size" feature is probably not used in the wild, so IMO that's not a big deal if we break it. Current implementation use an empty string in this case which, as @mkver mentioned, is not a valid language string. So in any case such a software doesn't read a proper value anyway (or if it's half smart it detects the empty string and use the default value for that element in the end). As I said, in VLC this is already the case where the language is set to "eng" by default, even before the Track is read. This is the case in libavformat as well. That covers most desktop readers. There are also web browsers (WebM) that can parse this element as well. But I'm not aware they can select a track based on its language so even if they read the element (wrong) it's not used. It's mostly informational in the file. I'm not sure how this is handled in the default Android media player, is it possible to select the track based on the language ? Then there are the other appliances (TVs, Blu-ray players, media streamers) that can read .MKV files. The good ones can show the language of each track so the user can select it. I have no idea if they are using their own parser, a parser they buy with the chip (along with the codec handling code) or use libavformat. We can eliminate the case of 'zero-size' because those files don't exist anyway. This is likely untested and/or unhandled. The issue is when the element is not present. Did they assume it was And a video track with a set language is also very unlikely to be any use to them either. It's really only used for audio and subtitles. So given all that, I think making the |
Totally agree. |
2 missing approvals |
Following the bug/inconsistency found in EBML [1] it might be better to make
the track language mandatory. It will give better past/present compatibility.
In light of the next Track Selection explanation it's also vital to know the
language when you have more than one audio/subtitle track.
This may be a problem for video tracks that were not mandatory and technically
didn't have a defined language. The flag does make sense though, even for video.
For example some text on the screen may appear differently between languages.
This change will turn every track of every file that doesn't have this value set
assume it's in English. That's probably what any system dealing with track selection
is already doing anyway given it was the official default value.
[1] ietf-wg-cellar/ebml-specification#394