Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SCC: negative cursor leads to strange behavior #406

Closed
humberthardy opened this issue Oct 3, 2023 · 9 comments
Closed

SCC: negative cursor leads to strange behavior #406

humberthardy opened this issue Oct 3, 2023 · 9 comments
Assignees

Comments

@humberthardy
Copy link
Contributor

Hello,

I may have found an issue in SCCReader

If the cursor becomes negative in SccCaptionText.append, old text is overwritten by the new one

Here is a test.

  1. The cursor is put at row 15, col 16 and writes "Eeech!"
  2. The cursor is put at col 8 on the same line (15) and writes "Scary"

The current result is Scary!h instead of Scary Eeech!

I am not sure if it is a bug or a bad\unauthorized SCC sequence, any thought about that?

  def test_scc_with_negative_cursor(self):
    scc_content = """Scenarist_SCC V1.0
00:00:01:00	94AE 94AE 9420 9420 94F8 94F8 45E5 E5E3 68A1 94F4 94F4 D3E3 61F2 79A1 942C 942C 942F 942F
00:00:02:00	942F 942F
    """
    scc_disassembly_expected = """\
00:00:01:00	{ENM}{ENM}{RCL}{RCL}{1516}{1516}Eeech!{1508}{1508}Scary!{EDM}{EDM}{EOC}{EOC}
00:00:02:00	{EOC}{EOC}
"""
    scc_disassembly = to_disassembly(scc_content)
    self.assertEqual(scc_disassembly_expected, scc_disassembly)
    import ttconv.srt.writer as srt_writer

    doc = to_model(scc_content)
    current_srt = srt_writer.from_model(doc)

    #bad_srt = """10:00:01,400 --> 00:00:02,033\nScary!h!\n"""
    expected_srt = """10:00:01,400 --> 00:00:02,033\nScary!  Eeech!"""
    self.assertEqual(current_srt, expected_srt)
@palemieux
Copy link
Contributor

@humberthardy Did you try the test case with the PR at #395, which is a subtanstial refactor of the SCC reader?

@humberthardy
Copy link
Contributor Author

master and PR #395 have the same behaviour

@humberthardy
Copy link
Contributor Author

Thank you @valnoel, I will test it soon!

@humberthardy
Copy link
Contributor Author

I found some errors, but, I think they are more related to the parent branch, PR #315`.
I can open issues in your repo @valnoel or, I can try to fix it too.

Keep in mind that the main problem is (probably) a malformed SCC 😄

  1. There is no default style
    With ttconv/master, if no style has been selected, we pop-on by default.
    https://github.com/sandflow/ttconv/blob/b2cc10a963633e1f09682c9b5330eeddfbdbb555/src/main/python/ttconv/scc/reader.py#L220C26-L220C26
    I have a scc with no RCL & RU. the whole content is lost, probably better to pop up by default than ignore it?

2 - Backspace at beginning of a line.
SCC code : 10:01:44;17 94AE 9420 9470 9723 946E 80C1 92B0 20ec 6120 e6e9 6e20 64e5 7320 616e 6edc e573 2031 38b0 b02c 942C 8080 8080 942F

src/main/python/ttconv/scc/context.py", line 160, in backspace
    self.get_caption_to_process().get_current_text().backspace()
AttributeError: 'NoneType' object has no attribute 'backspace'

Easy fix: Rather than check for get_current_text() is not None, maybe we can comment context.backspace() for SccExtendedCharacter in line.py?

3 -Roll up with empty caption
SCC code : 10:03:20:16 94ad 94ad 9426 9426 92d0 92d0 a880 9138 9138 2064 942c 942c e575 f820 76ef e9f8 2c20 e56e 2061 6e67 ec61 e973 29ba

ttconv/scc/context.py", line 467, in process_text
    self.active_caption.append_text(word)
AttributeError: 'NoneType' object has no attribute 'append_text'

4 - Adding text with no active line
SCC code: 10:55:31:29 2080 3280 2046 3180

ttconv/scc/caption_paragraph.py", line 129, in append_text
    self.get_current_line().add_text(text)
AttributeError: 'NoneType' object has no attribute 'add_text'

@valnoel
Copy link
Collaborator

valnoel commented Oct 12, 2023

@humberthardy Thanks for the feedback.

I wasn't sure we wanted to make the SCC reader resilient or constraining regarding the CEA-608 specification until we discuss with @palemieux...
So I've updated #395 with e749eb9 to make the SCC parsing less restrictive and to handle your cases.

Could you please check if this works for you at this point? Thanks!

@humberthardy
Copy link
Contributor Author

humberthardy commented Oct 16, 2023

Thanks @valnoel for your amazing work, it seems to work better.
I've tested it blind on over a thousand SCCs and it seems to work 99% of the time.
I still have a few errors (~ 10) I don't know if it's worth fixing it.
I've identified a few, if you are interested, you can find a sample below.
The last one is probably the most interesting if I get it: we are switching from Popon to rollup, the SCCContext is in rollup style but the current paragraph still in Popon style leading to a runtime error.

Again: thank you!

  def test_scc_content_leads_to_python_error(self):
    scc_content = """\
Scenarist_SCC V1.0

00:00:03:01	9370 6970 7375 6d00 942c 942f

09:59:59:18	9420 9476 9723 2080 9420 942c 942f 9420 97f4 a8e3 616d e5f2 6120 7368 75f4 f4e5 f229 9452 9723 91ae bce3 6861 eff4 e9e3 20e3 61e3 ef70 68ef 6e79 9470 9723 91ae e5f8 e973 f473 20f4 68f2 ef75 6768 ef75 f420 70f2 ef67 f261 6d3e 9420 942c 942f 9420 94f4 a870 ec61 79e6 75ec 206d 7573 e9e3 2980

11:00:09:07	942c
"""

#     text_alignment = self.active_caption.guess_text_alignment()
#   File "ttconv/src/main/python/ttconv/scc/caption_paragraph.py", line 288, in guess_text_alignment
#   left_border = longest_line.get_indent() + longest_line.get_leading_spaces()
# File "ttconv/src/main/python/ttconv/scc/caption_line.py", line 165, in get_leading_spaces
# first_text = self.get_texts()[index].get_text()
# IndexError: list index out of range
    scc_content = """\
Scenarist_SCC V1.0

01:02:28:03	9420 9420 94D0 94D0 D0D5 49D3 2043 A745 D354 2051 D545 4C51 D545 2043 C84F D345 2051 D5A7 4FCE 94F2 94F2 D045 D554 20D0 C1D3 20D3 4520 C445 C2C1 5252 C1D3 D345 52AE 942C 942C 942F 942F

01:02:34:04	942C 942C

01:02:41:17	2020
"""
#     self.buffered_caption.append_text(word)
#   File "ttconv/src/main/python/ttconv/scc/caption_paragraph.py", line 129, in append_text
#   self.get_current_line().add_text(text)
# AttributeError: 'NoneType' object has no attribute 'add_text'


    scc_content = '''\
Scenarist_SCC V1.0

11:19:24:05	9420 946E A861 7070 EC61 7564 E973 73E5 6DE5 6EF4 7329 9420 942C 942F

11:19:30:17	9426 94AD 946E 3E3E 3E20 CE61 F2F2 61F4 E575 F2BA 2043 E5F4 F4E5
'''
# 11:19:24:05	{RCL}{15WhI}(applaudissements){RCL}{EDM}{EOC}
# 11:19:30:17	{RU3}{CR}{15WhI}>>> Narrateur: Cette
#                       ^{CR}:  RuntimeError: Cannot roll-Up PopOn-styled caption.
#

    dis = to_disassembly(scc_content)
    doc = to_model(scc_content)

    self.assertIsNotNone(doc)

    region_1 = doc.get_region("pop1")
    self.assertIsNotNone(region_1)

@valnoel
Copy link
Collaborator

valnoel commented Oct 24, 2023

@humberthardy I've updated my PR, addressing the last cases you posted. Could you please have a look at it? Thanks!

@humberthardy
Copy link
Contributor Author

I no longer see any python runtime errors.
Thank you very much @valnoel

@palemieux
Copy link
Contributor

Closed by #395

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants