Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delay time #53

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) 2015-2018 Isaac Bernat
Copyright (c) 2015-2025 Isaac Bernat

Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
Expand Down
7 changes: 5 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
1. [Get the subtitles](https://github.com/isaacbernat/netflix-to-srt#get-the-subtitles) (`.xml` dfxp or `.vtt` files from Netflix, YouTube... streaming media services).
- [From Netflix](https://github.com/isaacbernat/netflix-to-srt#from-netflix)
- [From YouTube](https://github.com/isaacbernat/netflix-to-srt#from-youtube)
2. [Convert them](https://github.com/isaacbernat/netflix-to-srt#convert-them-into-srt) into `.srt` files.
2. [Convert them](https://github.com/isaacbernat/netflix-to-srt#convert-them-into-srt) into `.srt` files (and/or shift timestamps).
3. [Star this repo ⭐](https://github.com/isaacbernat/netflix-to-srt#star-this-repo)

## Get the subtitles
Expand Down Expand Up @@ -69,6 +69,7 @@ You need FireFox and AdblockPlus Add-On. *not tested on other browsers*
- Copy your subtitle files in the same directory as `to_srt.py`
- Or use `-i INPUT_PATH` and `-o OUTPUT_PATH` for custom file locations
- All `.xml` and `.vtt` files in the input directory will generate a converted `.srt` file on the output one
- *Optional:* Use `-d DELAY_MS` parameter when running the script to delay all the timestamps by the given number of milliseconds. Negative values shift timestamps backwards. Example: `python to_srt.py -i samples/delays -o samples/delays -d -1500` will take all the eligible files in `samples/delays` and shift the subtitles to be 1.5 seconds earlier than the original version
- Enjoy! (And **star the repo ⭐** if you liked it ;D)

## Star this repo
Expand All @@ -84,7 +85,9 @@ If you like this project, please **star the repository ⭐**. It's free and it h
- Thanks for your contribution!

## Why this repository?
VLC player could not reproduce that kind of xml subtitles and I could not find any tool that could easily transform the xml files to a suitable format (e.g. SubRip (`.srt`)) in Linux or Mac, so I wrote this script and decided to share. I got a request for WebVTT (`.vtt`) and did the same.
[VideoLAN's VLC media player](https://www.videolan.org/vlc/) could not reproduce that kind of xml subtitles and I could not find any tool that could easily transform the xml files to a suitable format (e.g. SubRip (`.srt`)) in Linux or Mac, so I wrote this script and decided to share. I got a request for WebVTT (`.vtt`) and did the same.

Similarly, adjusting timestamps in 50ms increments was inconvenient using VLC's hotkeys (G, H and/or J) for large mismatches (e.g. 60 seconds because openings or summaries), so I added the `-d DELAY_MS` parameter so I could "advance" all the subtitles lines easily.

## TODOs
- More robust file parsing than just some quick and dirty regexes.
Expand Down
33 changes: 33 additions & 0 deletions samples/delay/sample_vtt.vtt
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
WEBVTT

NOTE Netflix
NOTE Profile: webvtt-lssdh-ios8
NOTE Date: 2019/11/27 21:10:46

NOTE SegmentIndex
NOTE Segment=591.758 7643@494 61
NOTE Segment=596.513 12163@8137 99
NOTE Segment=595.011 9765@20300 77
NOTE Segment=597.430 8648@30065 68
NOTE Segment=390.473 4112@38713 34
NOTE /SegmentIndex




1
00:01:15.450 --> 00:01:17.577 position:50.00%,middle align:middle size:80.00% line:84.67%
Que veux-tu ?

2
00:01:17.660 --> 00:01:21.247 position:50.00%,middle align:middle size:80.00% line:79.33%
Je veux savoir si une femme
montera sur le trône de Kattegat.

3
00:01:22.165 --> 00:01:24.250 position:50.00%,middle align:middle size:80.00% line:84.67%
Tu veux dire, après le décès de Ragnar ?

4
00:01:59.503 --> 00:02:00.629 position:50.00%,middle align:middle size:80.00% line:84.67%
Oui.
17 changes: 17 additions & 0 deletions samples/delay/sample_vtt_minus_900.srt
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
1
00:01:14,550 --> 00:01:16,677
Que veux-tu ?

2
00:01:16,760 --> 00:01:20,347
Je veux savoir si une femme
montera sur le trône de Kattegat.

3
00:01:21,265 --> 00:01:23,350
Tu veux dire, après le décès de Ragnar ?

4
00:01:58,603 --> 00:01:59,729
Oui.

17 changes: 17 additions & 0 deletions samples/delay/sample_vtt_plus_900.srt
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
1
00:01:16,350 --> 00:01:18,477
Que veux-tu ?

2
00:01:18,560 --> 00:01:22,147
Je veux savoir si une femme
montera sur le trône de Kattegat.

3
00:01:23,065 --> 00:01:25,150
Tu veux dire, après le décès de Ragnar ?

4
00:02:00,403 --> 00:02:01,529
Oui.

38 changes: 38 additions & 0 deletions samples/delay/sample_xml.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<tt xmlns:tt="http://www.w3.org/ns/ttml" xmlns:ttm="http://www.w3.org/ns/ttml#metadata" xmlns:ttp="http://www.w3.org/ns/ttml#parameter" xmlns:tts="http://www.w3.org/ns/ttml#styling" ttp:cellResolution="40 19" ttp:pixelAspectRatio="1 1" ttp:tickRate="10000000" ttp:timeBase="media" tts:extent="640px 480px" xmlns="http://www.w3.org/ns/ttml">
<head>
<ttp:profile use="http://netflix.com/ttml/profile/dfxp-ls-sdh"/>
<styling>
<style tts:color="white" tts:fontFamily="monospaceSansSerif" tts:fontSize="100%" xml:id="style0"/>
</styling>
<layout>
<region xml:id="region_00">
<style tts:textAlign="left"/>
<style tts:displayAlign="center"/>
<style tts:backgroundColor="transparent"/>
</region>
<region xml:id="region_01">
<style tts:textAlign="left"/>
<style tts:displayAlign="center"/>
<style tts:backgroundColor="transparent"/>
</region>
<region xml:id="region_02">
<style tts:textAlign="left"/>
<style tts:displayAlign="center"/>
<style tts:backgroundColor="transparent"/>
</region>
</layout>
</head>
<body style="style0">
<div xml:space="preserve">
<p begin="241910001t" end="280280000t" region="region_00" tts:extent="65.00% 5.33%" tts:origin="17.50% 79.29%" xml:id="subtitle0">Ricky! Ricky! Wake the f***</p>
<p begin="241910001t" end="280280000t" region="region_01" tts:extent="42.50% 5.33%" tts:origin="20.00% 84.62%" xml:id="subtitle1">up! Ricky, get up!</p>
<p begin="281115000t" end="301970001t" region="region_00" tts:extent="52.50% 5.33%" tts:origin="22.50% 79.29%" xml:id="subtitle2">Bubbles, f*** off! I'm</p>
<p begin="281115000t" end="301970001t" region="region_01" tts:extent="45.00% 5.33%" tts:origin="20.00% 84.62%" xml:id="subtitle3">sleeping in today. </p>
<p begin="302805001t" end="330330000t" region="region_00" tts:extent="75.00% 5.33%" tts:origin="10.00% 79.29%" xml:id="subtitle4">No, you're not. You told me you</p>
<p begin="302805001t" end="330330000t" region="region_01" tts:extent="60.00% 5.33%" tts:origin="17.50% 84.62%" xml:id="subtitle5">were gonna help me put up</p>
<p begin="331165000t" end="359117508t" region="region_00" tts:extent="67.50% 5.33%" tts:origin="15.00% 79.29%" xml:id="subtitle6">flyers for the grand opening</p>
<p begin="591165000t" end="608117508t" region="region_01" tts:extent="62.50% 5.33%" tts:origin="17.50% 84.62%" xml:id="subtitle7">of the Shed and Breakfast.</p>
</div>
</body>
</tt>
22 changes: 22 additions & 0 deletions samples/delay/sample_xml_minus_900.srt
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
1
00:00:23,291 --> 00:00:27,128
Ricky! Ricky! Wake the f***
up! Ricky, get up!

2
00:00:27,211 --> 00:00:29,297
Bubbles, f*** off! I'm
sleeping in today.

3
00:00:29,380 --> 00:00:32,133
No, you're not. You told me you
were gonna help me put up

4
00:00:32,216 --> 00:00:35,011
flyers for the grand opening

5
00:00:58,216 --> 00:00:59,911
of the Shed and Breakfast.
22 changes: 22 additions & 0 deletions samples/delay/sample_xml_plus_900.srt
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
1
00:00:25,091 --> 00:00:28,928
Ricky! Ricky! Wake the f***
up! Ricky, get up!

2
00:00:29,011 --> 00:00:31,097
Bubbles, f*** off! I'm
sleeping in today.

3
00:00:31,180 --> 00:00:33,933
No, you're not. You told me you
were gonna help me put up

4
00:00:34,016 --> 00:00:36,811
flyers for the grand opening

5
00:01:00,016 --> 00:01:01,711
of the Shed and Breakfast.
37 changes: 32 additions & 5 deletions to_srt.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,11 +69,36 @@ def xml_cleanup_spans_end(span_end_re, text, has_cursive):
return text


def to_srt(text, extension):
def to_srt(text, extension, delay_ms):
if extension.lower() == ".xml":
return xml_to_srt(text)
if extension.lower() == ".vtt":
return vtt_to_srt(text)
text = xml_to_srt(text)
elif extension.lower() == ".vtt":
text = vtt_to_srt(text)
return shift_srt_timestamp(text, delay_ms)


def shift_srt_timestamp(text, delay_ms=0):
if not delay_ms:
return text

def shift_time(time_str, shift):
h, m, s_ms = time_str.split(":")
s, ms = s_ms.split(",")
total_ms = int(h)*3600000 + int(m)*60000 + int(s)*1000 + int(ms)
new_ms = total_ms + shift

h = new_ms // 3600000; new_ms %= 3600000
m = new_ms // 60000; new_ms %= 60000
s = new_ms // 1000; ms = new_ms % 1000
return f"{h:02}:{m:02}:{s:02},{ms:03}"

def replace_timestamp(match):
start = shift_time(match[1], delay_ms)
end = shift_time(match[2], delay_ms)
return f"{start} --> {end}" if start and end else match[0]

timestamp_regex = r"(\d{2}:\d{2}:\d{2},\d{3}) --> (\d{2}:\d{2}:\d{2},\d{3})"
return re.sub(timestamp_regex, replace_timestamp, text)


def convert_vtt_time(line):
Expand Down Expand Up @@ -205,14 +230,16 @@ def main():
help=help_text.format("input", directory))
parser.add_argument("-o", "--output", type=str, default=directory,
help=help_text.format("output", directory))
parser.add_argument("-d", "--delay", type=int, default=0,
help="delay all subtitles by the given number of milliseconds")
a = parser.parse_args()
filenames = [fn for fn in os.listdir(a.input)
if fn[-4:].lower() in SUPPORTED_EXTENSIONS]
for fn in filenames:
with codecs.open(u"{}/{}".format(a.input, fn), 'rb', "utf-8") as f:
text = f.read()
with codecs.open(u"{}/{}.srt".format(a.output, fn[:fn.rfind('.')]), 'wb', "utf-8") as f:
f.write(to_srt(text, fn[-4:]))
f.write(to_srt(text, fn[-4:], a.delay))


if __name__ == '__main__':
Expand Down