doc:appunti:linux:video:subtitleripper
                Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| doc:appunti:linux:video:subtitleripper [2024/02/01 11:13] – [OCR the images from the .sub file] niccolo | doc:appunti:linux:video:subtitleripper [2024/02/01 11:56] (current) – [How to rip DVD subtitles with vobsub2srt] niccolo | ||
|---|---|---|---|
| Line 9: | Line 9: | ||
| * **lsdvd** - From the official Debian repository. | * **lsdvd** - From the official Debian repository. | ||
| * **vobcopy** - From the official Debian repository. | * **vobcopy** - From the official Debian repository. | ||
| + | * **mediainfo** - From the official Debian repository. | ||
| * **mkvtoolnix** - From the official Debian repository. | * **mkvtoolnix** - From the official Debian repository. | ||
| * **vobsub2srt** - From the Deb Multimedia repository. | * **vobsub2srt** - From the Deb Multimedia repository. | ||
| ===== Ripping the .vob from the DVD ===== | ===== Ripping the .vob from the DVD ===== | ||
| + | |||
| + | A DVD can contain several **titles** and you should identify which one you want to rip; generally it is the longer one or the one with most chapters. We check the DVD content using the **lsdvd** tool: | ||
| + | |||
| + | < | ||
| + | lsdvd /dev/sr0 | ||
| + | Disc Title: DVD_TITLE | ||
| + | Title: 01, Length: 01: | ||
| + | Title: 02, Length: 00: | ||
| + | Title: 03, Length: 00: | ||
| + | Title: 04, Length: 00: | ||
| + | Title: 05, Length: 00: | ||
| + | Title: 06, Length: 00: | ||
| + | Longest track: 01 | ||
| + | </ | ||
| + | |||
| + | The longest title is the **#1**, so we will extract it using **vobcopy**: | ||
| <code bash> | <code bash> | ||
| vobcopy -n ' | vobcopy -n ' | ||
| </ | </ | ||
| + | |||
| + | The resulting file will be saved into the working directory (as specified by the **%%-o%%** option) and it will be named by the DVD title, something like **DVD_TITLE.vob**. | ||
| + | |||
| + | You can inspect the content of the file using the **mediainfo** tool, in our case the file contains one video stream, two audio streams and three subtitle streams. The subtitles are in the standard DVD format: VobSub, which is a images (bitmap) format, not text. | ||
| + | |||
| ===== Converting the .vob into .mkv format ===== | ===== Converting the .vob into .mkv format ===== | ||
| Line 22: | Line 44: | ||
| As far I know, there is not a tool capable of extracting the VobSub subtitles directly from the vob file; we might hope that **ffmpeg** was capable of doing this, but it seems not. | As far I know, there is not a tool capable of extracting the VobSub subtitles directly from the vob file; we might hope that **ffmpeg** was capable of doing this, but it seems not. | ||
| - | Fortunately the **mkvextract** can extract the VobSub stream from a //mkv// file, so we firstly use ffmpeg to convert the //vob// into //mkv//. In the following example all the stream are copied, without re-encoding. At this step you may want to re-encode the video to squeeze the MPEG2 stream into the more efficient H264 format. | + | Fortunately the **mkvextract** | 
| <code bash> | <code bash> | ||
| Line 37: | Line 59: | ||
| ===== Extracting .sub and .idx files from the .vob ===== | ===== Extracting .sub and .idx files from the .vob ===== | ||
| + | |||
| + | From the //mkv// file it is now possibile to create **two files** (.sub and .idx) for each subtitles stream. The stream numbering expected by '' | ||
| <code bash> | <code bash> | ||
| mkvextract ' | mkvextract ' | ||
| </ | </ | ||
| + | |||
| + | The result will be two files: **subtitles-3.sub** and **subtitles-3.idx**. It is possible to repeat the command to extract the other subtitles (**#4** and **#5** in our example). | ||
| ===== OCR the images from the .sub file ===== | ===== OCR the images from the .sub file ===== | ||
| <code bash> | <code bash> | ||
| - | vobsub2srt --ifo ' | + | vobsub2srt --ifo ' | 
| </ | </ | ||
| The .IFO file is required to get the correct palette, width and hight, but it is not mandatory. | The .IFO file is required to get the correct palette, width and hight, but it is not mandatory. | ||
doc/appunti/linux/video/subtitleripper.1706782423.txt.gz · Last modified:  by niccolo
                
                