Previously on my blog: In my TiVo2Podcast stuff I automated the process of putting chapters around commercials, but had to call out to a small C++ app I wrote to put the chapters in using libmp4v2.
A few weeks ago I was looking at some ruby gems for a project I was working on and stumbled across ffi, a foreign function interface gem for ruby, or as its docs put it: “a ruby extension for programmatically loading dynamic libraries, binding functions within them, and calling those functions from Ruby code.” As long as you know the function signatures that you need, its pretty trivial to make the calls from Ruby. You do need to be aware of memory management stuff sometimes, but overall its pretty easy, especially for basic use. If you’re only going to be working in Ruby and need access to a C library, this is much easier than mucking with swig, that’s for sure.
The mind-blowing part for me is that the authors of the gem have made it smart enough to know what flavor of ruby vm and platform the code is running in and it does the right thing, no matter if its JRuby or on Windows or whatever. While I haven’t had a chance to use it yet, I suspect this property will be useful with JRuby at work in the future.
Continue reading FFI for Ruby and an mp4v2 example
In the process of working on my TiVo to Video Podcast stuff I finally got to the point where I wanted to do something about commercials. The ideal solution would be callable from ruby or at least the command line so that it can be used in an automated manner. After a few weeks or research and playing around I found a workable solution that required me writing some code.
Early on I made the design decision to put chapter markers in around the commercials rather than cut them out. The main reason for this is “What if the detection is wrong?” For a 30 minute show, you’d end up missing more than half it in a worse case scenario. Since its easy to jump ahead to the next chapter on my iPhone and iPad this seemed like a good decision.
Here’s the three most important facts I learned in my research…
- Commercial detection: No question, comskip is the right tool to use here. The downside is that by default its a windows-only command line tool. It works perfectly under wine, so that mitigates it. Right now I use it via wine, but the source is available, so in the long run it would be good to have a native linux binary to call. Comskip creates a variety of output formats, so I picked one that seemed to be the easiest to work with.
- There is no good command line tool to add chapters to an existing mp4/m4v file: I dug around and found a lot of potential solutions, but the all were either not on linux, couldn’t take the resultant files that comskip spit out, or just not a good fit for what I was doing.
- The MP4v2 library had primitives for adding the chapters: From this point forward, it was just writing some code that did exactly what I wanted.
The code below expects three arguments: 1) The video file to work on, 2) The chapter file output from comskip in ZoomPlayer chapter format, 3) and the total length in seconds of the video file. The last one I might be able to remove once I have more brain time to devote to this.
// Compile with something like: g++ AddChapterInfo.cpp -o AddChapterInfo -lmp4v2 -lboost_regex
int main(int argc, char *argv)
char *m4vfilename = argv;
char *chapfilename = argv;
uint32_t total_length = boost::lexical_cast<uint32_t>(argv);
MP4FileHandle m4vfile = MP4Modify(m4vfilename);
// Add the chapter track, have it reference the first track
// (should be the video) and set the "clock ticks per second" to 1.
// (We may want to set that to 1000 to go into milliseconds.)
MP4TrackId chapter_track = MP4AddChapterTextTrack(m4vfile, 1, 1000);
uint32_t last_time = 0;
while (getline(chapfile, s))
if (boost::regex_search(s, rem, chpre))
uint32_t t = boost::lexical_cast<int>(rem) * 1000;
if (t > 0)
MP4AddChapter(m4vfile, chapter_track, t - last_time);
last_time = t;
if (total_length - last_time > 0)
MP4AddChapter(m4vfile, chapter_track, total_length - last_time);
I’ve been using this code for over two weeks straight and has been operating perfectly, but obviously this code could be made a lot more robust, especially in the areas of error handling. I’ve only run into issues when comskip guesses commercials wrong, which is only payoff for putting chapters in instead of nuking the commercials all together.
In the long run, I should either write and release a generic tool that helps the next poor sap like me or work on using swig bindings to mp4v2 so I could just do the calls in ruby.