Capturing and Posting Video Screenshots at Random Intervals

I recently created a couple of bot Twitter accounts, @ScreenshotsVice which shares screencaps from episodes of "Miami Vice" and @Pasolinibot which shares caps from the filmography of Pier Paolo Pasolini, both at regular intervals.

The way these accounts are managed are that, when a cronjob is triggered, from a video library, a series of screencaptures are dumped (if they were set to refresh on the previous run), and then the listing in the directory is shuffled, and then the script selects one at random and posts it to Twitter.

Because the intervals are managed by cronjobs, I'll focus on what happens when a job is triggered.

The Python script that selects and posts the capture relies on the tweepy Twitter API client for Python, so you will need your Twitter Developer API credentials. The script that generates the captures, before running, requires that ffmpeg be installed and the directory where the video library is stored and the directory where you'll store screencaptures.

To generate the capture library, the following script is used:

#!/bin/bash

LNVC_DATA_PATH=${SOURCE_DIR}
LNVC_PREV_PATH=${SOURCE_DIR_OUT}-Screens
REFRESH_SCREENS=1
DUMP_INTERVAL=1
DUMP_RANDOM=1
RANDOMIZE_NAMES=1

if [ -z "${LNVC_DATA_PATH}" ]; then
    echo "No source path found. Set LNVC_DATA_PATH." ; exit 1
fi

if [ -z "${LNVC_PREV_PATH}" ]; then
    echo "No target path found. Set LNVC_PREV_PATH." ; exit 1
fi

if [ -n "${REFRESH_SCREENS}" ]; then
    rm -rf $LNVC_PREV_PATH/*
fi

This sets up options for things like randomizing filenames, whether or not to refresh the library, defining the video source and the capture outputs, the latter two are the only ones that are required.

After this, the video library is iterated, and the captures created:

for v in `ls $LNVC_DATA_PATH`; do \
    if [ -n "${DUMP_RANDOM}" ]; then
        LENGTH=$(ffmpeg -i $LNVC_DATA_PATH/$v 2>&1 | grep Duration | cut -d ' ' -f 4 | sed s/,//) ; \
        HOUR=$(echo $((0 + RANDOM % $(echo $LENGTH | cut -d ":" -f 2)))) ; \
        if (( $HOUR < 10 )); then HOUR=$(printf "%02d\n" $HOUR); fi ; \
        MINUTE=$(echo $((1 + RANDOM % $(echo $LENGTH | cut -d ":" -f 2)))) ; \
        if (( $MINUTE < 10 )); then MINUTE=$(printf "%02d\n" $MINUTE); fi ; \
        SECOND=$(echo $((1 + RANDOM % 59))) ; \
        if (( $SECOND < 10 )); then SECOND=$(printf "%02d\n" $SECOND); fi ; \
        ffmpeg -i $LNVC_DATA_PATH/$v -ss $HOUR:$MINUTE:$SECOND.000 -vframes 1 $LNVC_PREV_PATH/`head /dev/urandom | tr -dc A-Za-z0-9 | head -c 13 ; echo ''`.jpg
    fi

    if [ -n "${DUMP_INTERVAL}" ]; then
        ffmpeg -i $LNVC_DATA_PATH/$v -frames:v 1 -vf fps=1/$DUMP_INTERVAL $LNVC_PREV_PATH/`head /dev/urandom | tr -dc A-Za-z0-9 | head -c 13 ; echo ''`.jpg
        # The filenames will be prefixed by a random string, but per-video; if you'd like them randomized in totality, set `RANDOMIZE_NAMES`
        if [ -n "${RANDOMIZE_NAMES}" ]; then
            cd $LNVC_PREV_PATH ; \
            for f in `ls`; do mv $f `head /dev/urandom | tr -dc A-Za-z0-9 | head -c 13 ; echo ''`.jpg; done
        fi
    fi
done

If DUMP_RANDOM is enabled, random timecodes will be dumped from a video file, DUMP_INTERVAL will dump video files into captures at each interval (useful for films, rather than slices from an episode of a TV series). Filenames (which helps when selecting images randomly to post) can be selectively re-randomized.

This script has dumped videos from SOURCE_DIR into SOURCE_DIR_OUT, which our poster script will read from.

First, we have to setup the Twitter authentication:

import tweepy
import os, random


def main():

    twitter_auth_keys = {

        "consumer_key"        : os.environ['twitter_consumer_key'],

        "consumer_secret"     : os.environ['twitter_consumer_secret'],

        "access_token"        : os.environ['twitter_access_token'],

        "access_token_secret" : os.environ['twitter_access_token_secret']

    }



    auth = tweepy.OAuthHandler(

            twitter_auth_keys['consumer_key'],

            twitter_auth_keys['consumer_secret']

            )

    auth.set_access_token(

            twitter_auth_keys['access_token'],

            twitter_auth_keys['access_token_secret']

            )

    api = tweepy.API(auth)

Then lock in our selected image:

files = os.listdir(os.environ['SOURCE_DIR_OUT'])
    for i in range(1,10000):
        random.shuffle(files)
    path = os.environ['SOURCE_DIR_OUT'] + "/" + random.choice(files)
    media = api.media_upload(path)

you'll see above that it reads the captures into a list of filenames, then shuffles the list, before selecting one at random, then we post:

tweet = "#pasolini"

    post_result = api.update_status(status=tweet, media_ids=[media.media_id])



if __name__ == "__main__":

    main()

The package can be found on Github along with some additional examples (one using a Kubernetes CronJob resource, for example) for use.

31