Intro
What started off aimed at being a very small "Oh, I'll just spin up my spare server to process these videos over a few nights" and it turned into a much larger project of "What if I have 4 servers do it instead of just 1?"
So the following is a recap of the results, mostly in case I decide to do this again I don't have to scour Google for bash scripting help.
The Approach
ASSUMPTION: /drives/primary/convert is a shared mapped location already mapped on all the servers.
When attempting to parallelize a process, controlling which processor is doing what is the name of the game. I took a masterless approach, all the workers operate as if they are just one in a million, or the only one. The script does these basic steps:
- Every minute, look and see if there is a file to operate on
- If there is, grab a file and move to it's own safe directory to operate on
- Convert the file (ie create a new file) using HandBrakeCLI to a common output directory with the same name
- Move the input file to a processed directory
- Repeat
The Dirty Script
#!/bin/bash #Lock directory is used to make sure another cron job on the current server isn't already running. lockdir=/tmp/AXgqg0lsoeykp9L9NZjIuaqvu7ANILL4foeqzpJcTs3YkwtiJ0 mkdir $lockdir || { echo "lock directory exists. exiting" exit 1 } #Get the hostname, make that the node. NODE=$(hostname) INPUT=/drives/primary/convert/input ACTIVE=/drives/primary/convert/operating/$NODE OUTPUT=/drives/primary/convert/output PROCESSED=/drives/primary/convert/processed for FILE in $INPUT/*; do echo "\n+++++++++" NAME="${FILE##*/}" echo $NAME echo "$FILE" >> "$NODE"sLog.txt echo "$FILE" mv "$FILE" "$ACTIVE/" # echo "${ACTIVE}/${NAME}" #echo "DOIN WORK: ${OUTPUT}/${NAME%%.*}.m4v" #touch "${OUTPUT}/${NAME%%.*}".m4v length1=$(ffprobe -i "${ACTIVE}/${NAME}" -show_format -v quiet | sed -n 's/duration=//p') length2=$(echo "$length1/1" | bc) length3=$(echo "$length2-6" | bc) #Trimming off 6 seconds at the beginning and hopefully about 6 seconds at the end. HandBrakeCLI -Z Normal -i "${ACTIVE}/${NAME}" -o "${OUTPUT}/${NAME%%.*}".m4v --start-at duration:6 --stop-at duration:"${length3}" mv "${ACTIVE}/${NAME}" "${PROCESSED}/${NAME}" echo "\n----------" # We break here because we just want cron to get one file at a time. I dunno how else to do it. break done # take pains to remove lock directory when script terminates trap "rmdir $lockdir" EXIT INT KILL TERM