This video shows the first (or fourth, depending on your point of view) Star Wars film — Episode IV: A New Hope — recut so that its dialogue is sorted into alphabetical order. Each word is displayed on screen along with a count of its occurrences in the script. It runs 43 minutes and change, although you probably wouldn’t want to sit through the entire thing:
Everyone associates lightsabers with Star Wars, so you might be surprised that the word is used only once in the film:
Even Wedge — one of the few rebels who survives the original trilogy — gets mentioned more than lightsabers in the first film:
This is a project you wouldn’t want to do purely manually. If I were assigned this task, I’d write a program that would make use of the subtitle file to identify and sort every word in the dialogue, and the approximate time — give or take some fractions of a second — when each word is uttered. Courtesy of the people behind the Matroska file format for videos, here’s a sample of a subtitle file, which should give you an idea of the information they hold. Oddly enough, it features dialogue from another Star Wars film:
1
00:02:17,440 –> 00:02:20,375
Senator, we’re making our final approach into Coruscant.2
00:02:20,476 –> 00:02:22,501
Very good, Lieutenant.
One of the guys — and I do mean guys — who’s bound and determined…
It’s Sunday, which means it’s time for another “picdump!” Here are 118 memes, pictures, and…
It’s Sunday, which means it’s time for another “picdump!” Here are 113 memes, pictures, and…