This video shows the first (or fourth, depending on your point of view) Star Wars film — Episode IV: A New Hope — recut so that its dialogue is sorted into alphabetical order. Each word is displayed on screen along with a count of its occurrences in the script. It runs 43 minutes and change, although you probably wouldn’t want to sit through the entire thing:
Everyone associates lightsabers with Star Wars, so you might be surprised that the word is used only once in the film:
Even Wedge — one of the few rebels who survives the original trilogy — gets mentioned more than lightsabers in the first film:
This is a project you wouldn’t want to do purely manually. If I were assigned this task, I’d write a program that would make use of the subtitle file to identify and sort every word in the dialogue, and the approximate time — give or take some fractions of a second — when each word is uttered. Courtesy of the people behind the Matroska file format for videos, here’s a sample of a subtitle file, which should give you an idea of the information they hold. Oddly enough, it features dialogue from another Star Wars film:
1
00:02:17,440 –> 00:02:20,375
Senator, we’re making our final approach into Coruscant.2
00:02:20,476 –> 00:02:22,501
Very good, Lieutenant.
Tap to see the source. This is yesterday’s daily New Yorker cartoon, created by Brendan…
C’mon, let it not be Asians this time. Last time was pretty bad. Here’s the…
Jon Stewart’s right, and we’ve been here before. Where we are now, I’ve been before…
Poppies thrive in overturned soil, which is why they bloom in battlefields. I’m in the…
In times of high dudgeon, there’s a tendency to throw integrity out the window. One…
A demonstrator at Texas State University in Austin, Texas on Wednesday, November 6, 2024. Photo…