Last week all of the KTRU files finished their next phase of transformation. Both the master files (WAV) and the access files (MP3) have embedded metadata. While the metadata that each format accepts varies, each file has a title, date, and other relevant information within the file.
For example, if you play a music CD on your computer, embedded metadata tells the computer the name, artist, and duration for each song file. The same is true for KTRU’s digital files, but the WAV files contain a bit more information like detailed descriptions.
How did we do it? First, we used an open source software called BWF MetaEdit for embedding information into the WAVs. Our metadata coordinator, Scott Carlson created some Excel spreadsheets that helped make the embedding process easier and more automated.
For MP3s, we used a Python script that Scott wrote. He was unable to find a reliable open source software to write metadata to MP3s, so he created his own. It also helped automate the process. If any of this doesn’t make sense, there was a lot of copying and pasting of information involved.
Why do this? If we didn’t embed metadata, the files would have no information beyond a the file name. Embedding metadata ensures that 10 years down the line, the file can tell its story.
What’s next? All of the WAVs and MP3s will be sorted into what can go online and what cannot go online, but will be nearline (can only be listened to in the reading room). After that, we will need to use our old metadata and create some new metadata to prepare items to go in the institutional repository (scholarship.rice.edu).