okay, so writing code. the code will be pretty simple, i think. the minimum viable thing here is just a table showing date, interviewees, publication, and then (if applicable) a link to somewhere on the internet where you can see/hear that interview.
writing code to generate that table is actually tightly coupled to how this library is organized. organization is actually more important than the code, i think. i don't really have any background in that kind of organization, so my approach might be a little naive. i would definitely first like some thoughts on the organization patterns that i'm developing.
consider how i lay out a few 2003 interviews:
there's a directory for each year, and each of those directories contains subdirectories that correspond to a particular interview. the names of the subdirectories are named according to the following convention: YEAR[MONTH][DAY]_INTERVIEWEE[+INTERVIEWEE...]_PUBLICATION
there are three fields separated by underscores: date, interviewee(s) and publication. for date, month and day are optional, as we may not know what month or day to put. there can be one interviewee or many interviewees, and if there are many then their initials will be separated with a plus. the last field is publication, although not in any strict sense.
so the directory structure, by its naming convention, is human-readable and machine-parseable.
those interview directories then contain whatever media is associated with that interview and some of our my metadata about the interview and those files.
notes.txt is just a plaintext file with any note on any interview. it's unstructured because there's little need for it right now and so it's too soon for me to say specifically how it should work.
source.yaml is where i declare where i got a particular file. actually, that's not always true. let's just take a look at some examples.
==> 2003/200303_cbz_fuse-fm-kate-fitzpatrick/source.yaml <==
in this case, that's an archived version of the page that says that Fuse FM did an interview with Cedric. that's not where i got the mp3 file, though; i don't know where my copy came from.
==> 2003/20031003_orl_raw-time/source.yaml <==
- file: 2003-10-03-orl-raw-time.mkv
in this case, you can find this interview at that url - actually that youtube video is a whole show at Emo's followed by the interview (the standalone video no longer exists on youtube, afaict), but whatever, you can see it for yourself there. this kind of source attribution will be useful when it comes time to create a webpage for all of these interviews, so that i can just link to stuff that's out there without having to host it myself.
==> 2003/2003_tmv_triple-j/source.yaml <==
- format: audio
here's something spooky: i don't have this interview, and i don't know how to get it - however, i believe that it happened, and i link to a source that informs my belief. oh well.
the other two interview dirs don't have a source.yaml - that's because i don't have a source for that stuff.
here's another common type of interview:
it's just some text that i copied, pasted, lightly massaged (for example, if the HTML used bold to indicate an interviewee's response, then here, in plaintext, i have to indicate that by some other means) and attribute accordingly. oh yeah, all text is just gonna be plaintext. maybe later we can make it markdown, but for now it's plaintext.
this system seems to work well for whatever files i have, as well as for missing content.
let's end with some stats: i know of approximately 486 ORL/TMV-related interviews, and i have (in some form) about 354 of those. i'm continuing to trawl this site for old interviews. sometimes the source is live; great. if not, i can often come up with something with either a search engine or archive.org's wayback machine. but uhh i would definitely love it if someone with their own interview library just gave me the whole thing, because this work is really tedious.