entries friends calendar user info barbarian group Previous Previous Next Next
Great Perfect Thanks - A long entry about digital archiving
billetdoux
[info]billetdoux
Add to Memories
Tell a Friend
A long entry about digital archiving


I have a confession. I am a Scrapbooker.

Okay, well, not really. More of a neurotic half-baked archivist. I don't compulsively arrange everything in little books, though I used to, before the sheer volume got out of hand. I still save massive amounts of physical paraphernalia - I have a box for "general stuff while living on brookline street," and a box for the barbarian group. I have a pile for ticket stubs, and a box for my dear beloved girlfriend. I have a box for rock and roll flyers and trinkets, and I have a box of company ID badges - the kind they give you when you are a visitor to a big corporate headquarters. I have a lot of those.

On top of all of these, of course, I have the boxes from previous situations throughout my life - boxes from old jobs, old loves, of houses.. and maybe 20 boxes of photographs from the land before time when I shot film. Oh, and 5 big boxes of letters and four binders of negatives. I also save all my phone bills and that sort of thing, and my taxes, and, and, and.

When I moved into this house two years ago, I was in a seriously nostalgic mood. I photographed or scanned nearly every nicknack and clipping lying around my house, and I posted them up all online. Here you can see my collection of IDs up to that point, my business cards past, a few soccer photos, the handmade book from an art exhibit I had at my house in 1996, and a catalog of every nicknack I had lying around the house. I also scanned endless magazine covers, catalogs, photos, etc. - anything that I had been saving through the years but wasn't sure, exactly why - such as this collection: three design within reach catalog covers that tell stories on the cover. So, I did all this scanning, and then I packed it all up and moved into my new house. I didn't throw any of it away, of course, but I did leave it in storage. The thinking was that I would have the digital versions handy, and so I could probably get by using those for any eventual need (such as, say, this essay). And if I ever really needed the original versions, I would have them nearby and could pull them out.

Alongside this, I digitized every one of my CDs. Something like 2,000-3,000 of them. I bought a big hard drive and ripped them all onto the hard drive, so I could play them in iTunes, in my living room, on my iPod, in my car, etc. Again, I didn't throw away the CDs, but I didn't keep them out, either. If I ever needed the original, it would be near at hand.

What I discovered, then, over the course of the following two and a half years is that I didn't, in fact, need the originals. This probably isn't that surprising to many of you. I mean, it's junk, really.

I did, however, find much use for having it digitized - I could refer to it online, I could search it, it was indexable, it was easily brought up in all manner of conversations, and I could share it far more easily with people that were interested.

The Digitizing of the Archive
I started to think of all the other things that this would apply to: my photos, my vinyl, my writings. And thus the first part of my digital scrapbooking obsession was born.

Since I moved in here, I have now undertaken several digitization projects. I work on most of them every weekend. They are in various stages of completion. They are:
  1. The digitization of all my CDs: This is complete, of course, though new CDs are always being purchased, and must be digitized. Upon purchase, a CD is immediately ripped into iTunes at a very high bit rate. Album art is applied - the entire booklet, if possible. The tags are corrected. The CD is then placed on a shelf and almost never touched again, save for occasional raids on the CD pile for road trips.
  2. My 10" vinyl: This is done. I had about 100 10" records. They have all been ripped, converted to 256k AACs, tagged and had artwork applied.
  3. My 7" vinyl: I have about 1000 7" records. I am about 80% through ripping them all. This has been the most rewarding of the music rips, because 7"s are a) easily and quickly listened to, and b) replete with rare and otherwise unreleased material. I have heard works I hadn't heard in years. I have been reconnected with bands I forgot I loved (Space Needle, Movietone, Fuxa, Asha Vida). It's been fun. I have a stack on my desk of the remaining to rip 7"s and it gets me excited every time, especially the sub pop single of the month by Pigface, with the original version of "empathy" featuring Michael Gira on vocals. It's sticking out from the pile about 2/3 of the way down.
  4. My 12" vinyl. I have SO FAR to go on this. I pulled about 200 out that I hadn't heard in ages, are unreleased on CD and I want to rip. I have about 40 of those left. As for the rest... no idea yet.
  5. My Cassettes: I haven't even begun on this yet. Most of these will be half.comable, of course, and I'll probably already have a good chunk in the archives, between my own CDs and the three other collections I have ripped for friends. But there's a lot of awesome demos, local music, etc, and never mind my own demos from the 90's. I'm really excited about this and hope to start it soon.
  6. My VHS tapes: I'm more or less done with this. I have about 40 outstanding video cassettes that I could not buy DVD versions on half.com or I haven't gotten around to ripping yet - mainly home movies, music videos and the like. I plan on tackling these this winter. My good friend Jon Whitney helped me greatly in this project - I loaned him about 20 video tapes for ripping for his night The Sound Your Eyes Can Follow at River Gods, and he ripped a good chunk of these videos from VHS for me.
  7. My home videos: I haven't done any of these yet, save the ones that come out of my cheap digital cameras. They are going to be this winter. That's gonna be hella nostalgic.
  8. My photographs: I have about 10,000 photographs. I tackled these first. I probably scanned and cleaned up about 2,000 of them. I think that's all I really need. I now have a more or less complete photo history of my life digitized.
  9. My Journals: I have more or less digitized them all, save 2 hard bound, handwritten journals from 1991 and 1992 that I'm not quite done yet. I then have about three hours of work getting them all up into Livejournal, which I am using to archive them. (see my LJ calendar page for a ridiculous amount of entries spanning 18 years, though they are not, as of yet, viewable to anyone. One day, I plan on doing so. I just need to finish. It'll be about one more year.)
  10. My Mix tapes and mix CDs: I plan on making these playlists in iTunes, once i have everything ripped. This won't be for 2-3 years. It's gonna be awesome, though. Every time I hear "Push" by the Cure, I expect to hear "Age of Consent" by New Order after it, because of a mix tape from 1987. I have all these tapes, and I have a notebook showing the tracklisting for every mix tape I made 1988-1996. I have the tracklisting of basically every mix tape I ever made. it's gonna rule when I compile these. Aww yeah. Instant nostalgia.
  11. My parent's old vacation slides: I'm just getting started on this. They are being catalogged and sorted. I am investigating bulk scanning, cuz I can't bear to scan 500 of them by hand. I know machines exist for this, I just need to find out costs
  12. My dad's old super 8 films: Only opened the box once, though it's sitting on my desk. two years.
  13. My DVDs: I'm not ripping my feature films (unless they are PAL or a non-US region and need translating), but only animation, music videos and the like - things that will play will on Channel Rick. I finished this task about a month ago.

So I work on one or more of these archiving projects every weekend. They all require differing amounts of intelligence and attention, so there's one to do no matter what my mood. Sometimes you want a mindless task. Sometimes you want to be engaged.

So, then, this is a good chunk of my digital scrapbooking.

The Storage of New Material
However, there's more.

Over time, my life has been basically becoming more and more digital. Everything I've described to you (save for the additional info about new CDs), has been about transitioning old archival material into the digital world. I've said little about the endless amount of NEW digital material that comes into my possession every day. Every day people email you photos. Every day you download some new bit of software, or a music video, or an mp3. Every day you look at a PDF. Every day someone at work sends you a clipping of some dumb movie, or a quick little sketch of your coworker with a poopy party hat on. These things, I feel, need to be saved.

There's all this talk on how society is digital now, and everything is saved, and it's the end of archiving, but I can assure you, nothing can be further from the truth. It's so impossible to save everything - it's like our digital culture has willfully worked to make the archiving of your digital material nearly impossible. At the very least, it requires copious amounts of attention and discipline. It's maddening. I tried, for a short time, to save all my SMS messages, but I have to let them go. There's literally no way to do it. Saving IM logs is doable, but it's maddeningly complex across multiple computers, gadgets, applications and platforms. I used to neurotically save software, in the pre-internet days, only to stop saving additional software (not throwing away the previously archived, of course), somewhere in 1999, thinking it was all on the internet now. Then in the early 2000's I found, time and time again, that software on the internet is ephemeral. Like the great SoundJam disappearance of 2001. And don't even get me started on how ridiculously hard it was for a while there to find the standalone Stuffit Expander for OSX. This stuff happens. So now I archive software again.

Basically, you can break down the type of "aquired digital material" into the following categories:
  • Music Emailed or IM'd to me: Goes into iTunes, gets tagged properly
  • Photos Emailed or IM'd to me: Go into the photo library, into a folder named "OPP" (other people's photos, get it?) and into folders by indiividual emailer and get cataloged in iView
  • Software: Goes into the Software Archive Folder
  • PDFs, funny drawings, band flyers, schematics, manuals: Go into the "Scrapbook" folder, and get cataloged in iView
  • Receipts, invoices, itineraries, Ticketmaster pdfs: Go into the receipts folder.
  • Downloaded videos - First all videos (WMA, Flash, etc.) are converted using Visual Hub, and then they go into a "movies to add" folder. Once every two weeks, they get transferred over to the AV mac in the living room, and they get tagged and go into the AV computers itunes, like the DVDs above
  • Downloaded music - it's rare I do this, but when I do, I clean up the tags, and put it into itunes. I don't bother downloading anything in WMA format that's just audio - that's just a joke.

I spend about 2-3 hours each weekend organizing and putting everything away, tagging it and filing it.

I realize this all seems kind of neurotic, and I suppose it is, but it's been absolutely amazing how often having all this stuff at my disposal has come in handy. Photos especially, but, I mean, Channel Rick has really been borne of this, and it's made my (admittedly very limited) tv watching so much better. Sure, I could watch Weeds, or I could also watch my favorite bands on obscure live broadcasts from 10 years ago. It's amazing.

Backup
So I backup all of this crap pretty obsessively, of course. What's the point of archiving if you're gonna lose it in a hard drive crash. By the way - your hard drive is going to crash. Soon. And you are going to lose everything. Back it up. I cannot convey to you how painful this is to lose this stuff. As the more of your life goes online, the more traumatic this will feel. It's happened to me twice. It's... horrible. Back up.

Currently I back up using Time Machine in My Developer's Preview of Mac OS X Leopard. I am a registered developer using a legal developer's copy. I hope I'm not violating my NDA by saying I'm using it. I'm testing it. hell, I'm probably the word's most obsessive Mac-based digital archiver, so I figure I should test Time Machine. I backed up everything before I moved over to Leopard, but it's working swimmingly. Either way, though, whether Time Machine now, or Deja Vu before this, I backup nightly. It's automated. It makes copies of any files that have been deleted in the day. And it's on another RAID than my home directory. It's really great having this online, and it's really great having the backup be automated. I used to use CDs and then DVDs to archive, but it never really worked. So now it's all right there, online, and backed up. I like that.

I should, of course, have an offsite backup. I'm not sure how to pull this off what with the full archive being somewhere around 1.5TB right now. I'll probably buy 2 more 750 GB SATA drives and 2 more trays for the RAID, and then swap the backup array in n out every week. I might get around to this. We'll see. I have a bit of reshuffling to do coming up what with the new computer and it's 4 bay sata internal raid enclosure.

Versioning, Source Apps
So this leads me to my biggest grief. I have been doing this in some capacity since... maybe 1993 or so. I have files on my G5 right now that were originally created on my old Apple //C. I have a TON of files made in really old versions of appleworks, ms word, etc. I have been religiously migrating these up in versions through the years, and can I just tell you it's a fucking pain? There's this whole thing with open formats that's going on now... big debates in governments and organizations about whether they want to save all their documents in Word format, when they don't know the future of MS word, and whether they'll even be able to open their documents in the future. I can tell you now that though this hasn't hit the consumer level yet, it will. It's only a matter of time.

We've lucked out in a lot of ways that JPG and MP3 and PDF are the order of the day on the web (ignore where PDF came from, I think it's pretty safely open what with its full integration into OSX). iTunes Music Store files are the worst right now - god knows what they're gonna do with those in the future, so every time I buy an album on iTunes I have to burn it to CD-RW and re-rip it into iTunes. There's a D-D bitrate conversion loss degradation in there, of course, so I don't buy anything I love too terribly much. Besides, everything's cheaper on half.com anyway.

A Call of OS-Level Version Migration
But nonetheless, applications die, and despite our best efforts we're gonna have files saved in .graffle or .keynote or .chat docs. What we really need is an OS level version migration system, similar to Time Machine, but which migrates documents, according to a user's settings, to newer versions of the same documents. Of course it saves the original, Time Machine style, for posterity's and provenence's sake, but basically at any time a new version of some app comes out, the service would migrate them - maybe once a month, or whatever you set it to.

This really shouldn't be that hard - iView and DeBabelizer have a myriad of converter information saved in them to open documents - it would be relatively trivial to develop a set of converters that migrate your applications according to your preferences. I would love to see this, more than anything, as a future feature of an OS. Just make it automatically work. The vast majority of people out there really don't care that they are losing some original metadata when they migrate their files from MS Word 95 to MS Word X, and if you did care, or were an archivist, you could set your settings appropriately. And the system could, of course, migrate over the important metadate - date created and modied, original application version, etc.

I'm sort of on the front end of a wave of consumers here, what with having more or less my whole life online (or scheduled to be, anyway, by the end of 2007), but I'm not really that far ahead. Everyone owns a digital camera and an iPod. It's true there's a crazy culture now, especially in the youth, of embracing the ephemeral nature of digital files - buying/downloading songs more than once, as needed, and taking countless digital photos and deleting them after looking @ them only once. It causes me shivers to think of all these archival material being thrown away, but that's just me. Indeed, we are already seeing the rise of the hard to find digital file - "iTunes originals" that go offline eventually, parody music videos that are driven underground by copyright issues (like the Karen Carpenter Barbie film), etc., etc. And as much as we're happy to throw away certain music, no one is happy when their hard drive dies with all of their ripped CDs on them that they spent months on.

But anyway, despite this throw-it-away culture, I do believe people want to save more and more. These issues are going to crop up more and more. In the two years since I've started this, I've noticed a definite rise in the profile of digital scrapbooking. Livejournal calls its photo utility scrapbook, and there seem to be a host of services available if you google it.

Archiving isn't really dead, but it definitely needs some new thinking applied to it.

Current Music: Venture Bros

Comments
terrajen From: [info]terrajen Date: September 5th, 2006 03:59 am (UTC) (Link)
wow...and here i thought i was a digital packrat.
billetdoux From: [info]billetdoux Date: September 5th, 2006 04:30 am (UTC) (Link)
Oh yeah?? Tell me more. Do you save all this sort of stuff as well? Where do you keep it all?
violetshuraka From: [info]violetshuraka Date: September 5th, 2006 04:07 am (UTC) (Link)
i just realized i had my mom's old super 8 movies to transfer, one fell out of my closet as i was trying to hide some more junk away. it has been on my list to do for about 4 years now!!! oops.
billetdoux From: [info]billetdoux Date: September 5th, 2006 04:29 am (UTC) (Link)
Yeah, the super 8. I've been holding on to those for like 10 years and still haven't gotten to them, but I'm getting close, I can feel it. 2007 will be THE YEAR.
capricornia From: [info]capricornia Date: September 5th, 2006 04:13 am (UTC) (Link)
best post ever!
billetdoux From: [info]billetdoux Date: September 5th, 2006 04:29 am (UTC) (Link)
ha! thanks! sometimes it's good to geek out.

And thanks, though I think the computer day post of 2002 was even geekier.
magneticwoman From: [info]magneticwoman Date: September 5th, 2006 04:35 am (UTC) (Link)
do you really think iview is better than iphoto? i HATE iphoto and every single update they make to it only makes it worse.

isn't there any good photo archiving software??? i feel like we've had this talk before. deja vu?

ps i love this post. i think you might have a photo/scan/something about the various hair pins that you found under your bed. that one was my favorite.
billetdoux From: [info]billetdoux Date: September 5th, 2006 04:52 am (UTC) (Link)
I think, basically, we need to write one. I'm working on it. iView is definitely better, but it's not there, and microsoft bought it, so we can't expect it to improve.

I do still have the hairpins, of course. They have been archived and catalogged, absolutey. You'll need them again one day, I've no doubt.
magneticwoman From: [info]magneticwoman Date: September 5th, 2006 05:17 am (UTC) (Link)
do they have little tags on them? with dates?
billetdoux From: [info]billetdoux Date: September 5th, 2006 05:34 am (UTC) (Link)
They are in helium-gas-filled containers to protect them from the ravages of time.
_perihelion_ From: [info]_perihelion_ Date: September 5th, 2006 06:50 pm (UTC) (Link)
we need to write one

if you can design it, I can write it. :)
billetdoux From: [info]billetdoux Date: September 6th, 2006 01:58 am (UTC) (Link)
You don't happen to know Objective C/Cocoa, do you? ;)
_perihelion_ From: [info]_perihelion_ Date: September 6th, 2006 02:06 am (UTC) (Link)
of course, dude. this is what I do. :)
billetdoux From: [info]billetdoux Date: September 6th, 2006 09:51 pm (UTC) (Link)
Really? You do cocoa? You a full timer?
_perihelion_ From: [info]_perihelion_ Date: September 7th, 2006 12:05 am (UTC) (Link)
contractor. I go where the work is. which is nice because I like it that way. :)

corrently I'm under contract until at least May. but that's not to say I can't take a part-time gig on the side, if you've got work you need done.
billetdoux From: [info]billetdoux Date: September 11th, 2006 07:18 pm (UTC) (Link)
We are getting into cocoa work. You should send me your resume. Rick (at) barbarian group (one word) (dot) come
savia From: [info]savia Date: September 5th, 2006 05:20 am (UTC) (Link)
Great post! You and my main man are very much alike in the save-everything-and-organize-it way. He's slowly moving toward digitization and whatnot.

[info]buttler just finished uploading his 5,000+ CDs to iTunes (he was a rock critic for several years so has a ridiculous amount of music), and has a ton of vinyl he wants to digitize, but no idea where to start. I haven't a clue, either. I imagine it has something to do with hooking a turntable up to the computer - any suggestions on where to start (especially on a tight budget)? Also, tapes. Is it easy to split up the LPs and tapes into separate tracks?


billetdoux From: [info]billetdoux Date: September 5th, 2006 05:42 am (UTC) (Link)
Man, I could write a whole post about ripping vinyl. But basically, get a good turntable, get a phono preamp (I recommend (a href="http://www.amazon.com/ART-USB-Phono-Plus/dp/B000BBGCCI/sr=8-16/qid=1157434530/ref=sr_1_16/102-2665652-6168933?ie=UTF8&s=electronics">this one</a>, and a audio to usb converter. I recommend the Griffin iMic because it comes with Final Vinyl, the best software for ripping the audio (though it's buggy, and only slightly better than other software).

A slightly easier route is to use the new Numark TTUSB though it only comes with audacity, and has no adjustable gain on its crappy phono preamp.

Save the files to AIFFs, and drag them into iTunes and convert 'em to whatever you like listening to. Then Tag 'em. It's the tagging that's a pain and man, there's no easy way around that. TEDIOUS.

The whole process is pretty pleasant, though. You get to listen to your old records, which is pretty fun.

If you're rich, I'd also recommend just checking the itunes music store before you rip - it's amazing the shit I thought was rare that's up there now.

If you already have a turntable, though, all you need is a cheap phono preamp (mine was $30) and a cheap USB audio interface (the iMac is $29.99). That's it. Then you're good to go. \

Final Vinyl uses software to find the breaks in the records. It works so so. Other software lets you hit a key on the keyboard between songs and it marks it. This works a little bit better. but either way, it's easy, but tedious.
emeraldexile From: [info]emeraldexile Date: September 5th, 2006 01:33 pm (UTC) (Link)
wow.. i wanna be that organized. the best i can do is categorize photos into month/year they were taken. & design within reach rules. i love their furniture & lighting
billetdoux From: [info]billetdoux Date: September 14th, 2006 03:58 am (UTC) (Link)
I love them - but I probably won't love them as much when my FIVE HUNDRED POUND new shelf arrives.
vomitola From: [info]vomitola Date: September 5th, 2006 01:34 pm (UTC) (Link)
organizing is so delicious.
watchamacallit From: [info]watchamacallit Date: September 5th, 2006 03:07 pm (UTC) (Link)
Wow. And I thought I had my shit together arranging my iTunes for a whopping twenty minutes yesterday! I have very little stored on the computer. Only a smattering of CDs and photos (it doesn't help that I don't have a digital camera).
ru8y From: [info]ru8y Date: September 6th, 2006 07:40 pm (UTC) (Link)
I've never been big on saving things because I don't like having a lot of stuff.

But now with the ability to save nearly an infinite about of information in a small space, I'm finding more and more that I do want to save -- but rather than lacking space, I'm lacking the organizational framework.

Thanks for the informative post.

billetdoux From: [info]billetdoux Date: September 14th, 2006 04:00 am (UTC) (Link)
How are your C++ chops? Or do you stick to the objective?
24 comments or Leave a comment
profile
Dr. Rickford Webbington
Name: Dr. Rickford Webbington
calendar
Back June 2009
123456
78910111213
14151617181920
21222324252627
282930
links
page summary
tags