Going into final editing

Comments Off on Going into final editing

I’ve got Terri Wells’ edits back in the mail, so now I have to make a final run through the book. After making all the corrections, there will still be work to get it up on Smashwords. My experience with JHOVE Tips for Developers, which I did mostly as a practice run, shows that it will take several revision cycles to get the book’s style to satisfy all of Smashwords’ criteria. (JHOVE Tips still doesn’t qualify for the premium catalog.) Smashwords doesn’t have any provision for submitting a book as a private draft, so please don’t buy it till I say here that it’s ready.

The amount of support that I’ve gotten on this project has been fantastic. I hope you’ll be as happy with the result as I am.

Current status

Comments Off on Current status

I’ve had to change proofreaders at a late date, but I think the new proofreader will do very well. I’m still committed to getting the book out in April.

I’d changed the default page of filesthatlast.com to point at the “About” page. Unfortunately, this left no way to get to the posts page, and every solution to this that I’ve seen requires writing PHP, which isn’t allowed on WordPress-hosted blogs. I really want to attract more attention to the “About” page, which is the one that actually promotes the book, but for the moment I’ve just changed the default page back.

The advantage of independence

Comments Off on The advantage of independence

The Signal is a very good blog on digital preservation. It has a serious limitation, though: it’s published by the Library of Congress, which as a government agency has to stay neutral on businesses and products. I heard at the recent OPF Hackathon that people who write for it have been required to take out comments endorsing or criticizing specific products.

I don’t have that limitation. In Files that Last, I name names and make recommendations. Here are some things in FTL that you’ll never find on The Signal:

  • “Sometimes content providers decide to stop supporting old DRM systems which require you to have online access, making the stuff you paid for suddenly useless. Major League Baseball did this in 2007 with its videos.”
  • “According to several websites, in 2012 Sky News yanked a story which was embarrassing to Formula One Racing. It was ‘withdrawn for further review’ and later restored to the website in a redacted form. In some countries, removing or rewriting news stories because of governmental censorship is routine.”
  • “Amazon’s use of the word ‘purchased’ for Kindle content is an outright lie.”
  • “iPhoto is hostile to intelligent users and digital preservation. Flee from it.”

If you want to see more statements on digital preservation with no punches pulled, you’ll be able to in April when the book comes out.

The tale of a preservation geek

Comments Off on The tale of a preservation geek

Files that Last goes to the proofreader tomorrow, but just today I came across a story that I wanted to add to it. Rather than mess with the existing text, I’m entering it as an appendix. Here it is, as it currently stands.

Just a day before this book is due to go to the proofreader, I’ve come across the story of a person who really exemplifies the term “preservation geek.” A story on Reason magazine’s website, “Amateur Beats Gov’t at Digitizing Newspapers: Tom Tryniski’s Weird, Wonderful Website,” tells us of a retired computer engineer, Tom Tryniski, who has digitized over 22 million newspapers, many dating back to the 19th century, and made them available on Fultonhistory.com. It’s a truly ugly, Flash-based website, but that’s not the point here. What the site shows is that high budgets and formal training in library science aren’t necessary to doing valuable preservation work.

Tryniski started by digitizing old postcards for neighbors in Fulton, New York. Then he spent a year digitizing the entire run of the Oswego Valley News by hand on a flatbed scanner. In 2003 he got a microfilm scanner at a fire sale and started getting microfilms of newspapers from libraries and historical societies in exchange for the digitized copies. He’s paying its own expenses, apparently less than $1000 a month. The setup sounds very fragile; he has a “server that’s located in a gazebo on his front deck,” and the article doesn’t say a word about offsite backup for his growing farm of computers and drives. If anything bad happens to him or his house, the whole archive might vanish.

What one person does, though, someone else can do better. It would take more money, but not a lot more, to set up a better server environment and a secondary backup, and it would just take a little taste and programming skill to set up a better-looking site.

The article raises the question of how much supporting metadata is needed:

Asked for the rationale behind this byzantine system, a spokesperson for the NEH denied that breaking up the funding into small grants drives up costs, adding that the goal is partially to teach small libraries how to digitize newspapers in accordance with the Library of Congress’ “high technical” standards. That way they’ll be able to take that know-how and apply it to other projects.

But [Brian] Hansen [the general manager of Newspapers.com] says the Library of Congress’ detailed specifications for analyzing each newspaper page are of questionable value to users and a major reason his firm has to charge so much.

“Why not use the money for a lighter index to get more pages online? It would be interesting to sit down with the Library of Congress and the NEH and have a conversation about what’s the best thing we can do for consumers,” says Hansen.

Even so, less than one-third of the funding goes to the actual scanning and indexing by firms like iArchives. The NEH says the remaining money—more than $2 per newspaper page— goes for “identification and selection of the files to be digitized, metadata creation, cataloguing, reviewing files for quality control, and scholarship on the scope, content and significance on each digitized newspaper title, and in some cases specialized language expertise.”

Certainly there’s value in all that information, but it adds cost. The approach which the Library of Congress takes isn’t necessarily the approach that you should take as a Level 1 archivist with a server in a gazebo. Having a little information on a lot of newspaper pages is better in some ways than having a lot of information on relatively few pages.

There are high and low roads, and the efforts of eager amateurs can make a significant contribution to the retention of information. Preservation geeks, go forth and archive!


Comments Off on Reminder

Please get in any last-minute comments on the advance draft by the end of tomorrow (Thursday). Thanks to those who already have given feedback, and to those who are looking it over now. More feedback means a better book in April.

FTL cards

Comments Off on FTL cards

If anyone wants some cards to hand out, promoting the upcoming publication of FTL, let me know. I’ll be taking a bunch to the Hackathon.

Next steps for FTL

Comments Off on Next steps for FTL

I’m planning to send Files that Last to the proofreader on March 7 or 8, before I go to the OPF Hackathon. She’s set aside a block of time for it, and I don’t want to mess up her schedule if I don’t have to. If those of you who have advance copies could send feedback by then, that would be great. I can sneak in changes after that, but the closer the proofreader’s copy is to the final copy, the easier it will be all around.

Pre-publication FTL now available for backers


The pre-publication version of Files that Last is now done, and instructions for downloading it have been sent to all backers at the $25 level and higher. If you didn’t get that message and think you should have, let me know.

This week I attended the Personal Digital Archiving conference at the University of Maryland and was glad to see how digital preservation is starting to catch on with non-library people! The things I learned there resulted in a few last-minute additions to the book.

Thanks to all of you who backed Files that Last at any level. All of you helped to make it possible. It’s still on target for an April release.

First draft complete

Comments Off on First draft complete

The first draft of Files that Last is now complete at 64,708 words. The next step for me is to go over it carefully before letting anyone else see it. In early February I’ll make it available to the people who chose the $25 reward level on Kickstarter. My plan is to put it up as a PDF and ODT files in a password-protected directory, and send out the password to everyone who’s authorized to see it. If you would prefer to see it some other way, let me know and I’ll see if I can arrange it.

Next steps

Comments Off on Next steps

This used to be the blog for the Kickstarter campaign for the book Files that Last. Now it’s the blog for the forthcoming book. So where do we go from here?

My Kickstarter promises are to deliver an early version to supporters at the $25+ level in February and a completed book to supporters at the $10+ level in April. I see no reason I shouldn’t meet those dates. By a pleasant coincidence, Preservation Week is April 21-27, and if I keep to my schedule I should be able to release the book to coincide with it.

I need to figure out how to advertise a tech e-book. Right now I know effectively nothing about that.

But there will be a book, and a reasonably polished one at that!

Older Entries Newer Entries