Crowdsource, Please

by wjw on May 15, 2011

Like every other midlist writer on the planet, I’m striving to get my out-of-print books and stories online so that (a) you can enjoy them, and (b) I can make a few bucks.

To this end, I embarked upon a Cunning Plan.  I discovered that my work had been pirated, and was available for free on BitTorrent sites located in the many outlaw server dens of former Marxist countries.  So I downloaded my own work from thence with the intention of saving the work of scanning my books— I figured I’d let the pirates do the work, and steal from them.   While this seemed karmically sound, there proved a couple problems.

First, the scans were truly dreadful and full of errors.   (Even if you’re desperate for my work, I can’t really recommend them.)  A lot of time has been spent copy-editing, both by me and by Kathy— which isn’t really so bad, because this would have to be d0ne anyway.

But second, apparently a few of my books were so obscure that they flew under the radar of even the pirates! You can’t imagine how astounded I was when I discovered this.

I could really use some decent scans of some of my books, and I figure some among you must have better scanners and OCR than the piece of crap that’s currently sitting on my shelf.

So I’m willing to trade.  Should any of you volunteer to provide scans of Days of Atonement, Angel Station, and Knight Moves, that lucky individual will get a signed, personalized copy of the WJW book of his or her choice (assuming I actually have a copy, of course). Plus, whatever book you scan will spend digital eternity with your name in it, along with my eternal thanks.  Sound good?

Crowdsourcing.  It’s so 21st Century!  You want to do this, right?

Let’s talk.

{ 53 comments… read them below or add one }

Zora May 15, 2011 at 6:38 am

I’ve been turning public domain books into free ebooks for eight years (at Distributed Proofreaders). I’ve proofread about 70,000 pages for them. I have a scanner, the latest copy of Abbyy Finereader, and great knowledge of common OCR errors, which we term scannos (clown for down, arid for and, that sort of thing). I believe that you, as blog owner, have access to my email address. Feel free to contact me. You’ll have to send me copies of the books you want scanned, as I don’t have paper copies. Old, battered copies is fine, as long as the print is clear. It might be best if I were to send you XHTML files, which you could subject to a second proofreading pass and then convert to the desired formats with Calibre (great free software). Just one proofing isn’t enough to catch all the errors, though I could commit to catching 95% of them.

Scanning the books is easy. Correcting the errors is the tricky part — which is why pirates skip it. The proofing is going to take more time than you expect, even though I’m dang fast at it.

Phiala May 15, 2011 at 2:33 pm

And it’s too late for your first round, but I have a set of sed scripts that automatically corrects the most common OCR errors; wrote them for another author friend who was doing something similar.

John Appel May 15, 2011 at 5:59 pm

I have copies of both Days of Atonement and Angel Station if those would be helpful.

on May 15, 2011 at 8:50 pm

A quick search shows that Knight Moves does have a scan floating around.

Tangental Tip May 15, 2011 at 8:53 pm

I don’t have the digitized books to offer (or copies to scan) but I do have a tip on a great scanner for books– the Plustek Opticbook 3600. Costs (a lot) more than the 30 buck printer/scanner combos you can buy at Wal-Mart, but worth it if you have a lot of books you want to digitize non-destructively.

http://plustek.com/usa/products/opticbook-series/opticbook-3600-plus/introduction.html

Nadine A May 15, 2011 at 9:33 pm

I can scan in Knight Moves, if you still need that one.

jher May 15, 2011 at 9:47 pm

i’ve a decent copy of knight moves and a scanner. tell me where to send jpgs…

Michael Roberts May 15, 2011 at 10:10 pm

Count me in for proofreading. I do a lot of it and I’m always up for reading your work. And my sister has a document feed scanner (she’s a CPA) that I can borrow. I’ve got both Days of Atonement and Angel Station in paperback, but finding them in the boxes is always a pain. I might have Knight Moves, too; not sure.

Andrew Smith May 15, 2011 at 11:32 pm

I have an actual paperback copy of Days of Atonement somewhere on my shelves. I’d be willing to send it to you free of charge if you’d like to have an original to scan yourself. I entered my email as prompted, feel free to use it to contact me.

Greg Weeks May 15, 2011 at 11:54 pm

I have a copy of Knight Moves. It took a bit of searching to find it. How soon do you need the scans? I’m not going to proof the text though. That’s what http://www.pgdp.net is for. I found out you were searching from Boing Boing.

Greg Weeks

Lydia May 16, 2011 at 12:04 am

Amazon has KNIGHT MOVES in pb for as low as a penny plus shipping – I’d be happy to get a copy and have it sent if that would be helpful.

Jeff Forbes May 16, 2011 at 1:08 am

I also have a copy of Days of Atonement and Angel Station. Not a lot of free time to scan, however.

Eddie Cochrane May 16, 2011 at 2:28 am

I’ve found a torrent containing Knight Moves if you are interested in a pirate copy of it. It is a collection called “Fantasy & Science Fiction Authors UVWYZ – PDF” mentioning the file “Walter Jon Williams – Knight Moves.pdf” in the details. I can download a copy and email if you like, it’s only 136.05 MB for the whole thing so even if it’s a single archive it shouldn’t take too long.

Joe in Australia May 16, 2011 at 3:16 am

I’ve got copies of both the UK and USA first editions of Angel Station.

What? No, I was just boasting. It’s such a great book. Why hasn’t anyone turned it into a movie?

Randall Neff May 16, 2011 at 3:17 am

What file format and resolution would you like?
Scan, convert images to pdf, run Acrobat Pro OCR to
put text ‘underneath” the image?
I have Days of Atonment and Angel Station in hardcover.
(Signed, by the way)

john aho May 16, 2011 at 3:34 am

Did you try the bowels of IRC?

I found Knight Moves on one server which I forwarded you a copy.

heteromeles May 16, 2011 at 3:43 am

I’ve got a copy of Knight Moves. No scanner though.

Bruce May 16, 2011 at 3:52 am

I, too have been a volunteer at Distributed Proofreaders, and I have one of the better book scanners. I have all three books in paperback, which is harder to scan well due to the narrower margins. For this project, I think I could get two of them in hardcover through my library (one ILL, one local) for better quality scans. I’m not a collector of author signed books, would it be possible to get ebooks rather than paper?

Bruce May 16, 2011 at 6:37 am

I’m assuming you won’t approve want to approve this one. Just out of curiosity, I scanned and OCR’d the first two chapters of Knight Moves, which is the one that I don’t think I can get in hardcover from the library. I had to rescan a few pages because part of the page was too close to the edge, and I had to adjust a couple of text area boundaries in FineReader 10. I didn’t actually proofread it, but from just eyeballing the results it looks to me like the OCR is pretty clean. The biggest problem with it is that it’s creating paragraph breaks at some page boundaries. They’re easy to spot because they are paragraph breaks that are not indented.

I’ve only put up the FineReader HTML output of the two chapters I’ve scanned here:
http://www.zuhause.org/knight_moves_ch1-2.htm
If you want me to continue on this, I expect that you’ll want the scanned images as well, and maybe some other output formats from FineReader. It can write Word 2007, RTF, OpenOffice formats. The raw images are about 1MB each, when I create images for proofreading at PGDP, I usually postprocess them down to about 100k, which are not really suitable for OCR, but are usually clean enough for proofreading.

Bruce May 16, 2011 at 6:42 am

Oh, and I see that it had some issues with punctuation and italics, like on page 9, it didn’t OCR a ? and ! correctly.

Bruce May 16, 2011 at 6:49 am

And a couple other pages, 21 and 23 aren’t identing the beginning of the paragraphs.

Chuck LeDuc Díaz May 16, 2011 at 8:05 am

I managed to score a real-life copy of the elusive Solip:System last year, so I’ll give it a shot.

Andrew Robinson May 16, 2011 at 12:07 pm

I only own paperbacks of ‘Hardwired’ and ‘Voice of the Whirlwind’, but I’ll ask around the secondhand bookshops of London to try and buy the missing ones for you.

Andrew Robinson, former party leader, Pirate Party UK.

Andrew Robinson May 16, 2011 at 12:41 pm

I’ve sourced a copy of Knight Moves that can be scanned, so we’ve got all 3 between us. What’s the best scan format for you guys, or should I get it mailed to someone with a better scanner than my 600dpi flatbed?

Can the proofreading process be easily broken down into manageable chunks for crowdsourcing? If so, we can put the word out and probably get a few thousand eyes on it fairly quickly.

Kevin O'Neill May 16, 2011 at 1:12 pm

I have an old paperback version of ‘Days of Atonement’ that I can separate from the binding and run through a copy machine (saves to PDF) and then OCR. Do you want the original PDF scans and the text file?

Let the crowd proof it for you :)

kto

jon May 16, 2011 at 2:57 pm

Either “from there” or “thence” not “from thence.” The former is standard, the middle is archaic, and the latter is poseurishly snooty.

Michael Walsh May 16, 2011 at 3:37 pm

For the Old Earth Books reprint of Clifford Simak’s novel “Way Station” we also used a pirated online copy. Obviously it was proofed, but still …

Oriol May 16, 2011 at 3:51 pm

I have found Knight Moves. The quality is good.

Send me an email and I will send you the PDF. My address is violetagris(at)ymail.com.

SV May 16, 2011 at 3:56 pm

Hi Mr. Williams,

You might be interested in the open source book scanner project. A colleague of mine built one for a local library, to digitize out-of-copyright works. It’s surprisingly effective, won’t destroy the original book, and can be built in a weekend by a geek of moderate skill.

http://diybookscanner.org/forum/viewtopic.php?f=1&t=262
http://diybookscanner.org/forum/viewtopic.php?f=3&t=302&start=0

kacir May 16, 2011 at 4:10 pm

I have just mailed you scanned copy of Knight Moves.

Luis May 16, 2011 at 5:01 pm

Hi,

If you can’t find them, I have no problem in retyping… for free.

Greg Tucker May 16, 2011 at 5:08 pm

I have copies of of all 3 books (paperback) and would be honored to help the project.

Coherent May 16, 2011 at 5:17 pm

I have Knight Moves here. I will email it to you momentarily for your perusal.

Malcolm Farmer May 16, 2011 at 6:11 pm

I’m game– I scan stuff for Distributed Proofreaders (Hi, Zora!), and have hard copies of Knight Moves (Tor) and Angel Station (Orbit, UK). And a scanner, ABYY FineReader, and copies of DP’s pre-proofing and post-proofing processing software. I can scan Angel Station there if there are no other electronic copies sitting on someone’s hard drive already….

David Karger May 16, 2011 at 7:36 pm

I have Knight Moves. But scanning a whole book (without destroying it) is a lot of work for an autograph! Would you consider subdividing the task, putting up a doc where people could “claim” and scan, OCR, and proofread specific pages? I’d be happy to do a few.

I’d also be willing to accept reduced compensation—say your initials on a blank page.

DensityDuck May 16, 2011 at 7:45 pm

Dammit, and I just got finished my own edit of “City On Fire” (finished “Metropolitan” about a week ago.) I think the biggest pain in the ass was fixing all the quotation marks–for some reason, whoever scanned it turned all the double-quotes into singles. ‘The result was that all the dialogue looked like this’, he said, ‘and it’s necessary to go through every sentence.’

PS: Jesus, Walter, you sure do love hyphens! If I never have to paste — again it’ll be too soon!

Vulpine May 16, 2011 at 7:46 pm

If you didn’t mind the copy being torn apart, a clean scan would be pretty easy and I have a pretty good way of pulling the text from some kinds of scanned documents. On the other hand, leaving the book intact makes it re-usable but scans aren’t likely to be as clear. Either way, I don’t have any copies of the books and would need to obtain one to scan it.

DensityDuck May 16, 2011 at 7:47 pm

Oh, and: “Constan-tine” “Con-stantine” It was funny to see all the places where the scan hadn’t realized that something was a hyphenation.

John Appel May 16, 2011 at 8:11 pm

I’ll clarify: I have hardcopies of those two books which I’d offer up for scanning. I’d offer to scan’em, but my flatbed scanner gave up the ghost some time back.

TorrentFreak Reader May 16, 2011 at 8:48 pm

Of course, you didn’t steal from “the pirates”, and they didn’t steal from you. You may already be aware of that… But I think the confusion between copying and stealing is a bad thing.

Dave May 16, 2011 at 9:14 pm

Oooh does this mean I be able to buy an e-copy of Solip:System? Splendid!

Tilmon Hocutt May 16, 2011 at 9:23 pm

Yeah I have Angel Station in pb and have a scanner. Guess he already has the other 3 books by him that I have scanned.

TimC May 16, 2011 at 10:11 pm

My local library have Angel Station and Knight Moves. Ping me via email if you’d like me to scan them.

Lee Edward McIlmoyle May 16, 2011 at 10:37 pm

No signal. Sorry. Just wanted to say that, after reading the comments to this post, I have to say, you have some very cool fans.

And kudos to you for making a minus into a plus. Congratulations, and good luck with the eBook ‘re-release’ plan. I truly believe it’s time for writers to take back control of their back catalogues, and yours seems like a very intelligent method of implementing that.

Chris Mills May 16, 2011 at 11:15 pm

I’d be happy to scan any of those three. I have a nice, big, ex-library edition of Days of Atonement and would love a signed WJW edition (I like Angel Station a lot).

Let me know if I can be of assistance.

Bob May 16, 2011 at 11:56 pm

You must not be looking very hard. I found Knight Moves in under 2 minutes. Looks pretty clean too.

I have a little story that I made up, and I’ll tell it to you if you don’t read that much into it. It’s called the Tale of the Pythian Kassandra, and it’s about a priestess of the Delphic Apollo. I like to think of her as a sturdy, big-hipped woman with a straight Grecian nose and a slight mustache, not very bright, good-hearted, a little vague, and new to the job—the priests never chose the Pythia for her brains, you see; they didn’t want the oracle challenging their power. The oracle was open for business only one month out of the year, so we’ll have to picture this story as taking place toward the end of Kassandra’s busy time; the pilgrims have been winding their way up to the temple for weeks now, and Kassandra’s
been breathing the inspiring vapors so often she’s half-addled.

wjw May 17, 2011 at 1:30 am

Wow, the one day I get boing’d, I’m traveling and have no Internet! This is posted off a borrowed computer in a hotel lobby.

Okay, I’ve now got multiple copies of Knight Moves, so thanks everybody.

I’ll have to get back to y’all later on the other books, after I get home and have time to organize my response.

Zora May 17, 2011 at 2:33 am

Might be a good idea to organize ourselves and spread this out. There are two books left to do. Contribute two, old, sacrificial books. Cut off the spine with a guillotine and send the pages through the offered document feed scanner. 300 dpi, save as .pngs. One or two people to do the OCR. Two people to proofread each book — two passes. Let WJW do the collecting of email addresses and organizing of the group. The more people involved, the faster we’ll get this done.

I should perhaps mention that I have a new, fast computer and the latest Abbyy. I could do the OCR in under half an hour. But if someone else has the same setup and wants to do it, fine.

If Greg Weeks wants to give the files a final once-over and validate the XHTML, do accept his help. He’s the guy who has organized a lot of the sf at Distributed Proofreaders. You have him to thank for the etexts of H. Beam Piper and Astounding Stories, among other things.

DensityDuck May 17, 2011 at 4:42 am

Oh, and since I forgot to say: I plan to purchase these as soon as they’re made available. I’ll only steal it if I can’t get it (…did that make sense?)

MollyKate May 17, 2011 at 10:01 am

I’ve been producing out of print and hard to find ebooks for years and years – up into the 4 digits long since. The thing most people really don’t realize is that proofreading and formatting are the key to a decent ebook… there IS no way to produce a “clean scan” with only a scanner and AbbyyFineReader. Come on, guys – it’s not brain surgery! Even publishing companies hire proofreaders, or used to, though sometimes these days I wonder.

(Just could not resist putting in my two cents worth, having spent way more energy than I wanted to on trying to get groups of volunteer proofers to do a decent job, before I gave up and started working alone.)

I saw a copy of the original post on a private email list, and I have to say it is the most entertaining of its type I’ve seen. In fact, I don’t think I’ve enjoyed an author letter so much in decades. WJW clearly deserves good clean files of all his books and many sales.

Leave a Comment

Previous post:

Next post:

Contact Us | Terms of User | Trademarks | Privacy Statement

Copyright © 2010 WJW. All Rights Reserved.