Bill Converted to Images so YOU can’t read it
Found link via drudge, but it doesn’t seem to be there any more.
I’m having problems believing congress is even pretending to act in our best interests.
I’m currently looking for optical character recognition software to see if I can’t work towards converting this thing to normal text. I can’t take a substantial bite “in time” but here’s the link: http://www.humanevents.com/article.php?id=30700
Here are the first 3 paragraphs.
Democratic staffers released the final version of the stimulus bill at about 11 p.m. last night after delaying the release for hours to put it into a format which people cannot “search” on their home computers.
Instead of publishing the bill as a regular internet document — which people can search by “key words” and otherwise, the Dems took hours to convert the final bill from the regular searchable format into “pdf” files, which can be read but not searched.
Three of the four .pdf files had no text embedded, just images of the text, which did not permit text searches of the bill. That move to conceal the bill’s provisions had not been remedied this morning at the time of publication of this article. (You can find the entire bill on the House Appropriations [http://appropriations.house.gov <http://appropriations.house.gov/>] website.)
February 16th, 2009 at 4:06 pm
This is paraphrased, but “don’t ascribe to malice what can adequately be explained by ineptitude”. I’m betting somebody who didn’t really know what they were doing converted the doc to pdf.
I’d say the decision to distribute as PDF was actually the right one (though obviously not as images.) The statement that it should have been published as “a regular internet document” is actually pretty dumb too. PDF is “a regular internet document”. If what they mean is it should have been published as web pages as well as in a downloadable format, OK, but can you imagine the general Internet user trying to save all those web pages for offline reading, or trying to print it? And don’t even get me started if what they meant was it should be published as a Word doc.
It’s entirely possible they felt they didn’t have time to prepare all the individual web pages (and heaven forbid they tried to publish it as all one page!) in time for initial release. (And hours to prepare it as a PDF? Gimme a break.)
Best answer in my opinion would have been a properly tagged and bookmarked PDF as soon as it was ready, then a properly laid out, standards-compliant series of web pages as soon after as possible. Embarrassing that they didn’t have the technical services available to do that for such an important document.
If they anybody really thought that publishing as images was going to keep it from being available in searchable format for more than a couple of hours, they really are incompetent (as you point out Mike, with how an hour or so’s work in OCR would resolve that situation.)
I guess the real question in my mind is: did they get any benefit from releasing it, but in a format that wouldn’t be searchable for a couple of hours? i.e. did the delay serve a tactical purpose?
February 16th, 2009 at 4:18 pm
Well I think it’s regrettably unknowable. I tried D/Ling it but gave up.
PDFs don’t bother me in the least. It’s a decent (if cumbersome) standard “formal” document format.
There’s now a wiki with the full text and dissection efforts going on.
February 17th, 2009 at 1:59 am
I had a quick look at the .gov page and had no trouble downloading a 650 page pdf that was regular text as expected - maybe I wasn’t looking at the same thing you were talking about though.
And no surprise that a full text dissection effort is already in full swing :)
P