Locked out: automatic wordpress upgrade woes

August 2nd, 2008

This morning I decided to a bit of maintenance work on my blog and noticed that my Worpress installation was out of date with the current release. I was running 2.5.1 and 2.6.0 is now available.

I had used the automatic upgrade plugin for my last upgrade and everything went very smoothly. This time, however, I was unable to log into the admin console after completing the upgrade.
Picture 1

It seems that there is a browser cache issue, or perhaps something specific to Firefox 3. I tried using the lost password feature and received a magic URL via email, but this also failed to work. After finding this post during an initial google search, I tried getting to my site via Safari and that worked. Anyhow, perhaps this saves you some panic if you find yourself in the same situation.

Paper describing the weaver package published in Computational Statistics

June 14th, 2008

It seems like a lifetime ago that I developed the weaver package for caching code chunks in Sweave documents. The paper that I presented at the DSC 2007 has finally been published in Computational Statistics. The title is Caching Code Chunks in Dynamic Documents: The weaver package. Here’s the abstract:

Authoring dynamic documents can become tedious for authors when a document contains one or more time consuming code chunks and each edit requires reprocessing all of the document. We introduce the weaver package that allows computationally expensive code chunks to be cached in order to speed up the edit/process/review cycle for dynamic documents authored using the Sweave framework.


And here a link to an unofficial pdf and the weaver package.

Technorati Tags: ,

Setting a longer timeout for Net::HTTP Requests in Ruby

April 13th, 2008

In order to exercise a RESTful web service I’ve been working on, I wrote a quick Ruby script to hammer the service with large update requests. After uncovering and fixing a handful of concurrency issues in the server, I started seeing timeout errors in my test script when I sent numerous simultaneous updates. The errors look like:

/opt/local/lib/ruby/1.8/timeout.rb:54:in `rbuf_fill': execution expired (Timeout::Error)
from /opt/local/lib/ruby/1.8/timeout.rb:56:in `timeout'
from /opt/local/lib/ruby/1.8/timeout.rb:76:in `timeout'
from /opt/local/lib/ruby/1.8/net/protocol.rb:132:in `rbuf_fill'
from /opt/local/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'
from /opt/local/lib/ruby/1.8/net/protocol.rb:126:in `readline'
from /opt/local/lib/ruby/1.8/net/http.rb:2017:in `read_status_line'
from /opt/local/lib/ruby/1.8/net/http.rb:2006:in `read_new'
from /opt/local/lib/ruby/1.8/net/http.rb:1047:in `request'
from /opt/local/lib/ruby/1.8/net/http.rb:1034:in `request'
from /opt/local/lib/ruby/1.8/net/http.rb:543:in `start'
from /opt/local/lib/ruby/1.8/net/http.rb:1032:in `request'
from /opt/local/lib/ruby/1.8/net/http.rb:842:in `post'
from ./rndsender.rb:21:in `post_update'
from ./rndsender.rb:76:in `main'
from ./rndsender.rb:80

Adding a rescue as shown below allows you to handle the timeout error:

def post_update(path, payload)
http = Net::HTTP.new(@host, @port)
res = http.post(path, payload, {'Content-Type' =>; 'application/xml'})
case res
when Net::HTTPSuccess
puts "update posted"
else
res.error!
end
rescue Timeout::Error =>; e
puts "update timeout error"
end

After searching a bit on the web, I came across this post that had the magic incantation for adjusting the timeout in the Net::HTTP API. Here it is:

http = Net::HTTP.new(@host, @port)
http.read_timeout = 500

And in case you are interested in actually making use of the timeout, be warned! Read this in which you will learn that

Ruby’s Thread#raise, Thread#kill, and the timeout.rb standard library based on them are inherently broken and should not be used for any purpose. And by extension, net/protocol.rb and all the net/* libraries that use timeout.rb are also currently broken (but they can be fixed).

Technorati Tags: ,

Free Range Kids: Support for Sane Parenting

April 13th, 2008

Reading my rss feeds this morning I came across a post on Schneier on Security about Overestimating Threats Against Children that led me to a great op-ed essay by Lenore Skenazy, Why I Let My 9-Year-Old Ride the Subway Alone. Here’s an excerpt:

I left my 9-year-old at Bloomingdale’s (the original one) a couple weeks ago. Last seen, he was in first floor handbags as I sashayed out the door.

[snip]

No, I did not give him a cell phone. Didn’t want to lose it. And no, I didn’t trail him, like a mommy private eye. I trusted him to figure out that he should take the Lexington Avenue subway down, and the 34th Street crosstown bus home. If he couldn’t do that, I trusted him to ask a stranger. And then I even trusted that stranger not to think, “Gee, I was about to catch my train home, but now I think I’ll abduct this adorable child instead.”

Long story short: My son got home, ecstatic with independence.

The author has started a website Free Range Kids as a forum for sane parenting. Since the birth of my son, I have a new understanding of how easy it is to overestimate risks when your kids are involved in the equation. This natural protective instinct is… enhanced by the fear mongering in the media. Too often, parent support groups seem to reinforce the quick to worry, slow to think approach in spreading news about the latest threat to our precious ones. So I’m glad to see Skenazy’s essay getting some attention and even happier to see a website focused on making sane choices instead of simply being as safe as possible (although I have started sleeping on the floor since so many Americans die from tragic bed falls).

Technorati Tags:

A quick look at CouchDB Performance

December 16th, 2007

I had a chance to play with CouchDB last week and carry out a few performance related experiments. The installation and initial configuration went surprisingly smoothly considering that CouchDB is a new project using a language, Erlang, that is not yet standard fare. I used the apt-get targets suggested on the CouchDB wiki, ran configure and make, and issued a few chmod commands so that my test user could access the default locations for CouchDB’s data, log and run directories and was up and running within a half hour.

The next pleasant surprise was the web-based database browser that comes with the installation. Not only did it provide immediate feedback that I had a working installation, but the GUI actually allows you to do useful things like create/delete database and add/edit/remove documents. Here’s a screenshot showing the document view for one of my test documents:

200712151644

Next, I installed the CouchObject Ruby gem to quickly have a way of working with CouchDB from Ruby. The hardest part here was figuring out that you have to require ‘rubygems’ to use gems. If Ruby can find rubygems after I install it and if I call gem install as root, why can’t Ruby find those packages without additional help? Sigh. Anyhow, once I could load the gem, using it to manipulate CouchDB was no problem.

Performance Experiments

First of all, as is clear from the above, this was my first time using CouchDB and it is entirely possible that I missed some configuration levers that would totally change the numbers I saw. Second, whether the numbers below look good or bad to you will depend on the end use you have in mind — perhaps CouchDB isn’t the right hammer for me.

The test machine is a dual Xeon 2.4MHz server with 4GB RAM running Debian 4.0. The system has one ~60GB SCSI disk. All tests were run with CouchDB 0.7.2.

All tests use a small document template that looks like this:


// Example document
{
"name":12345, // integer
"color":"white",
"type":"washer",
"tstamp":12345 // long
}

For all of the following tests, the client was run on the same server that was running CouchDB.

Test 1: Sequential Document Creation

For Test 1, I created new documents in an empty database one after another in a loop.


=== Add single doc in a loop ===
| N | sec | Docs/sec |
|--------+------+----------|
| 1000 | 9 | 111 |
| 10000 | 102 | 98 |
| 100000 | 1075 | 93 |

The times scale linearly, but I was surprised to see a 1.3GB file size for the database in the 100K case. Given the small example document, I estimated an optimistic lower bound on the space required for storage:


"name" => 4B
"color" => 9B (5B for "white", 4B int for length)
"type" => 10B
"tstamp" => 8B
"_id" => 36B (assume internally assigned ID stored
as 32B string + length)

The total is 67B/doc, but let’s call it 100B/doc to have a nice round number. So 100K documents translates to 9MB. A final DB size of 1.3GB implies an inflation factor of 148. I know disk space is cheap, but is it that cheap?

Test 2: Bulk Document Creation

For Test 2, I used the _bulk_docs API to create each set of documents in one call.


=== Add N docs in bulk ===
| N | sec | Docs/sec |
|--------+-----+----------|
| 1000 | 0.6 | 1667 |
| 10000 | 15 | 667 |
| 100000 | NA | NA |

When attempting to add 100K documents in bulk the client said:


/usr/lib/ruby/1.8/timeout.rb:54:in `rbuf_fill': execution expired (Timeout::Error)

And the server said:


eheap_alloc: Cannot allocate 729810240 bytes of memory (of type "heap").
Aborted

Here I was surprised both that 100K documents was enough to exhaust the system’s memory and that the result was a complete crash of the server. I wonder how hard it would be to take advantage of Erlang’s touted fault tolerant features to make the server more robust to such situations.

Test 3: Repeated Bulk Document Creation

In Test 3, I ran a loop in which I added 1000 documents via the bulk API in loop. I was able to add 200K documents in 96 seconds. The big surprise was that this resulted in a DB size of 38MB (an inflate factor of 2). It is not clear to me why the DB size varies so much between the document at a time and bulk access methods. But it does suggest that if you know how to use CouchDB, you can get very good performance and that, as CouchDB developer Damien Katz puts forth in this post on his blog, there is ample room for some big optimizations in CouchDB.

Conclusions

  1. CouchDB was easy to install and use. The available documentation was helpful.
  2. Performance-wise, it is fairly easy to discover limitations of the current system (as is completely reasonable, IMO, for an alpha release) especially for use cases involving lots of documents.
  3. My take away is that CouchDB is a compelling solution for many use cases, but not yet ready for large collections with high throughput demands. I suspect that is going to change as performance optimizations are implemented and replication and partitioning features are added in the future.

Technorati Tags: , ,


Content recommendations from Evri