Archive for the ‘programming’ Category

New sharding support (table fragmentation) in Erlang’s Mnesia database?

Sunday, August 10th, 2008

I was just browsing through the Mnesia User’s Guide and surprised to see a lot more detail on the features supporting table fragmentations, Mnesia’s term for data sharding. It sure looks like Mnesia has an impressive feature set for building a fault tolerant large data store. In particular, it looks like you can manage a set of nodes hosting the db as well as the number of fragments for a given table and do add/removes of both at run time. I need to find some time to take it for a test drive. Is it easy to use? Is it fast enough to use as the backing store of a high performance web service?

Technorati Tags: , ,

A quick look at CouchDB Performance

Sunday, December 16th, 2007

I had a chance to play with CouchDB last week and carry out a few performance related experiments. The installation and initial configuration went surprisingly smoothly considering that CouchDB is a new project using a language, Erlang, that is not yet standard fare. I used the apt-get targets suggested on the CouchDB wiki, ran configure and make, and issued a few chmod commands so that my test user could access the default locations for CouchDB’s data, log and run directories and was up and running within a half hour.

The next pleasant surprise was the web-based database browser that comes with the installation. Not only did it provide immediate feedback that I had a working installation, but the GUI actually allows you to do useful things like create/delete database and add/edit/remove documents. Here’s a screenshot showing the document view for one of my test documents:

200712151644

Next, I installed the CouchObject Ruby gem to quickly have a way of working with CouchDB from Ruby. The hardest part here was figuring out that you have to require ‘rubygems’ to use gems. If Ruby can find rubygems after I install it and if I call gem install as root, why can’t Ruby find those packages without additional help? Sigh. Anyhow, once I could load the gem, using it to manipulate CouchDB was no problem.

Performance Experiments

First of all, as is clear from the above, this was my first time using CouchDB and it is entirely possible that I missed some configuration levers that would totally change the numbers I saw. Second, whether the numbers below look good or bad to you will depend on the end use you have in mind — perhaps CouchDB isn’t the right hammer for me.

The test machine is a dual Xeon 2.4MHz server with 4GB RAM running Debian 4.0. The system has one ~60GB SCSI disk. All tests were run with CouchDB 0.7.2.

All tests use a small document template that looks like this:


// Example document
{
"name":12345, // integer
"color":"white",
"type":"washer",
"tstamp":12345 // long
}

For all of the following tests, the client was run on the same server that was running CouchDB.

Test 1: Sequential Document Creation

For Test 1, I created new documents in an empty database one after another in a loop.


=== Add single doc in a loop ===
| N | sec | Docs/sec |
|--------+------+----------|
| 1000 | 9 | 111 |
| 10000 | 102 | 98 |
| 100000 | 1075 | 93 |

The times scale linearly, but I was surprised to see a 1.3GB file size for the database in the 100K case. Given the small example document, I estimated an optimistic lower bound on the space required for storage:


"name" => 4B
"color" => 9B (5B for "white", 4B int for length)
"type" => 10B
"tstamp" => 8B
"_id" => 36B (assume internally assigned ID stored
as 32B string + length)

The total is 67B/doc, but let’s call it 100B/doc to have a nice round number. So 100K documents translates to 9MB. A final DB size of 1.3GB implies an inflation factor of 148. I know disk space is cheap, but is it that cheap?

Test 2: Bulk Document Creation

For Test 2, I used the _bulk_docs API to create each set of documents in one call.


=== Add N docs in bulk ===
| N | sec | Docs/sec |
|--------+-----+----------|
| 1000 | 0.6 | 1667 |
| 10000 | 15 | 667 |
| 100000 | NA | NA |

When attempting to add 100K documents in bulk the client said:


/usr/lib/ruby/1.8/timeout.rb:54:in `rbuf_fill': execution expired (Timeout::Error)

And the server said:


eheap_alloc: Cannot allocate 729810240 bytes of memory (of type "heap").
Aborted

Here I was surprised both that 100K documents was enough to exhaust the system’s memory and that the result was a complete crash of the server. I wonder how hard it would be to take advantage of Erlang’s touted fault tolerant features to make the server more robust to such situations.

Test 3: Repeated Bulk Document Creation

In Test 3, I ran a loop in which I added 1000 documents via the bulk API in loop. I was able to add 200K documents in 96 seconds. The big surprise was that this resulted in a DB size of 38MB (an inflate factor of 2). It is not clear to me why the DB size varies so much between the document at a time and bulk access methods. But it does suggest that if you know how to use CouchDB, you can get very good performance and that, as CouchDB developer Damien Katz puts forth in this post on his blog, there is ample room for some big optimizations in CouchDB.

Conclusions

  1. CouchDB was easy to install and use. The available documentation was helpful.
  2. Performance-wise, it is fairly easy to discover limitations of the current system (as is completely reasonable, IMO, for an alpha release) especially for use cases involving lots of documents.
  3. My take away is that CouchDB is a compelling solution for many use cases, but not yet ready for large collections with high throughput demands. I suspect that is going to change as performance optimizations are implemented and replication and partitioning features are added in the future.

Technorati Tags: , ,

Amazon announces simpleDB, Momentum Builds for Simple Databases

Friday, December 14th, 2007

Adding to the momentum behind non-relational simple databases, Amazon announced its simpleDB web service product. As described on the simpleDB main page,

Amazon SimpleDB is a web service for running queries on structured data in real time. This service works in close conjunction with Amazon Simple Storage Service (Amazon S3) and Amazon Elastic Compute Cloud (Amazon EC2), collectively providing the ability to store, process and query data sets in the cloud. These services are designed to make web-scale computing easier and more cost-effective for developers.

SimpleDB shares a number of similarities with couchDB which describes itself on the couchDB Quick Overview as:

A document database server, accessible via a RESTful JSON API.
Ad-hoc and schema-free with a flat address space.
Distributed, featuring robust, incremental replication with bi-directional conflict detection and management.
Query-able and index-able, featuring a table oriented reporting engine that uses Javascript as a query language.

Both provide a simple RESTful API, although SimpleDB looks like it is more XML based whereas CouchDB uses JSON. Both provide access to an ad-hoc database of items (or documents in CouchDB parlance) that consist of key/value pairs and each provides a mechanism to query the items by their contents. And both are implemented using Erlang (CouchDB for sure, simpleDB according to this post on Inside Looking Out).

While the similarity between SimpleDB and CouchDB is quite evident, it wasn’t until I read over the Detailed Description section of the SimpleDB main-page that I realized that document databases, where documents consist of key/value pairs, are really very close to Google’s BigTable concept (you just have to turn your head and squint a bit). To get a feel for this, take a look at the description of Hbase’s data model (Hbase is an open source BigTable-like simple database that integrates with Hadoop) and compare to SimpleDB and CouchDB.

Aside from the buzz, I’ve been thinking a lot recently about simple databases and how a fast, highly scalable, and flexible key/value store is an essential component of just about any serious web application. And I’ve been lamenting that an open source implementation that can be used as a building block of web-scale applications doesn’t yet exist (although in time, projects like CouchDB, Hbase, and ThruDB may fill this void). Despite not being free, nor open source, perhaps Amazon’s SimpleDB is the building block I’ve been looking for. But I’m not sure.

For one thing, I’m wary of ending up with a web app that is tightly coupled to AWS. EC2 makes a lot of sense to me because the boundary between your app and AWS is clear. In that model you can run your app on your servers and deploy extra nodes to EC2 when you need more power. But without a local alternative to SimpleDB, one would have to be very careful not to end up with an app that can only run on AWS — and that also complicates the development process since you can develop and test offline. The cost model of SimpleDB is attractive, so in the end I guess my concern boils down to not having a non-AWS local only solution…

Another aspect that I’m uncertain of is the choice of XML (or JSON for CouchDB). For real-time processing of large volumes of documents, I think it may make more sense to have a more compact data representation and have interfaces that are more integrated into the programming languages being used. For this, I really like what the Facebook developers have made available in the Thrift project. Although such an approach makes the schema somewhat less flexible, I’d really like to see a simple database that makes use of thrift and focus on speed and scalability. Sounds like a perfect project to dive into learning Erlang in the spare time I don’t have :-)
Edited to add:
This post has a more detailed SimpleDB vs CouchDB comparison.

Technorati Tags: , ,

ctime in Unix Means Last Change Time, Not Create Time

Sunday, November 18th, 2007

I’ve always thought that “ctime” provides the creation time of a file on a unix filesystem, and I’ve always been wrong about that. A better mnemonic is change time since the ctime indicates the last time a file’s metadata (inode) was changed. It isn’t as if this information is deeply hidden. Indeed, if you read the man page for stat, you will likely find a fairly straight forward description. For example, on OS X you will see:


# cut from "man stat"

st_ctime Time when file status was last changed (inode data modifica-
tion). Changed by the chmod(2), chown(2), link(2),
mknod(2), rename(2), unlink(2), utimes(2) and write(2) sys-
tem calls.

The Linux man page for stat summarizes ctime behavior nicely: “The field st_ctime is changed by writing or by setting inode information (i.e., owner, group, link count, mode, etc.).”

So if this information is so easy to find, why have I (and I suspect I’m not completely alone on this misconception) been operating with a faulty ctime definition all this time? I think it is because even computer programmers are human. Our primary mode of understanding the world (and this includes computer systems) is to observe patterns and tell ourselves stories about what is happening and why. As you can confirm from reading any email help forum, looking and guessing comes naturally to us humans, reading fine manuals does not. So I saw the pattern atime, ctime, and mtime and I made up a story about access time, creation time (bzzt, wrong), and modification time. It’s a sensible story that happens to be wrong.

I don’t think there is anything wrong with the looking and guessing approach — even for programmers. In fact, I think good software engineers develop a keen intuition about computer systems and use it to great advantage. Making guesses and quickly verifying them is often much faster than finding, reading, and absorbing all of the fine manuals out in the world. But it is important to remember that when things don’t work as expected, we should assess our assumptions and RTFM as soon as possible.

Part of git history

Sunday, August 5th, 2007

I’ve been using git as my primary version control system for a number of months now. Even though most of the projects I interact with use Subversion, I’m able to live in a git world by using git-svn to pull commits from svn repositories and send commits back. And now I’m officially part of the history of git. By which I mean that a completely trivial patch that I wrote has been integrated into the git source tree.

So what’s so interesting about git? Here are a few thoughts…

Speed. I spend a lot of time working with the Bioconductor code base which consists of 243 (and growing) contributed packages all in the same directory in an svn repository. Performing an update (svn update) or a diff (svn diff) are both quite time consuming in this configuration. For example, re-running ’svn up’ after completing an update takes about 14 seconds on my laptop. The equivalent operation using ‘git svn rebase’ takes 2 seconds. Obtaining a complete status (’svn status’) is 4s vs 2s with git (and git’s diff operation is even faster).

Fast local branches. It is very easy and fast to create local branches in your git repository where you can work on a particular feature or bug fix. This allows me to work on a number of fixes/features at the same time without intermingling the changes; I switch from one local branch to the other and keep work separate. This is much faster and cheaper than having multiple svn working copies. Perhaps more important is the git-rebase command that makes it easy to maintain a series of patches (commits) on a local branch and continue to track upstream changes.

Improved log functionality. Since the entire repository is local, log and diff in git are much much faster than svn and are still available when you are offline. In addition, git provides a really convenient view of history that includes the commit log, the diffstat giving a summary of the files changed in the commit, and the actual patch. This makes it possible for me to review commits across a large repository very quickly. This log feature alone is worth using git for if you need to keep tabs on a large project.

So how do you get started? Here are a few tips.

Install git
Git doesn’t use a standard configure script. Instead, you can find all of the options in config.mak.in. You should copy this to config.mak and edit it to suit your needs. On my OS X laptop I have the following in config.mak

prefix=/Users/seth/scm
NO_EXPAT=nope

Then I just ‘make && make install’

Install Subversion with the Perl swig bindings
I built the latest Subversion from source and followed the direction in ’subversion/bindings/swig/INSTALL’ to install the Swig bindings (you will need swig). This worked fine for me on OS X, but I have never succeeded in building the Perl bindings on our SuSE Linux servers without resorting to various tricks to add libs to the link command — if you encounter problems, contact me and perhaps I can help.

Clone a Subversion Repository
The following command will clone a Subversion repository and set it up for use with git-svn. What you are doing here is importing all Subversion history into a local git repository, so depending on the repository, this can take awhile. What is surprising, is that the git repository size including all history is often the same or not much larger than an svn working copy.

git svn clone http://someserver.com/path/to/svn/project/trunk myproj

In the resulting git repository, myproj, there should be a master branch which tracks an internal-use-only branch called remotes/git-svn.

Update the master branch to latest upstream changes

git svn rebase

View some changes

git log –stat -p

You can also set some configuration options and get nice colorization of various git outputs, like diff.

Warning
Git is not Subversion and follows a radically different model in that it is a distributed SCM rather than a centralized SCM. It is really worth the time to read up a bit on git to get a sense of how it works. It will take some getting used to.