* Home * SlinkP Software Projects * Random Zope Notes * blobnotes
Last updated Aug 28, 2008 12:14 am GMT-5

Zope Serving Binary Large Objects from the Filesystem - Benchmarks and Patches

We all know Zope is slow at serving large binaries. And sometimes people like to put files on the filesystem instead of in the ZODB for one reason or another, but this often slows things down further.

Just how slow are we talking?

Let's look at the built-in File class and several popular products that store data on the filesystem instead of in the ZODB. I tried ExtFile 1.1.3; the lesser-known but much enhanced ExtFile 1.4.0; LocalFS-1.2-andreas (the only version I've got running with Zope 2.7); the Filesystem Directory View (FSDV) provided by CMF 1.4.2 (CMFCore/FSFile.py); and finally, a newcomer, ExternalFolder 0.1. As a point of reference, I also benchmarked Apache 2.

(Note: If you should come across LocalFS version 1.1 which is somewhat hidden on LocalFS sourceforge page, do NOT deploy it! It has a showstopper bug: for every download of a file of size N bytes, it writes at least N bytes to the ZODB! No, I'm not making this up.)

(Note 2: Andreas' patched LocalFS-1.1, which bears little resemblance to the buggy LocalFS-1.1 from sourceforge, is available at http://www.easyleading.org/Downloads/LocalFS-1.1-andreas.tar.gz ... but due to spammer abuse, the site blocks several million north american users including me :-( I happened to find an accessible tarball of the CVS version here ... note the "Download Tarball" link at bottom. )

The following table shows requests per second as measured by ab (Apache Bench) version 1.3. Measurements were taken with ab -n 100 -c 10. Tests were run on a Pentium III 1.2 GHz laptop with 512 MB RAM running gentoo linux.

Results are shown sorted by best performance on the largest file (right-most column). Results measured in requests per second (higher is better).

size (kb): 5 50 500 5000 50000
Apache2 2083.33 1818.18 561.80 85.32 8.05
ExtFile 1.4.0 151.75 137.93 35.57 5.66 0.09
LocalFS-1.2-andreas 138.89 120.77 40.31 6.44 0.08
ExternalFolder 0.1 152.44 138.89 35.68 6.08 0.08
File (OFS.Image.File) 174.52 156.49 38.36 7.70 0.07
ExtFile 1.1.3 162.07 139.28 25.73 0.80 0.03*
FSDV (CMFCore.FSFile) 94.16 90.50 26.02 0.83 0.03**
*(188 failed requests)
**(189 failed requests)

This benchmark does not show the whole sorry truth of the situation; FSDV, ExtFile 1.1.3, and LocalFS all commit the sin of loading the entire file into memory and returning it all at once. Good way to eat up your server's memory! On my hardware, FSDV and ExtFile-1.1.3 were unable to stand up to the load on the largest file; most of the requests timed out.

ExtFile 1.4.0 does better - it streams the file in 64k chunks using RESPONSE.write().

(LocalFS 1.0 did something even weirder - load the whole file into memory, then wrap the data up in a temporary zope File object! Crazy stuff. Andreas' version seems to perform marginally better, even though it does the same thing; I haven't looked at why.)

A Partial Solution

At Pycon 2004, Chris McDonough and I played with one way to improve the situation. Chris added support for iterators to ZPublisher and ZServer. Zope code that wants ZPublisher to stream data can now return one of these instead of a string. More information at http://dev.zope.org/Wikis/DevSite/Proposals/FasterStaticContentServing ... hmm maybe i should move this page there.

So, here's my experiment: I've patched ExtFile, FSDV, LocalFS, and ExternalFolder to return an iterator (Chris's ZPublisher.Iterators.filestream_iterator, to be precise). I also instantiated my FileCacheManager to give the same benefit to File objects associated with it. Here are the results, measured in requests per second (higher is better).

size (kb): 5 50 500 5000 50000
Apache2 2083.33 1818.18 561.80 85.32 8.05
File (OFS.Image.File) (CACHED) 155.52 148.59 82.03 17.52 2.01
ExtFile 1.1.3 (PATCHED) 169.49 156.49 81.43 17.97 1.93
ExternalFolder 0.1 (PATCHED) 165.02 138.50 57.37 12.63 1.82
FSDV (CMFCore.FSFile) (PATCHED) 94.97 89.13 56.27 14.33 1.70
ExtFile 1.4.0 (PATCHED) 162.34 135.50 57.08 10.31 1.68
LocalFS-1.2-andreas (PATCHED) 137.17 92.25 54.95 10.66 1.17
LocalFS-1.2-andreas 138.89 120.77 40.31 6.44 0.08
ExtFile 1.4.0 151.75 137.93 35.57 5.66 0.09
ExternalFolder 0.1 152.44 138.89 35.68 6.08 0.08
File (OFS.Image.File) 174.52 156.49 38.36 7.70 0.07
ExtFile 1.1.3 162.07 139.28 25.73 0.80 0.03*
FSDV (CMFCore.FSFile) 94.16 90.50 26.02 0.83 0.03**
*(188 failed requests)
**(189 failed requests)

Summary:

Much better for big files, generally the same or worse for tiny files. For the 50 MB file, the slowest patched product (LocalFS) is 14 times faster than the fastest unpatched product! And as Chris M. pointed out, it's fast enough to saturate a T3 line.

There are some remaining mysteries (Why does FSDV have so much more overhead on smaller files than the others? Why is ExtFile-1.4 slower than ExtFile-1.1.3?), but given the triviality of the patches, I'm pretty happy.

Here are the Patches:

No warranty, blah blah. None of these are well-tested aside from this benchmark.

Next problem: ZEO

We all know that if you want to run a scaleable, high-performance Zope site, you need to run ZEO, right?

There is one gotcha. Large binaries in the ZODB are much worse when ZEO is in the picture.

Actually that's an oversimplification - if some or most of the Pdata objects are in the ZEO cache, the speed is almost identical to that of FileStorage. But as soon as you hit a large blob that isn't in the ZEO cache at all, it gets much slower.

UPDATE 2008/08/28: It's not nearly as bad as it used to be. Zope 2.11 performs much much better than Zope 2.7 in this regard, because a relevant bug has been fixed in ZEO. For large uncached files, ZEO in Zope 2.11 is about 1/5 the speed of FileStorage, as seen in this table.

Scores are requests-per-second, higher is better. For this test, ZEO and Zope were run on the same box. I ran this with ab2 -n1 since I was only interested in uncached results.

size (kb):3225632000
File (via Filestorage; Zope 2.11) 111.1193.621.33
File (via ZEO.ClientStorage; Zope 2.11)
121.1249.61 0.27

For comparison, here's a similar test with Zope 2.7 on older hardware. Again, ZEO was running on localhost and I used ab -n 1. Results measured in requests per second (higher is better). Note that for the largest file, ZEO was 25 times slower than FileStorage.

size (kb):12324121638332
File (via Filestorage; Zope 2.7) 125.0047.6214.080.50
File (via ZEO.ClientStorage; Zope 2.7)
38.463.260.900.02

Mitigation for ZEO

  1. Upgrade to Zope 2.10 or later.
  2. In zope.conf, increase your zeo-client cache size to as much disk space as you can spare (but no more than the size of Data.fs). This will reduce the likelihood of hitting uncached data.

     Send me some mail slinkP home page Powered by Zope