I want to attach a USB stick to the AVM Fritz!Box 7170 to use as USB storage and be able to write to it using the integrated ftp server. When writing a bunch of files, the write performance drops to under 50 kb/sec, while the stick can easily handle 512 kb/sec. Why the bad performance and why the drop?
I replaced the stock AVM firmware with Freetz but got similar results. What got my attention is a drop in performance after copying 4 files, that does not recover after time. The following tests were done using the Freetz modification with Linux kernel 2.6.13.1-ohio.
Performance drop when writing
Look at these numbers when copying a bunch of files to the stick using scp:
$ scp tmp* root@fritz.box:/var/media/ftp/uStor00/
tmp1 100% 2048KB 682.7KB/s 00:03
tmp2 100% 2048KB 512.0KB/s 00:04
tmp3 100% 2048KB 512.0KB/s 00:04
tmp4 100% 2048KB 55.4KB/s 00:37
tmp5 100% 2048KB 38.6KB/s 00:53
Each following transfer would then be at only 55KB/s. Issuing a sync command to flush out dirty buffers makes no difference, so the speed is not throttled by the USB stick being busy.
Let’s have a look at the VFS cache
The Linux kernel reveals some interesting cache and memory information in /proc/meminfo. These are numbers taken after a fresh boot:
# cat /proc/meminfo
MemTotal: 30204 kB
MemFree: 9632 kB # unused, completely free memory
Buffers: 280 kB
Cached: 6280 kB # memory used for cached files
SwapCached: 0 kB
Active: 8652 kB
Inactive: 1524 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 30204 kB
LowFree: 9632 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB # memory waiting to be written to disk
Writeback: 0 kB # memory actively being written to disk
Mapped: 8040 kB
Slab: 6028 kB
CommitLimit: 15100 kB
Committed_AS: 5724 kB
PageTables: 240 kB
VmallocTotal: 1048560 kB
VmallocUsed: 4056 kB
VmallocChunk: 1043636 kB
While copying the first files, the highlighted numbers read like this:
MemFree: 1716 kB
Cached: 13704 kB
Active: 8976 kB
Inactive: 8928 kB
Dirty: 6836 kB # lots of data waiting to be written
Writeback: 444 kB # lots of data being actively writting
We see that the cache is filled up quickly with buffers also marked to be written on the stick (marked dirty) and that the pdflush daemon already started to write out chunks of consecutive data to the usb stick. Remember that usb sticks have good performance when streaming out data chunks that fit into the physical structure but bad performance, when writing out small chunks because a lot of the flash memory keeps being reread and overwritten. The performance is good here, because there are a lot of dirty buffers the kernel can optimize the writing out.
Writing file ‘tmp1’
Let’s go back and look at the numbers exactly after tmp1 has been written (2048 kB):
MemFree: 7100 kB # before: 9632 kB
Cached: 8456 kB # before: 6280 kB
Dirty: 0 kB
Writeback: 0 kB
The buffers have all been flushed, so the stick is idle. Our cache grew by 2048 kB taken from the free memory, containing now also the file tmp1.
Writing file ‘tmp2’
Copying file tmp2 (2048 kB) is fast and the memory info after copying is no surprise:
MemFree: 5084 kB # 2048 kB less
Cached: 10504 kB # 2048 kB more
Dirty: 0 kB
Writeback: 0 kB
Neither is tmp3 (2048 kB), because there is still unused memory left. But now it’s getting interesting, because write performance with tmp4 drops drastically.
Writing file ‘tmp4’ with no free memory
While writing tmp4, and the performance dropping to 30 KB/sec, the numbers look like this:
MemFree: 1148 kB
Cached: 13988 kB
Dirty: 12 kB
Writeback: 36 kB
Of course free memory is useless, we’d rather have everying to into the cache. The cache stays filled (we have tmp1, tmp2 and tmp3 in the cache), but the values for Dirty and Writeback are too low.
Before, the file to be written was completely loaded into the cache first and marked dirty.The pdflush daemon was started deferred and found rich caches to be written to disk.
The number of blocks marked dirty now never seems to exceed 50 kB. The pdflush daemon can only flush out small chunks of up to 36 kB at once (usually less), resulting in a lot of USB operations and overhead and low performance.
Clearing the cache helps
The Freetz kernel unfortunately does not expose /proc/sys/vm/drop_caches to drop all cached buffers. But what happens, if we rm tmp1:
MemFree: 1604 kB
Cached: 14004 kB
Nothing. tmp1 is not in the cache anymore and most likely tmp4 has taken it’s place, because it is newer. But tmp2 is still in the cache, so let’s rm it:
MemFree: 3464 kB # rm tmp2 frees up the cached memory
Cached: 12152 kB # the rm'ed file is removed from cache
Now we have over 3 MB free and unused memory and the file is not in the cache anymore.
Writing tmp5
Now let’s copy tmp5 (2048 KB). These are numbers from during the copy to see the values of Dirty and Writeback, so the file is only partly transfered yet:
MemFree: 2204 kB
Cached: 12948 kB
Dirty: 152 kB
Writeback: 424 kB
We again see high numbers for Dirty and Writeback as parts of the copied file are moved to the cache and dirty. The pdflush daemon gets huge chunks of buffers again to be streamed to the medium and we get a fairly high transfer rate.
Broken kernel behaviour
This is the fairly old Linux kernel 2.6.13.1-ohio from Freetz. The behaviour of the VFS and pdflush seems to be broken and thus result in very poor write performance:
- when there is no free memory available, why doesn’t the kernel free more old cache memory for the new buffers to be marked dirty?
- it seems, the pdflush daemon is forced to write out as soon as there are dirty buffers and memory is low (= no free memory). Why does the kernel seem to prefer to free memory by writing out dirty buffers instead of clearing the read cache to make room for more dirty buffers?
- allocating new buffers seems to stall while pdflush daemon is freeing up dirty memory
- new buffers are still taken from old cached files, so after copying the whole file, it is completely in the cache. why not put if completely in the cache before starting to write out and stall allocation of new buffers?
- rm’ing a file that is in the cache, frees up the cache, resulting in performance boosts, until that free memory is used by the cache again and the pdflush daemon writes out much smaller chunks. Practically that won’t happen and a normal Linux system should never have large amounts of free memory.
This is a kernel bug preventing Fritz!Box 7170 from ever achieving good write performance on my USB stick and other mediums.
Summary
- When copying files to the Fritz!Box, the kernel caches these files in it’s cache only when free memory is available.
- It writes out the cached files to storage, with good performance because there are big chunks to be written. The files remain in the cache in case they are read.
- With a full cache and no free memory, new files aren’t cached anymore but directly written to the medium, resulting in a lot of small writes with big overhead and bad performance. The file is still in the cache after writing is done.
- Clearing up the cache results in free memory and write performance boosts until the cache is saturated again.
- Because the cache is only freed up on unmount, the situation almost never happens, making writing data to USB sticks a pain.
External harddrives might work better, because of fast integrated hardware caches that can take lots of small chunks. But on a USB stick without hardware cache, performance is killed by the small writes.
It is unlikely that this bug will be fixed by AVM or by Freetz for the Fritz!Box 7170 because it seems to be a flaw in the used Linux kernel and AVM does not update the 7170 firmware anymore.
Is this a known bug and is this fixed in newer kernels?