February 18, 2017

Building a NAS Part 5

FreeNAS Impressions

So since my last post, I’ve had a good amount of time to play around with FreeNAS. I want to say that FreeNAS is the real deal. This is a complete, fully-featured enterprise storage operating system. The GUI is easy work with and the amount of available features is very high.

Performance Issues

So I set up two drives in a mirror and created two datasets. I shared one dataset with CIFS and the other with NFS. I used my desktop computer (Windows 10) and my laptop (Debian 8 Testing) to connect to the shares, both machines using a wired connection.

I started copying some (large) files to the FreeNAS server to gauge performance. The Windows machine could transfer the data to the CIFS dataset at 100+ MB/s, so no problems there. My Debian laptop could transfer to the NFS dataset at 100+ MB/s, which is expected. The issues came up when I used the Windows machine to transfer data using NFS and also when I used the Debian machine to transfer using CIFS. The Windows machine could only manage 25 MB/s over NFS and the Debian machine could only manage 35 MB/s over CIFS. Something was amiss.

Debian Solution

The problem with the Debain machine was simple. I did not have the package cifs-utils installed. I do not know what I was using before. But after installing that package, my CIFS transfer speeds jumped up to 100+ MB/s, the same as the Windows machine, and what I was expecting. So that got resolved.

Windows, NFS, & Synchronous Writes

The problem with Windows and NFS was more complicated.

First, I want to briefly describe synchronous and asynchronous writes. When you transfer data to another machine, it first goes into the receiving machine’s memory, then it is recorded in the machine’s disk system. A synchronous write waits until the disk system has confirmed that the data has been written before writing more data. Asynchronous writes do not wait for this confirmation.

Waiting for the confirmation is critical to data integrity. Knowing that the data is actually on the disk system is obviously good. However, this benefit comes at the expense of speed. Synchronous writes are slower because the client machine has to wait until the receiving system’s disks have confirmed that the data was written.

The problem with just using asynchronous writes is if something happens between a client sending the data and the server’s disks writing the data. Let’s say that asynchronous writes are being used and the server crashes during the transfer. The client thinks that all the data it sent is on the server’s disk. The server told the client that it was. However, the server’s disks don’t have all the data. Since the writing process is slower than the transfer process, there was some data in the server’s memory that had not been written to the disks yet. When the server crashed, the contents of the volatile system memory are lost. Now there is a mismatch between what the client thinks the server has and what the server’s disks actually have. This is called data loss.

For testing, I disabled synchronous writes on my NFS share. I then copied a file to my FreeNAS box from Windows. My speeds jumped up to around 70 MB/s. So a pretty big improvement. Since I am not completely sure that I want to deal with the risk of asynchronous writes, for now I have simply reverted back to the defaults. I will simply avoid using NFS on Windows. This doesn’t address the problem, but a proper solution is costly (either in money or performance).

I’m still left with a few questions. Notably, why did the Debian machine not have any issues with synchronous writes? Was it transferring data asynchronously? Also, why did CIFS have no problems with synchronous writes. Is CIFS transferring data asynchronously?

POSTSCRIPT: I found some more info after some research. The default behavior of FreeNAS datasets is to write data asynchronously unless synchronous writes are specified. So I think that anything achieving a fast transfer rate, is more than likely using asynchronous writes. I will have to see how much of a risk this is and if there is a way to address it. I think a uninterpretable power supply would be a cost-effective solution.

There’s actually a solution to the synchronous write problem. You use a SLOG. In my understanding, with ZFS, you can ‘make a record’ of your writes to your disk before you write the data. This is called a ZFS Intent Log (ZIL). This is not the data being written, but a ‘record’ of what data needs to be written. The problem is that now you have to do even more writing to the disks. So you can move the ZIL to a separate, high-speed device, called a Separate ZFS Intent Log, or SLOG. This is non-volatile, high-speed storage. In the case of a system crash, the ZFS system can go through the persistent SLOG and see what data got written and what was not written and then correct itself.