In a previous post,
Thumper: Putting Blastwave on ZFS,
I quickly saw some information and jumped to completely the wrong conclusion.
In the comments, Boyd kindly pointed out that I should probably investigate it
a little more thoroughly. So I have. Just to recap, effectively I am trying to
install software, with pkgadd onto a ZFS filesystem. The full filesystem is
17 terabytes, and still has 17TB available. The steps I followed were:
zfs create zpool1/software
zfs create zpool1/software/blastwave
zfs set mountpoint=/opt/csw zpool1/software/blastwave
pkgadd -d http://www.blastwave.org/pkg_get.pkg
And the errors I was receiving were along the lines of:
pkgadd: ERROR: unable to create package object </opt/csw/bin>.
pathname does not exist
pathname does not exist
unable to fix attributes
/opt/csw/bin
for pretty much every file/directory in the package. This resulted in pkgadd
noting that “Installation of <cswpkgget> partially failed.” As Boyd
suggested, I reran the failing scenario under truss:
truss -Df -o pkgadd.truss pkgadd -d http://www.blastwave.org/pkg_get.pkg
so we’re getting timestamps, following child processes and dumping that lovely
trace out to a file for later examination. Examining the output is fun, because
the errors appear to write() one character at a time, so grepping through the
file took a while. Still, here are (what I think are) the relevant calls before
the error message:
910: 0.0001 lxstat(2, "/opt/csw/bin", 0xFEFAAF40) Err#2 ENOENT
910: 0.0001 lstat64("/opt/csw/bin", 0x080450D0) Err#2 ENOENT
910: 0.0002 mkdir("/opt/csw/bin", 0755) = 0
910: 0.0001 xstat(2, "/opt/csw/bin", 0xFEFAAF40) = 0
910: 0.0001 door_info(8, 0x080430C0) = 0
910: 0.0001 door_call(8, 0x080430F8) = 0
910: 0.0001 statvfs("/opt/csw/bin", 0xFEFAAFC8) Err#79 EOVERFLOW
So, what’s happening here? Well, first of all the program is looking to see if
/opt/csw/bin exists. Since it determines that it doesn’t (which is fine, not
an error), it creates the directory with mkdir(). The interesting bit is that
we’re then calling statvfs() which returns file system information, and is
erroring on with EOVERFLOW. There, methinks, is the problem. So, what does
that error mean? According to the manual page:
One of the values to be returned cannot be represented correctly in the structure pointed to by buf.
OK, interesting. A little googling finds a similar problem with bootadm in
bug 6419989
on OpenSolaris. Jan succinctly describes the problem:
The long and short of it is that if the amount of free space available in the root filesystem is greater than can be represented in the vfstat’s f_bavail, the statvfs call will fail with EOVERFLOW. To fix this, bootadm must be compiled with -D_FILE_OFFSET_BITS=64 [ … ]
So it looks like pkgadd is suffering the same problem on Solaris 10 U3. To
try and verify that this was the problem, I tweaked the /opt/csw file system
so that the quota was somewhat less than 2TB (the maximum size of filesystem
that can be represented in 32 bits):
zfs set quota=2g zpool1/software/blastwave
pkgadd -d http://www.blastwave.org/pkg_get.pkg
which worked without error. Score! It looks from the bug tracker that there are a number of applications with the same problem, but that folks testing out the ZFS root file system support are quickly finding & reporting them. Question is, do I need to report this one, or has it already been sorted? Since it’s Solaris 10 U3 (rather than OpenSolaris), how do I go about correctly reporting issues?