Solving Closures of Attributes for a Relation

When we design relational schemas for databases, there are many tasks that rely on, or at least benefit from, the closure of attributes with respect to a given set of functional dependencies (FDs). However, it could take us a lot of efforts to enumerate all the combinations of the attributes and solve each closure of them, which will grow exponentially with the number of attributes.

Consider a relation R with attributes A1, A2, …, An. The number of the combinations of no attribute, one attribute, to all attributes, are n choose 0, n choose 1, …, n choose n, respectively, which are the binomial coefficients of n. The sum of them equals 2 raised to the power of n.

A First Course in Database Systems, Third Edition by Jeffrey D. Ullman, and Jennifer Widom shows us in Example 3.13 that there are two kinds of obvious simplifications.

  • Closing the empty set and the set of all attributes cannot yield a nontrivial FD.
  • If we already know that the closure of some set X is all attributes, then we cannot discover any new FD’s by closing supersets of X.

Apart from these, I have not read other written ways of simplification. I supposed that it would be required to close other combinations of attributes with brute-force, and only by then could I spot useful closures and useless ones.

By saying ‘useless’, I mean closures that are: 1) deduced from trivial FDs only, e.g. {A, B}+ = {A, B}, or 2) augmented by only trivial FDs, e.g. {A, C}+ = {A, B, C} when {A}+ = {A, B}, except if this closure indicate a key with a minimal initial set, in the last example if ABC are all the attributes and AC is the minimal possible combination to get them.

The idea of brute-force worked me through quite a few exercises until I stumbled upon an exercise over a relation with 6 attributes but only 4 FDs, excerpted here:

Consider a relation Stocks(B, O, I, S, Q, D). Let the set of FDs for Stocks be S→D, I→B, IS→Q, and B→O.

A First Course in Database Systems, Third Edition. Exercise 3.5.3.

The combinations of the attributes, excluding the empty set and universal set, are:

  • B, O, I, S, Q, D;
  • BO, BI, BS, BQ, BD, OI, OS, OQ, OD, IS, IQ, ID, SQ, SD,QD;

With the contrast between the number of the combinations and the number of the FDs, I feel neither worthy to list all the combinations nor easy to solve all the closures correctly. Let alone relations with even more attributes! I should find some other rules to make the problem simpler.

Consider the process of closing attributes regarding a set of FDs. Attributes in the initial set (left side of the equal sign) of closures will

  • contribute themselves to the resulting set (right side of the equal sign), always;
  • contribute the right-hand side (RHS) of any FD, if they satisfy its left-hand side (LHS);
  • contribute nothing else, if they do not satisfy the left-hand side of any FD.

Thus generally, on one hand, attributes won’t contribute anything more than themselves to closures if they don’t appear in any FD’s LHS, i.e. (1) attributes that may contribute more than themselves to closures all reside in the LHS of the given FDs. On the other hand, all the FDs won’t contribute more than their both sides, i.e. (2) attributes that contribute only themselves to closures may become indispensable for a key, sometimes.

Claim (1) makes me think that we should start with combinations of the LHS of the given FDs. Beware, though, of the trivial situations when some RHS and LHS of individual FDs overlap. For example, with AB→C and CD→E, ABD, instead of ABCD, will suffice to close to ABCDE.

This consideration leads to another claim: (3) An FD with part of its LHS in a closure’s resulting set can contribute its RHS if and only if the closure is augmented with at least the rest part of the FD’s LHS. Of course, we should try to find the ‘least’ of such part.

With these 3 claims, I’m ready to present my

Procedure of getting closures of a relation given a set of FDs:

  1. Start with any FD. Solve the closure of its LHS with respect to all the FDs. (Claim 1)
  2. Find candidates for augmentation by solving every set difference of any other FD’s LHS and this closure’s resulting set, respectively, i.e. attributes in the FD’s LHS but not in the closure’s resulting set. (Claim 3)
  3. Augment the closure with the set differences from the last step, one by one, generating a series of new closures.
  4. For any newly generated closure, repeat steps 2-3, until no new closure can be found.
  5. Repeat the above steps for all other FDs.
  6. Finally, augment every closure with all the attributes that are absent from its resulting set, to reveal keys of the relation. (Claim 2)
  7. Anytime, ignore a closure whose initial set is a superset of some other closure’s initial set and is a subset of that closure’s resulting set.

This procedure can reduce the amount of work when the number of FDs are much fewer than the number of combinations of attributes, which is often the case when attributes reach more than four.

Let’s try this procedure on the Stocks exercise.

With S→D, we have S+=SD at first. The candidates for SD are I, and B. Augmented closures are SI+=SIDBQO and SB+=SBDO. SI is a key. The candidates for SBDO is I. The augmented closure is SIB+ can be ignored since SIB is a superset of SI and a subset of SI+, or simply put, SI is already a key.

With I→B, we have I+=IBO at first. The candidate for IBO is S. We have solved IS+ already.

With IS→Q, we have solved IS+ already. There is no candidate for augmentation in any way, since it’s already a key.

With B→O, we have B+=BO at first. The candidates for BO are S, and I. Augmented closures are BS+ and BI+. We have solved BS+ already. BI+ can be ignored since BI is a superset of I and a subset of I+.

Until now, we have only these useful closures:


Augment S+ with IBQO, since SIBQO is a superset of SI and a subset of SI+, it can be ignored. The same applies for the rest attempts to find a key other than SI.

Thanks to the simplicity of the FDs, we can solve all the useful closures with just 10 enumerations in this exercise, much less than all the combinations of the attributes.


Bug Check 0x124 when Sysprep generalizes Windows 10

We have recently been preparing for the deployment of a big number of new desktop computers. The plan was to follow the Microsoft ways using WDS and/or MDT, whose procedures generally include setting up a referencing computer and customizing it, then capturing the image to deploy.

The process went along very well during our early stage tests on virtual machines, until two samples of the physical machines were delivered to us for further preparation, when things began to get complicated.

The model was DELL Optiplex 3240 AIO, an all-in-one computer, and the problem was most of the time when Sysprep was run to generalize the OS, it turned out a blue screen of death, with Bug Check error type WHEA_UNCORRECTABLE_ERROR, no matter if firmware was set in UEFI mode or legacy BIOS mode.

Performing postmortem analyses on the Minidump files, I found out the error code was 0x124, and the first parameter was always 0x4, which means an uncorrectable PCI Express error occurred. After running !analyze -v, WinDbg reported the problem was probably caused by GenuineIntel, but I highly doubted it was the CPU to blame.

Most of the documents, including MSDN and the Microsoft community Wiki, says that this Bug Check is almost always caused by physical hardware failures, especially overclocking. In spite of that, I still could not convince myself. Obviously, all of the two sample machines were showing the same symptom, and if it were a hardware issue, the quality would be disastrous. After all, it’s DELL.

Finally, I googled ‘sysprep bsod 124’ and found one case on Microsoft TechNet forum, titled ‘BSOD when sysprep Windows 10‘, which was nearly identical to our situation: all-in-one computer, same Bug Check error, same type number, only the brand is HP. In the replies, someone provided another kernel debugger helper command, !errrec, which shows detailed WHEA error records given the address, i.e. the second parameter in the Bug Check.

Later, I found the command was already provided in the MSDN document of Bug Check 0x124. The solution could come much earlier, had I been patient enough to follow the document!

In the output of !errrec the related Device Ids were revealed.

  • 10ec:8168, Realtek Gigabit Ethernet
  • 8086:24f3, Intel Dual Band Wireless-AC 8260

All of them are network adapters.

Since it’s not practical to remove any hardware from inside of an all-in-one computer, we decided to firstly try uninstalling all network adapters from Device Manager before doing Sysprep, and it worked!

When the network devices are removed manually from Device Manager one by one, with Sysprep run immediately afterwards, blue screens are never observed again. The method really worked throughout our subsequent generalization and capturing processes.

But why does such a serious problem happen in such a fundamental procedure? Obviously, the scenario has never been tested by the manufacturers, neither DELL nor Realtek, Intel, etc. Don’t DELL’s teams use Sysprep any more? Guess so and not.

  • Sysprep may be still in use. When doing the search, I saw some comments on Sysprep, which recommend building a reference computer offline, to prevent background tasks, esp. Windows Updates from altering the system without solicitation. Maybe DELL make their images in a closed lab environment where our scenario never happens.
  • Sysprep may be fading out. With prevailing DISM techniques, servicing an image offline might be no longer a question but perhaps a best practice. When altering an image offline, device drivers never sneak in so our scenario will never happen.

Use Write-Verbose to output messages when writing PowerShell scripts

It seems to be intuitive to use Write-Host to output debugging or informative messages to the console when writing and debugging PowerShell scripts, as we usually do in the “hello, world” examples, but it’s not right.

The correct way of doing this looks like the following function:

function my-function

    Write-Verbose "the message"

By default, the verbose messages won’t display. To show them, call the function with the standard Verbose option:

my-function -Verbose


What does ‘Exp’ mean in source code files from a CVS repository?

We often see an ‘Exp’ in source code files from a CVS repository, eg.

$Id: samp.c,v 1.5 1993/10/19 14:57:32 ceder Exp $

$Header: /projects/compbio/cvsroot/CompbioCVSDoc/cvs-manual/cvs_76.html,v 1.1 1997/04/19 23:10:26 markd Exp $

What does it mean?

Answer from Guide to CVS commands:

Any identifier is acceptable for state. A useful set of states is `Exp’ (for experimental), `Stab’ (for stable), and `Rel’ (for released). By default, the state of a new revision is set to `Exp’ when it is created. The state is visible in the output from cvs log, and in the $Log$ and $State$ keywords. Note that CVS uses the dead state for its own purposes; to take a file to or from the dead state use commands like cvs remove and cvs add, not cvs admin -s.

Overlay Images with EXIF Timestamp using ImageMagick

Sometimes we may need to annotate an image with its timestamp. Like the following:

DSC02268 => DSC02268

Here’s how to do this:

on Windows 7, open cmd with delayed expansion on.

cmd /v:on

set PATH to ImageMagick.

path C:\Program Files (x86)\ImageMagick-6.3.3-Q16;%path%

change current directory to where image files exist.

cd /d D:\TEMP\2014-10-22


if not exist withdate mkdir withdate
for %i in (*.jpg) do @(
  set o=withdate\%i
  echo %i =^> !o!
  for /f "tokens=1-2" %x in ('convert -ping -format "%[EXIF:DateTimeOriginal]" %i info:') do @(
    set d=%x
    set d=!d::=-! %y
  for /f %x in ('convert -ping -format "%[height]" %i info:') do @(
    set /a fontsize=%x/25
    set /a padding=%x/50
    set /a strokewidth=%x/25/12
  ) >nul
  convert -gravity SouthEast -font Tahoma -pointsize !fontsize! ^
    -stroke black -strokewidth !strokewidth! -annotate +!padding!+!padding! "!d!" ^
    -stroke none -fill white -annotate +!padding!+!padding! "!d!" ^
    %i !o!


zfs on FreeBSD 10 snippets

1. easily identify new disks with gpt label

1.1 gpart create -s gpt da0
1.2 gpart add -t freebsd-zfs -a 100M -s 136G -l nas01.0.bay5 da0
gpart add -t freebsd-zfs -a 100M -s 136G da1
gpart modify -i 1 -l nas01.ext1.bay1 da15
1.3 true > /dev/da0p1 # workaround for kern/154226
1.4 gpart show -lp

2. create/import zpool (transactional object layer; pooled storage layer)

zpool create tank mirror gpt/nas01.0.bay1 gpt/nas01.0.bay5
zpool attach tank mirror-0 gpt/nas01.0.bay3
zpool add tank mirror gpt/nas01.0.bay2 gpt/nas01.0.bay6
zpool add tank spare gpt/nas01.0.bay8
zpool import

3. create zfs dataset (zfs posix layer; zfs volume emulator)

zfs create -o quota=5G -o copies=2 tank/home
zfs set compression=on tank/home
zfs create -o atime=off -o exec=off -o setuid=off -o logbias=throughput tank/ghost

4. move var and tmp

# ref. freebsd on zfs
/etc/rc.conf: zfs_enable=”YES”

5. samba

# … pkg install samba42 …
# ref. /usr/local/etc/smb.conf
/etc/rc.conf: samba_enable=”YES”

6. nfs

/etc/rc.conf # ref. nfs on freebsd
zfs set sharenfs=’-network -maproot=0′ tank/vmware

7. iscsi

# ref. iscsi on freebsd

8. tuning for vmware esxi

zfs set atime=off tank/vmware
# compromise data integrity for performance, ups required
zfs set sync=disabled tank/vmware

9. set up monitor

# /etc/periodic.conf:

A. view and monitor

zfs get all tank/home
zfs list -t all -o name,type,creation -s creation
zpool iostat -v tank 5

B. backup

B.0 preparation:

#tail -F /var/log/messages
#dd if=/dev/random bs=1m count=10240 of=/dev/ada8
gpart create -s gpt ada8
gpart add -t freebsd-zfs -a 4K -s 1953514544 -l nas01.bak01 ada8
geli init gpt/nas01.bak01

B.1 first-time:

geli attach gpt/nas01.bak01
zpool create -R /bak01root bak01 gpt/nas01.bak01.eli
zfs snapshot -r tank@2013-03-25
zfs hold -r latest-backup tank@2013-03-25
zfs send -R tank@2013-03-25 | zfs receive -duvF bak01
zpool export bak01
geli detach gpt/nas01.bak01

B.2 incremental:

zfs snapshot -r tank@`date +%F`
zfs hold -r latest-backup tank@`date +%F`
geli attach gpt/nas01.bak01
zpool import -N -R /bak01root bak01
zfs list -H -d 1 -t snapshot -o name tank | xargs zfs holds
#zfs list -H -t snapshot -o name | grep ‘^bak01.*@yyyy-mm-dd’ | xargs -n1 zfs rollback
zfs send -R -i yyyy-mm-dd tank@`date +%F` | zfs receive -duvF bak01
zpool export bak01
geli detach gpt/nas01.bak01
# perform (B.3), and then
zfs list -H -d 1 -t snapshot -o name tank | xargs zfs holds
zfs release -r latest-backup tank@yyyy-mm-dd
#zfs get userrefs | sort -k 3

B.3 verify on another machine booted from LiveCD:

geli attach gpt/nas01.bak03
zpool import -N bak03
zpool scrub bak03
gstat -p
zpool status
zpool export bak03
geli detach gpt/nas01.bak03

C. replace disk

gpart show -lp da0
zpool replace tank gpt/nas01.0.bay1 gpt/nas01.0.bay8
zpool replace tank 3545861042431994935 gpt/nas01.0.bay5
zpool detach tank gpt/nas01.0.bay1
gpart modify -i 1 -l nas01.0.bay1.FAIL-`date +%Y%m%d` da0
# replace hardware
=> (1) partition and label the disk immediately after insertion
=> (F) erase zfs metadata if “zpool replace” complains
zpool add tank spare gpt/nas01.0.bay1

D. reduce mirror level

zpool detach tank gpt/nas01.0.bay3

E. destroy dataset

zfs destroy tank/test

F. erase zfs vdev metadata (tcsh arithmetic)

# cf. zpool labelclear
set p=gpt/nas01.0.bay5
zdb -l /dev/${p}
set s=`diskinfo ${p} | cut -f 3`
@ s = $s / 1024 / 512 – 1
dd if=/dev/zero of=/dev/${p} bs=512K count=1
dd if=/dev/zero of=/dev/${p} bs=512K count=2 seek=$s
# ref. p.22

NetFlow vs sFlow

Nutshell: NetFlow for routers and firewalls, while sFlow for Ethernet switches.

NetFlow v9 ==> IPFIX (IETF) draft

NetFlow: High CPU usage. Some vendor with hardware.
sFlow: Free license for vendors. Hardware (expansion card) is the only choice.

NetFlow records every packet, header and payload truncated at 1200 bytes. Flexible, can get URLs, hostnames, etc. Sampled NetFlow exists.
sFlow takes samples. Designed for Ethernet.


  1. NetFlow vs. sFlow for Network Monitoring and Security: The Final Say
  2. NetFlow or sFlow: which is the open standard?

Differences between partners and resellers

Original from:

  • Partners understand solutions can be complicated.
  • Partners usually have Solution Architects that work with engineering and project management teams to make sure all parties understand all aspects of a solution.
  • Designs are created, reviewed, and explained to customers before installation begins.
  • Partners and customers are able install and test solutions together before engineers come to your location and start unboxing equipment and plugging in equipment.
  • Partners have true integration facilities that can be used to configure your entire solution.
  • Partners invest in day-two support options for their clients. They provide options for onsite engineering and/or call center support.

VMware vShield Endpoint may consume too much memory on Windows XP

The problem affects:

VMware vCenter Server 5.0 Update 1 and modules 5.0.0U1
    VMware Tools 8.6.5 build-621624
      vsepflt.sys build-443031

The problem is fixed in:

VMware Tools 8.6.5 build-731933
    vsepflt.sys build-652273

VMware vShield Endpoint Thin Agent 1.0 Update 3 Release Notes

After installation of Thin Agent drivers (vShield Endpoint 1.0.0, vShield Endpoint 1.0.0 Update 1, or vShield Endpoint 1.0.0 Update 2) on Windows XP/2003 guests, overall memory usage of the guest VM may increase substantially with all processes consuming higher amounts of memory than usual. This behavior is not observed in later versions of Windows viz. Vista and later. It results from a combination of the way the Thin Agent reads a file during its scan and Windows internal behavior.