Stitching data

How data are stitched

We stitch data using our MATLAB package, StitchIt, which is designed specifically for BakingTray data. Raw tiles are loaded, transformed as defined by parameters in the recipe, and placed at their theoretical correct tile position. StitchIt does not perform any tile alignment using the overlap regions. If the microscope and stage axes are well aligned, good results can be obtained without elaborate stitching procedures. If higher accuracy is required, you should try BigStitcher.

StitchIt provides the following features:

  • Illumination correction for vignetting.

  • Rotation and distortion correction of tiles with user-defined parameters.

  • Pre-processing during acquisition, display of last completed section to web, and automatic initiation of stitching when acquisition completes. This feature requires a Linux PC.

  • Correction of intensity differences across different optical sections.

  • Optional removal of tile seams in stitched images.

  • Down-sampling the dataset to a single multi-page TIFF stack or MHD file.

  • Various useful tools for interacting with the raw data.

PC hardware choices

Stitching is performed by a dedicated PC, not the acquisition PC. Specs to consider:

  • Favour number of cores over clock speed.

  • StitchIt does not use CUDA so a fancy graphics card is not needed.

  • 64 GB of RAM should be enough. If you expect to regularly acquire multiple brains at resolutions in excess of 2.5 by 2.5 by 5 microns then opt for 128 GB.

  • Plan for storing about 8 to 20 acquisitions at a time on the PC before they go to central storage. Remember that each acquisition will transiently occupy about 3 x the raw data size since in addition to raw data you will have the stitched version, possibly a cropped version version of this, and the compressed raw data. About 12 TB is a reasonable storage size to aim for and you can achieve this with platter or SSD RAID. Err on having less rather than more storage since users tend to leave data on the analysis PC and forget about it, rather than processing it and sending it on backed-up central storage. So you can start smaller and add storage only if you really need it.

Operating system choices

It is suggested the analysis PC runs Linux because managing large data is easier on Linux than Windows. In particular:

  • Multiple concurrent users can log on remotely (via command line or remote desktop clients like X2Go).

  • BTRFS software RAID is easy to manage (e.g. you can swap out failed drives remotely over the command line if you have a spare in place) and is fast enough.

  • The command line tools are generally pretty handy for getting you out of trouble if something weird has happened to the data.

  • Having remote command-line access via SSH is really helpful and tmux allows these sessions to be persistent, which makes working remotely very easy.

  • rsync is already there waiting for you to simplify transfers to central storage. Note that rsync also available on Windows.

If you have established workflows for dealing with the above on Windows and prefer this, then you can run StitchIt on Windows also. However, please note that the syncAndCrunch command which does processing and web display during acquisition is Linux-only and there are no plans to port it to Windows.

Last updated