Big image data formats
Prerequisites
Before starting this lesson, you should be familiar with:
Learning Objectives
After completing this lesson, learners should be able to:
Understand the concepts of lazy-loading, chunking and scale pyramids
Know some file formats that implement chunking and scale pyramids
Motivation
Modern microscopy frequently generates image data in the GB-TB range. Such data cannot be naively opened. First, the data may not fit into the working memory (RAM) of your computer. Second, it would take a lot of time to load the data into the memory. Thus, it is important to know about dedicated concepts and implemenations that enable swift interaction with such big image data.
Concept map
Figure
Activities
- Inspect the size of a large TIFF file on disk.
- Compare the file size to the size of your computer’s memory.
- Open this file in an image viewer.
- Observe that this takes some time.
- Observe that your memory fills up.
- Open a big image (several GB) from a TIFF file.
- Lazy-load a big image (several GB) from a TIFF file.
- Observe that TIFF chunking is plane-wise.
- Observe that slicing “at an angle” requires loading everything.
- Observe that TIFF chunking is plane-wise.
- Lazy-load from a file format with resolution pyramid and 3-D chunking (e.g. OME-Zarr, BDV/XML/HDF5)
- Observe that it opens very quickly (thanks to resolution pyramid and chunking).
- Observe that one can swiftly slice the data at all angles (thanks to 3-D chunking).
- Inspect the file format to see the resolution pyramid and chunks.
- Only load one specific resolution level.
- This is useful as most software does not support multi-resolution data for computations.
Show activity for:
ImageJ GUI
Please note that currently, no explicit images are mentioned, because we don’t know yet where to provide big image data. Until we found a solution, please find some appropriate image data for your course.
Open a big TIFF image
- Check big TIFF image file size on disk
- Compare to your computer’s RAM
- Open Fiji
- [ Plugins › Utilities › Monitor Memory… ]
- Load TIFF as a whole: [ File › Open… ]
- Observe that this is slow, because it needs to load the whole image.
- Observe how the memory fills up.
- Slice in x,y,z: [ Image › Stacks › Orthogonal Views ]
- Observe that this is relatively fast, because all data is in memory already.
- Lazy load: [ Plugins › Bio-Formats › Bio-Formats Importer ]
- Use virtual stack
- This enables plane-wise lazy-loading.
- This is faster, due to lazy-loading.
- Changing z-slices takes some time, because it needs to lazy-load the corresponding chunk (i.e. plane).
- Slice in x,y,z: [ Image › Stacks › Orthogonal Views ]
- This takes more time now, because all data needs to be loaded.
- Check the memory while it is taking the time.
Open a big XML/HDF5 image
- Check XML/HDF5 image file size on disk.
- Compare to your computer’s RAM.
- Open Fiji
- [ Plugins › Utilities › Monitor Memory… ]
- Lazy load XML/HDF5 using [ Plugins › BigDataViewer › Open XML/HDF5 ]
- This is fast, because of lazy-loading and resolution pyramids.
- Zoom in (it will fetch higher resolution levels).
- Inspect the data at different angles [ Left button drag ]
- This is fast, because of 3D chunking.
- Lazy load XML/HDF5 using [ Plugins › Bio-Formats › Bio-Formats Importer ]
- Use virtual stack
- This enables plane-wise lazy-loading.
- Choose a resolution level.
- Inspect data (now we are at the same situation as if this were a TIFF image with plane-wise chunking, s.a.).
- HDFView
- Inspect the .h5 file to see the resolution pyramids and chunking.
Assessment
Fill in the blanks
- Opening data piece-wise on demand is also called ___ .
- Storing data piece-wise is also called ___ .
- In order to enable fast inspection of spatial data at different scales (like on Google maps) one can use ___ .
Solution
- lazy-loading
- chunking
- resolution pyramids
Explanations
Follow-up material
Recommended follow-up modules:
Learn more: