After completing this lesson, learners should be able to:
Open and save various image files formats
Understand the difference between image voxel data and metadata
Understand that converting between image file formats likely leads to loss of information
Motivation
There are numerous ways how to save image data on disk. Virtually every microscope vendor has their own file format. It is thus very important to understand how to open those files and inspect their content. Moreover, some software will open only specific image file formats and thus it is sometime necessary to re-save the data. During such image file format conversions information can be lost; it is important to be aware of this and avoid such information loss as much as possible.
[Image > Adjust > Brightness/Contrast] such that cells appear saturated
[File > Save As > GIF…]
Open with Fiji
Pixel values have changed
Open with a web browser
Movie plays and looks as when you saved it
python BioIO
# %%
# Load different image files and access various levels of metadata
# minimal conda env for this module
# conda create -n ImageFileFormats python=3.10
# activate ImageFileFormat
# pip install bioio bioio-tifffile bioio-lif bioio-czi bioio-ome-tiff bioio-ome-zarr notebook
# %%
# Load .tif file with minimal metadata
# - Observe that BioImage chooses the correct reader plugin
# - Observe that the return object is not the image matrix
frombioioimportBioImageimage_url='https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_8bit__nuclei_PLK1_control.tif'bioimage=BioImage(image_url)print(bioimage)print(type(bioimage))# %%
# Print some onject attributes
# - Observe that the object is 5 dimensional with most dimensions being empty
# - Observe that the dimension order is always time, channel, z, y, x, (TCZYX)
print(bioimage.dims)print(bioimage.shape)print(f'Dimension order is: {bioimage.dims.order}')print(type(bioimage.dims.order))print(f'Size of X dimension is: {bioimage.dims.X}')# %%
# Extract image data
# - Observe that the returned numpy.array is still 5 dimensional
image_data=bioimage.dataprint(type(image_data))print(image_data)print(image_data.shape)# %%
# Extract specific part of image data
# - Observe that numpy.array is reduced to populated dimensions only
yx_image_data=bioimage.get_image_data('YX')print(type(yx_image_data))print(yx_image_data)print(yx_image_data.shape)# %%
# Access pixel size
importnumpyasnpprint(bioimage.physical_pixel_sizes)print(f'An pixel has a length of {np.round(bioimage.physical_pixel_sizes.X,2)} microns in X dimension.')# %%
# Access general metadata
print(type(bioimage.metadata))print(bioimage.metadata)# %%
# Load .tif file with extensive metadata
image_url="https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_16bit__collagen.md.tif"bioimage=BioImage(image_url)print(bioimage)print(type(bioimage))# - Observe that the image is larger than the previous
print(bioimage.dims)# %%
# Access image and reduce to only populated dimensions
yx_image_data=bioimage.data.squeeze()print(type(yx_image_data))print(yx_image_data)print(yx_image_data.shape)# %%
# Access pixel size
print(bioimage.physical_pixel_sizes)print(f'An pixel has a length of {np.round(bioimage.physical_pixel_sizes.Y,2)} microns in Y dimension.')# Access general metadata
# - Observe that metadata are more extensive than in the previous image
print(type(bioimage.metadata))print(bioimage.metadata)# %%
# Load .lif file
# - Observe that BioImage chooses the correct reader plugin
# - Observe that the return object has 4 different channels
# - Observe that the general metadata are an abstract element
image_url="https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_xyc__two_images.lif"bioimage=BioImage(image_url)print(bioimage)print(type(bioimage))print(bioimage.dims)print(bioimage.metadata)print(type(bioimage.metadata))# %%
# Access channel information
print(bioimage.channel_names)# %%
# Access image data for all channels
img_4channel=bioimage.data.squeeze()# Alternative
img_4channel=bioimage.get_image_data('CYX')# - Observe that numpy.array shape is 3 dimensional representing channel,y,x
print(img_4channel.shape)# Access only one channel
img_1channel=bioimage.get_image_data('YX',C=0)# Alternative
img_1channel=img_4channel[0]# - Observe that numpy.array shape is 2 dimensional representing y,x
print(img_1channel.shape)# %%
# Access different images in one image file (scenes)
# - Observe that one image file can contain several scenes
# - Observe that they can be different in various aspects
print(bioimage.scenes)print(f'Current scene: {bioimage.current_scene}')# - Observe that the image in the current scene as 4 channel and Y/X dimensions have the size of 1024
print(bioimage.dims)print(bioimage.physical_pixel_sizes)# Switch to second scene
# - Observe that the image in the other scene as only one channel and Y/X dimensions are half as large as the first scene
# - Observe that the pixel sizes are doubled
bioimage.set_scene(1)print(bioimage.dims)print(bioimage.physical_pixel_sizes)# %%
# Load .czi file
# file needs first to be downloaded from https://github.com/NEUBIAS/training-resources/raw/master/image_data/xyz__multiple_images.czi
# save file in the same directory as this notebook
# - Observe that BioImage chooses the correct reader plugin
# - Observe that the return object has a z dimension
bioimage=BioImage('/Users/fschneider/skimage-napari-tutorial/ExampleImages/xyz__multiple_images.czi')print(bioimage)print(type(bioimage))# %%
# little excersise in between
# Access image dimensions
print(bioimage.dims)# Access general metadata
# - Observe that metadata are abstract
print(bioimage.metadata)print(type(bioimage.metadata))# Access pixel size
print(bioimage.physical_pixel_sizes)# Access image data for all channels
img_3d=bioimage.data.squeeze()# Alternative
img_3d=bioimage.get_image_data('ZYX')# - Observe that numpy.array shape is 3 dimensional representing z,y,x
print(img_3d.shape)# Access only one channel
img_2d=bioimage.get_image_data('YX',Z=0)# Alternative
img_2d=img_3d[0]# - Observe that numpy.array shape is 2 dimensional representing y,x
print(img_2d.shape)# %%
# little excercise:
# paticipants should try to open one of their files with python
Resaving images in different file formats very often leads to a loss of metadata or distortion of the pixel values. It is critical to be aware of this!
Checks to be done after each resaving
Open the resaved image in all relevant applications and check whether the pixels values and/or metadata are different from the original image
This is critical for the scientific integrity of the resaving
Open the image in a web-browser and observe how the image is rendered
This can be useful to share previews with collaborators
[Image > Adjust > Brightness/Contrast] such that cells appear saturated
[File > Save As > GIF…]
Open with Fiji
Pixel values have changed
Open with a web browser
Movie plays and looks as when you saved it
Assessment
True or false
One could use Excel’s XLSX file format for saving image data.
Solution
One could use Excel’s XLSX file format for saving image data. True, the matrix of each sheet could represent one image plane and one could use the first sheet to store metadata and the mapping of each sheet (image plane) to the zct coordinates, e.g. sheet 12 c 2 z 3 t 1.
Discuss
What are the pros and cons of converting an image into another format?
What are the pros and cons of splitting metadata and image pixel data into separate files?
Do you know any good file formats for image metadata?
Solution
(A) Sometimes it is necessary to convert to another format to be able to open the image in a specific software. (B) Converting an image to another format typically loose information, e.g. because the file format that you are saving to cannot represent all the metadata of the original image file. Thus, it is in general recommened to keep to original image file. (C) Converting to a file format with good compression may save you considerable disk space.
(A) Metadata typically is much smaller than the pixel data. Thus, it can be a good idea to keep metadata in a separate file that can be readily inspected (inspecting the potentially TB sized pixel data files can be tricky). (B) The best file formats for metadata and pixel data can be very different due to the nature of the data, thus splitting can make sense. (C) Having separate files always bares the risk that you loose one of them, e.g. you may forget to copy both to a new folder.
TXT, XML, and JSON are good formats for image metadata, because they are human readable standard formats that can be openend with any text editor.