Note
See this notebook for an explanation with examples of the different types of representations in torchmil.
Spatial and sequential representation
In torchmil, bags can be represented in two ways: sequential and spatial.
In the sequential representation bag['X'] is a tensor of shape (bag_size, dim).
This representation is the most common in MIL.
When the bag has some spatial structure, the sequential representation can be coupled with a graph using an adjacency matrix or with the coordinates of the instances. These are stored as bag['adj'] (of shape (bag_size, bag_size)) and bag['coords'] (of shape (bag_size, coords_dim)), respectively.
Alternatively, the spatial representation can be used.
In this case, bag['X'] is a tensor of shape (coord1, ..., coordN, dim), where N=coords_dim is the number of dimensions of the space.
In torchmil, you can convert from one representation to the other using the functions torchmil.utils.seq_to_spatial and torchmil.utils.spatial_to_seq from the torchmil.data module. These functions need the coordinates of the instances in the bag, stored as bag['coords'].
Example: Whole Slide Images
Due to their large resolution, Whole Slide Images (WSIs) are usually represented as bags of patches. Each patch is an image, from which a feature vector of is typically extracted. The spatial representation of a WSI has shape (height, width, feat_dim), while the sequential representation has shape (bag_size, feat_dim). The coordinates corresponds to the coordinates of the patches in the WSI.
SETMIL is an example of a model that uses the spatial representation of a WSI.
torchmil.data.seq_to_spatial(X, coords)
Computes the spatial representation of a bag given the sequential representation and the coordinates.
Given the input tensor X of shape (batch_size, bag_size, dim) and the coordinates coords of shape (batch_size, bag_size, n),
this function returns the spatial representation X_enc of shape (batch_size, coord1, coord2, ..., coordn, dim).
This representation is characterized by the fact that the coordinates are used to index the elements of spatial representation:
X_enc[batch, i1, i2, ..., in, :] = X[batch, idx, :] where (i1, i2, ..., in) = coords[batch, idx].
Parameters:
-
X(Tensor) –Sequential representation of shape
(batch_size, bag_size, dim). -
coords(Tensor) –Coordinates of shape
(batch_size, bag_size, n).
Returns:
-
X_esp(Tensor) –Spatial representation of shape
(batch_size, coord1, coord2, ..., coordn, dim).
torchmil.data.spatial_to_seq(X_esp, coords)
Computes the sequential representation of a bag given the spatial representation and the coordinates.
Given the spatial tensor X_esp of shape (batch_size, coord1, coord2, ..., coordn, dim) and the coordinates coords of shape (batch_size, bag_size, n),
this function returns the sequential representation X of shape (batch_size, bag_size, dim).
This representation is characterized by the fact that the coordinates are used to index the elements of spatial representation:
X_seq[batch, idx, :] = X_esp[batch, i1, i2, ..., in, :] where (i1, i2, ..., in) = coords[batch, idx].
Parameters:
-
X_esp(Tensor) –Spatial representation of shape
(batch_size, coord1, coord2, ..., coordn, dim). -
coords(Tensor) –Coordinates of shape
(batch_size, bag_size, n).
Returns:
-
X_seq(Tensor) –Sequential representation of shape
(batch_size, bag_size, dim).