Some notes about how I use the NVIDIA Jetson Nano (4 Gb)
Various implementations of a simple reconstruction procedure for acoustic resolution optoacoustic microscopy based on the virtual detector concept. Including a small literature review.
CUDA has a small amount of memory available for its threads called shared memory. As the name already suggests is that this memory is available to all threads within a block simultaneously. We want to use this property to make threads read memory from global memory to shared memory in a block, use the memory together, and afterwards write the result back into global memory to avoid multiple accesses to global memory. Nevertheless, there are some rules one need to respect for high performance.