CUDA – rabbit hole 101

NVIDIA Jetson Nano Setup

Posted on 2024-02-01 by hofmannu

Some notes about how I use the NVIDIA Jetson Nano (4 Gb)

Continue reading “NVIDIA Jetson Nano Setup”

Simple MATLAB implementations of SAFT

Posted on 2021-01-252021-01-25 by hofmannu

Various implementations of a simple reconstruction procedure for acoustic resolution optoacoustic microscopy based on the virtual detector concept. Including a small literature review.

Continue reading “Simple MATLAB implementations of SAFT”

Memory access patterns to shared memory in CUDA

Posted on 2020-10-212021-02-16 by hofmannu

CUDA has a small amount of memory available for its threads called shared memory. As the name already suggests is that this memory is available to all threads within a block simultaneously. We want to use this property to make threads read memory from global memory to shared memory in a block, use the memory together, and afterwards write the result back into global memory to avoid multiple accesses to global memory. Nevertheless, there are some rules one need to respect for high performance.

Continue reading “Memory access patterns to shared memory in CUDA”