Using C-Reduce to understand Clang compiler bugs
Suppose we have a crash while compiling huge application from source, e.g. a Python package with native C++ code. A source file fails to compile with the followin...
Suppose we have a crash while compiling huge application from source, e.g. a Python package with native C++ code. A source file fails to compile with the followin...
We all love our CV/blog websites hosted on GitHub Pages. We also love Jupyter notebooks for revolting the look and feel of daily data processing. Now imagine that...
Web-conferencing platforms are on the raise during these unprecedented times. On the other side, the vulnerablilities of Zoom and lack of privacy motivates us to ...
In order to quickly explore PyTorch internals, I decided to compile and install a Debug build on my local machine. The first problem was that modern Clang surpris...
The CUDA compiler does not handle infinite loops properly. For instance, the loop below will be completely eliminated from the resulting assembly, along with its ...
Recent 5.x and 6.x GCC compilers are causing NVCC to produce the following kind of weird compile errors:
GPU-equipped clusters are often managed by SLURM job control system. Essentially, developer logs into the frontend node by SSH, builds the application and then qu...
OpenACC enables rapid transition of serial C/C++/Fortran into GPU-enabled parallel code. However, due to high-level nature, OpenACC does not offer access to GPU-s...
The performance power of GPUs could be exposed to applications using two principal kinds of programming interfaces: with manual parallel programming (CUDA or Open...
Multiple presentations about OpenMP 4.0 support on NVIDIA GPUs date back to 2012. There is however still very limited OpenMP 4.0 production-ready tools availabili...