I'll be talking about the -fanalyzer
static analysis option I added to
GCC:
- overview of the analyzer and its internal implementation
- what I've changed so far for GCC 12
- my plans for further development of the analyzer
("Prepared project report": 25 minutes, including questions)
Points-to analysis is a static code analysis that calculates the pointer-pointee relationship between expressions and static memory locations. The results of the points-to analysis may be used by multiple optimizations and analyses. Of particular interest a precise points-to analysis is necessary to perform data-layout optimizations at the level of alias sets. We use the high level,...
Bunsen is a toolkit for compact storage and analysis of DejaGNU test results. The toolkit includes a storage engine that compresses and indexes a large collection of test result logs in a Git repository, a Python library for querying and analyzing the test result collection, and a simple CGI service for accessing query results through a web browser.
In this talk I will give an in-depth look...
Abstract:
AMD has been working on adding support for GPU compute debugging to GDB. Early on, it became apparent that current DWARF would not be sufficient to support optimized SIMT/SIMD code, so we came up with extensions and generalizations that we intend to propose to DWARF 6. Although designed with GPUs in mind, the extensions are generic and can just as well be used to improve quality of...
CTF (Compact C Type Format) is a debugging format whose main (but not only) purpose is to convey type information of C program constructs. BTF is a similar format used in the Linux kernel to support the portable execution of BPF programs. Both formats share a common ancestor and show some remarkable similarities. However they are not the same format, their application goals are different, are...
BPF is a virtual machine that resides in the Linux kernel. Initially intended for user-level packet capture and filtering, BPF is nowadays generalized to serve as a general-purpose infrastructure also for non-networking purposes. BPF programs are often written manually, directly in assembly instructions. However, people often want to write their BPF programs in C. We recently added support...
Prepared presentation
In this talk we present an overview of gprofng, a next generation profiling tool for Linux.
This profiler has its roots in the Performance Analyzer from the Oracle Developer Studio product. Gprofng is a standalone tool however and specifically targets Linux. It includes several tools to collect and view the performance data. Various processors from Intel, AMD, and...
This talk will discuss the methods used in constructing the recent improvement in complex divide in libgcc where the gross error rate dropped from more than 1 per 100 tests to less than 1 per 10 million tests. The change in accuracy is platform independent while the modest performance loss varies with platform. We also discuss flaws and likely areas for addressing reducing remaining small errors.
The malloc library provided by glibc offers considerable flexibilty in deciding when to use mmap for larger allocations and when to use sbrk/trim. The default settings for the decision thresholds are reasonable for many applications. Three tunables are available to adjust these settings. The limits on these settings have not been changed since 2006. Server class systems now have much more...
The existing implementation of the OpenACC "kernels" construct in GCC
is unable to cope with many language constructs found in real HPC
codes which generally leads to very bad performance. This talk
presents upcoming changes to the "kernels" implementation that improve
the performance significantly:
- A more unified internal representation of "kernels" and "parallel"
regions as a...
BoF to discuss topics related to concurrency and offloading work onto AMD and NVIDIA accelerators using OpenMP and OpenACC.
In particular the implementation of the missing OpenMP 5.0 & 5.1 features, including memory allocators, unified shared memory, C++ attributes, etc.
Related topics and trends can also be discussed, be it base language concurrency features, offloading without using...
The annual GNU Toolchain mindfulness and meditation session. A cordial Question and Answers session with the GCC Steering Committee, GLIBC, GDB and Binutils Stewards also will be entertained.
This is a lightning talk.
One of the hurdles necessary to overcome for the M1 Darwin GCC port is
supporting the Darwin ABI specification. GCC is designed to process
argument passing the same way, regardless of whether the argument is
named or variadic. This however does not leave scope to accommodate the
Darwin modifications to the AArch64 ABI, which specifies that...
Recent x86 processors support "non_temporal" stores which bypass the cache when storing data. It is widely understood that normal stores to cache are appropriate when it is likely that the data may be needed before the cache is full. It is also understood that stores of large blocks of data which exceed the available cache allow the overall application to run faster when the block of stores...
Discuss topics related to the rs6000 / Power / PowerPC toolchain, including support for Power10.
The GNU C Library is used as the C library in the GNU systems and most systems with the Linux kernel. The library is primarily designed to be a portable and high performance C library. It follows all relevant standards including ISO C11 and POSIX.1-2008. It is also internationalized and has one of the most complete internationalization interfaces known.
This BoF aims to bring together...
A demonstration of debugging OpenMP/OpenACC kernels using GDB, and a quick overview of the how it was achieved and what still needs to be done.
CORE-V is a family of RISC-V processor cores developed to commercially robust standards by the Open Hardware Group, a consortium of industrial and academic organizations.
In the first part of this talk we give an update on the work on the GNU tool chain for the CV32E40P, the first of the CORE-V family with custom extensions for branching, autoincrement load/store, hardware loops, multiply...
This is more of a placeholder than anything else: There's an email thread going around that was a bit inconclusive as to whether on not we should have one of these so I figured it'd be easier to just make one.