Commit 334e4a6e authored by Alexandru Dura's avatar Alexandru Dura

Save the lecture notes

parent f4bf3438
......@@ -36,3 +36,19 @@
- dynamic scheduling
- each thread picks a new chunk once done with the current chunk
- guided scheduling
* <2020-02-11 Tue>
** Benchmarking
- starting a parallel region has high cost
- use orphan directives instead
- LLVM seems to be faster than GCC, on par with ICC
- base your plots on the serial reference
- false sharing: the cacheline is shared, without actually sharin the data
- CC-NUMA - cache-coherent non-uniform memory access
- physical memory is initialized when a block returned by malloc is first touched
- OpenMP constructs cost is O(1000) cycles
** Scalasca OpenMP and MPI profiler
- tolerate 10-20% time increase, but no more; otherwise you might be measuring a completely different thing
* <2020-02-13 Thu>
- SIMD optimizations
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment