Tag about-low-level
20 bookmarks have this tag.
Low-level stuff: CPUs, syscalls, embedded.
20 bookmarks have this tag.
Low-level stuff: CPUs, syscalls, embedded.
This offers an interesting technical analysis of systemd (in part 3). I’m not a huge fan of the social/historical parts (1-2, 4), although they offer some perspective.
Latency, throughput and port usage of x86 instructions.
This post is all about speculative compilation, or just speculation for short, in the context of the JavaScriptCore virtual machine.
Some thoughts on custom allocator interfaces with nice examples.
How OpenBSD prohibited all syscalls from unknown locations.
A nice post about SIMD algorithms using Rust’s portable SIMD as an example.
How to spin before sleeping so that it actually helps and not harms?
Microbenchmark for futexes + spinlocks and some useful links at the bottom.
Because spin locks are so simple and fast, it seems to be a good idea to use them for short-lived critical sections. For example, if you only need to increment a couple of integers, should you really bother with complicated syscalls? In the worst case, the other thread will spin just for a couple of iterations…
Unfortunately, this logic is flawed! A thread can be preempted at any time, including during a short critical section. If it is preempted, that means that all other threads will need to spin until the original thread gets its share of CPU again. And, because a spinning thread looks like a good, busy thread to the OS, the other threads will spin until they exhaust their quants, preventing the unlucky thread from getting back on the processor!
The [board] game is about creating a small shellcode in memory by copying existing instructions and then exploiting a buffer overflow to jump into it, so that you can overwrite your opponent’s return address to force them to go to the game_over() function.There are other mechanics as well and more layers of strategy (like setting the exception handler or monkeypatching).
This blog post is one of those things that just blew up. From a tiny observation at work about odd behaviors of spinlocks I spent months trying to find good benchmarks, (still not entirely successful) writing my own spinlocks, mutexes and condition variables and even contributing a patch to the Linux kernel. The main thing I’ll try to answer is to give some more informed guidance on the endless discussion of mutex vs spinlock. Besides that I found that most mutex implementations are really good, that most spinlock implementations are pretty bad, and that the Linux scheduler is OK but far from ideal. The most popular replacement, the MuQSS scheduler has other problems instead. (the Windows scheduler is pretty good though)
The site contains all the lectures, project materials and tools necessary for building a general-purpose computer system and a modern software hierarchy from the ground up.
What if all software suddenly disappeared? What's the minimum you'd need to bootstrap a practical system? I decided to start with a one sector (512-byte) seed and find out how far I can get.
Detailed explanation of futexes, including some possible pitfalls.
The paper’s claim:
False.
Compilers do optimize atomics, memory accesses around atomics, and utilize architecture-specific knowledge. This paper illustrates a few such optimizations, and discusses their implications.
Interestingly, none of the optimizations proposed in the paper actually work on GCC or Clang.
A first (as far as I know) description of ringbuffer based on two mmaps. I hope to make a better one sometime, but for now this’ll the best explanation I have.
A free book about atomics and locks that also serves as a nice cheatsheet for x86_64, aarch64 and futexes.
An example of false sharing in real-ish workload.