Summary of gains after learning to hand-build a CPU

The professor's course on hand-building a CPU:CS61C Computer Architecture (Update: Logisim Debug tips)

All code, in my view, can be mapped to concrete hardware implementations; code exists for me at the physical level, not just as an abstraction.

As long as you're good at writing assembly, the CPU can run any program, but you have to manage everything yourself. If you want multithreading, you must manually control the CPU input—feed in thread A's code for a while, then thread B's code—which is very cumbersome, and this leads to the operating system.

Why are operating systems written in C plus assembly? Because C provides pointers that can directly access specific addresses (for example, page tables). Why use assembly? Because some assembly instructions are hardware-related, such as machine mode in RISC-V, which requires assembly to use; C compilers weren't designed to emit the specific instruction that controls hardware mode. Ultimately, the CPU only needs machine code: convert C plus assembly into machine code, feed it to the CPU, and the operating system runs.

How are locks implemented? Two threads accessing the same variable cause conflicts, so you add a lock. But a lock is also a variable—call it variable A—so if two threads request A at the same time, won't that also cause a conflict? Yes. A lock is not an ordinary variable; it is actually implemented in hardware. We design the hardware and its bytecode/assembly instructions, expose the lock interface to the operating system, and the OS wraps it and provides it to high-level languages.

What language is the C compiler written in? Initially it was written in assembly (the CPU wants machine code; the CPU doesn't need to know how that machine code was produced, so it could originally be written in another high-level language—for example, the first Java compiler was written in C). This supports the most basic high-level -> assembly translation, i.e., some C syntax. Once the compiler supports enough of C's syntax, you can write a new compiler in that version of C itself—meaning the new compiler is written in C. I saw this explained in 'Operating Systems: Reality Restored'.

Why does MySQL's prepared statement processing speed things up? I haven't studied specific papers, but I can guess the principle: the client sends an SQL string containing placeholders, the server compiles it into machine code, and when the client sends parameters to replace the placeholders, the server only needs to substitute the corresponding parts in the machine code with the parameters—compile once, run many times.

Last updated