Deep-Tech discussions with an early stage founder in BioxComputing domain. [Work In Progress]
Adding a new custom instruction in RISCV-LLVM Backend(Full pipeline from LLVM IR to RISCV Assembly to Machine Hex Code) Hacking the internals of LLVM Backend(Tablegen) Contribution to the following files:
IntrinsicsRISCV.td file
RISCVInstrInfo.td file
LLVM IR -> asm code -> binary encodings [Internals] Step 1: LLVM IR → RISCV asm code (llc tool) Step 2: MachineInst → MCLayer → MCInst → MCStreamer → MCAsmStreamer → AsmWriter → .s file → MCObjectStreamer → MC Code Emitter → .
STAGE 1: Preprocessing Command: gcc -E hello.c -o hello.i
What happens:
#include <stdio.h> gets replaced by the entire contents of stdio.h, thousands of lines of function declarations, type definitions, macros. printf is not defined yet, just declared. The preprocessor doesn’t know what printf does, just that it exists.
Output: hello.i, pure C code, no # directives, just expanded text. Roughly 800 lines for this tiny program.
// ... thousands of lines from stdio.
There’s an interesting history for this post, I gave an interview at IBM for a compiler engineering position. The discussion was so fun and revolved around everything from C to Low level compiler and OS stuff including computer architecture level intricacies. We delved deep into the privilege escalation(user to kernel mode), runtime stack, and eventually syscalls, interrupts/TRAP. Basic question was the difference between an interrupt and a syscall which anyone can answer but it gets interesting when we start looking at the syscall as a main character.
Understanding the processes-process communication in depth at OS level This is again coming from one of the interviews discussions at IBM for compiler role.
Over the past few years, through my work with Unikraft on unikernels, a compiler research internship at CERN/Berkeley Lab, hardware–software co-design at Vicharak Computers, and systems research for neuromorphic chips at IISc, I’ve been trying to find my place in the low-level world.
But I kept running into the same wall: a lack of deep understanding of C’s intricacies the kind that every serious low-level engineer or hacker is expected to master.
When nothing goes right, you open your laptop and you end-up doing things that you would anyways do whatever the fk is going on in the world, for me it has been the love for understanding my computer. Reverse Engineering grounds me and improves my focus and it wouldn’t be wrong to call it as a “mindset” rather than a “discipline”.
My remote interview experience with Co-founder/CEO of Zettascale Computing Corp(YCombinator funded deep-tech startup), this time it was XPUs not GPUs or TPUs.
Technical discussions- Architecture-level questions: -> Why are you even doing this? Impact? Real impact not how the next ChatGPT will be much more faster. → Creative models like BNNs or LGNs don’t work well on GPUs. → They quickly get disregarded. What about training BNNs on XPUs? Will we be able to train them on XPUs because on GPUs they are extremely slow?
This isn’t just another roadmap but my personal low-level learning path(which will never end I know) that I have been following while figuring out complete end-to-end low-level stack for a novel computing architecture based on memristors in-memory brain-inspired chip.
Questions to answer: “How does userspace talk to hardware?” “Where do syscalls actually live?” “How would my neuromorphic accelerator appear to Linux?” “How does the kernel map physical devices to memory?” “How would I add a new instruction or device?
Another interview motivation post, this time it was NVIDIA LLVM IR engineering intern position. They wanted me to have contributed to a real open source codebase specifically in LLVM/LLVM IR. So, here I will try to explain some of the interesting hacks about LLVM IR, experiments I did and what I learnt. I have learned this very hard way that it doesn’t matter if you know 100s of topics or concepts or have read dozens of technical blogs.