Posts

optimizing a server

What is a Server? It is a software program that manages resources over a network. The whole network could be visualized like this: Today we are writing a server. A dumb one, at first. We’ll benchmark the load handling of the server. And then we optimize. Difference between websockets and a http server model WebSockets maintain a persistent, stateful connection where both client and server can continuously exchange data. In a traditional HTTP model, the client usually sends a request, receives a response, and the connection is then closed. ...

threads

Threads are the smallest unit of execution that an operating system can schedule inside a process. People (me) get confused about what threads are and what processes are. This article will talk about threading as a programming concept and not the theory behind processes and threads. Nevertheless, we’ll talk about processes and threads too. Processes and threads Process A ├── Thread 1 ├── Thread 2 └── Thread 3 Process B ├── Thread 1 └── Thread 2 Process is an independent program instance. It has its own memory and resources. Think of it like an agent. A process can have one or more threads in it. Threads share the process memory and resources, though each thread has its own execution state and stack. How processes and threads are implemented depends on specific programming languages and operating systems, so I highly recommend checking out the wiki. ...

notes on quantization

“compute solves a lot of problem” If we just had enough compute, a lot of the problems that we experience today would be solved. Loger contexts, smarter weights and biases, etc. But right now we don’t have infinite compute. That’s the sad reality. So we optimize. Quantization is our attempt at just that. history Reference: https://arxiv.org/abs/2103.13630 Quantization is a way of compression. It is a process of mapping a large set of continous or high-precision values into smaller discrete set of values. ...

notes on PolyBlocks

Recently, at PyTorch Day India in Bangalore, I saw a talk on AI compilers. Here is the link: YouTube Picture from the session I didn't know there were Indian labs working on the AI compiler problem. But it turns out there are. PolyMage Labs is an IISc lab in Bangalore working on PolyBlocks. Since AI is moving fast, there is a clear need for efficient AI compilers that can lower high-level tensor programs to IR for GPUs, TPUs, and other backends. PolyBlocks minimizes dependency on external vendor libraries like cuBLAS/cuDNN while still generating highly optimized code via compiler-driven transformations and tiling. ...

MIME-ish implementation to share images over ssh

Some days back I was studying for computer networks exam. I came across few protocols which were very interesting. Like SMTP (Simple Mail Transfer Protocol), telnet, SCP (Secure Copy Protocol) just to name a few. SMTP and a little bit of theory Simple Mail Transfer Protocol is a protocol used to transfer mails over servers. It was written in 1981. IT works on port number 25. Since SMTP is server-to-server, the client port number is 587. ...

Python's argparse

In this article I’d like to introduce you to a rather useful python library that can be of use to you. It’s called argparse and recently I have been using it as my go to for couple of things. I first got to know about this library when participating in a kaggle comp. It was pretty intimidating at first because you’re not sure what’s going on but after this article I am hoping you’d know how to deal with code that mentions argparse. We’ll also talk about config files and how this library can be used to write config file. ...

Thoughts on AI; updates on essays

If you didn’t know, I recently started writing more. Published a new website. Yes, the website you’re reading this at. The reason was to get good at understanding and learning. With the coming of AI, writing code has never been easier. And to be honest, I don’t think AI has any role in this. This was way before AI came. The main thing that drives the world imo is an idea. Ideas and implementations. Now the way we implement things have been changing since ages. The one example I like to think about is of the compilers and assembly programmers when C language came. Pretty sure all of them were in the same position developers today are. But that’s another story. Implementations change, but the most thing that drives technology, sciences, math and all the important stuff, are, as i said, ideas. And to get better ideas, we don’t just need intellect. No. We need creativity, we need people who can understand deeply. Who can think. And I don’t use the word think in a lighter manner. Thinking was never easy. And in today’s world, it’s even harder. Which is why I started writing. Because believe me or not, writing is thinking. Every week I have this essay that I have to think about, learn, and write about. ...

Notes on torch code compilation

Before we see what torch.compile does, we should first understand pytorch’s default mode and why we’d ever want to move away from it. PyTorch runs in eager mode by default. Think of it as PyTorch reading and executing your code op by op, as Python encounters each line. It’s immediate, flexible, and great for prototyping — but it pays a Python interpreter cost on every single operation. For production and deployment, we want to skip that cost. That’s where compilation comes in. ...

Notes on SIMD

Today we look at matrix multiplication (matmul, as we will call in this essay). Since, the last essay was on backprop, it was only logical to think about the most fundamental math operation that lets us do the algo. That is, matmul. Also, the numbers in this essay are going to shock you. Like really. So if you think I am making this up, you should checkout my code for this essay. ...

Backpropagation: first draft

I’m assuming you understand the basic idea of neural networks. This essay focuses purely on the backpropagation algorithm itself. What is Backpropagation? Backpropagation is an algorithm that computes how much each weight and bias should change to reduce the loss. It tells us not just whether parameters should go up or down, but by how much, based on their actual impact on the loss function. We use math to figure out that. ...