<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Raghunandhan VR</title>
        <link>https://raghu.app/</link>
        <description>I own a computer and I like to develop things with it.</description>
        <lastBuildDate>Mon, 06 Apr 2026 05:28:42 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>Feed for Node.js</generator>
        <language>en</language>
        <image>
            <title>Raghunandhan VR</title>
            <url>https://raghu.app//opengraph-image.png</url>
            <link>https://raghu.app/</link>
        </image>
        <copyright>All rights reserved 2026, Raghunandhan VR</copyright>
        <atom:link href="https://raghu.app/rss.xml" rel="self" type="application/rss+xml"/>
        <item>
            <title><![CDATA[cognitive-load]]></title>
            <link>https://raghu.app/writings/cognitive-load</link>
            <guid isPermaLink="false">https://raghu.app/writings/cognitive-load</guid>
            <pubDate>Mon, 06 Apr 2026 05:28:42 GMT</pubDate>
            <content:encoded><![CDATA[import MermaidDiagram from '@/app/components/md/mermaid'
import { BlogViewCounter } from '@/app/components/ui/blog-view-counter'
import { TableOfContents } from '@/app/components/ui/table-of-contents'

export const metadata = {
  title: 'The Cognitive Load: Why Code Makes Your Brain Hurt',
  description: 'How to write code that doesn\'t make people want to rage quit',
  alternates: {
    canonical: '/writings/cognitive-load',
  },
  openGraph: {
    images: [
      {
        url: `/api/og?title=The+Cognitive+Load:+Why+Code+Makes+Your+Brain+Hurt`,
        width: 1200,
        height: 630,
      },
    ],
  },
};

# The Cognitive Load: Why Code Makes Your Brain Hurt    
<BlogViewCounter slug="/writings/cognitive-load" createdAt={new Date('2025-06-24')} />
<TableOfContents />

Ever spent 3 hours debugging what should've been a 5-minute fix? The code worked perfectly, but understanding it felt impossible and the code structure was... complex. Very complex.


It has a name: **cognitive load**.

## What Actually Is Cognitive Load?

> Cognitive load is how much a developer needs to think in order to complete a task.

When you're reading code, you're juggling things like:
- Variable values and their states
- Control flow logic  
- Function call sequences
- Business rules and constraints

The average person can hold roughly **4 such chunks** in working memory. Once you cross that threshold, everything becomes exponentially harder.

*Picture this: You've been asked to fix a bug in a completely unfamiliar Go project. The codebase uses many design patterns, abstractions, and modern technologies. While well-intentioned, **this creates a significant cognitive load challenge.***

<MermaidDiagram diagram={`
graph TD
    A["🧠 Fresh Brain"] --> B["🧠+ One Thing"]
    B --> C["🧠++ Two Things"]  
    C --> D["🧠+++ Three Things"]
    D --> E["🤯 brain.exe has stopped working"]
    
    style A fill:transparent,stroke:#4CAF50,stroke-width:2px
    style B fill:transparent,stroke:#FFC107,stroke-width:2px
    style C fill:transparent,stroke:#FF9800,stroke-width:2px
    style D fill:transparent,stroke:#FF5722,stroke-width:2px
    style E fill:transparent,stroke:#F44336,stroke-width:3px
`} />

We should reduce cognitive load as much as possible, because **time spent understanding is time not spent solving**.

## The Two Flavors of Mental Torture

**Intrinsic load** - The unavoidable complexity of the problem you're solving. Building a payment system? Some complexity is inherent to financial transactions. Can't reduce this much.

**Extraneous load** - Unnecessary complexity created by how the information is presented. Poor variable names, confusing abstractions, showing off with "clever" code. This is what kills productivity and can be massively reduced.

---

For the rest of this post, I'll use these cognitive load indicators:  
`🧠` - Fresh working memory, zero cognitive load  
`🧠+` - One fact in our working memory  
`🧠++` - Two facts, getting harder  
`🧠+++` - Three facts, approaching limit  
`🤯` - Cognitive overload, more than 4 facts

*Our brains are way more complex than this, but this simple model works great for understanding code complexity.*

## Complex Conditionals That Melt Your Brain

Let me show you some real pain:

```go
// This makes your brain work overtime 🧠+++ then 🤯
if user.Age > 18 && 
   (user.HasLicense || user.HasPermit) && 
   (user.IsActive && !user.IsBanned) && 
   (user.Subscription == "premium" || user.TrialDays > 0) && 
   time.Since(user.LastLogin) < 30*24*time.Hour {
    // By now you've forgotten what the first condition was
    processUserAction(user)
}
```

By the time you reach `processUserAction()`, your brain is tracking:
- Age check `🧠+`
- License/permit logic `🧠++` 
- Active status AND ban status `🧠+++`
- Subscription OR trial logic `🧠++++`
- Time calculation `🤯`

**Cognitive overload achieved!** Your brain can't hold all this.

Much better approach:

```go
// Your brain can actually relax now 🧠
isAdult := user.Age > 18
canDrive := user.HasLicense || user.HasPermit  
isActiveUser := user.IsActive && !user.IsBanned
hasValidSubscription := user.Subscription == "premium" || user.TrialDays > 0
isRecentlyActive := time.Since(user.LastLogin) < 30*24*time.Hour

if isAdult && canDrive && isActiveUser && hasValidSubscription && isRecentlyActive {
    processUserAction(user) // Crystal clear what's happening
}
```

Each variable name tells a story. Your brain doesn't need to rebuild the logic every single time.

## Nested Hell That Makes You Want to Quit

```go
func processOrder(order Order) error {
    if order.User.IsValid { // 🧠+, okay valid users only
        if order.Payment.IsSuccessful { // 🧠++, payment worked
            if order.Inventory.IsAvailable { // 🧠+++, items in stock  
                if order.Shipping.AddressValid { // 🤯, what were we checking again?
                    return createOrder(order)
                }
            }
        }
    }
    return errors.New("order processing failed")
}
```

Each nested `if` adds another layer to your mental stack. By the fourth condition, you've completely lost track of what needs to be true for the order to succeed.

Early returns save your sanity:

```go
// Much easier to follow 🧠
func processOrder(order Order) error {
    if !order.User.IsValid {
        return errors.New("invalid user")
    }
    
    if !order.Payment.IsSuccessful {
        return errors.New("payment failed") 
    }
    
    if !order.Inventory.IsAvailable {
        return errors.New("out of stock")
    }
    
    if !order.Shipping.AddressValid {
        return errors.New("invalid shipping address")
    }
    
    // If we're here, everything's good! 🧠
    return createOrder(order)
}
```

Each check is independent. Your brain doesn't need to keep track of nested conditions. **Linear thinking is natural for humans** - embrace it!

## Too Many Shallow Modules (The Microservice Challenge)

Here's a story that illustrates the problem perfectly. I once consulted for a startup where a team of **5 developers** had created **17 microservices**! 🤦‍♂️

Every "simple" feature required changes across 4+ services. Want to add a new user field? You need to:
1. Update the user service `🧠+`
2. Modify the auth service `🧠++`  
3. Change the notification service `🧠+++`
4. Update the analytics service `🧠++++`
5. Pray nothing breaks `🤯`

They were 10 months behind schedule because they spent more time managing service interactions than building features.

<MermaidDiagram diagram={`
graph LR
    %% Main Event
    UserRegistration[["User Registration"]]

    %% Auth Flow
    UserRegistration --> AuthService(["Auth Service"])
    AuthService --> DatabaseCheck[("Database Check")]

    %% User Flow
    UserRegistration --> UserService(["User Service"])
    UserService --> ProfileCreation[("Profile Creation")]

    %% Email Flow
    UserRegistration --> EmailService(["Email Service"])
    EmailService --> WelcomeEmail[("Welcome Email")]

    %% Analytics Flow
    UserRegistration --> AnalyticsService(["Analytics Service"])
    AnalyticsService --> TrackEvent[("Track Event")]

    %% Notification Flow
    UserRegistration --> NotificationService(["Notification Service"])
    NotificationService --> PushNotification[("Push Notification")]

    %% Billing Flow
    UserRegistration --> BillingService(["Billing Service"])
    BillingService --> SetupBilling[("Setup Billing")]

    %% Styling
    classDef main stroke:#E91E63,stroke-width:3px,fill:#fff5f8
    classDef auth stroke:#9C27B0,stroke-width:2px,fill:#f9f0fb
    classDef user stroke:#3F51B5,stroke-width:2px,fill:#f0f3fb
    classDef email stroke:#2196F3,stroke-width:2px,fill:#f0f8ff
    classDef analytics stroke:#00BCD4,stroke-width:2px,fill:#e8fbfd
    classDef notification stroke:#009688,stroke-width:2px,fill:#e8f7f5
    classDef billing stroke:#4CAF50,stroke-width:2px,fill:#edf9ee
    classDef action fill:#ffffff,stroke:#cccccc,stroke-width:1px

    class UserRegistration main
    class AuthService auth
    class UserService user
    class EmailService email
    class AnalyticsService analytics
    class NotificationService notification
    class BillingService billing
    class DatabaseCheck,ProfileCreation,WelcomeEmail,TrackEvent,PushNotification,SetupBilling action
`} />

This is what we call **shallow modules** - the interface complexity is huge compared to the tiny functionality each provides. **You have to keep in mind each module's responsibilities AND all their interactions.**

Compare this to **deep modules** - simple interface, complex functionality hidden inside.

Think about the Unix I/O interface in Go:

```go
// Simple interface, massive complexity hidden underneath
file, err := os.Open("data.txt")     // 🧠+
defer file.Close()
buffer := make([]byte, 1024)
n, err := file.Read(buffer)          // 🧠+
```

This simple interface has **hundreds of thousands of lines** of complexity hidden under the hood - filesystems, hardware drivers, memory management, buffering. But you don't need to think about any of that! `🧠+`

> The best components provide powerful functionality yet have a simple interface.

**Reality check**: If your team is smaller than a cricket team, you probably don't need microservices yet.

## When Being "Smart" Backfires

We've all encountered code that tries to be too clever. Here's an example:

```go
// Look how clever I am! 🤯
result := users.Filter(func(u User) bool { 
    return u.Active && u.Verified 
}).Map(func(u User) UserScore { 
    return UserScore{User: u, Score: calculateComplexScore(u)} 
}).Filter(func(us UserScore) bool { 
    return us.Score > threshold 
}).Map(func(us UserScore) ProcessedUser { 
    return transformUserData(us.User) 
}).Sort(func(a, b ProcessedUser) bool { 
    return a.Priority > b.Priority 
}).Take(limit)
```

vs. the version that won't make your teammates hate you:

```go
// Your future self will thank you 🙏🧠
var activeUsers []User
for _, user := range users {
    if user.Active && user.Verified {
        activeUsers = append(activeUsers, user)
    }
}

var usersWithScores []UserScore  
for _, user := range activeUsers {
    score := calculateComplexScore(user)
    usersWithScores = append(usersWithScores, UserScore{
        User: user, 
        Score: score,
    })
}

var qualifiedUsers []ProcessedUser
for _, userScore := range usersWithScores {
    if userScore.Score > threshold {
        processed := transformUserData(userScore.User)
        qualifiedUsers = append(qualifiedUsers, processed)
    }
}

sort.Slice(qualifiedUsers, func(i, j int) bool {
    return qualifiedUsers[i].Priority > qualifiedUsers[j].Priority
})

if len(qualifiedUsers) > limit {
    qualifiedUsers = qualifiedUsers[:limit]
}
```

Yeah, it's more lines. But at 2 AM when something breaks, you'll actually understand what's happening. **Clever code is tomorrow's debugging nightmare.**

## Business Logic Lost in HTTP Status Codes

Picture this conversation:

**Backend team**: "We return 401 for expired JWT tokens, 403 for insufficient access, and 418 for banned users."

**Frontend devs**: "Wait, what was 418 again?" `🧠+`

**QA team**: "I got a 403, is that expired token or insufficient access?" `🧠++`

**New intern**: "Why 418? Isn't that for teapots?" `🧠+++`

**Everyone**: *cognitive overload* `🤯`

Engineers on the frontend need to temporarily hold this mapping in their brains:
- `401` = expired JWT token `🧠+`
- `403` = insufficient access `🧠++`  
- `418` = banned users `🧠+++`

Then QA engineers come along: "Hey, I got `403` status, is that expired token or insufficient access?" **QA can't jump straight to testing - they have to recreate the backend team's mental model first.**

Why hold this custom mapping in working memory? Better to abstract away from HTTP protocol details:

```go
type APIResponse struct {
    Success bool   `json:"success"`
    Code    string `json:"code"`
    Message string `json:"message"`
}

// Much clearer 🧠
response := APIResponse{
    Success: false,
    Code:    "jwt_expired", 
    Message: "Your session has expired, please log in again",
}
```

Cognitive load on frontend: `🧠` (fresh)  
Cognitive load on QA: `🧠` (fresh)

Same applies to numeric status codes in databases - **prefer self-describing strings**. We're not in the 640K RAM era anymore!

## Inheritance Nightmare That Makes You Want to Scream

Here's what happened when I had to change admin user functionality:

```
AdminController → UserController → GuestController → BaseController
```

To understand what an admin can do: `🧠`
- First, check `BaseController` for basic functionality `🧠+`
- Then `GuestController` for guest-specific logic `🧠++`  
- Next `UserController` for user modifications `🧠+++`
- Finally `AdminController` for admin features `🧠++++`

But wait! There's `SuperAdminController` that extends `AdminController`. By changing `AdminController`, I might break superadmin functionality, so let me check that too: `🤯`

**Five levels deep** just to understand what happens when an admin clicks a button.

You know how family recipes work? Your grandmother's base recipe, your mother's modifications, then your own tweaks. That's manageable because each person adds something meaningful. But imagine if you had to check 5 generations of recipe modifications just to know if you need to add salt! 

Better approach - **prefer composition over inheritance**:

```go
// Much clearer structure 🧠
type AdminUser struct {
    permissions    *PermissionManager
    authentication *AuthService  
    userManagement *UserManager
}

func (a *AdminUser) BanUser(userID string) error {
    if !a.permissions.CanBanUsers() {
        return errors.New("insufficient permissions")
    }
    return a.userManagement.Ban(userID)
}
```

Now when something breaks, you know exactly where to look.

## Testing Your Code's Brain Damage Level

Want to know if your code has cognitive load problems? Here's my **30-minute rule**:

1. Grab someone new to your codebase
2. Ask them to make a "simple" change  
3. Count how many files they need to open
4. Time how long they're confused

If they're scratching their head for more than **30 minutes** on a simple change, your code has serious cognitive load issues. `🤯`

I once saw someone spend **6 hours** figuring out how to add a single validation rule because the logic was split across 12 different "clean" abstractions. The validation itself took 5 minutes to write.

## Abusing DRY (Don't Repeat Yourself) Principle

**Here's where good intentions go wrong.** We're so obsessed with not repeating code that we create tight coupling between unrelated components.

I consulted for a company where they extracted a "common validation utility" used by 8 different services. Sounds good, right? Wrong. When they needed to change validation logic for **one specific use case**, they couldn't because it would break the other 7 services. `🤯`

The team ended up spending 2 weeks coordinating changes across 8 services just to add a new field validation. 

**Rob Pike said it best**: *"A little copying is better than a little dependency."*

Sometimes it's better to have 3 similar functions than 1 "flexible" function with 15 parameters and conditional logic everywhere.

```go
// Bad: Over-abstracted 🤯
func ValidateUser(user User, mode string, strict bool, legacy bool, skipEmail bool) error {
    if mode == "login" {
        if !strict && legacy {
            // ... complex conditional logic
        }
    } else if mode == "registration" {
        if !skipEmail {
            // ... more conditionals  
        }
    }
    // ... 50 more lines of conditional madness
}

// Good: Clear and simple 🧠
func ValidateUserLogin(user User) error {
    if user.Email == "" {
        return errors.New("email required")
    }
    if user.Password == "" {
        return errors.New("password required") 
    }
    return nil
}

func ValidateUserRegistration(user User) error {
    if err := ValidateUserLogin(user); err != nil {
        return err
    }
    if len(user.Password) < 8 {
        return errors.New("password too short")
    }
    return nil
}
```

**All your dependencies are your code.** When some imported library breaks, you're the one debugging 10+ levels of stack traces at 3 AM.

## Cognitive Load in Familiar vs New Projects

Here's something tricky: **familiarity feels like simplicity, but they're different things.**

You might think your 2-year-old codebase is "simple" because you know it well. But watch a new developer try to understand it. If they're confused for more than 40 minutes straight on a simple task, your code has high cognitive load.

<MermaidDiagram diagram={`
graph TD
    A["New Developer 🧠"]
    A --> B["File 1: 🧠+"]
    B --> C["File 2: 🧠++"] 
    C --> D["File 3: 🧠+++"]
    D --> E["File 4: 🤯"]
    
    F["Experienced Dev 🧠"]
    F --> G["Same Files: 🧠"]
    
    style A fill:transparent,stroke:#FF5722,stroke-width:2px
    style B fill:transparent,stroke:#FF9800,stroke-width:2px
    style C fill:transparent,stroke:#FFC107,stroke-width:2px
    style D fill:transparent,stroke:#F44336,stroke-width:2px
    style E fill:transparent,stroke:#D32F2F,stroke-width:3px
    style F fill:transparent,stroke:#4CAF50,stroke-width:2px
    style G fill:transparent,stroke:#2E7D32,stroke-width:2px
`} />

**The more mental models someone needs to learn, the longer it takes them to be productive.**

If you can keep cognitive load low, new people can contribute meaningfully within their first few hours. I've seen this happen - it's beautiful!

## What Actually Works: Practical Rules

### 1. Use Names That Tell Stories

```go
// Bad 😵🤯
flag := checkStuff(data)
if flag {
    doThing()
}

// Good ✅🧠  
isEligibleForDiscount := checkUserEligibility(userData)
if isEligibleForDiscount {
    applyDiscount()
}
```

### 2. Extract Complex Logic Into Named Functions

```go
// Bad 😵🤯
if user.Subscription.Type == "premium" && 
   user.Subscription.ValidUntil.After(time.Now()) && 
   contains(user.Features, "advanced_analytics") {
    // do premium stuff
}

// Good ✅🧠
func canAccessAdvancedFeatures(user User) bool {
    return user.Subscription.Type == "premium" &&
           user.Subscription.ValidUntil.After(time.Now()) &&
           contains(user.Features, "advanced_analytics")
}

if canAccessAdvancedFeatures(user) {
    // do premium stuff  
}
```

### 3. Keep Related Code Together

Don't make people hunt through 10 files to understand one feature. **If it changes together, it should live together.**

### 4. Write Comments for "Why", Not "What"

```go
// Bad 😵
// Increment counter by 1
counter++

// Good ✅
// Track failed login attempts for rate limiting
failedLoginCounter++
```

### 5. Prefer Boring, Obvious Solutions

Your code will be read 10x more than it's written. Make it boring and obvious.

## Tight Coupling with Framework Magic

There's a lot of "magic" in frameworks. By relying too heavily on framework quirks, **you force all future developers to learn that magic first.** It can take months.

I've seen teams spend more time debugging framework internals than solving business problems. Keep your core logic separate from framework magic.

```go
// Business logic shouldn't be tied to framework
type UserService struct {
    db Database
}

func (s *UserService) CreateUser(userData UserData) (*User, error) {
    // Pure business logic here 🧠
    if err := s.validateUserData(userData); err != nil {
        return nil, err
    }
    return s.db.CreateUser(userData)
}

// Framework adapter  
type GinUserController struct {
    userService *UserService
}

func (c *GinUserController) CreateUser(ctx *gin.Context) {
    var userData UserData
    if err := ctx.ShouldBindJSON(&userData); err != nil {
        ctx.JSON(400, gin.H{"error": err.Error()})
        return
    }
    
    user, err := c.userService.CreateUser(userData)
    if err != nil {
        ctx.JSON(400, gin.H{"error": err.Error()})
        return
    }
    
    ctx.JSON(201, user)
}
```

**By no means am I saying avoid frameworks!** Just don't let framework magic leak into your business logic. Use frameworks like libraries, not like religion.

## A Simple Analogy 🌶️

Good code is like a **well-organized spice rack**. Each compartment has a clear purpose, everything has its place, and you don't need to search through the whole rack to find what you need during the dinner rush.

When code is poorly organized, it's like dumping all your spices in one container - technically everything's there, but good luck finding the right one when you're cooking for a crowd!

## Quick Wins to Reduce Brain Burn

1. **Use descriptive variable names** - `isEligibleForDiscount` beats `flag` any day
2. **Extract complex conditions** - Give them meaningful names  
3. **Prefer early returns** - Reduce nesting levels
4. **Keep related code together** - Don't make people hunt across files
5. **Use consistent patterns** - Once someone learns your style, they shouldn't have to relearn it
6. **Write boring, obvious code** - Clever code is tomorrow's debugging nightmare
7. **Test with new team members** - If they're confused for >40 minutes, fix it

## The Bottom Line (Don't Ignore This!)

Your code will be read **10x more times** than it's written. Every confusing pattern, every "clever" trick, every unnecessarily complex structure adds to the mental burden of everyone who comes after you (including future you).

I can't tell you how many times I've looked at my own code from six months ago and thought, "What was I thinking when I wrote this?" (Spoiler: I clearly wasn't thinking clearly)

The goal isn't to show off how smart you are - it's to solve problems without making people's brains hurt.

**Ask yourself**: "Will a developer understand this quickly, or will they need to rebuild my entire thought process?"

If it's the latter, time to refactor! 🔧

---

**Here's the thing**: The best code feels boring to read. There are no surprises, no mental gymnastics, no "aha! moments" required.

Boring code is beautiful code - it gets the job done without making your brain work overtime.]]></content:encoded>
            <author>raghunandhanvr@outlook.com (Raghunandhan VR)</author>
        </item>
        <item>
            <title><![CDATA[bloom-filter]]></title>
            <link>https://raghu.app/writings/bloom-filter</link>
            <guid isPermaLink="false">https://raghu.app/writings/bloom-filter</guid>
            <pubDate>Mon, 06 Apr 2026 05:28:42 GMT</pubDate>
            <content:encoded><![CDATA[import Image from 'next/image';
import { Suspense } from 'react';
import BloomFilterVisual from './bloom-filter-visual';
import FalsePositiveVisual from './false-positive-visual';
import MermaidDiagram from '@/app/components/md/mermaid';
import { BlogViewCounter } from '@/app/components/ui/blog-view-counter';
import PageLoader from '@/app/components/ui/page-loader';
import { TableOfContents } from '@/app/components/ui/table-of-contents';

export const metadata = {
  title: 'Bloom Filters: How Systems Check Name Availability in Milliseconds',
  description: 'A deep dive into the clever data structure behind instant username checks across the web',
  alternates: {
    canonical: '/writings/bloom-filter',
  },
  openGraph: {
    images: [
      {
        url: `/api/og?title=Bloom+Filters:+How+Systems+Check+Name+Availability+in+Milliseconds`,
        width: 1200,
        height: 630,
      },
    ],
  },
};

# Bloom Filters: How Systems Check Name Availability in Milliseconds
<BlogViewCounter slug="/writings/bloom-filter" createdAt={new Date('2025-04-11')} />
<TableOfContents />

## When "That's Already Taken" Happens Way Too Fast

2 AM in a random night, chain-drinking redbull to see if I would get wings, trying to register a domain for a side project. The usual suspects—my go-to domain names—were all taken (obviously). As I typed one idea after another into the registrar's search box, I started noticing something weird.

The "Sorry, this domain is taken" message appeared *instantly*. No delay. No loading icon. Nothing.

The same thing happened when I tried to find a username for a new GitHub account. And when creating a custom URL on Bitly. And *again* when setting up a Discord server.

Wait, how the heck are these completely different systems checking availability so blazingly fast? They must be querying massive databases with millions or billions of entries... right?

After wasting several hours of my life on this bizarre rabbit hole (instead of, you know, actually working on my project), I found something fascinating: most of these systems probably use a clever data structure called a **Bloom filter**.

## Bloom Filters: Fast Set Membership at Scale

At its core, a Bloom filter is a probabilistic data structure designed to answer one question with extraordinary efficiency: "Is this element in the set?" The mathematical beauty lies in its asymptotic properties:

- **Time complexity**: O(k) for both insertion and lookups, where k is the number of hash functions
- **Space complexity**: O(m), where m is the bit array size, independent of the items stored
- **Memory efficiency**: Can represent n elements in roughly n*log(e)*log(1/ε) bits, where ε is the false positive rate

The trade-off is brilliant: by accepting a controlled probability of false positives (but never false negatives), you gain tremendous space efficiency and constant-time operations regardless of how many items you're storing.

<MermaidDiagram diagram={`
graph TD
    A["Standard Data Structures"] --> B["Sets: O(n) space, O(1) lookup"]
    A --> C["Hash Tables: O(n) space, O(1) avg lookup"]
    A --> D["Trees: O(n) space, O(log n) lookup"]
    E["Bloom Filter: O(m) space, O(k) lookup<br>m << n for large datasets"]
    style E fill:#d4f0fd,stroke:#0077b6
`} />

Think of it as a compact fingerprint of your dataset - not enough information to reconstruct the data, but enough to definitively say "that's not in our system" with 100% confidence.

## How Fast Username Checks Actually Work

Before diving into the Bloom filter magic, let's think about the problem these systems are solving:

- Twitter needs to check against 450+ million usernames
- Domain registrars check billions of registered domains
- URL shorteners like Bitly check trillions of existing URLs

Doing a direct database lookup for *every* check would be painfully slow and expensive. Consider what happens when you're typing "@coolname99" letter by letter, and the UI gives you real-time feedback:

1. Is "c" taken? *Database check*
2. Is "co" taken? *Database check*
3. Is "coo" taken? *Database check*
4. ...and so on

That's a DDOS attack on your own database, essentially. Not great.

<MermaidDiagram diagram={`
sequenceDiagram
    User->>+Frontend: Types "cooldev2"
    Frontend->>+Bloom Filter: Check "cooldev2"
    Note over Bloom Filter: Hash1("cooldev2") % m = pos1<br>Hash2("cooldev2") % m = pos2<br>Hash3("cooldev2") % m = pos3<br>Check if bits at positions are set
    Bloom Filter-->>-Frontend: "Definitely not in system"
    Frontend-->>-User: "Available!"
`} />

## Enter the Bloom Filter: A Probabilistic BS Detector

A Bloom filter is basically a fancy sieve for data. It answers one simple question: *"Is this thing in the set?"* with:

- "No, definitely not" (100% confidence)
- "Probably yes" (some uncertainty)

The weird part? It's almost scarily space-efficient. You can represent millions of usernames in just a megabyte or so of memory.

How's this useful? Here's a diagram of the typical flow:

<MermaidDiagram diagram={`
flowchart TD
    A[Username Check] --> B{In Bloom Filter?}
    B -->|"Definitely Not"| C[Available!]
    B -->|"Probably Yes"| D{Check Database}
    D -->|Exists| E[Already Taken]
    D -->|Doesn't Exist| F[Available - False Positive]
`} />

When you check if a username is available:

1. First, consult the super-fast Bloom filter
2. If it says "definitely not in the system," you're good to go!
3. If it says "probably in the system," then (and only then) do a proper database check

This cuts down database queries by 90%+ for new, unique usernames—which is exactly what most people are looking for.

## How These Things Actually Work

Honestly, I was super skeptical at first. How can something possibly tell you with certainty that an item isn't in a set without storing the whole set?

The trick is surprisingly simple. A Bloom filter is basically just an array of bits (zeros and ones) with everything initially set to zero.

<Suspense fallback={<PageLoader />}>
<BloomFilterVisual />
</Suspense>

<MermaidDiagram diagram={`
graph LR
    subgraph "Adding 'alex' to the filter"
    A["alex"] --> H1["Hash fn #1"]
    A --> H2["Hash fn #2"] 
    A --> H3["Hash fn #3"]
    H1 --> P1["Position 1"]
    H2 --> P3["Position 3"]
    H3 --> P5["Position 5"]
    end
    subgraph "Resulting bit array"
    BF["[0, 1, 0, 1, 0, 1, 0, 0, 0, 0]"]
    style BF text-align:center
    end
    P1 -.-> BF
    P3 -.-> BF
    P5 -.-> BF
`} />

When GitHub adds a new username, let's say "devninja42":

1. They run "devninja42" through multiple hash functions
2. Each hash function outputs a position in the bit array
3. They flip those specific bits to 1

Later, when checking if "codewizard99" exists:

1. Run "codewizard99" through the same hash functions
2. Check if ALL the corresponding bits are 1
   - If any bit is 0: Username DEFINITELY doesn't exist
   - If all bits are 1: Username PROBABLY exists (do a DB check)

Here's a simplified view of adding "alex" to the filter:

<MermaidDiagram diagram={`
flowchart LR
    A["alex"] --> H1["Hash 1"]
    A --> H2["Hash 2"] 
    A --> H3["Hash 3"]
    H1 --> P1["Position 2"]
    H2 --> P2["Position 5"]
    H3 --> P3["Position 9"]
    P1 & P2 & P3 -.-> BF["[0,0,1,0,0,1,0,0,0,1]"]
`} />

The real magic is that multiple usernames can set the same bits to 1, which is why you occasionally get "false positives" – where the filter thinks something might exist when it actually doesn't.

<MermaidDiagram diagram={`
graph TD
    subgraph "Hash Functions Selection"
    HF["k independent hash functions<br>MurmurHash3, FNV, Jenkins"]
    end
    subgraph "Mathematical Properties"
    MP["False Positive Rate (p) = (1 - e^(-kn/m))^k<br>where:<br>k = number of hash functions<br>n = number of items<br>m = bit array size"]
    end
    subgraph "Optimal Configuration"
    OC["For minimal false positives:<br>k = (m/n)ln(2)<br>m = -n*ln(p)/(ln(2))²"]
    end
    HF --> MP
    MP --> OC
`} />

## Simple Username Checker

I couldn't sleep without trying this myself, so I hacked together a quick Bloom filter in Go, trust me this is not GPT generated code. If you're not a programmer, no worries—just focus on the results:

```go
package main

import (
	"fmt"
	"hash/fnv"
)

type BloomFilter struct {
	bits     []bool
	size     int
	numHashes int
}

func NewBloomFilter(size, numHashes int) *BloomFilter {
	return &BloomFilter{
		bits:     make([]bool, size),
		size:     size,
		numHashes: numHashes,
	}
}

func (bf *BloomFilter) hash(s string, seed int) int {
	h := fnv.New32a()
	h.Write([]byte(s))
	h.Write([]byte{byte(seed)})
	return int(h.Sum32()) % bf.size
}

func (bf *BloomFilter) Add(username string) {
	for i := 0; i < bf.numHashes; i++ {
		pos := bf.hash(username, i)
		bf.bits[pos] = true
	}
}

func (bf *BloomFilter) MightContain(username string) bool {
	for i := 0; i < bf.numHashes; i++ {
		pos := bf.hash(username, i)
		if !bf.bits[pos] {
			return false // definitely not in set
		}
	}
	return true // maybe in set
}

func main() {
	// create filter with 100 bits and 3 hash functions
	filter := NewBloomFilter(100, 3)
	
	// add some common usernames
	takenNames := []string{"admin", "user", "john", "alex", "emma"}
	for _, name := range takenNames {
		filter.Add(name)
	}
	
	// check various usernames
	testNames := []string{"john", "cooldev2023", "alex", "hacker99"}
	
	for _, name := range testNames {
		if filter.MightContain(name) {
			fmt.Printf("'%s' might be taken. Checking database...\n", name)
			
			// simulate DB check (just check our list)
			found := false
			for _, taken := range takenNames {
				if taken == name {
					found = true
					break
				}
			}
			
			if found {
				fmt.Printf("  Database confirms: '%s' is taken\n", name)
			} else {
				fmt.Printf("  False alarm! '%s' is actually available\n", name)
			}
		} else {
			fmt.Printf("'%s' is definitely available!\n", name)
		}
	}
}
```

And when I ran this with just a tiny 100-bit filter:

```
'john' might be taken. Checking database...
  Database confirms: 'john' is taken
'cooldev2023' is definitely available!
'alex' might be taken. Checking database...
  Database confirms: 'alex' is taken
'hacker99' might be taken. Checking database...
  False alarm! 'hacker99' is actually available
```

See that last result? The filter thought "hacker99" might exist (a false positive), but the database check confirmed it's actually available. That's the small trade-off for all this efficiency.

But here's the wild part - with just 100 bits, I was already getting decent results. Scale that up to a few megabytes, and you can efficiently filter billions of items with a false positive rate under 1%.

## False Positives (Or: Why I Thought My Cool Username Was Taken)

If you've ever been surprised that a seemingly random username was already taken, sometimes it might actually be a false positive!

<Suspense fallback={<PageLoader />}>
<FalsePositiveVisual />
</Suspense>

These happen when different usernames end up setting the same bits to 1. It's like a hash collision but visual - all your hash functions just happen to hit positions that were already set by other usernames.

For URL shorteners and domain registrars, these false positives mean unnecessary database checks, but that's WAY better than checking the database for every single lookup.

## Tuning and Optimizing

There's some math involved in making these filters efficient (sorry). The false positive rate depends on:

- How big your bit array is (m)
- How many hash functions you use (k)
- How many items you're storing (n)

<MermaidDiagram diagram={`
graph TD
    A["Optimal hashes (k) = (m/n) * ln(2)"]
    B["False positive rate ≈ (1 - e^(-kn/m))^k"]
`} />

For my side project, I ended up using a Bloom filter with:
- 1 million bits (about 125KB)
- 7 hash functions
- False positive rate around 0.01 (1%)

This could handle about 100,000 usernames with just a 1% chance of unnecessary database checks. Scaling this up is just a matter of increasing the bit array size.

## How the Big Players Do It

Companies like Twitter, GitHub, or Bitly probably use industrial-strength implementations. Many use Redis with the RedisBloom module, which basically handles all the complexity for you:

```go
import (
	"github.com/go-redis/redis/v8"
	"github.com/RedisBloom/redisbloom-go"
)

// create client
client := redis.NewClient(&redis.Options{Addr: "localhost:6379"})
bloomClient := redisbloom.NewClient(client.Options().Addr, "username-filter", nil)

// create filter with 0.01 error rate, 100K capacity
bloomClient.Reserve("username-filter", 0.01, 100000)

// add username
bloomClient.Add("username-filter", "coolcoder123")

// check if exists
exists, _ := bloomClient.Exists("username-filter", "webdev99")
if exists {
    // maybe exists, check database
} else {
    // definitely available
}
```

This setup can handle millions of users across distributed systems while keeping lookups blazing fast.

## The Weird Case of Deleted Usernames

Here's a quirk: standard Bloom filters don't let you delete items. Once a bit is set to 1, it stays that way.

For username systems where people might delete accounts and free up names, there's a variant called a "Counting Bloom Filter" that uses counters instead of just bits:

<MermaidDiagram diagram={`
graph TD
    A["Standard: [0,1,0,1,1,0,1]"]
    B["Counting: [0,2,0,1,3,0,1]"]
    C["Add: increment counters"]
    D["Remove: decrement counters"]
    C --> B
    D --> B
`} />

Each position tracks how many items have set that bit, so you can decrement when removing items. This uses more memory but allows for username recycling.

The implementation details get more interesting when we look at the time complexity. For both standard and counting Bloom filters:
- Add operation: O(k) time complexity, where k is the number of hash functions
- Query operation: O(k) time complexity
- Space complexity: O(m) where m is the bit array size

These operations are also parallelizable since each hash function can be computed independently, which is crucial for high-throughput systems.

## How Tech Giants Actually Implement This

Major platforms like Twitter, GitHub, and LinkedIn don't just use basic Bloom filters - they integrate them into sophisticated distributed systems:

### Twitter's Approach
Twitter uses a combination of in-memory Bloom filters with persistent storage. Their implementation reportedly:
- Distributes filters across their microservice architecture
- Uses multiple layers of filters with varying false positive rates
- Implements real-time synchronization for username changes
- Employs custom hash functions optimized for string usernames

### LinkedIn's EdgeRank Implementation
LinkedIn has publicly discussed using Bloom filters in their feed ranking algorithm:
```java
// simplified pseudocode based on LinkedIn's approach
class DistributedBloomFilter {
    private BloomFilter<String> localFilter;
    private DistributedCache cache;
    
    public DistributedBloomFilter(int expectedItems, double falsePositiveRate) {
        // initialize local filter with optimal bit size and hash count
        int optimalBits = (int) (-expectedItems * Math.log(falsePositiveRate) / (Math.log(2) * Math.log(2)));
        int optimalHashes = (int) (optimalBits / expectedItems * Math.log(2));
        
        this.localFilter = BloomFilter.create(
            Funnels.stringFunnel(Charset.defaultCharset()),
            expectedItems,
            falsePositiveRate);
            
        // initialize distributed cache connection
        this.cache = new RedisDistributedCache();
    }
    
    public void add(String item) {
        // add locally
        localFilter.put(item);
        
        // propagate to distributed cache
        cache.setBits(getHashPositions(item));
    }
    
    public boolean mightContain(String item) {
        // fast path - check local filter first
        if (!localFilter.mightContain(item)) {
            return false;
        }
        
        // double-check with distributed cache
        return cache.checkBits(getHashPositions(item));
    }
}
```

### Google's BigTable Implementation
Google uses Bloom filters extensively to reduce disk I/O in BigTable. Their published papers reveal:
- They use Bloom filters to check if a SSTable might contain a specific row/column pair
- Each SSTable has its own Bloom filter that stays in memory
- Their implementation reportedly reduces read operations by up to 50%

<MermaidDiagram diagram={`
graph TD
    A[Client Query] --> B{In Memory BF?}
    B -->|No| C[Return Not Found]
    B -->|Maybe| D{Check Block Cache}
    D -->|Found| F[Return Data]
    D -->|Not Found| E{Check SSTable}
    E -->|Found| F
    E -->|Not Found| C
`} />

## The Instagram Problem: Scaling to Billions

For massive platforms like Instagram with a billion+ users, a single Bloom filter gets unwieldy. Enter Scalable Bloom Filters, which are basically chains of bloom filters that grow over time:

<MermaidDiagram diagram={`
graph TD
    A[Check Username] --> B{In Filter 1?}
    B -->|Yes| C[Maybe Exists]
    B -->|No| D{In Filter 2?}
    D -->|Yes| C
    D -->|No| E{In Filter 3?}
    E -->|Yes| C
    E -->|No| F[Definitely Available]
`} />

Each filter has a fixed capacity. When one fills up, a new one is created for new usernames. Checks go through all filters - if any says "maybe," do a database check.

The mathematical formula for determining the optimal parameters in a production environment is:

<MermaidDiagram diagram={`
graph TD
    A["Optimal bit array size (m) = -n * ln(p) / (ln(2)²)"]
    B["Optimal hash count (k) = (m/n) * ln(2)"]
    C["Where: n = expected items, p = acceptable false positive rate"]
`} />

For Instagram's scale (1B+ users), a Bloom filter with a 1% false positive rate would need approximately:
- 9.6 billion bits (~1.2GB of memory)
- 7 hash functions

## Real-World Performance: Mind = Blown

Just to see how efficient this really is, I tested with a simulated dataset of 10 million usernames:

<MermaidDiagram diagram={`
graph LR
    A[DB Only: 15ms per check]
    B[Bloom+DB: 0.1ms for available names]
    C[Memory: 1.2MB vs 500MB+ for full list]
`} />

The results were shocking:
- Reduced DB queries by over 95%
- Available usernames got sub-millisecond responses
- Used less than 1/400th of the memory

No wonder these systems feel instantaneous when telling me my clever domain name ideas are already taken!

## Just give a try?

Beyond usernames and domains, Bloom filters are useful anywhere you need to quickly check set membership:

- Browser spell checkers (is this word likely in the dictionary?)
- Cache systems (is this item likely in the cache?)
- Network routers (have I seen this packet before?)
- Cryptocurrency (is this transaction in the mempool?)
- Ask LLMs to give more project ideas

They're one of those magical data structures that seem too good to be true but actually work incredibly well for the right problems.

## So Next Time...

The next time you're frantically typing usernames into Twitter or domain ideas into GoDaddy at 2 AM, and you get that instant "Sorry, that's taken" message, now you know there's probably a Bloom filter working behind the scenes - not a direct database query.

Kind of makes you appreciate the clever engineering that goes into making these everyday experiences feel so seamless, doesn't it?

## Wanna Dig Deeper?
- [Bloom Filters by Example](https://llimllib.github.io/bloomfilter-tutorial/) - With interactive demos
- [RedisBloom](https://redis.io/docs/stack/bloom/) - For production use
- [bloom v4](https://github.com/bits-and-blooms/bloom) - A solid Go library
- [The Math Behind It All](https://en.wikipedia.org/wiki/Bloom_filter#Probability_of_false_positives) - For the brave]]></content:encoded>
            <author>raghunandhanvr@outlook.com (Raghunandhan VR)</author>
        </item>
        <item>
            <title><![CDATA[db]]></title>
            <link>https://raghu.app/writings/db</link>
            <guid isPermaLink="false">https://raghu.app/writings/db</guid>
            <pubDate>Mon, 06 Apr 2026 05:28:42 GMT</pubDate>
            <content:encoded><![CDATA[import { BlogViewCounter } from '@/app/components/ui/blog-view-counter'
import { TableOfContents } from '@/app/components/ui/table-of-contents'

export const metadata = {
  title: 'Other DB Options',
  description: 'A look at some interesting managed database products that offer unique features and cost-saving benefits.',
  alternates: {
    canonical: '/writings/db',
  },
  openGraph: {
    images: [
      {
        url: `/api/og?title=Other+DB+Options`,
        width: 1200,
        height: 630,
      },
    ],
  },
};

# Exploring Cost-Effective Managed Database Options Beyond the Big Three
<BlogViewCounter slug="/writings/db" createdAt={new Date('2024-08-10')} />
<TableOfContents />

## PlanetScale (MySQL)

[PlanetScale](https://planetscale.com/) uses Vitess, developed at YouTube, for advanced sharding and scaling. Features like database branching (similar to Git branching) and Github's Ghost for schema migrations make it a solid option for MySQL. PlanetScale is my personal choice, and they also provide great knowledge transfer for Vitess on their [YouTube channel](https://www.youtube.com/c/PlanetScale).

## Neon.tech (Serverless PostgreSQL)

[Neon](https://neon.tech/) offers a serverless PostgreSQL that automatically scales up or down with zero downtime, saving costs and optimizing performance. It also has database branching. Keep in mind, 'serverless' is just a fancy term—they handle the servers for us, so we don't need to worry about scaling or maintenance, but our database is still running on servers somewhere.

## TimescaleDB (PostgreSQL extension)

[TimescaleDB](https://www.timescale.com/) is a time-series database built on PostgreSQL, perfect for handling time-series data. It might be helpful for blockchain teams and those dealing with logging. TimescaleDB is open-source.

## DragonflyDB (Redis alternative)

[DragonflyDB](https://dragonflydb.io/) is faster and more memory-efficient than Redis, fully compatible with Redis protocols for efficient caching. In a project where Redis was handling a large volume of caching for session management and API rate limiting, switching to DragonflyDB allowed us to handle more traffic with the same resources, reduce memory consumption, and keep costs down, all while being fully compatible with our existing Redis setup.

## ClickHouse (Analytical DB)

[ClickHouse](https://clickhouse.com/), developed by Yandex, is excellent for real-time analytical reporting with SQL, handling huge datasets efficiently. Its retrieval speed is really surprising. ClickHouse is open-source.

## FaunaDB (Serverless DB)

With [FaunaDB](https://fauna.com/), there are no worries about scaling or server maintenance. It supports complex queries with GraphQL and FQL.

## CockroachDB (NewSQL)

[CockroachDB](https://www.cockroachlabs.com/) is cloud-native and SQL-compatible, ensuring robust transaction processing across regions without the high costs.

Yes, I know I've missed a few, like [Xata](https://xata.io/), [AWS Aurora](https://aws.amazon.com/rds/aurora/), etc. But, the focus here is on services that truly make "cost-effective" worth considering. These solutions are not only ideal for developers handling side projects but also beneficial for enterprise-level companies aiming to optimize costs and scale their infrastructure efficiently.]]></content:encoded>
            <author>raghunandhanvr@outlook.com (Raghunandhan VR)</author>
        </item>
        <item>
            <title><![CDATA[best-coding-ai]]></title>
            <link>https://raghu.app/writings/best-coding-ai</link>
            <guid isPermaLink="false">https://raghu.app/writings/best-coding-ai</guid>
            <pubDate>Mon, 06 Apr 2026 05:28:42 GMT</pubDate>
            <content:encoded><![CDATA[import { BlogViewCounter } from '@/app/components/ui/blog-view-counter'
import { TableOfContents } from '@/app/components/ui/table-of-contents'

export const metadata = {
  title: 'The Best Coding AI',
  description: 'This blog helps you understand why you shouldn\'t just trust AI model reviews or jump on every new hyped release. Here\'s how to actually evaluate what works.',
  alternates: {
    canonical: '/writings/best-coding-ai',
  },
  openGraph: {
    images: [
      {
        url: `/api/og?title=The+Best+Coding+AI`,
        width: 1200,
        height: 630,
      },
    ],
  },
};

# The Best Coding AI
<BlogViewCounter slug="/writings/best-coding-ai" createdAt={new Date('2025-09-02')} />
<TableOfContents />

This blog helps you understand why you shouldn't just trust AI model reviews or jump on every new hyped release.

Which model's actually better, *GPT 5* or *Opus 4.1?* Everyone wants a straight answer, but it's not that simple. Some people share opinions really early, while others properly test things before saying anything.

## Who's giving the opinion?

First thing: check **who's talking**. How many years of engineering experience do they have *before the AI era?* Are they using AI models to code daily? People new to AI coding usually form opinions too quickly, and one or two bad experiences throw them off. These models work on probability, so failures are expected. Experienced developers share opinions only after extensive testing across different scenarios.

Get ideas from people who have at least **5 years of coding experience before the AI era**. They understand what good code looks like without AI assistance, so they can better evaluate whether AI's actually helping or just generating noise. These are the people who debugged with *StackOverflow*, read through expert discussions, understood not just what worked but *WHY* it worked. That depth matters when evaluating AI output.

There's a whole generation now that doesn't know what StackOverflow is. They copy-paste errors into chat windows and get instant answers. Sure, the code works. Ask them why it works that way instead of another way? Blank stares. Ask about edge cases? Nothing. They're trading **deep understanding for quick fixes**.

Newer developers often judge models on subjective criteria like design quality. Nothing wrong with that, but it's harder to establish objective benchmarks for aesthetics.

## How're they using it?

Same model gives totally different results based on:
1. **The platform** (*Cursor*, *Claude Code*, etc.) which have predefined context, instructions, system prompts, tool integrations
2. **Programming language context** (*TypeScript*, *Python*)
3. **Codebase size and architectural complexity**

Coding agents combine a model (GPT-5), instructions (system prompts), tools (file I/O operations), all running in execution loops. Sometimes one loop uses *multiple models*. Cursor trains specific models just for codebase search or diff application. It's not always about the base model. Codex has dedicated models for code review workflows.

One developer finds *GPT-5* excellent using it in *Copilot* with *TypeScript* across thousands of files. Another finds it inadequate with a completely different stack. This is *before* considering prompt engineering and context provision strategies. The complexity compounds quickly.

Benchmarks like *SWE-bench*? They're necessary but don't reflect production usage. SWE-bench tests Python exclusively. If you're working in TypeScript, the relevance drops. Models overfit to benchmarks, and popular benchmarks create training data contamination issues.

## The depth problem

AI gives you answers fast, but the knowledge you gain is **shallow**. Back when we had to read multiple *StackOverflow* threads, you came out understanding not just what worked, but *why* it worked. Every great developer got there by understanding systems deeply and understanding other developers' thought processes. That's exactly what we're **losing**.

The acceleration has begun and we can't stop it. But that doesn't mean we let it make us worse developers. The future isn't about whether we use AI, it's about **how we use it**.

## So how to pick?

The only reliable method is **testing multiple models in your actual workflow**. But here's how to do it properly:

1. When AI gives you an answer, **interrogate it**. Ask *why*. Takes longer, but that's the point.
2. Do code reviews differently. Don't just check if code works. Ask what other approaches were considered. *Why this one*? Make understanding the process as important as the result.
3. Build things from scratch sometimes. Yes, AI can generate that authentication system. But build one yourself first. You'll write worse code, but you'll **understand every line**.
4. Find where smart people discuss code. *Reddit*, *Discord*, wherever. That's where you'll find discussions that make you think differently.

Learn from others if you analyze their context: their experience level, tech stack, and use case complexity. This helps filter noise and identify relevant opinions.

If someone's been shipping production code with AI for months, uses your tech stack, and has similar architectural complexity, their opinion carries more weight than someone testing on hobby projects. But remember, even the best AI is just a tool. The developers who'll survive are the ones who understand the **fundamentals underneath**.
]]></content:encoded>
            <author>raghunandhanvr@outlook.com (Raghunandhan VR)</author>
        </item>
        <item>
            <title><![CDATA[ai]]></title>
            <link>https://raghu.app/writings/ai</link>
            <guid isPermaLink="false">https://raghu.app/writings/ai</guid>
            <pubDate>Mon, 06 Apr 2026 05:28:42 GMT</pubDate>
            <content:encoded><![CDATA[import Image from 'next/image';
import { Suspense } from 'react';
import Tokenization  from './tokenization';
import SelfAttention from './self-attention';
import NeuralNetwork from './neural-network';
import ContextAwareResponse from './context-aware-response';
import Transformer from './transformer';
import MermaidDiagram from '@/app/components/md/mermaid'
import { BlogViewCounter } from '@/app/components/ui/blog-view-counter'
import PageLoader from '@/app/components/ui/page-loader'
import { TableOfContents } from '@/app/components/ui/table-of-contents'

export const metadata = {
  title: 'Understanding AI Models',
  description: 'A comprehensive guide to understanding how modern AI models work, from data collection to deployment, with practical insights for developers.',
  alternates: {
    canonical: '/writings/ai',
  },
  openGraph: {
    images: [
      {
        url: `/api/og?title=Understanding+AI+Models`,
        width: 1200,
        height: 630,
      },
    ],
  },
};

# Understanding AI Models: A Developer's Guide
<BlogViewCounter slug="/writings/ai" createdAt={new Date('2024-11-29')} />
<TableOfContents />

### Before We Start
**Bro**, this AI knows if you search for the best place to eat **Dosa, Vada, and Filter Coffee** at 8AM, it’ll guide you to the top-notch tiffin shop. But if you ask the same question at 10PM? It might suggest, “**Vada and filter coffee at night? Bro, it might duck your sleep, how about I suggest something else, or are you feeling like a Dosa Night special?**” And no, this isn’t just a if-else, it’s a **billion trillions zillions** of if-else conditions, powered by complex **neural networks** and **deep learning** algorithms.

Decided not only to use **[ChatGPT](https://openai.com/chatgpt)** or any other LLM just to code, debug, write content. Let us understand how they work. Not to become an expert, but to understand how they work.

While reading this blog, let's do something interesting: we'll not just learn concepts, but also imagine building our own **Large Language Model (LLM)** step by step. This mental model will help you understand how models like **[ChatGPT](https://openai.com/chatgpt)** actually work, know why they sometimes give weird answers, learn how to talk to them more effectively, and appreciate the complexity behind what happens after you just type "Hello!" and send it to GPT.

### Overview
Let's start with a high-level overview of how AI models work:

<MermaidDiagram diagram={`
graph TD
    subgraph DataPipeline[Data Pipeline]
        A[(Raw Data Store)]
        B[Data Processing]
        C[Training Dataset]
    end
    
    subgraph ModelDev[Model Development]
        D[Base Model]
        E[Specialized Model]
        F[Production Model]
    end
    
    subgraph Deploy[Production Environment]
        G[API Service]
        H[Model Registry]
        I[Monitoring System]
    end

    A -->|Extract & Validate| B
    B -->|Clean & Transform| C
    C -->|Initialize Training| D
    D -->|Domain Adaptation| E
    E -->|Human Feedback Loop| F
    F -->|Version Control| H
    H -->|Deploy| G
    G -->|Track Metrics| I
    I -.->|Performance Feedback| B
    
    style DataPipeline fill:transparent,stroke:#333,stroke-width:1px
    style ModelDev fill:transparent,stroke:#333,stroke-width:1px
    style Deploy fill:transparent,stroke:#333,stroke-width:1px
`} />

Yeah, lets break down the whole process into smaller steps.

### 1. Everything Starts with Data

#### The Data Hunger Games
Before any AI magic happens, we need data. Lots of it. Here's what AI models actually learn from:

<MermaidDiagram diagram={`
graph TD
    A[Web Scraping] -->|Raw Data| D[Processing Pipeline]
    B[User Content] -->|Filtered Data| D
    C[Academic Sources] -->|Curated Data| D
    D -->|Tokenization| E[Processed Data]
    E -->|Embedding| F[Training Ready Data]
`}/>
Data is the lifeblood of AI models. Imagine trying to learn a new language without ever hearing it spoken. That's what AI models face without data. They need vast amounts of information to understand patterns, context, and nuances. From web scraping to user-generated content, every piece of data contributes to the model's learning process. It's like a never-ending buffet for AI, where more data means better understanding and performance.

#### Simple Data Collector
<details>
<summary>Example Code for Simple Data Collector</summary>

```python
import requests
from bs4 import BeautifulSoup

class SimpleDataCollector:
    def __init__(self):
        self.collected_data = []
    
    def collect_from_url(self, url):
        # Fetch content from URL
        response = requests.get(url)
        soup = BeautifulSoup(response.text, 'html.parser')
        
        # Extract text content
        text = soup.get_text()
        
        # Basic cleaning
        cleaned_text = ' '.join(text.split())
        self.collected_data.append(cleaned_text)
        
    def get_data(self):
        return self.collected_data

# Usage
collector = SimpleDataCollector()
collector.collect_from_url('https://example.com')
data = collector.get_data()
```
</details>

#### Tokenization: Breaking Down Language
Tokenization is like teaching a child to read by breaking down sentences into words. It's the process of converting text into smaller, manageable pieces called tokens. These tokens are the building blocks that AI models use to understand and generate language. By breaking down language into tokens, models can analyze and learn from vast amounts of text data efficiently.

<Suspense fallback={<PageLoader />}>
<Tokenization />
</Suspense>

<details>
<summary>Example Code for Simple Tokenizer</summary>

```python
class SimpleTokenizer:
    def __init__(self):
        self.vocab = {}  # word -> id mapping
        self.next_id = 0
    
    def tokenize(self, text):
        # Split into words
        words = text.lower().split()
        
        # Convert to token ids
        tokens = []
        for word in words:
            if word not in self.vocab:
                self.vocab[word] = self.next_id
                self.next_id += 1
            tokens.append(self.vocab[word])
            
        return tokens
    
    def decode(self, tokens):
        # Convert token ids back to words
        reverse_vocab = {id: word for word, id in self.vocab.items()}
        return ' '.join(reverse_vocab[token] for token in tokens)

# Usage
tokenizer = SimpleTokenizer()
text = "Hello world of AI"
tokens = tokenizer.tokenize(text)  # [0, 1, 2, 3]
decoded = tokenizer.decode(tokens)  # "hello world of ai"
```
</details>

#### Building Our LLM: Data Collection Phase

1. **The Data Shopping List**
    - [Wikipedia dumps](https://dumps.wikimedia.org/) (knowledge base)
    - [Books and literature](https://www.goodreads.com/) (language understanding) 
    - [Code repositories](https://github.com/) (technical knowledge)
    - [Social media](https://www.twitter.com/) (current trends)
    - [Academic papers](https://arxiv.org/) (specialized knowledge)

2. **Pattern Recognition Examples**
    - User behavior patterns (late-night vs. daytime searches)
    - Language patterns (formal vs. casual)
    - Context switching (different domains)
    - Regional patterns (location-specific content)

### 2. Neural Networks: The Digital Brain

#### Basic Building Blocks
Neural networks are the digital brains of AI models. They mimic the way our brains work, with neurons and synapses, to process information. Each layer of a neural network transforms the input data, allowing the model to learn complex patterns and make predictions. It's like teaching a computer to recognize faces by showing it thousands of pictures until it can identify a face on its own.

<Suspense fallback={<PageLoader />}>
<NeuralNetwork />
</Suspense>

<details>
<summary>Example Code for Simple Neural Network</summary>

```python
import numpy as np

class SimpleNeuralNetwork:
    def __init__(self, input_size, hidden_size, output_size):
        # Initialize weights with random values
        self.hidden_weights = np.random.randn(input_size, hidden_size)
        self.output_weights = np.random.randn(hidden_size, output_size)
    
    def sigmoid(self, x):
        # Activation function
        return 1 / (1 + np.exp(-x))
    
    def forward(self, inputs):
        # Forward pass through the network
        self.hidden = self.sigmoid(np.dot(inputs, self.hidden_weights))
        self.output = self.sigmoid(np.dot(self.hidden, self.output_weights))
        return self.output

# Usage
nn = SimpleNeuralNetwork(input_size=3, hidden_size=4, output_size=2)
sample_input = np.array([0.5, 0.3, 0.7])
prediction = nn.forward(sample_input)
```
</details>

#### Pattern Recognition in Action
Neural networks excel at pattern recognition. They can identify patterns in data that are too complex for humans to see. For example, they can determine whether a user wants workout songs or party songs based on the time of day and location. This ability to recognize patterns makes neural networks powerful tools for tasks like image recognition, language translation, and more.

Here's what's happening inside these networks:
- Input: "I need songs"
- Context: Time is 6 AM, user is at the gym
- Model thinks: "87% chance they mean workout songs"

But if:
- Input: Same "I need songs"
- Context: Time is 10 PM, user is at a party
- Model thinks: "92% chance they mean party songs"

### 3. Deep Learning: Going Deeper

Deep learning takes neural networks to the next level by adding more layers and complexity. It's like stacking multiple brains on top of each other, each one learning from the previous one. This allows deep learning models to understand and generate more complex data, from natural language to images and beyond.

<MermaidDiagram diagram={`
graph TD
    I((Input)) --> H1((Hidden 1))
    H1 --> H2((Hidden 2))
    H2 --> H3((Hidden 3))
    H3 --> O((Output))
`}/>

<details>
<summary>Example Code for Deep Learning Model</summary>

```python
import numpy as np

class DeepLearningModel:
    def __init__(self, layer_sizes):
        self.weights = []
        self.layers = []
        
        # Initialize weights for each layer
        for i in range(len(layer_sizes) - 1):
            w = np.random.randn(layer_sizes[i], layer_sizes[i+1]) * 0.01
            self.weights.append(w)
    
    def relu(self, x):
        # ReLU activation function
        return np.maximum(0, x)
    
    def forward(self, inputs):
        current_input = inputs
        self.layers = [inputs]
        
        # Forward pass through each layer
        for w in self.weights:
            z = np.dot(current_input, w)
            current_input = self.relu(z)
            self.layers.append(current_input)
            
        return current_input

# Usage
model = DeepLearningModel([4, 8, 6, 2])  # 4 layers
sample_input = np.array([0.2, 0.7, 0.1, 0.9])
output = model.forward(sample_input)
```
</details>

### 4. Attention: The Game Changer

#### Beyond Simple Pattern Matching

Attention mechanisms revolutionized AI by allowing models to focus on specific parts of the input data. It's like having a spotlight that highlights the most important information, enabling the model to make more accurate predictions. This concept was popularized by the paper "[Attention is All You Need](https://arxiv.org/abs/1706.03762)" and is the foundation of transformer models like GPT.

#### How Attention Works in Practice
Attention mechanisms allow models to understand context and relationships between words in a sentence. For example, in the sentence "The dog **chased** the cat because **it** was scared," attention helps the model determine that "it" refers to "cat." This ability to focus on relevant parts of the input makes attention mechanisms crucial for tasks like language translation and text generation.

<Suspense fallback={<PageLoader />}>
<SelfAttention />
</Suspense>

```python
def example_attention_flow():
    sentence = ["The", "dog", "chased", "the", "cat", "because", "it", "was", "scared"]
    focus_word = "it"
    context = {
        "it": ["cat", 0.8],  # 80% attention to "cat"
        "cat": ["scared", 0.7],  # 70% attention to "scared"
        "dog": ["chased", 0.6]  # 60% attention to "chased"
    }
    return context[focus_word]

# The model learns these attention patterns during training
attention_result = example_attention_flow()  # Returns ["cat", 0.8]
```

### 5. Transformers: The Backbone of Modern AI

Transformers have become the backbone of modern AI, especially in natural language processing tasks. They leverage attention mechanisms to process input data in parallel, making them highly efficient and effective for large-scale data.

<Suspense fallback={<PageLoader />}>
<Transformer />
</Suspense>

#### How Transformers Work
Transformers use a mechanism called self-attention to weigh the importance of different words in a sentence relative to each other. This allows them to capture context and relationships more effectively than previous models.

Here's a simplified explanation of the transformer architecture:
- **Input Embedding**: Converts input tokens into vectors.
- **Positional Encoding**: Adds information about the position of each token.
- **Self-Attention**: Computes attention scores for each token pair.
- **Feedforward Neural Network**: Processes the attention outputs.
- **Output Layer**: Generates predictions or further embeddings.


<details>
<summary>Example Code for a Simple Transformer</summary>

```python
import torch
import torch.nn as nn

class SimpleTransformer(nn.Module):
    def __init__(self, input_dim, model_dim, num_heads, num_layers):
        super(SimpleTransformer, self).__init__()
        self.encoder_layer = nn.TransformerEncoderLayer(d_model=model_dim, nhead=num_heads)
        self.transformer_encoder = nn.TransformerEncoder(self.encoder_layer, num_layers=num_layers)
        self.linear = nn.Linear(model_dim, input_dim)

    def forward(self, src):
        output = self.transformer_encoder(src)
        return self.linear(output)

# Usage
model = SimpleTransformer(input_dim=512, model_dim=512, num_heads=8, num_layers=6)
src = torch.rand((10, 32, 512))  # (sequence_length, batch_size, input_dim)
output = model(src)
```
</details>

Transformers have enabled breakthroughs in AI by allowing models to understand context and relationships in data more effectively. They are the foundation of many state-of-the-art models, including GPT, BERT, and more.

#### Context-Aware Responses: Understanding the GOAT

AI models can provide different responses based on the context. Let's see how this works with our GOAT (Greatest of All Time) example:

<Suspense fallback={<PageLoader />}>
<ContextAwareResponse />
</Suspense>

This example demonstrates how AI models can adapt their responses based on the given context. Whether we're talking about cricket, football, or Tamil cinema, the model understands the context and provides an appropriate answer.

### 6. Training: Teaching Our AI

#### The Learning Process
Training an AI model is like teaching a child to ride a bike. It involves trial and error, with the model learning from its mistakes. During training, the model processes vast amounts of data, adjusting its parameters to minimize errors and improve accuracy. This iterative process is what enables AI models to learn and make predictions.

<MermaidDiagram diagram={`
graph TD
    D[Training Data] -->|Batching| B[Forward Pass]
    B -->|Loss Calculation| C[Backward Pass]
    C -->|Update Weights| D
`}/>

<details>
<summary>Example Code for Simple Training Loop</summary>

```python
import numpy as np

class ModelTrainer:
    def __init__(self, model, learning_rate=0.01):
        self.model = model
        self.lr = learning_rate
        
    def calculate_loss(self, predictions, targets):
        # Simple MSE loss
        return np.mean((predictions - targets) ** 2)
        
    def train_step(self, inputs, targets):
        # Forward pass
        outputs = self.model.forward(inputs)
        
        # Calculate loss
        loss = self.calculate_loss(outputs, targets)
        
        # Simple backward pass (gradient descent)
        gradient = 2 * (outputs - targets)
        self.model.update_weights(gradient, self.lr)
        
        return loss

# Usage
trainer = ModelTrainer(model=SimpleNeuralNetwork(2, 4, 1))
input_data = np.array([[0.1, 0.2], [0.3, 0.4]])
target_data = np.array([[0.3], [0.7]])

for epoch in range(100):
    loss = trainer.train_step(input_data, target_data)
    if epoch % 10 == 0:
        print(f"Epoch {epoch}, Loss: {loss}")
```
</details>

### 7. Fine-tuning and RLHF: Teaching Manners

#### Making AI Helpful and Safe
Fine-tuning and Reinforcement Learning from Human Feedback (RLHF) are like teaching AI models good manners. They refine the model's behavior, ensuring it provides helpful and safe responses. By incorporating human feedback, models learn to prioritize user satisfaction and safety, making them more reliable and user-friendly.

<MermaidDiagram diagram={`
graph TD
    M[Base Model] -->|Human Feedback| R[Reward Model]
    R -->|Training| P[Policy Update]
    P -->|Improved| F[Fine-tuned Model]
`}/>

<details>
<summary>Example Code for Simple RLHF</summary>

```python
class SimpleRLHF:
    def __init__(self, base_model):
        self.base_model = base_model
        self.reward_scores = {}
        
    def collect_feedback(self, response, human_score):
        # Store human feedback scores
        self.reward_scores[response] = human_score
        
    def generate_response(self, prompt, temperature=0.7):
        # Generate multiple responses
        responses = [
            self.base_model.generate(prompt, temperature)
            for _ in range(3)
        ]
        
        # Pick response with highest historical reward
        best_response = max(
            responses,
            key=lambda r: self.reward_scores.get(r, 0)
        )
        
        return best_response

# Usage
rlhf = SimpleRLHF(base_model=PretrainedModel())

# Collect feedback
response = "I can help you with that task!"
human_score = 0.9  # High score for helpful response
rlhf.collect_feedback(response, human_score)

# Generate improved response
prompt = "How can you help me?"
better_response = rlhf.generate_response(prompt)
```
</details>

### 8. Deployment: Going Live

#### From Training to Production
Deploying an AI model is like launching a rocket. After rigorous testing and fine-tuning, the model is ready to go live and serve users. This involves setting up infrastructure, ensuring scalability, and monitoring performance to maintain reliability and efficiency.

<MermaidDiagram diagram={`
flowchart LR
    C[Client] --> LB[Load Balancer]
    LB --> S1[Server 1]
    LB --> S2[Server 2]
    S1 --> C
    S2 --> C
`}/>

<details>
<summary>Example Code for API Server Deployment</summary>

```python
from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()
model = LoadModel("path/to/model")

class Query(BaseModel):
    text: str
    temperature: float = 0.7

@app.post("/generate")
async def generate_text(query: Query):
    try:
        response = model.generate(
            query.text,
            temperature=query.temperature
        )
        return {"response": response}
    except Exception as e:
        return {"error": str(e)}

# Run with: uvicorn app:app --host 0.0.0.0 --port 8000
```
</details>

### 9. Future Trends and Real-world Applications

The AI landscape is evolving faster than we update our npm packages. Here are some exciting trends:

1. **[Multimodal Models](https://arxiv.org/abs/2106.10752)**
   - Text + Images (like DALL-E)
   - Text + Code (like Copilot)
   - Text + Audio (like Whisper)

2. **[Efficient Models](https://arxiv.org/abs/2009.01325)**
   - Smaller, faster models
   - Specialized for specific tasks
   - Optimized for edge devices

3. **Enhanced Privacy**
   - Local model deployment
   - [Federated learning](https://en.wikipedia.org/wiki/Federated_learning)
   - [Differential privacy](https://en.wikipedia.org/wiki/Differential_privacy)

<MermaidDiagram diagram={`
graph TD
    subgraph "Future of AI"
        M[Multimodal] -->|Integration| F[Future Models]
        E[Efficiency] -->|Optimization| F
        P[Privacy] -->|Enhancement| F
    end
`}/>

### Conclusion

Understanding how AI models work has changed how I interact with them. It's like knowing a friend really well - you know when they'll be helpful and when they might need a bit more context.

Remember:
1. These models learn from patterns in our digital lives
2. They understand context better than you might think
3. But they're still programs - amazingly sophisticated ones, but programs nonetheless
4. The better you understand them, the better you can work with them

### Resources for Going Deeper
- [3Blue1Brown on Neural Network](https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi)
- [Good Resource for GNN and others](https://distill.pub)
- ["Attention Is All You Need"](https://arxiv.org/abs/1706.03762) - The Transformer paper
- [Hugging Face Documentation](https://huggingface.co/docs)]]></content:encoded>
            <author>raghunandhanvr@outlook.com (Raghunandhan VR)</author>
        </item>
        <item>
            <title><![CDATA[skipper]]></title>
            <link>https://raghu.app/writings/skipper</link>
            <guid isPermaLink="false">https://raghu.app/writings/skipper</guid>
            <pubDate>Mon, 06 Apr 2026 05:28:42 GMT</pubDate>
            <content:encoded><![CDATA[import MermaidDiagram from '@/app/components/md/mermaid'
import { BlogViewCounter } from '@/app/components/ui/blog-view-counter'
import { TableOfContents } from '@/app/components/ui/table-of-contents'

export const metadata = {
  title: 'Inside a High-Performance Reverse Proxy',
  description: 'About Zalando Skipper architecture, performance optimizations, and unique features.',
  alternates: {
    canonical: '/writings/skipper',
  },
  openGraph: {
    images: [
      {
        url: `/api/og?title=Inside+a+High-Performance+Reverse+Proxy`,
        width: 1200,
        height: 630,
      },
    ],
  },
};

# Inside a High-Performance Reverse Proxy
<BlogViewCounter slug="/writings/skipper" createdAt={new Date('2025-02-02')} />
<TableOfContents />

We recently switched to [Zalando Skipper](https://github.com/zalando/skipper) for our reverse proxy needs after trying out Nginx, HAProxy, Traefik, etc,.

I'm writing this blog because I spent weeks evaluating different reverse proxies for our infrastructure, and Skipper's approach to solving common proxy challenges really stood out. While digging through their source code, I found several interesting implementation details that I think are worth sharing - especially if you're building networked services in Go. 

Looking at their codebase, PRs, it's clear that Zalando properly follows the principles of good [dx](/writings/dx) - the code is well-structured, thoroughly documented, and designed with extensibility in mind. Whether you're evaluating reverse proxies or just interested in high-performance Go code, there's a lot to learn from their approach.

Skipper's Go-based architecture, extensibility, and performance in high-concurrency scenarios made it stand out. I've been learning from their codebase on how they do memeory management, goroutine handling, and routing.

<MermaidDiagram diagram={`
graph TD
    A[Incoming Request] --> B[Parsing]
    B --> C{Routing}
    C -->|Match| D[Filters]
    C -.->|No Match| Z[404]
    D --> E[Backend]
    E --> F[Response Processing]
    F --> G[Response to Client]

    H[Memory Management] -.-> B
    H -.-> E
    I[Concurrency Model] -.-> C
    I -.-> D
    J[Routing Engine] -.-> C
    K[Load Balancing] -.-> E
    L[Observability] -.-> F
`} />

## 1. Memory Management and Concurrency

Skipper manages memory pretty efficiently. Here are some interesting techniques they use:

### Object Pooling

They're using object pooling for request contexts. Here's a simplified version:

```go
import (
    "net/http"
    "sync"
    "time"
)

type proxyContext struct {
    Request        *http.Request
    Response       http.ResponseWriter
    roundTripStart time.Time
    filters        []filters.Filter
}

var proxyContextPool = sync.Pool{
    New: func() interface{} {
        return &proxyContext{
            filters: make([]filters.Filter, 0, 10),
        }
    },
}

func acquireContext(w http.ResponseWriter, r *http.Request) *proxyContext {
    ctx := proxyContextPool.Get().(*proxyContext)
    ctx.reset(w, r)
    return ctx
}

func releaseContext(ctx *proxyContext) {
    ctx.Request = nil
    ctx.Response = nil
    proxyContextPool.Put(ctx)
}
```

What's cool here is they pre-allocate the `filters` slice. This avoids allocations during request processing, which is crucial for low latency under high load. The time complexity for acquiring and releasing contexts is O(1), which is great for performance.

### Request Handling

Here's how Skipper handles requests:

```go
func (p *proxy) ServeHTTP(w http.ResponseWriter, r *http.Request) {
    ctx := acquireContext(w, r)
    defer releaseContext(ctx)

    if err := p.process(ctx); err != nil {
        p.errorHandler(ctx, err)
    }
}
```

This setup allows Skipper to handle a large number of concurrent connections efficiently. Each request gets its own goroutine, leveraging Go's concurrency model.

## 2. Filter Chain Execution

Skipper's filter chain is both flexible and fast. Here's how they process it:

```go
func (p *proxy) process(ctx *proxyContext) error {
    for _, f := range ctx.filters {
        if err := f.Request(ctx); err != nil {
            return err
        }
    }

    if err := p.forward(ctx); err != nil {
        return err
    }

    for i := len(ctx.filters) - 1; i >= 0; i-- {
        if err := ctx.filters[i].Response(ctx); err != nil {
            return err
        }
    }

    return nil
}
```

The time complexity here is O(n), where n is the number of filters. But what's clever is that by keeping the filter functions small and simple, they're allowing the Go compiler to inline these calls, reducing function call overhead.

I'm still exploring their filter implementations, but I've been working on a custom rate limiting filter. Below's just a very basic example:

```go
type rateLimitFilter struct {
    limit rate.Limit
    burst int
    limiter *rate.Limiter
}

func (f *rateLimitFilter) Request(ctx filters.FilterContext) {
    if !f.limiter.Allow() {
        ctx.Serve(&http.Response{
            StatusCode: http.StatusTooManyRequests,
            Body:       ioutil.NopCloser(strings.NewReader("Rate limit exceeded")),
        })
    }
}
```

This filter integrates seamlessly with Skipper's existing chain. I'm impressed by how easy it is to extend Skipper's functionality.

## 3. Routing and Load Balancing

Skipper's routing engine is pretty smart. They use a trie for static routes and regex for dynamic ones.

### Trie-based Route Matching

Here's a simplified version of their trie-based route matching:

```go
type trieNode struct {
    children map[string]*trieNode
    route    *Route
}

func (t *trie) lookup(path string) *Route {
    node := t.root
    segments := strings.Split(path, "/")
    for _, segment := range segments {
        child, exists := node.children[segment]
        if !exists {
            return nil
        }
        node = child
    }
    return node.route
}
```

This gives O(k) lookup time for static routes, where k is the path length. It's much faster than iterating through a list of routes for each request.

<MermaidDiagram diagram={`
graph TD
    A[Root] --> B[api]
    A --> C[static]
    B --> D[v1]
    B --> E[v2]
    D --> F[users]
    D --> G[products]
    E --> H[auth]
    C --> I[css]
    C --> J[js]
`} />

### Load Balancing

Their weighted round-robin load balancing is interesting:

```go
type WeightedRoundRobinLB struct {
    endpoints []*Endpoint
    weights   []int
    current   int
    mu        sync.Mutex
}

func (lb *WeightedRoundRobinLB) Next() *Endpoint {
    lb.mu.Lock()
    defer lb.mu.Unlock()

    totalWeight := 0
    for _, w := range lb.weights {
        totalWeight += w
    }

    for {
        lb.current = (lb.current + 1) % len(lb.endpoints)
        if lb.current == 0 {
            totalWeight--
            if totalWeight < 0 {
                totalWeight = 0
                for _, w := range lb.weights {
                    totalWeight += w
                }
            }
        }
        if totalWeight == 0 || lb.weights[lb.current] > totalWeight {
            return lb.endpoints[lb.current]
        }
    }
}
```

This allows for fine-grained control over traffic distribution. The time complexity is O(n) in the worst case, where n is the number of endpoints, but in practice, it's often much better.

## 4. Dynamic Configuration

One thing that really stands out is Skipper's ability to update routing configuration on the fly. This is super useful in Kubernetes environments. Here's a snippet from their Kubernetes ingress controller:

```go
func (c *Client) LoadAll() ([]*eskip.Route, error) {
    ingresses, err := c.getIngresses()
    if err != nil {
        return nil, err
    }

    routes := make([]*eskip.Route, 0, len(ingresses))
    for _, ing := range ingresses {
        rs, err := c.ingressToRoutes(ing)
        if err != nil {
            klog.Errorf("error converting ingress %v/%v to routes: %v", ing.Namespace, ing.Name, err)
            continue
        }
        routes = append(routes, rs...)
    }

    return routes, nil
}
```

This function is part of a loop that periodically checks for changes in Kubernetes Ingress resources and updates Skipper's routing table. The time complexity here is O(n * m), where n is the number of ingresses and m is the average number of routes per ingress.

## 5. Observability

I'm still exploring their observability features, but Skipper integrates well with Prometheus for metrics collection. Here's a basic example of how they set up a histogram for request durations:

```go
var (
    requestDuration = prometheus.NewHistogramVec(
        prometheus.HistogramOpts{
            Name:    "skipper_serve_route_duration_seconds",
            Help:    "The duration in seconds of serving requests.",
            Buckets: []float64{.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10},
        },
        []string{"route"},
    )
)

func init() {
    prometheus.MustRegister(requestDuration)
}
```

This allows for detailed monitoring of request latencies across different routes.

---

I'm still digging into their codebase, but I'm impressed by what I've seen so far. The attention to performance optimization, from low-level memory management to high-level routing strategies, is evident throughout.

While Nginx and HAProxy are solid, Skipper's Go-based architecture and focus on dynamic configuration make it a great fit for our container-based setup. Its extensibility has allowed us to implement custom logic that would have been tricky with other solutions.
 If you're dealing with high-traffic scenarios or complex routing needs, Skipper is definitely worth a look.

The code is [well-documented](https://opensource.zalando.com/skipper/reference/architecture/), and the community seems pretty active in slack.

Peace.]]></content:encoded>
            <author>raghunandhanvr@outlook.com (Raghunandhan VR)</author>
        </item>
        <item>
            <title><![CDATA[oauth]]></title>
            <link>https://raghu.app/writings/oauth</link>
            <guid isPermaLink="false">https://raghu.app/writings/oauth</guid>
            <pubDate>Mon, 06 Apr 2026 05:28:42 GMT</pubDate>
            <content:encoded><![CDATA[import MermaidDiagram from "@/app/components/md/mermaid";
import { BlogViewCounter } from "@/app/components/ui/blog-view-counter";
import { TableOfContents } from "@/app/components/ui/table-of-contents";

export const metadata = {
  title: "How OAuth Works?",
  description:
    "Understanding OAuth flows, tokens, and security. A practical guide to implementing OAuth in your applications.",
  alternates: {
    canonical: "/writings/oauth",
  },
  openGraph: {
    images: [
      {
        url: `/api/og?title=How+OAuth+Works`,
        width: 1200,
        height: 630,
      },
    ],
  },
};

# How OAuth Works?

<BlogViewCounter slug="/writings/oauth" createdAt={new Date("2025-07-14")} />
<TableOfContents />

### Understanding OAuth Implementation

When I was working at **[Freightify](https://freightify.com)**, I built an IAM service that handles around 25K+ active users daily. OAuth was one of those concepts that appeared complex initially, but the implementation becomes straightforward once you understand the core principles.

Let me explain what happens behind those "Login with Google" buttons.

## The Problem OAuth Solves

Consider this scenario - your application needs to access user data from another service. The traditional approach involves asking users for their username and password of that service. This approach has several critical issues:

1. Your application stores and manages credentials that don't belong to you
2. Users must trust you with their passwords
3. If your system gets compromised, all user credentials are exposed
4. You get complete access to user accounts, not just what you need
5. Users cannot revoke access without changing their passwords everywhere

OAuth addresses these issues by implementing a token-based system where users can grant limited access to applications without sharing their actual credentials.

<MermaidDiagram
  diagram={`
graph TD
    A[User Requests Access] --> B[App Redirects to OAuth Provider]
    B --> C[User Login & Consent]
    C --> D[Authorization Code Returned]
    D --> E[App Exchanges Code for Token]
    E --> F[Access Token Received]
    F --> G[API Calls with Token]
    G --> H[Protected Data Access]
`}
/>

## Core OAuth Components

The OAuth system involves several key components:

**Resource Owner**: The user who owns the data. This is your end user.

**Client**: Your application that wants to access user resources. This can be your web app, mobile app, or any service.

**Authorization Server**: The server that authenticates users and issues access tokens. Examples include Google, Facebook, GitHub.

**Resource Server**: The server hosting the protected resources. This may be the same as the authorization server or a separate service.

**Access Token**: A string representing the authorization granted to the client. This acts as a temporary access key.

**Refresh Token**: A token used to obtain new access tokens when the current one expires.

## OAuth Authorization Code Flow

The Authorization Code Flow is the most commonly used OAuth flow. Here's the step-by-step process:

### Step 1: Discovery and Initial Request

Most services provide OAuth endpoint information at a well-known location. This eliminates the need to hardcode URLs.

```go
type AuthorizationServerMetadata struct {
    Issuer                string   `json:"issuer"`
    AuthorizationEndpoint string   `json:"authorization_endpoint"`
    TokenEndpoint         string   `json:"token_endpoint"`
    ScopesSupported       []string `json:"scopes_supported"`
}

func discoverAuthServer(baseURL string) (*AuthorizationServerMetadata, error) {
    resp, err := http.Get(baseURL + "/.well-known/oauth-authorization-server")
    if err != nil {
        return nil, err
    }
    defer resp.Body.Close()

    var metadata AuthorizationServerMetadata
    if err := json.NewDecoder(resp.Body).Decode(&metadata); err != nil {
        return nil, err
    }

    return &metadata, nil
}
```

### Step 2: Redirect User to Authorization Server

Your application redirects the user to the authorization server with specific parameters:

```go
func buildAuthorizationURL(authEndpoint, clientID, redirectURI, state string, scopes []string) string {
    params := url.Values{
        "response_type": {"code"},
        "client_id":     {clientID},
        "redirect_uri":  {redirectURI},
        "scope":         {strings.Join(scopes, " ")},
        "state":         {state},
    }

    return authEndpoint + "?" + params.Encode()
}

// Usage
authURL := buildAuthorizationURL(
    "https://auth.raghu.app/oauth/authorize",
    "your-client-id",
    "https://yourapp.com/callback",
    "random-state-value",
    []string{"read:profile", "write:posts"},
)
```

The parameters are:

- `response_type=code`: Specifies the authorization code flow
- `client_id`: Your application identifier
- `redirect_uri`: Where to send the user after authorization
- `scope`: The permissions you're requesting
- `state`: Random value for security (prevents CSRF attacks)

### Step 3: User Authorization

The user gets redirected to the authorization server where they see a consent screen. If they approve, they are sent back to your application with an authorization code.

The redirect looks like:

```
https://yourapp.com/callback?code=abc123def456&state=random-state-value
```

### Step 4: Exchange Code for Token

Your application takes this authorization code and exchanges it for an access token:

```go
type TokenResponse struct {
    AccessToken  string `json:"access_token"`
    TokenType    string `json:"token_type"`
    ExpiresIn    int    `json:"expires_in"`
    RefreshToken string `json:"refresh_token"`
    Scope        string `json:"scope"`
}

func exchangeCodeForToken(tokenEndpoint, clientID, clientSecret, code, redirectURI string) (*TokenResponse, error) {
    data := url.Values{
        "grant_type":    {"authorization_code"},
        "code":          {code},
        "redirect_uri":  {redirectURI},
        "client_id":     {clientID},
        "client_secret": {clientSecret},
    }

    resp, err := http.PostForm(tokenEndpoint, data)
    if err != nil {
        return nil, err
    }
    defer resp.Body.Close()

    if resp.StatusCode != http.StatusOK {
        return nil, fmt.Errorf("token exchange failed: %s", resp.Status)
    }

    var token TokenResponse
    if err := json.NewDecoder(resp.Body).Decode(&token); err != nil {
        return nil, err
    }

    return &token, nil
}
```

### Step 5: Use Access Token

Now you can make requests to the resource server using the access token:

```go
func makeAuthenticatedRequest(resourceURL, accessToken string) (*http.Response, error) {
    req, err := http.NewRequest("GET", resourceURL, nil)
    if err != nil {
        return nil, err
    }

    req.Header.Set("Authorization", "Bearer "+accessToken)

    client := &http.Client{}
    return client.Do(req)
}

// Example usage
resp, err := makeAuthenticatedRequest("https://api.raghu.app/user/profile", accessToken)
```

## Handling Token Refresh

Access tokens expire for security reasons. When they expire, you use the refresh token to obtain a new one:

```go
func refreshAccessToken(tokenEndpoint, clientID, clientSecret, refreshToken string) (*TokenResponse, error) {
    data := url.Values{
        "grant_type":    {"refresh_token"},
        "refresh_token": {refreshToken},
        "client_id":     {clientID},
        "client_secret": {clientSecret},
    }

    resp, err := http.PostForm(tokenEndpoint, data)
    if err != nil {
        return nil, err
    }
    defer resp.Body.Close()

    var token TokenResponse
    if err := json.NewDecoder(resp.Body).Decode(&token); err != nil {
        return nil, err
    }

    return &token, nil
}
```

## PKCE - For Public Clients

When building mobile apps or SPAs, you cannot securely store client secrets. PKCE (Proof Key for Code Exchange) solves this problem:

```go
import (
    "crypto/rand"
    "crypto/sha256"
    "encoding/base64"
)

func generatePKCEChallenge() (verifier, challenge string, err error) {
    // Generate code verifier
    bytes := make([]byte, 32)
    if _, err := rand.Read(bytes); err != nil {
        return "", "", err
    }
    verifier = base64.RawURLEncoding.EncodeToString(bytes)

    // Generate code challenge
    hash := sha256.Sum256([]byte(verifier))
    challenge = base64.RawURLEncoding.EncodeToString(hash[:])

    return verifier, challenge, nil
}

func buildPKCEAuthURL(authEndpoint, clientID, redirectURI, state, codeChallenge string, scopes []string) string {
    params := url.Values{
        "response_type":         {"code"},
        "client_id":             {clientID},
        "redirect_uri":          {redirectURI},
        "scope":                 {strings.Join(scopes, " ")},
        "state":                 {state},
        "code_challenge":        {codeChallenge},
        "code_challenge_method": {"S256"},
    }

    return authEndpoint + "?" + params.Encode()
}

func exchangeCodeWithPKCE(tokenEndpoint, clientID, code, redirectURI, codeVerifier string) (*TokenResponse, error) {
    data := url.Values{
        "grant_type":    {"authorization_code"},
        "code":          {code},
        "redirect_uri":  {redirectURI},
        "client_id":     {clientID},
        "code_verifier": {codeVerifier},
    }

    resp, err := http.PostForm(tokenEndpoint, data)
    if err != nil {
        return nil, err
    }
    defer resp.Body.Close()

    var token TokenResponse
    if err := json.NewDecoder(resp.Body).Decode(&token); err != nil {
        return nil, err
    }

    return &token, nil
}
```

## OAuth Scopes: The Permission System

Scopes define what permissions you are requesting. Follow the principle of least privilege - only request what you actually need.

Common scope patterns:

- `read:profile` - Read user profile information
- `write:posts` - Create posts on behalf of user
- `admin:users` - Administrative access to user management

```go
type ScopeManager struct {
    requestedScopes []string
    grantedScopes   []string
}

func (s *ScopeManager) HasScope(scope string) bool {
    for _, granted := range s.grantedScopes {
        if granted == scope {
            return true
        }
    }
    return false
}

func (s *ScopeManager) ValidateRequest(requiredScope string) error {
    if !s.HasScope(requiredScope) {
        return fmt.Errorf("insufficient scope: required %s", requiredScope)
    }
    return nil
}
```

## Token Storage and Security

Proper token storage is critical for security:

```go
type TokenStore struct {
    tokens map[string]*TokenResponse
    mutex  sync.RWMutex
}

func NewTokenStore() *TokenStore {
    return &TokenStore{
        tokens: make(map[string]*TokenResponse),
    }
}

func (ts *TokenStore) StoreToken(userID string, token *TokenResponse) {
    ts.mutex.Lock()
    defer ts.mutex.Unlock()
    ts.tokens[userID] = token
}

func (ts *TokenStore) GetToken(userID string) (*TokenResponse, bool) {
    ts.mutex.RLock()
    defer ts.mutex.RUnlock()
    token, exists := ts.tokens[userID]
    return token, exists
}

func (ts *TokenStore) DeleteToken(userID string) {
    ts.mutex.Lock()
    defer ts.mutex.Unlock()
    delete(ts.tokens, userID)
}
```

## Common OAuth Issues and Solutions

### Token Expiration

Always check token expiry and refresh when needed:

```go
func (ts *TokenStore) GetValidToken(userID string) (*TokenResponse, error) {
    token, exists := ts.GetToken(userID)
    if !exists {
        return nil, fmt.Errorf("no token found for user")
    }

    // Check if token is expired (simplified)
    if time.Now().Unix() > token.ExpiresIn {
        // Refresh token
        newToken, err := refreshAccessToken(
            tokenEndpoint,
            clientID,
            clientSecret,
            token.RefreshToken,
        )
        if err != nil {
            return nil, err
        }

        ts.StoreToken(userID, newToken)
        return newToken, nil
    }

    return token, nil
}
```

### State Parameter Validation

Always validate the state parameter to prevent CSRF attacks:

```go
func validateState(receivedState, expectedState string) error {
    if receivedState != expectedState {
        return fmt.Errorf("state mismatch: possible CSRF attack")
    }
    return nil
}
```

## OpenID Connect (OIDC)

OIDC builds on top of OAuth to provide identity information. When you include the `openid` scope, you receive an ID token along with the access token:

```go
type OIDCTokenResponse struct {
    AccessToken  string `json:"access_token"`
    TokenType    string `json:"token_type"`
    ExpiresIn    int    `json:"expires_in"`
    RefreshToken string `json:"refresh_token"`
    IDToken      string `json:"id_token"`
    Scope        string `json:"scope"`
}

// ID Token contains user information
type IDTokenClaims struct {
    Subject   string `json:"sub"`
    Name      string `json:"name"`
    Email     string `json:"email"`
    Picture   string `json:"picture"`
    ExpiresAt int64  `json:"exp"`
    IssuedAt  int64  `json:"iat"`
}
```

## Building Your Own OAuth Server

If you need to build your own OAuth server, here's a basic structure:

```go
type OAuthServer struct {
    clients map[string]*Client
    codes   map[string]*AuthorizationCode
    tokens  map[string]*AccessToken
}

type Client struct {
    ID           string
    Secret       string
    RedirectURIs []string
    Scopes       []string
}

type AuthorizationCode struct {
    Code        string
    ClientID    string
    UserID      string
    Scopes      []string
    ExpiresAt   time.Time
    RedirectURI string
}

func (server *OAuthServer) HandleAuthorization(w http.ResponseWriter, r *http.Request) {
    clientID := r.URL.Query().Get("client_id")
    redirectURI := r.URL.Query().Get("redirect_uri")
    state := r.URL.Query().Get("state")

    // Validate client and redirect URI
    client, exists := server.clients[clientID]
    if !exists {
        http.Error(w, "Invalid client", http.StatusBadRequest)
        return
    }

    // Show consent page to user
    // After user approves, generate authorization code
    code := generateAuthorizationCode()
    server.codes[code] = &AuthorizationCode{
        Code:        code,
        ClientID:    clientID,
        UserID:      getCurrentUserID(r),
        ExpiresAt:   time.Now().Add(10 * time.Minute),
        RedirectURI: redirectURI,
    }

    // Redirect back to client
    redirectURL := fmt.Sprintf("%s?code=%s&state=%s", redirectURI, code, state)
    http.Redirect(w, r, redirectURL, http.StatusFound)
}
```

## JWT Token Validation

Many OAuth providers use JWT tokens. Here's how to validate them properly:

```go
import (
    "crypto/rsa"
    "encoding/json"
    "fmt"
    "github.com/dgrijalva/jwt-go"
    "net/http"
)

type JWKSResponse struct {
    Keys []struct {
        Kid string   `json:"kid"`
        Kty string   `json:"kty"`
        Use string   `json:"use"`
        N   string   `json:"n"`
        E   string   `json:"e"`
    } `json:"keys"`
}

func getPublicKey(jwksURL, kid string) (*rsa.PublicKey, error) {
    resp, err := http.Get(jwksURL)
    if err != nil {
        return nil, err
    }
    defer resp.Body.Close()

    var jwks JWKSResponse
    if err := json.NewDecoder(resp.Body).Decode(&jwks); err != nil {
        return nil, err
    }

    for _, key := range jwks.Keys {
        if key.Kid == kid {
            return jwt.ParseRSAPublicKeyFromPEM([]byte(key.N))
        }
    }
    return nil, fmt.Errorf("key not found")
}

func validateJWTToken(tokenString, jwksURL string) (*jwt.Token, error) {
    token, err := jwt.Parse(tokenString, func(token *jwt.Token) (interface{}, error) {
        if _, ok := token.Method.(*jwt.SigningMethodRSA); !ok {
            return nil, fmt.Errorf("unexpected signing method: %v", token.Header["alg"])
        }

        kid := token.Header["kid"].(string)
        return getPublicKey(jwksURL, kid)
    })

    if err != nil {
        return nil, err
    }

    if !token.Valid {
        return nil, fmt.Errorf("invalid token")
    }

    return token, nil
}
```

## Error Handling

Handle OAuth errors properly to provide good user experience:

```go
type OAuthError struct {
    Error            string `json:"error"`
    ErrorDescription string `json:"error_description"`
    ErrorURI         string `json:"error_uri"`
}

func handleOAuthError(resp *http.Response) error {
    var oauthErr OAuthError
    if err := json.NewDecoder(resp.Body).Decode(&oauthErr); err != nil {
        return fmt.Errorf("failed to decode error response: %v", err)
    }

    switch oauthErr.Error {
    case "invalid_grant":
        return fmt.Errorf("authorization code expired or invalid")
    case "invalid_client":
        return fmt.Errorf("client authentication failed")
    case "access_denied":
        return fmt.Errorf("user denied access")
    case "invalid_scope":
        return fmt.Errorf("requested scope is invalid")
    default:
        return fmt.Errorf("oauth error: %s - %s", oauthErr.Error, oauthErr.ErrorDescription)
    }
}

func exchangeCodeWithErrorHandling(tokenEndpoint, clientID, clientSecret, code, redirectURI string) (*TokenResponse, error) {
    data := url.Values{
        "grant_type":    {"authorization_code"},
        "code":          {code},
        "redirect_uri":  {redirectURI},
        "client_id":     {clientID},
        "client_secret": {clientSecret},
    }

    resp, err := http.PostForm(tokenEndpoint, data)
    if err != nil {
        return nil, fmt.Errorf("token request failed: %v", err)
    }
    defer resp.Body.Close()

    if resp.StatusCode != http.StatusOK {
        return nil, handleOAuthError(resp)
    }

    var token TokenResponse
    if err := json.NewDecoder(resp.Body).Decode(&token); err != nil {
        return nil, fmt.Errorf("failed to decode token response: %v", err)
    }

    return &token, nil
}
```

## Token Introspection

Check if tokens are still valid with introspection endpoint:

```go
type IntrospectionResponse struct {
    Active    bool     `json:"active"`
    Scope     string   `json:"scope"`
    ClientID  string   `json:"client_id"`
    Username  string   `json:"username"`
    TokenType string   `json:"token_type"`
    Exp       int64    `json:"exp"`
    Iat       int64    `json:"iat"`
    Sub       string   `json:"sub"`
    Aud       []string `json:"aud"`
}

func introspectToken(introspectURL, token, clientID, clientSecret string) (*IntrospectionResponse, error) {
    data := url.Values{
        "token": {token},
    }

    req, err := http.NewRequest("POST", introspectURL, strings.NewReader(data.Encode()))
    if err != nil {
        return nil, err
    }

    req.SetBasicAuth(clientID, clientSecret)
    req.Header.Set("Content-Type", "application/x-www-form-urlencoded")

    client := &http.Client{}
    resp, err := client.Do(req)
    if err != nil {
        return nil, err
    }
    defer resp.Body.Close()

    var introspection IntrospectionResponse
    if err := json.NewDecoder(resp.Body).Decode(&introspection); err != nil {
        return nil, err
    }

    return &introspection, nil
}
```

## Production Configuration

Manage OAuth configuration properly:

```go
type OAuthConfig struct {
    ClientID          string
    ClientSecret      string
    AuthURL           string
    TokenURL          string
    IntrospectURL     string
    JWKSURL           string
    RedirectURI       string
    Scopes            []string
    Timeout           time.Duration
    RetryAttempts     int
    TokenCacheTimeout time.Duration
}

func NewOAuthConfig() *OAuthConfig {
    return &OAuthConfig{
        ClientID:          os.Getenv("OAUTH_CLIENT_ID"),
        ClientSecret:      os.Getenv("OAUTH_CLIENT_SECRET"),
        AuthURL:           os.Getenv("OAUTH_AUTH_URL"),
        TokenURL:          os.Getenv("OAUTH_TOKEN_URL"),
        IntrospectURL:     os.Getenv("OAUTH_INTROSPECT_URL"),
        JWKSURL:           os.Getenv("OAUTH_JWKS_URL"),
        RedirectURI:       os.Getenv("OAUTH_REDIRECT_URI"),
        Scopes:            strings.Split(os.Getenv("OAUTH_SCOPES"), ","),
        Timeout:           30 * time.Second,
        RetryAttempts:     3,
        TokenCacheTimeout: 5 * time.Minute,
    }
}

func (c *OAuthConfig) Validate() error {
    if c.ClientID == "" {
        return fmt.Errorf("OAUTH_CLIENT_ID is required")
    }
    if c.ClientSecret == "" {
        return fmt.Errorf("OAUTH_CLIENT_SECRET is required")
    }
    if c.AuthURL == "" {
        return fmt.Errorf("OAUTH_AUTH_URL is required")
    }
    if c.TokenURL == "" {
        return fmt.Errorf("OAUTH_TOKEN_URL is required")
    }
    return nil
}
```

## Retry Logic and Rate Limiting

Handle transient failures and rate limits:

```go
import (
    "math"
    "time"
)

type HTTPClient struct {
    client        *http.Client
    retryAttempts int
    baseDelay     time.Duration
}

func NewHTTPClient(timeout time.Duration, retryAttempts int) *HTTPClient {
    return &HTTPClient{
        client: &http.Client{
            Timeout: timeout,
        },
        retryAttempts: retryAttempts,
        baseDelay:     time.Second,
    }
}

func (h *HTTPClient) doWithRetry(req *http.Request) (*http.Response, error) {
    var lastErr error

    for attempt := 0; attempt <= h.retryAttempts; attempt++ {
        resp, err := h.client.Do(req)
        if err != nil {
            lastErr = err
            if attempt < h.retryAttempts {
                delay := time.Duration(math.Pow(2, float64(attempt))) * h.baseDelay
                time.Sleep(delay)
                continue
            }
            return nil, lastErr
        }

        // Handle rate limiting
        if resp.StatusCode == 429 {
            resp.Body.Close()
            retryAfter := resp.Header.Get("Retry-After")
            if retryAfter != "" {
                if delay, err := time.ParseDuration(retryAfter + "s"); err == nil {
                    time.Sleep(delay)
                    continue
                }
            }
            time.Sleep(h.baseDelay * time.Duration(attempt+1))
            continue
        }

        // Only retry on 5xx errors
        if resp.StatusCode >= 500 && attempt < h.retryAttempts {
            resp.Body.Close()
            delay := time.Duration(math.Pow(2, float64(attempt))) * h.baseDelay
            time.Sleep(delay)
            continue
        }

        return resp, nil
    }

    return nil, lastErr
}
```

## Middleware for API Protection

Protect your APIs with OAuth tokens:

```go
func OAuthMiddleware(config *OAuthConfig) func(http.Handler) http.Handler {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            authHeader := r.Header.Get("Authorization")
            if authHeader == "" {
                http.Error(w, "Authorization header required", http.StatusUnauthorized)
                return
            }

            tokenParts := strings.Split(authHeader, " ")
            if len(tokenParts) != 2 || tokenParts[0] != "Bearer" {
                http.Error(w, "Invalid authorization header format", http.StatusUnauthorized)
                return
            }

            token := tokenParts[1]

            // Validate token
            introspection, err := introspectToken(config.IntrospectURL, token, config.ClientID, config.ClientSecret)
            if err != nil {
                http.Error(w, "Token validation failed", http.StatusUnauthorized)
                return
            }

            if !introspection.Active {
                http.Error(w, "Token is not active", http.StatusUnauthorized)
                return
            }

            // Add user info to context
            ctx := context.WithValue(r.Context(), "user_id", introspection.Sub)
            ctx = context.WithValue(ctx, "scopes", strings.Split(introspection.Scope, " "))

            next.ServeHTTP(w, r.WithContext(ctx))
        })
    }
}
```

## Testing OAuth Implementation

Test your OAuth flows properly:

```go
func TestOAuthFlow(t *testing.T) {
    // Mock OAuth server
    mockServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        switch r.URL.Path {
        case "/token":
            response := TokenResponse{
                AccessToken:  "mock_access_token",
                TokenType:    "Bearer",
                ExpiresIn:    3600,
                RefreshToken: "mock_refresh_token",
                Scope:        "read write",
            }
            json.NewEncoder(w).Encode(response)
        case "/introspect":
            response := IntrospectionResponse{
                Active:   true,
                Scope:    "read write",
                ClientID: "test_client",
                Sub:      "user123",
                Exp:      time.Now().Add(time.Hour).Unix(),
            }
            json.NewEncoder(w).Encode(response)
        }
    }))
    defer mockServer.Close()

    config := &OAuthConfig{
        ClientID:      "test_client",
        ClientSecret:  "test_secret",
        TokenURL:      mockServer.URL + "/token",
        IntrospectURL: mockServer.URL + "/introspect",
    }

    // Test token exchange
    token, err := exchangeCodeWithErrorHandling(
        config.TokenURL,
        config.ClientID,
        config.ClientSecret,
        "test_code",
        "http://localhost:8080/callback",
    )

    assert.NoError(t, err)
    assert.Equal(t, "mock_access_token", token.AccessToken)

    // Test token introspection
    introspection, err := introspectToken(
        config.IntrospectURL,
        token.AccessToken,
        config.ClientID,
        config.ClientSecret,
    )

    assert.NoError(t, err)
    assert.True(t, introspection.Active)
}
```

## Key Points

1. **OAuth is about delegation of access, not authentication**
2. **Always use HTTPS** for OAuth flows
3. **Implement proper state validation** to prevent CSRF attacks
4. **Use PKCE for public clients** - Mobile apps and SPAs require this
5. **Store tokens securely** and handle expiration properly
6. **Follow the principle of least privilege** with scopes
7. **Validate all parameters** and handle errors gracefully

## Summary

Understanding OAuth is essential when building modern applications that integrate with external services. The concepts may appear complex initially, but the patterns become clear with implementation experience.

These fundamentals helped me build a robust IAM service at Freightify that handles thousands of users daily. OAuth, when implemented correctly, provides a secure and scalable way to handle authorization in distributed systems.

Security should be built from the ground up, not added as an afterthought. OAuth provides the necessary tools for secure implementation.
]]></content:encoded>
            <author>raghunandhanvr@outlook.com (Raghunandhan VR)</author>
        </item>
        <item>
            <title><![CDATA[munnar]]></title>
            <link>https://raghu.app/writings/munnar</link>
            <guid isPermaLink="false">https://raghu.app/writings/munnar</guid>
            <pubDate>Mon, 06 Apr 2026 05:28:42 GMT</pubDate>
            <content:encoded><![CDATA[import { BlogViewCounter } from '@/app/components/ui/blog-view-counter'
import { TableOfContents } from '@/app/components/ui/table-of-contents'

export const metadata = {
  title: 'Munnar',
  description: 'Just a few tips for exploring Munnar',
  alternates: {
    canonical: '/writings/munnar',
  },
  openGraph: {
    images: [
      {
        url: `/api/og?title=Munnar`,
        width: 1200,
        height: 630,
      },
    ],
  },
};

# Munnar
<BlogViewCounter slug='/writings/munnar' createdAt={new Date('2025-09-27')} />
<TableOfContents />

I've explored most of the mountains in South India. Munnar is simply the best, no question. Munnar is the best.

Still for those who want a true deep forest experience, go to [Gavi](https://www.google.com/search?q=Gavi+Forest). Now, let's talk about Munnar.

Being from a village near Udumalpet, I'm less than 60km from Munnar, so I've spent countless weekends exploring these hills. 

Munnar consists of 4 major road directions from the town:
- One towards **Udumalpet** (my home route)
- One towards **Vattavada** (most popular tourist route)
- One towards **Theni** (Bodi and Kumili in Poppara junction)
- Another towards **Cochin** (fully damaged and nasty as of 2025)

All roads except Cochin are fantastic for travel.

## Vattavada Road (Common tourist spots, the Traffic is always there)

These places are for **families with big crowds**, and this road will be in traffic most of the time until Top Station. Perfect for **first-time visitors** who want the typical Munnar experience.

1. [Boomer uncle tourist spots](https://www.google.com/search?q=Munnar+botanical+garden+elephant+safari) like botanical garden, elephant safari. Just skip them.
2. [Top Station](https://www.google.com/search?q=Top+Station+Munnar) - The highest point, great views but always crowded
3. [Vattavada](https://www.google.com/search?q=Vattavada+Munnar) - Small village with organic farms, less touristy
4. [Dams on the way](https://www.google.com/search?q=Munnar+dams) - Good photo spots, especially during monsoon
5. [Tea estates on the road](https://www.google.com/search?q=Munnar+tea+estates) - Multiple stops for tea tasting

## Theni Road (Mostly Trekking)

I highly recommend this for **trekking enthusiasts**. This route takes you through some of the most **challenging and rewarding treks** in the region.

1. [Chokramudi Trek](https://www.google.com/search?q=Chokramudi+Trek+Munnar) - Moderate difficulty, 3-4 hours, amazing sunrise views
2. [Kollukumalai Trek](https://www.google.com/search?q=Kollukumalai+Trek+Munnar) - Starts at 5am, need to reach before sunrise, but worth it
3. [Mesapulimala Trek](https://www.google.com/search?q=Mesapulimala+Trek+Munnar) - One of the toughest, 6-8 hours, for experienced trekkers
4. [Tea estates on the road](https://www.google.com/search?q=Theni+road+tea+estates+Munnar) - Less crowded than Vattavada route, better for photography

## Udumalpet Road

This is **my home route** - less touristy, more authentic. Perfect for those who want to experience the **real Munnar without the crowds**.

1. [Thoovanam waterfalls Trek](https://www.google.com/search?q=Thoovanam+waterfalls+Trek+Munnar) - 2-3 hours, beautiful waterfall, good for families
2. [Kanthaloor half day visit](https://www.google.com/search?q=Kanthaloor+Munnar) - Apple orchards and spice gardens, unique to this area
3. [Anamudi](https://www.google.com/search?q=Anamudi+Munnar) - South India's highest peak, restricted access but worth the effort
4. [Chinnar forest safari](https://www.google.com/search?q=Chinnar+forest+safari+Munnar) - Wildlife spotting, best during early morning or late evening
5. [Tea estates on the road](https://www.google.com/search?q=Udumalpet+road+tea+estates+Munnar) - Working plantations, not just tourist spots

## Hostels

1. [Zostel](https://www.google.com/search?q=Zostel+Munnar)
2. [The hostellers](https://www.google.com/search?q=The+hostellers+Munnar)

## Campsites

1. [Cox Cargill](https://www.google.com/search?q=Cox+Cargill+Munnar)
2. [Cloud Farm Munnar](https://www.google.com/search?q=Cloud+Farm+Munnar)
3. [Outernest campers](https://www.google.com/search?q=Outernest+campers+Munnar)
4. [Magic valley](https://www.google.com/search?q=Magic+valley+Munnar)
5. [Wild sherpas tenting and camping](https://www.google.com/search?q=Wild+sherpas+Munnar)

## Decent stays for family

1. [Tea county](https://www.google.com/search?q=Tea+county+Munnar)
2. [The Fog Munnar](https://www.google.com/search?q=The+Fog+Munnar)
3. [Thrill Holiday](https://www.google.com/search?q=Thrill+Holiday+Munnar)
4. [K Mansion](https://www.google.com/search?q=K+Mansion+Munnar)
5. [Cloud castle resorts](https://www.google.com/search?q=Cloud+castle+resorts+Munnar)

## Luxury stays

1. [Eden woods resort](https://www.google.com/search?q=Eden+woods+resort+Munnar)
2. [Elephant passage](https://www.google.com/search?q=Elephant+passage+Munnar)
3. [Ragamaya resort](https://www.google.com/search?q=Ragamaya+resort+Munnar)
4. [The panoramic getaway](https://www.google.com/search?q=The+panoramic+getaway+Munnar)

## Best Airbnbs

1. [Mudhouse marayoo](https://www.google.com/search?q=Mudhouse+marayoo+Munnar)
2. [Footprint - wilderness experience](https://www.google.com/search?q=Footprint+Munnar+wilderness)

## Peak Male Experience (5 Days)

**5 days for guys who want to feel themselves 100% in the forest:**

- **Day 1**: **Deep in the forest and camp**, one day inside the forest, no phone, no internet, no distractions, just you and the forest
- **Day 2**: **Hidden waterfall trek** through dense forests, ending at a crystal clear natural pool
- **Day 3**: **Off-road biking** to mountain peaks, just pure one day adventure, peak male experience
- **Day 4**: **Volunteering in Vattavada** - work with locals
- **Day 5**: **Night camping** under starry skies at a secluded mountain peak

*This will make you feel the forest and you.* 

Contact me at **+91 8667322394**.

Need still more of adventure? Let's talk about [Gavi](https://www.google.com/search?q=Gavi).

## Self Drive Bikes

1. [Gokulam](https://www.google.com/search?q=Gokulam+Munnar+bike+rental): +91 9447237165
2. [Shalom](https://www.google.com/search?q=Shalom+Munnar+bike+rental): +91 8281792798
3. [MBR](https://www.google.com/search?q=MBR+Munnar+bike+rental): +91 9447303119
4. [Sangeetha](https://www.google.com/search?q=Sangeetha+Munnar+bike+rental): +91 9447220648

## Self Drive Cars

1. [Royalpicks](https://www.google.com/search?q=Royalpicks+Munnar+car+rental): +91 9629926888

## Now the below items are for those who are new to Munnar

## Boys Trip (New to Munnar)

**4 days with trekking, volunteering, and authentic experiences for first-time visitors:**

- **Day 1**: Rent 300cc+ bike, explore **Vattavada route** early morning, stay in campsites (40km from Munnar)
- **Day 2**: Return from Vattavada, head to **Udumalpet road**, stay in Marayoor, explore Kanthaloor road (60km from Munnar)
- **Day 3**: Start at 5am for **Kollukumalai trek** (reach before sunrise), return and do **Chokramudi trek** based on energy (both on Theni road, 30km from Munnar)
- **Day 4**: **Thoovanam falls** and **Chinnar safari**, volunteer in Vattavada, explore remaining spots

## Family Trip Plans

### 3-Day Family Plan (No Trekking)

- **Day 1**: Vattavada route - Top Station, tea estates, dams, botanical garden
- **Day 2**: Udumalpet road - Kanthaloor, Thoovanam falls, Chinnar safari
- **Day 3**: Munnar town - Tea museum, local markets, relaxation

### 3-Day Family Plan (Light Trekking)

- **Day 1**: Vattavada route + easy trek to Top Station
- **Day 2**: Udumalpet road - Kanthaloor, Thoovanam falls (easy trek), Chinnar safari
- **Day 3**: Theni road - Chokramudi trek (moderate), tea estates

### 3-Day Family Plan (With Elderly)

- **Day 1**: Munnar town - Tea museum, local sightseeing, botanical garden
- **Day 2**: Vattavada route - Dams, tea estates (no trekking), comfortable stays
- **Day 3**: Udumalpet road - Kanthaloor (by car), Chinnar safari (comfortable vehicle)
]]></content:encoded>
            <author>raghunandhanvr@outlook.com (Raghunandhan VR)</author>
        </item>
        <item>
            <title><![CDATA[homelab]]></title>
            <link>https://raghu.app/writings/homelab</link>
            <guid isPermaLink="false">https://raghu.app/writings/homelab</guid>
            <pubDate>Mon, 06 Apr 2026 05:28:42 GMT</pubDate>
            <content:encoded><![CDATA[import MermaidDiagram from '@/app/components/md/mermaid'
import { BlogViewCounter } from '@/app/components/ui/blog-view-counter'
import { TableOfContents } from '@/app/components/ui/table-of-contents'

export const metadata = {
  title: 'From a Gaming Laptop to a Family-Managed Server',
  description: 'Transform a gaming laptop into a powerful home server, managed remotely from Chennai with family support. A complete guide covering everything from basic setup to crypto mining, with plenty of real-world experiences.',
  alternates: {
    canonical: '/writings/homelab',
  },
  openGraph: {
    images: [
      {
        url: `/api/og?title=From+a+Gaming+Laptop+to+a+Family-Managed+Server`,
        width: 1200,
        height: 630,
      },
    ],
  },
};

# From a Gaming Laptop to a Family-Managed Server
<BlogViewCounter slug="/writings/homelab" createdAt={new Date('2024-09-21')} />
<TableOfContents />

## Introduction

Moving to Chennai for an SDE role meant leaving behind my trusted **ASUS ROG laptop** - a machine that had faithfully served through countless RCB matches and intense coding sessions. With its **RTX 3060**, **16GB RAM**, and blazing-fast **1TB SSD**, it felt wrong to let it gather dust at home. One particularly frustrating evening, while battling with cloud costs for my side projects and sipping my evening filter coffee, an idea struck. Why not transform this gaming beast into a home server?

## The Birth of a Distributed System

But this wasn't going to be just another server setup. Being **600 kilometers away** from the hardware meant this would become a family project, whether they were ready for it or not. The gaming machine that once ran **PUBG** would now need to run **Docker containers**, and my casual gaming discussions with Dad would turn into talks about **system monitoring** and **power backups**.

Yes, I can finally SSH into my own server and deploy or do whatever I want.

Let's see what we're building here:

<MermaidDiagram diagram={`
graph TD
    Internet((Internet)) --> CF[Cloudflare DNS]
    CF --> Router[Home Router]
    Router -->|Port Forward| Server[Gaming Laptop Server]
    Server --> |Container| NGP[Nginx Proxy]
    NGP --> |Container| Apps[Application Stack]
    NGP --> |Container| Dev[Development Stack]
    NGP --> |Container| AI[AI Models]
    Apps --> Media[Media Server]
    Apps --> Cloud[Personal Cloud]
    Dev --> Git[Git Server]
    Dev --> CI[CI/CD Pipeline]
    AI --> LLM[Local LLMs]
    AI --> Code[Code Assistants]
`}/>

### Running a Server Remotely

Running a server remotely adds a whole new dimension to the concept of **distributed systems**. Your primary "operations team" consists of family members who think rebooting means switching the power off and on again. Every aspect of the setup needed to be foolproof, every instruction crystal clear, and every potential failure accounted for. **"Mom, laptop mela oru button irukum, please click it once?"** became our version of high-availability system management.

> **First Deployment Story**: During the initial setup weekend, I spent more time on video calls explaining what a power button looks like than actually configuring the server. Dad's suggestion to **"put a bright sticker near important buttons"** turned out to be surprisingly effective DevOps wisdom!

## The Family Support Matrix

Our version of **high availability** relies more on human reliability than system redundancy:

- **Mom**: 
  - **Primary Role**: Power Management 
  - **Backup Role**: Basic Restarts 
  - **Special Skills**: UPS beep interpreter

- **Dad**: 
  - **Primary Role**: Electricity Monitor 
  - **Backup Role**: Backup Power 
  - **Special Skills**: Power cut predictor

- **Sister**: 
  - **Primary Role**: Service Monitoring 
  - **Backup Role**: Cable Management 
  - **Special Skills**: Grafana dashboard reader

## Technical Foundation

### Infrastructure Evolution: From Windows to Linux

Remember trying to run multiple Docker containers on Windows? It's like trying to find parking in T Nagar during Diwali season - technically possible but way too complicated. Here's why Windows doesn't cut it:

- High resource overhead
- Random updates (worse than Chennai traffic during rains)
- WSL2 + Docker = Unnecessary complexity
- Permission systems more complex than ECR road signals

### Base Installation

First, grab Ubuntu Server (LTS version). Like choosing between idli and dosa - go with what's proven and reliable. During installation:
- Minimal installation (we're not running a gaming PC anymore)
- Enable OpenSSH server (for remote access)
- Skip optional snaps (we'll use Docker instead)

### Power Management (Critical for Laptop Servers!)

> **Power Cut Protocol**: Created a laminated guide showing the UPS beep patterns and their meanings. Mom now identifies critical battery status faster than any monitoring tool!

My native's power situations can be as unpredictable as Chennai Super Kings' batting order. Let's handle that:

```bash
#!/bin/bash
# /usr/local/bin/power-handler.sh

LOG="/var/log/power-handler.log"

check_power() {
    power_status=$(cat /sys/class/power_supply/AC*/online)
    
    if [[ $power_status == "0" ]]; then
        echo "$(date): Power lost, initiating safe mode" >> $LOG
        systemctl suspend
    fi
}

while true; do
    check_power
    sleep 5
done
```

Make it a service:

```bash
sudo nano /etc/systemd/system/power-handler.service

[Unit]
Description=Power Management Service
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/bin/power-handler.sh
Restart=always

[Install]
WantedBy=multi-user.target
```

Disable the lid sleep so that we are live during sleep:

```bash
sudo sed -i 's/#HandleLidSwitch=suspend/HandleLidSwitch=ignore/' /etc/systemd/logind.conf
sudo systemctl restart systemd-logind
```

### Network and DDNS Setup

Next, let's solve the dynamic IP problem. Home ISPs love changing IPs more frequently than Thatha changes TV channels:

1. **Assign a Static IP**: Configure your router to assign a static IP to your gaming laptop. This ensures that the laptop always has the same local IP address, making it easier to manage.

    ```bash
    ip addr | grep enp
    sudo nano /etc/netplan/00-installer-config.yaml #update the addresses
    ```

2. **Port Forwarding**: Forward the required ports on your router to the static IP of your gaming laptop. This allows external access to services running on the laptop. Common ports to forward include:
   - **80** (HTTP)
   - **443** (HTTPS)
   - **whatever port you use for your use**

<details>
<summary>DDNS Update Script</summary>

```bash
#!/bin/bash
API_TOKEN="your_cloudflare_token"
ZONE_ID="your_zone_id"
RECORD_ID="your_record_id"
DOMAIN="server.yourdomain.com"
LOG_FILE="/var/log/ddns-updater.log"

CURRENT_IP=$(curl -s http://ipv4.icanhazip.com)
OLD_IP=$(cat /tmp/last_ip.txt 2>/dev/null)

if [[ "$CURRENT_IP" != "$OLD_IP" ]]; then
    echo "$(date): IP changed from $OLD_IP to $CURRENT_IP" >> $LOG_FILE
    
    RESPONSE=$(curl -s -X PUT "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/dns_records/$RECORD_ID" \
        -H "Authorization: Bearer $API_TOKEN" \
        -H "Content-Type: application/json" \
        --data "{
            \"type\":\"A\",
            \"name\":\"$DOMAIN\",
            \"content\":\"$CURRENT_IP\",
            \"proxied\":true
        }")
    
    if echo "$RESPONSE" | grep -q '"success":true'; then
        echo "$CURRENT_IP" > /tmp/last_ip.txt
        telegram-send "Server IP updated successfully"
    else
        telegram-send "Failed to update server IP!"
    fi
fi
```

</details>

Add to crontab to run every 5 minutes:
```bash
*/5 * * * * /usr/local/bin/cloudflare-ddns.sh
```

### Basic Infrastructure

Here's our complete setup:

<MermaidDiagram diagram={`
graph TD
    Internet((Internet)) --> CF[Cloudflare]
    CF --> Router[Home Router]
    Router --> Nginx[Nginx Proxy]
    Nginx --> |Web Services| Services[Service Stack]
    Nginx --> |Media| Plex[Media Server]
    Nginx --> |Development| Dev[Dev Stack]
    Nginx --> |Crypto| Trading[Trading Stack]
    
    subgraph Service Stack
        Monitoring[Grafana]
        Cloud[NextCloud]
        Git[Gitea]
    end
    
    subgraph Recovery
        Power[Power Management]
        Auto[Auto Recovery]
        Alert[Alert System]
    end

    Services --> Recovery
`}/>

### Core Services Setup

**Nginx Proxy Manager (The Gateway):** Manages routing and SSL termination using Docker. Essential for securing and directing incoming network traffic.

**Media Server (Plex):** Organizes and streams multimedia content, optimized with Docker and hardware acceleration to enhance media delivery.

**Development Environment:** Incorporates Gitea for code hosting, Postgres for database operations, and a Hardhat node for blockchain interactions, all maintained within Docker for environment consistency.

**Install whatever you need:** This setup is flexible and can accommodate any number of services you may need. Just remember to install and configure them as required for your specific setup.

### Crypto & Trading Setup

Facilitates trading and crypto mining with a dedicated trading bot and a mining monitor dashboard. These Docker-based services help manage and monitor trading on Binance and mining operations effectively.

**Installation Note:** Ensure you install and configure all necessary services as required for your specific setup. This approach helps maintain a streamlined and efficient server infrastructure.

## Family Operations Guide

### Visual Status Indicators

- **Green LED**: All Systems Normal🟢 
  - **Action Required**: None 
  - **Family Member**: Anyone

- **Yellow LED**: Check Required 🟡
  - **Action Required**: Call if persists 
  - **Family Member**: Sister

- **Red LED**: Immediate Action 🔴
  - **Action Required**: Call Immediately 
  - **Family Member**: Mom

- **Trading Alert**: Price Target Hit 📊
  - **Action Required**: Check Dashboard 
  - **Family Member**: Dad

## Monitoring Dashboard

### System Metrics

Try installing Grafana and Prometheus on your server. These tools can provide valuable insights into your server's performance and health, helping you identify and address any issues promptly. Also there is a way you can monitor your server's power status and battery health using a UPS beep interpreter.

### Alert System

- **Disk Space Low**: 
  - **Message**: "Storage running low!" 
  - **Action**: Delete old movies 
  - **Family Member**: Sister

- **High CPU**: 
  - **Message**: "Server needs rest" 
  - **Action**: Wait 30 minutes 
  - **Family Member**: Anyone

- **Power Issue**: 
  - **Message**: "Check UPS" 
  - **Action**: Verify power 
  - **Family Member**: Mom

- **Trading Alert**: 
  - **Message**: "Price target hit!" 
  - **Action**: Check trading dashboard 
  - **Family Member**: Dad

```bash
#!/bin/bash
# Auto Recovery Script
SERVICES=("plex" "gitea" "nginx" "trading-bot" "mining-monitor")
LOG="/var/log/auto-recovery.log"

check_and_restart() {
    service=$1
    if [[ $(docker inspect -f '{{.State.Running}}' $service) != "true" ]]; then
        echo "[$(date)] Restarting $service" >> $LOG
        docker restart $service
    fi
}

for service in "${SERVICES[@]}"; do
    check_and_restart $service
done
```

## Future Plans

### Hardware Upgrades
- **Better UPS system**: High priority
- **4G backup internet**: Medium priority
- **External storage array**: Low priority
- **Mining optimization**: Low priority

### Services to Add
- **Home automation hub**: Medium priority
- Network-wide ad blocking: High priority
- **Advanced trading algorithms**: Low priority
- **ML-based prediction models**: Medium priority

### Family Training
- **Basic Docker commands**: High priority
- **Grafana dashboard reading**: Medium priority
- **Trading dashboard monitoring**: Low priority
- **Basic security protocols**: High priority

## Lessons Learned

What started as a way to repurpose my gaming laptop has evolved into a full family project. The machine that once ran **PUBG** now serves as our family's digital hub, running everything from **AI models** to **crypto operations**. More importantly, it's brought an unexpected tech awareness to the family - from Mom's expertise in UPS signals to Sister's proud Grafana dashboard monitoring.

My family thought I was slightly crazy when I suggested this project. Now Dad checks the mining stats before his morning coffee, Sister has become our home's junior DevOps engineer, and Mom knows exactly which beep means what. Who would've thought a gaming laptop could bring both technology and family together this way?

## Conclusion

Remember:
1. **Make everything foolproof**
2. **Document extensively**
3. **Keep family involved**
4. **Appreciate the support**
5. **Plan for failures**
6. **Keep security tight** (especially if you're running anything interesting)

Whether you're a developer looking to save on cloud costs or someone wanting to learn about self-hosting, remember - with some creativity and family support, that old gaming laptop can become something amazing!

P.S. Special thanks to my family, who now casually drops terms like **"hash rate"** and **"system updates"** in regular conversations! And to potential attackers - yes, this blog post is also a honeypot. Have fun!]]></content:encoded>
            <author>raghunandhanvr@outlook.com (Raghunandhan VR)</author>
        </item>
        <item>
            <title><![CDATA[fault-tolerance]]></title>
            <link>https://raghu.app/writings/fault-tolerance</link>
            <guid isPermaLink="false">https://raghu.app/writings/fault-tolerance</guid>
            <pubDate>Mon, 06 Apr 2026 05:28:42 GMT</pubDate>
            <content:encoded><![CDATA[import { BlogViewCounter } from "@/app/components/ui/blog-view-counter";
import { TableOfContents } from "@/app/components/ui/table-of-contents";

export const metadata = {
  title: "Fault Tolerance",
  description:
    "Understanding fault tolerance patterns, isolation strategies, and operational practices for building resilient distributed systems.",
  alternates: {
    canonical: "/writings/fault-tolerance",
  },
  openGraph: {
    images: [
      {
        url: `/api/og?title=Fault+Tolerance`,
        width: 1200,
        height: 630,
      },
    ],
  },
};

# Fault Tolerance

<BlogViewCounter
  slug="/writings/fault-tolerance"
  createdAt={new Date("2025-07-20")}
/>
<TableOfContents />

## Isolation

- Physical and logical independence between components
- Failure containment—prevent cascading failures
- Critical path with minimal dependencies
- Failures remain localized to origin component

```sql
-- auth service down..?, queries still execute

select * from orders where user_id = 123;
```

## Redundancy

- Multiple copies of every critical component
- Isolated replicas across availability zones
- Geographic distribution for regional failure protection
- Eliminate single points of failure

```
primary: us-east-1a | replica1: us-east-1b | replica2: us-east-1c
```

## Static Stability

- Maintain last known good state during failures
- Overprovision capacity for load absorption
- No configuration changes during incidents
- Fail to known safe state

```yaml
# config service fails, use cached configuration

last_known_good: { max_connections: 1000, timeout: 30s }
```

## Architecture Patterns

### Control Plane vs Data Plane

- Control plane: management, billing, configuration
- Data plane: serves traffic, stores data, operates independently
- Unidirectional dependency: data plane never depends on control plane

```
control plane api down → database continues serving queries
```

### Multi-Zone/Multi-Region

- Minimum three availability zones per cluster
- Automatic failover between zones
- Read replicas across multiple regions
- Regional promotion capabilities

```
cluster topology:

primary(us-east) → read_replica(eu-west) → read_replica(ap-south)
```

## Operational Practices

### Continuous Failover Testing

- Weekly production failover exercises
- Proactive issue detection
- Query buffering during transitions
- Failover as standard operation

```bash
# scheduled weekly failover

mysql> set global read_only=1; -- demote primary
mysql> set global read_only=0; -- promote replica
```

### Progressive Delivery

- Deploy to development environments first
- Feature flags enable granular rollout control
- Multi-week validation before production
- Minimize change impact radius

```json
{
  "feature_new_replication": {
    "dev": true,
    "staging": true,
    "prod": false
  }
}
```

### Synchronous Replication

- Commit acknowledgment requires replica confirmation
- Zero data loss during failovers
- All replicas maintain promotion readiness
- MySQL semi-sync, Postgres synchronous commit

```sql
-- mysql semi-sync: require minimum 1 replica acknowledgment

set global rpl_semi_sync_master_wait_for_slave_count = 1;
```

## Failure Handling

### Instance Level

- Immediate replica promotion on primary failure
- Block storage: volume detach/reattach to healthy instances
- Local storage: provision replacements, decommission failed instances

```bash
# ebs volume migration

aws ec2 detach-volume --volume-id vol-dead
aws ec2 attach-volume --volume-id vol-dead --instance-id i-healthy
```

### Zone Level

- Automatic failover to healthy zone replicas
- Query routing transparently redirects traffic
- Zero manual intervention required

```
proxySQL: us-east-1a (failed) → us-east-1b (new primary)

application connection strings remain unchanged
```

### Region Level

- Regional clusters operate independently
- Cross-region failures isolated
- Manual promotion of read-only regions available

```sql
-- promote eu-west read replica to primary

alter system promote standby database to primary;
```
]]></content:encoded>
            <author>raghunandhanvr@outlook.com (Raghunandhan VR)</author>
        </item>
        <item>
            <title><![CDATA[dx]]></title>
            <link>https://raghu.app/writings/dx</link>
            <guid isPermaLink="false">https://raghu.app/writings/dx</guid>
            <pubDate>Mon, 06 Apr 2026 05:28:42 GMT</pubDate>
            <content:encoded><![CDATA[import { BlogViewCounter } from '@/app/components/ui/blog-view-counter'
import { TableOfContents } from '@/app/components/ui/table-of-contents'

export const metadata = {
  title: 'On Developer Experience',
  description: 'Some thoughts on what makes a great DX',
  alternates: {
    canonical: '/writings/dx',
  },
  openGraph: {
    images: [
      {
        url: `/api/og?title=On+Developer+Experience`,
        width: 1200,
        height: 630,
      },
    ],
  },
};

# On Developer Experience
<BlogViewCounter slug='/writings/dx' createdAt={new Date('2025-01-24')} />
<TableOfContents />

My perspective on what creates a good developer experience:

## Shipping

- Stop talking, start shipping—just get it out
- We’re only as good as our last release

## Collaboration

- Why waste time in meetings when pair programming gets things done?
- Clear, direct chats beat long email threads

## Intensity

- Think big—actually do it
- If something feels off, say it
- Don’t shy away from tough questions

## Craftsmanship

- Details matter—take full ownership of what you build
- Leave the code cleaner than you found it, but only if you’re working on that part

## Autonomy

- Figure out your role, then own it
- If you’re stuck, ask for help early and move on

## Humility

- It’s fine to be wrong—just learn and move forward
- Confidence is good; arrogance isn’t
- Be open to others’ ideas, even if you’re sure you’re right
]]></content:encoded>
            <author>raghunandhanvr@outlook.com (Raghunandhan VR)</author>
        </item>
        <item>
            <title><![CDATA[deletion-focused]]></title>
            <link>https://raghu.app/writings/deletion-focused</link>
            <guid isPermaLink="false">https://raghu.app/writings/deletion-focused</guid>
            <pubDate>Mon, 06 Apr 2026 05:28:42 GMT</pubDate>
            <content:encoded><![CDATA[import { BlogViewCounter } from '@/app/components/ui/blog-view-counter'
import { TableOfContents } from '@/app/components/ui/table-of-contents'

export const metadata = {
  title: 'Code for Deletion, Not Reuse',
  description: 'Good code isn\'t reusable; it\'s deletable. Every abstraction is a bet on future requirements you\'ll probably get wrong.',
  alternates: {
    canonical: '/writings/deletion-focused',
  },
  openGraph: {
    images: [
      {
        url: `/api/og?title=Code+for+Deletion%2C+Not+Reuse`,
        width: 1200,
        height: 630,
      },
    ],
  },
};

# Code for Deletion, Not Reuse
<BlogViewCounter slug="/writings/deletion-focused" createdAt={new Date('2025-07-29')} />
<TableOfContents />

**TL;DR**: Good code isn't **reusable**; it's **deletable**. Every abstraction is a bet on future requirements you'll probably get wrong. Write code that's **easy to throw away**.

## Why?

Software requirements **change faster than we can predict them**. The code you write today solving tomorrow's imagined problems becomes the **legacy nightmare you can't remove** next year.

**Key Insights**:
- **Duplication is cheaper than the wrong abstraction**
- Tight coupling makes change expensive. **Optimize for disposal** instead
- **Maintenance cost > Writing cost**
- Ask **"How do I delete this?"** not "How do I reuse this?"

## The Reusability Trap

Every shared abstraction creates a **web of dependencies**. The more code that depends on your "reusable" component, the more expensive it becomes to change until it's **effectively frozen forever**.

```go
// What starts as "reusable"
type AuthService interface {
    Authenticate(ctx context.Context, opts ...AuthOption) (*User, error)
}

// Becomes unmaintainable
func (a *AuthManager) Authenticate(ctx context.Context, opts ...AuthOption) (*User, error) {
    // 47 microservices depend on this
    // 15 auth providers
    // 500 lines of edge cases
    // Can't change without breaking everything
}
```

**Every consumer multiplies the cost of change.**

## But What About DRY?

**The Counterargument**: "Don't Repeat Yourself is a fundamental principle! Duplication leads to bugs when you update one place but forget another. We should abstract early to prevent inconsistencies."

**The Reality**: DRY prevents one type of bug but creates another: **premature abstraction**. Yes, duplication can cause sync issues, but **wrong abstractions cause architectural paralysis**. A duplicated bug is annoying. A wrong abstraction touching 50 files is **a crisis**.

```go
// The DRY advocate's dream
type DataProcessor interface {
    Process(data interface{}) (interface{}, error)
}

// The reality: each implementation fights the abstraction
func (p *JSONProcessor) Process(data interface{}) (interface{}, error) {
    // 40 lines of type assertions and special cases
}
```

## Copy-Paste Driven Development

Duplication lets you understand patterns through **experience, not speculation**. It's easier to extract the right abstraction from **three examples** than to guess it from one.

```go
// Monday: Stripe handler
func handleStripe(amount int64, token string) error {
    return stripe.Charge(amount, token)
}

// Tuesday: PayPal handler (copied)
func handlePayPal(amount int64, token string) error {
    return paypal.Process(amount, token)
}

// Friday: Pattern emerges naturally
type PaymentHandler interface {
    Process(amount int64, token string) error
}
```

**Abstraction after repetition, not before.**

## "But This Doesn't Scale!"

**The Counterargument**: "In large organizations, we need shared libraries and consistent patterns. Without reusable components, every team reinvents the wheel. This leads to chaos."

**The Reality**: Shared libraries become **dependency nightmares**. That "standard" authentication library? Now you need **6 months and 12 teams** to agree on any change. Meanwhile, teams work around it, creating the **very inconsistency you tried to avoid**.

```go
// The "shared" library everyone depends on
import "company/shared/auth" // version locked since 2019

// What teams actually do
func authenticateUser(token string) (*User, error) {
    // Call the shared library
    user, err := auth.Validate(token)
    
    // Then work around its limitations
    if user.Type == "special_case_shared_lib_doesnt_handle" {
        // 50 lines of workarounds
    }
}
```

## Layer by Volatility

**Business logic changes constantly. Infrastructure rarely does.** Keep them separate: **volatile code at the top, stable code at the bottom**. This way, you **mostly delete from the top**.

```
┌─────────────────┐
│ Business Logic  │ ← Changes daily (easy to delete)
├─────────────────┤
│ Domain Models   │ ← Changes monthly
├─────────────────┤
│ HTTP/Database   │ ← Changes yearly (hard to delete)
└─────────────────┘
```

## "What About Code Reviews?"

**The Counterargument**: "Duplicated code makes reviews harder. Reviewers see the same logic repeatedly. Abstractions make code more readable and reviewable."

**The Reality**: Reviewing a **wrong abstraction is worse than reviewing duplication**. With duplication, you **see exactly what code does**. With a bad abstraction, you chase through **layers of indirection** wondering why `AbstractFactoryBuilderStrategy` exists.

```go
// Easy to review (even if duplicated)
func calculateTax(amount float64) float64 {
    return amount * 0.08
}

// "Abstracted" version
func calculateTax(amount float64) float64 {
    return getTaxStrategy().
        withRegion(getRegion()).
        withRules(loadRules()).
        calculate(amount)
}
```

**Which would you rather debug at 3 AM?**

## The Deletion Checklist

Before writing code, ask:
- **Can I delete this without touching other files?**
- **What breaks when I remove this?**
- Is this solving **today's problem** or **tomorrow's maybe-problem**?
- Am I creating an abstraction from **actual patterns** or **imagined ones**?

## The Hard Truth

Yes, this approach has tradeoffs. Yes, you'll have some duplication. Yes, it goes against what you learned in CS class. But here's what the DRY advocates won't tell you: **most production codebases are haunted by abstractions someone created in 2015 that nobody can remove**.

Remember: **Every abstraction is a bet on the future, and the house always wins**. Today's "perfect" reusable component is tomorrow's **legacy bottleneck that three teams are afraid to touch**. Write code that **admits it might be wrong**. Build systems that can evolve by **subtraction, not just addition**. Because in the end, the code that survives isn't the code that does everything. **It's the code that can gracefully disappear when its time is up**. ]]></content:encoded>
            <author>raghunandhanvr@outlook.com (Raghunandhan VR)</author>
        </item>
        <item>
            <title><![CDATA[decentralization]]></title>
            <link>https://raghu.app/writings/decentralization</link>
            <guid isPermaLink="false">https://raghu.app/writings/decentralization</guid>
            <pubDate>Mon, 06 Apr 2026 05:28:42 GMT</pubDate>
            <content:encoded><![CDATA[import MermaidDiagram from '@/app/components/md/mermaid'
import { BlogViewCounter } from '@/app/components/ui/blog-view-counter'
import { TableOfContents } from '@/app/components/ui/table-of-contents'

export const metadata = {
  title: 'Rethinking the Foundations: Towards a Truly Decentralized Internet',
  description: 'This analysis explores the need for a revamped internet protocol stack to ensure reliability, security, and real decentralization in light of recent major service outages.',
  alternates: {
    canonical: '/writings/decentralization',
  },
  openGraph: {
    images: [
      {
        url: `/api/og?title=Rethinking+the+Foundations:+Towards+a+Truly+Decentralized+Internet`,
        width: 1200,
        height: 630,
      },
    ],
  },
};

# Rethinking the Foundations: Towards a Truly Decentralized Internet
<BlogViewCounter slug="/writings/decentralization" createdAt={new Date('2024-10-07')} />
<TableOfContents />

Despite the promises of **blockchain** and **decentralization**, most digital interactions today still rely on **centralized services**, a paradox starkly highlighted by recent high-profile outages. For example, a **Cloudflare outage** in June 2023 significantly disrupted internet access, emphasizing the vulnerability of centralized internet infrastructure. Furthermore, when **AWS** faced an outage in December, it impacted millions, underscoring the precariousness of relying on a few centralized nodes for critical digital services. As someone passionate about blockchain, I find it frustrating to see the technology's potential for decentralization being undermined by the centralized infrastructure it often runs on.

## The Infrastructure Reality

The internet's current architecture resembles a **medieval kingdom** more than a democratic network. Every packet of data flows through centralized checkpoints, each controlled by a handful of corporate giants. This isn't just about market dominance—it's about fundamental architectural vulnerability:

<MermaidDiagram diagram={`
graph TD
    A[User Request] --> B[DNS Lookup]
    B --> C[Root Servers]
    C --> D[Cloud Provider]
    D --> E[CDN]
    E --> F[Application]
    
    subgraph "Failure Points"
        C
        D
        E
    end
    
    subgraph "Reliability Issues"
        G[Single Region Failure]
        H[Provider Outage]
        I[Route Hijacking]
    end
    
    style C fill:#ff6666
    style D fill:#ff6666
    style E fill:#ff6666
`}/>

The numbers reveal the true extent of this centralization:

1. **Infrastructure Control**
- Three cloud providers control **65%** of all internet infrastructure
- Five companies own **80%** of submarine cables
- Two CDN providers handle **85%** of content delivery

2. **Reliability Costs**
- Average downtime cost: **$5,600** per minute
- **98%** of organizations lose **$100,000+** per hour of downtime
- Multi-region redundancy increases costs by **2-3x**

3. **Service Dependencies**
- **89%** of IPFS gateways operate through **3 organizations**
- **63%** of Ethereum API calls route through **Infura** 
- **92%** of NPM packages depend on centralized registries 

## Beyond Web3's False Promise

I'm a big fan of **Blockchain**. It frustrates me that we are not using blockchain technology the way it was intended. **Web3** aims to decentralize the internet using blockchain technology. However, several fundamental issues prevent it from fulfilling this promise.

<MermaidDiagram diagram={`
graph TD
    subgraph "Web3 Reality"
        A[dApp Frontend] --> B[RPC Provider]
        B --> C[Blockchain]
        C --> D[Cloud Infrastructure]
        
        style B fill:#ff6666
        style D fill:#ff6666
    end
    
    subgraph "Hidden Dependencies"
        E[Domain Control]
        F[Gateway Access]
        G[Storage Systems]
    end
`}/>

### Centralization in Infrastructure

- **Dependence on Centralized Gateways**: Many decentralized applications rely on services like **Infura** or **Alchemy** to interact with the blockchain. These centralized gateways become single points of failure and control, undermining the decentralized ethos.

- **Cloud-Hosted Nodes**: A significant number of blockchain nodes run on centralized cloud providers like **AWS** and **Google Cloud**. This centralizes control and makes the network vulnerable to outages or censorship by these companies.

- **Centralized Storage Solutions**: While the blockchain stores transaction data, larger files like images or videos are often kept off-chain on centralized servers or semi-centralized networks that depend on central gateways.

### Economic and Scalability Challenges

- **High Transaction Costs**: Networks like **Ethereum** experience high gas fees during congestion, making small or frequent transactions impractical for everyday use.

- **Scalability Limitations**: Current blockchain architectures struggle to handle a large number of transactions per second, leading to slow processing times and hindering widespread adoption.

### Governance and Control Issues

- **Concentration of Power**: Despite aiming for decentralization, a small number of entities often hold significant influence over network decisions due to large token holdings or control over mining and validation processes.

### Usability Barriers

- **Complex User Experience**: Managing wallets and private keys is technical and user-unfriendly, creating barriers for mainstream adoption and pushing users toward centralized solutions that compromise decentralization.

The fundamental issue is that true decentralization cannot be achieved if critical components like infrastructure and access points remain centralized. Building decentralized applications on top of centralized services contradicts the very principles that **Web3** advocates.

## The Protocol-Level Solution

Instead of building another layer on top of broken infrastructure, we're rebuilding the foundation itself. This isn't another blockchain project or **Web3** initiative—it's a fundamental reimagining of how computers talk to each other.

### The New Protocol Stack

Our protocol reimagines internet infrastructure from the ground up, ensuring both decentralization and reliability:

<MermaidDiagram diagram={`
graph TD
    subgraph "Protocol Layers"
        A[Universal Protocol] --> B[Resource Layer]
        B --> C[Trust Layer]
        C --> D[Network Layer]
    end
    
    subgraph "Reliability Systems"
        E[Redundancy Control]
        F[Automated Failover]
        G[Load Distribution]
    end
    
    subgraph "Security Measures"
        H[Cryptographic Validation]
        I[Byzantine Consensus]
        J[Reputation Systems]
    end
`}/>

### 1. Decentralized Addressing System

Unlike traditional **DNS** controlled by **ICANN**, our system provides cryptographically secure, censorship-resistant addressing with built-in redundancy:

<MermaidDiagram diagram={`
graph LR
    A[Domain: mysite.key] --> B[Public Key Hash]
    B --> C[Distributed Ledger]
    C --> D[Frontend Location]
    C --> E[Backend Services]
    C --> F[Database Shards]

    subgraph "Reliability"
        G[Multiple Copies]
        H[Geographic Distribution]
        I[Automatic Replication]
    end
`}/>

This system replaces traditional **DNS**, which is controlled by centralized entities like **ICANN**, with a decentralized, cryptographically secure method of addressing internet resources. By using a distributed ledger to store the association between domain names and their corresponding resources, it ensures that addressing is resistant to censorship and manipulation. Each domain name, such as "mysite.key," is linked to a public key hash, which then points to various network resources like frontend locations, backend services, and database shards.

Every resource has multiple service providers, ensuring **24/7 availability**:
- Content is automatically replicated across geographic regions
- Service providers compete on reliability and performance
- Automatic failover ensures continuous operation
- Byzantine fault tolerance handles malicious actors

Reliability is enhanced by storing multiple copies of each resource in different geographic locations, which not only protects against regional failures but also improves load times by serving users from nearby locations. Automatic replication further ensures that changes to any resource are quickly propagated across the network, maintaining consistency and uptime.

### 2. Universal Resource Protocol

The protocol provides a unified interface for all digital services while ensuring consistent performance:

<MermaidDiagram diagram={`
graph TD
    subgraph "Resource Types"
        A[Static Content] --> D[Universal Protocol]
        B[Compute Tasks] --> D
        C[Storage Operations] --> D
    end
    
    subgraph "Quality Assurance"
        E[Performance Monitoring]
        F[Automatic Scaling]
        G[Load Balancing]
    end
    
    D --> E
    D --> F
    D --> G
`}/>

This protocol serves as a unified interface for accessing various types of digital services, including static content, compute tasks, and storage operations. By standardizing how resources are accessed, it simplifies the development of decentralized applications and improves interoperability between different services.

Built-in reliability features:
- Dynamic resource allocation based on demand
- Automatic load balancing across providers
- Real-time performance monitoring
- Quality-based provider selection

Reliability and performance are key focuses. The protocol incorporates dynamic resource allocation to adjust resource distribution based on real-time demand, ensuring efficient use of network resources. Automatic load balancing distributes requests across multiple providers to avoid overloading individual nodes and to reduce latency. Performance monitoring continuously assesses the quality of service provided, enabling the system to make adjustments as needed.

### 3. Native Trust System

Every resource interaction generates its own trust and verification mechanisms:

<MermaidDiagram diagram={`
graph TD
    A[Resource Request] --> B[Market Discovery] 
    B --> C[Automatic Escrow]
    C --> D[Execution]
    D --> E[Settlement]
    
    subgraph "Trust Verification"
        F[Resource Proof]
        G[Performance Monitor] 
        H[Payment Release]
    end
    
    subgraph "Security"
        I[Fraud Prevention]
        J[Dispute Resolution]
        K[Quality Assurance]
    end

    D --> F
    D --> G
    E --> H
    D --> I
    E --> J
    D --> K
`}/>

This component of the protocol is designed to facilitate safe and reliable transactions between unknown parties within the network. It begins with a market discovery process to identify potential service providers. Once a suitable provider is found, an automatic escrow mechanism is used to secure payments until the service is satisfactorily delivered.

When accessing any service:
1. Multiple providers are automatically discovered
2. Service quality is continuously monitored
3. Poor performers are automatically replaced
4. Payments are released only for quality service
5. Disputes are resolved through protocol rules

Trust and security are managed through continuous monitoring of service quality and performance, with mechanisms in place to handle disputes and prevent fraud. If a provider fails to meet agreed-upon service levels, the protocol can automatically switch to a different provider. This ensures that only providers delivering acceptable levels of service are compensated, thereby incentivizing high-quality service across the network.

## Real-World Implementation

Let's examine how existing services migrate and operate on the new protocol stack, with built-in reliability and security measures.

### 1. Video Streaming Platform

Consider how a **YouTube-like service** operates with guaranteed uptime and performance:

<MermaidDiagram diagram={`
graph TD
    subgraph "Content Flow"
        A[Video Upload] --> B[Content Chunking]
        B --> C[DHT Distribution] 
        C --> D[Edge Caching]
    end

    subgraph "Reliability Layer"
        E[Geographic Replication]
        F[Provider Redundancy]
        G[Quality Monitoring]
    end

    subgraph "Economic Incentives"
        H[Performance Rewards]
        I[Uptime Bonuses] 
        J[Quality Multipliers]
    end

    B --> E
    B --> F
    D --> G
    D --> H
    D --> I
    D --> J
`}/>

The protocol ensures **24/7 availability** through:
- Automatic content replication across regions
- Dynamic provider selection based on performance
- Economic incentives for reliable service
- Instant failover to backup providers

A **YouTube-like service** on this new protocol would benefit from a distributed hosting model where videos are chunked into smaller segments and distributed across a decentralized hash table (**DHT**), ensuring efficient retrieval. Edge caching techniques would be employed to deliver content quickly to users worldwide, regardless of origin server locations. The reliability layer involves geographic replication and provider redundancy to handle potential outages seamlessly. The economic incentives layer would reward providers based on performance metrics like uptime and stream quality, encouraging a competitive and high-quality service environment.

### 2. Financial Trading Systems

Secure, high-frequency trading with guaranteed execution:

<MermaidDiagram diagram={`
graph TD
    A[Trade Order] --> B[Market Discovery]
    B --> C[Multi-Provider Execution]
    C --> D[Settlement]

    subgraph "Security Layer"
        E[Order Validation]
        F[Fraud Prevention]
        G[Dispute Resolution]
    end

    subgraph "Performance"
        H[Latency Monitoring]
        I[Provider Ranking]
        J[Automatic Failover]
    end

    B --> E
    C --> F
    D --> G
    C --> H
    C --> I
    C --> J
`}/>

For financial trading platforms, the protocol ensures secure and fast transactions necessary for high-frequency trading. Trade orders are routed through a market discovery process that selects the best execution paths among multiple providers, minimizing latency and maximizing reliability. Security measures like order validation and fraud prevention are built into the protocol to protect against malicious activities and ensure the integrity of trades. Performance metrics such as latency monitoring and provider ranking help maintain high service standards, crucial for the demands of financial markets.

### 3. AI/ML Infrastructure

Distributed **AI computing** with guaranteed resources:

<MermaidDiagram diagram={`
graph TD
    A[AI Model] --> B[Resource Discovery]
    B --> C[Distributed Training]

    subgraph "Resource Markets"
        D[GPU Allocation]
        E[Memory Markets]
        F[Network QoS]
    end

    subgraph "Reliability"
        G[Hardware Redundancy]
        H[Checkpoint Systems]
        I[Result Validation]
    end

    B --> D
    B --> E
    B --> F
    C --> G
    C --> H
    C --> I
`}/>

Distributed **AI** and machine learning workloads would utilize the protocol to discover and allocate computational resources like **GPU** and memory across the network. This ensures that **AI models** can be trained efficiently on distributed datasets without central bottlenecks. Resource markets for computing power and memory ensure that resources are allocated based on demand and performance, with reliability systems like hardware redundancy and checkpoint systems providing fault tolerance and continuous operation.

## The Economic Ecosystem

A self-sustaining economy that rewards reliability:

<MermaidDiagram diagram={`
graph TD
    subgraph "Market Forces"
        A[Resource Demand] --> D[Price Discovery]
        B[Quality Metrics] --> D
        C[Reliability Score] --> D
    end

    subgraph "Provider Incentives" 
        E[Uptime Rewards]
        F[Performance Bonuses]
        G[Reputation Points]
    end

    D --> E
    D --> F
    D --> G
`}/>

The protocol prevents resource monopolization through:
- Dynamic pricing based on supply and demand
- Multiple provider requirements
- Anti-cartel mechanisms
- Small provider protections

The protocol supports a dynamic economic model where resource demand, quality metrics, and reliability scores directly influence pricing and provider incentives. This setup prevents monopolization, encourages competition, and ensures that smaller providers can compete fairly. Service providers earn based on performance, promoting a high-quality, reliable service delivery across the network.

## Migration Strategy

Seamless transition without service disruption:

<MermaidDiagram diagram={`
graph LR
    A[Current System] --> B[Hybrid Phase]
    B --> C[Protocol Native]
    
    subgraph "Phase 1"
        D[Content Migration]
        E[Performance Testing]
        F[Provider Selection]
    end
    
    subgraph "Phase 2"
        G[Full Integration]
        H[Legacy Support]
        I[Optimization]
    end
`}/>

The transition to this new protocol would be phased to minimize disruptions. Initially, existing services would operate in a hybrid mode, maintaining compatibility with legacy systems while gradually integrating with the new protocol. Over time, services would fully migrate to become protocol-native, optimizing their operations to leverage the full benefits of the decentralized infrastructure.

## Looking Forward

This isn't just another layer on existing infrastructure—it's a fundamental reimagining of how computers communicate. We're building an internet that's:

- Truly decentralized at the protocol level
- Economically self-sustaining
- Automatically trustworthy
- Universally accessible
- Inherently reliable

The internet began as a decentralized protocol for resilient communication. Through this new protocol stack, we're finally fulfilling that original vision—not through another application layer, but through fundamental protocol innovation that guarantees reliability, security, and true decentralization.]]></content:encoded>
            <author>raghunandhanvr@outlook.com (Raghunandhan VR)</author>
        </item>
    </channel>
</rss>