How a Browser Works: A Beginner-Friendly Guide to Browser Internals

1.What a browser actually is (beyond “it opens websites”)
A browser isn't just an app that opens websites; it's actually powerful software that requests data from the internet, understands it, and displays it beautifully on the screen. It works like a translator and painter between the user and web servers.
Core Kaam
The browser first converts the URL to an IP address via DNS, then sends an HTTP/HTTPS request to the server. Data like HTML, CSS, JS, and images comes from the server, which the browser parses to create a layout and draws pixels on the screen.
Main Components
User Interface: Things like address bar, tabs, bookmarks, back/forward buttons.
Rendering Engine: Understands HTML/CSS to build the page's structure and style (like Blink in Chrome).
JavaScript Engine: Handles dynamic things like button click animations (V8 in Chrome).
2. Main parts of a browser (high-level overview)
The browser's main parts at a high level—these key components together load, render, and manage websites.
User Interface
This is the visible part where you see the address bar, back/forward buttons, bookmarks, tabs, and menu. Basically, the user interacts with the browser through this.
Browser Engine
This is a bridge between the user interface and rendering engine. It passes commands and coordinates so everything runs smoothly.
Rendering Engine
Its job is to parse HTML and CSS to create the page layout and display it on the screen, like Blink (Chrome) or Gecko (Firefox).
JavaScript Engine
Handles dynamic things like button clicks and animations by executing JS code, for example, the V8 engine in Chrome.
Networking Layer
Sends HTTP/HTTPS requests to the server, fetches data, and handles all network-related tasks.
UI Backend and Data Storage
The UI backend handles elements like forms and buttons at the OS level; data storage saves cookies, history, and cache on the local device.

3. User Interface: address bar, tabs, buttons
The User Interface is the browser's topmost layer where we directly interact, like with the address bar, tabs, and buttons. These together make browsing easy and visual.
Address Bar
This is a text box where you type a URL (like www.google.com) or search query. Press Enter and the browser fetches the page from the server; it also provides auto-suggest for popular sites.
Tabs
Tabs let you open multiple websites in one window, like in Chrome. Each tab loads an independent page, switching is easy, and it saves memory.
Buttons
Back/Forward: To go to the previous or next page, for history navigation.
Refresh/Stop: To reload the page or stop loading.
Home: Takes you to the default homepage.
These buttons are for quick actions, giving the user fast control.
4. Browser Engine
This works as a bridge between the UI (like address bar, tabs) and Rendering Engine. When you type a URL or press a button, it passes commands to rendering and handles coordination like triggering network requests or JS execution.
Rendering Engine
This directly parses HTML and CSS, creates the DOM, calculates layout, and paints pixels on the screen (like Blink in Chrome). Its focus is solely on turning content visual, without direct talk to the UI.
5. Networking: how a browser fetches HTML, CSS, JS
Step-by-Step Process
The browser first resolves the URL to an IP address via DNS, then establishes a TCP connection with the server. It sends a GET request for HTML (like index.html), and the server responds with HTML along with a 200 OK status.
After HTML parsing, it finds links to CSS (<link>) and JS (<script>) inside, so the browser sends separate requests to download those files. This happens in parallel for speed, and it checks the cache first to see if it's locally available.
6. HTML parsing and DOM creation
When the browser gets HTML from the server, it doesn't show it directly on the screen. First, it parses the HTML and then builds a structure called the DOM (Document Object Model).
Step by step
HTML comes from the server as a simple text file:
xml
<html>
<body>
<h1>Hello</h1>
<p>World</p>
</body>
</html>
Tokenization
The browser breaks this text into small parts called tokens:
<html> → start tag <body> → start tag <h1> → start tag Hello → text </h1> → end tag … and so on.
Parsing
Now the browser understands from these tokens:
which element is inside which
which is the parent
which is the child
DOM Tree banana
From this information, the browser builds a tree structure called the DOM.
Example DOM tree:
text
Document
└── html
└── body
├── h1
│ └── "Hello"
└── p
└── "World"
7. CSS parsing and CSSOM creation
Just as HTML creates the DOM, CSS creates the CSSOM (CSS Object Model).
The browser doesn't use CSS directly; first it parses it and builds a structure.
Step by step
CSS comes from the server as simple text:
css
body {
background: white;
}
h1 {
color: red;
font-size: 24px;
}
Tokenization
The browser breaks the CSS text into small parts:
Selectors → body, h1
Properties → color, font-size
Values → red, 24px
Parsing
Now the browser understands:
Which rule applies to which element
Which property goes with which value
CSSOM banana
Combining all these rules, the browser builds a structure called CSSOM.
This is also tree-like, just like the DOM, but it's only for CSS rules.
8. How DOM and CSSOM come together
DOM and CSSOM together create the Render Tree in the browser, which becomes the base for visually rendering the page. This is the main step in the critical rendering path where content and style combine.
Render Tree Kaise Banta Hai
The DOM tree has all elements (visible + invisible), but matching with CSSOM, the browser picks only visible nodes—like skipping elements with display: none. Example: If DOM has <div><p>Hello</p></div> and CSS div { color: blue; } p { font-size: 20px; }, then the Render Tree will have styled <div> and <p> nodes.
9. Layout (reflow), painting, and display
Layout/Reflow Kya Hai
After the Render Tree is ready, the browser does layout (in Chrome) or reflow (in Firefox)—it calculates the exact position (x,y coordinates), size (width, height), and geometry for every visible node. This is based on the viewport, applying rules like flexbox or grid. Example: A <div style="width: 100px; height: 50px;"> box fits inside its parent.
If DOM or CSS changes (like adding a class via JS), a partial reflow triggers—only affected elements recalculate, but it's costly, so batch them.
Painting Process
After layout, painting happens where the browser creates layers—converting colors, backgrounds, borders, text, shadows into pixels via Paint Records. It traverses the Render Tree to generate draw commands, like "fill rect blue" or "draw text black". Invisible parts are skipped.
Display
Finally, display—combining paint layers into the final image and rendering it on screen. This is fast on the GPU, especially for transforms/animations. The browser targets 60fps for smoothness.
10. Very basic idea of parsing (using a simple math example)
Math Expression Example
Suppose you need the result of "2 + 3 4". First tokenization*—breaking the string into meaningful parts: numbers (2, 3, 4) and operators (+,* , = end). This is like separating <div> or "hello" in HTML.
Then apply syntax rules: multiplication first (), addition later (+). A tree forms: root at (34=12), with + 2, final 14. This is called a parse tree, where order matters.
HTML/CSS Mein Same Logic
In HTML "<h1>Hello</h1>", tokens: <h1>, "Hello", </h1>. Rules build a tree: h1 node with text node inside. In CSS "h1 { color: red; }", tokens: "h1", "{", "color:", "red", "}". Tree becomes a rule node with properties.