stevus / web-browser

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Web Browser

Considerations:

  • Detailing all activities happening inside a web browser from URL entry to paint complete.
  • Identifying individual browser components
  • Estimating level of difficulty to building individual browser components
  • Browser completeness

Overview

alt alt alt
Caption Caption Caption

Complexity of modules is represented in the amount of 💻

Overall Process

alt alt
Generalized understanding of process (web) My current understanding

Example Implementation

Elements

Browser Engine - HTML module

The objective of this module is to output a DOM (Document Object Model) tree to be consumed by the rendering engine.

💻 💻 💻

Library Browser Language Stability
Flex
Lex
Yacc
Bison
Expat Python

Development considerations:

Creating the DOM Tree

The DOM Tree maintains the hierarchy of all the HTML nodes (visual and nonvisual) on the page.

<!DOCTYPE html>
<html>
  <head>
    <meta name="viewport" content="width=device-width,initial-scale=1" />
    <link href="style.css" rel="stylesheet" />
    <title>Critical Path</title>
  </head>
  <body>
    <p>Hello <span>web performance</span> students!</p>
    <div><img src="awesome-photo.jpg" /></div>
  </body>
</html>
  1. The HTML module of the Browser Engine receives an input byte stream representing HTML content
  2. A Lexer tokenizes the input byte stream and converts it into equivalent HTML nodes
  3. A Parser arranges the HTML nodes into an Abstract Syntax Tree (DOM tree)

It is stated above the the HTML module contains a Lexer and a Parser, and not a Tokenizer and a Parser because a Lexer performs operations that a Tokenizer does not, namely:

  • A Tokenizer breaks a stream of text into tokens, usually by looking for whitespace (angle brackets etc).
  • A Lexer is basically a tokenizer, but it usually attaches extra context to the tokens -- this token is a <body> tag, that token is a <div> tag, this other token is an <img /> tag.

Taking this to completion:

  • A Parser takes the stream of tokens from the lexer and turns it into an abstract syntax tree representing the (usually) program represented by the original text.2

Implementations exist that combine the processes of Lexing and Parsing and call the module a Parser3 when in reality two separate operations are occuring on the input byte stream.

The DOM tree changes when:

  • A piece of Javascript has triggered created, removed or modified existing HTML nodes on the page.

Links

Browser Engine - CSS Module

💻 💻

The objective of this module is to output a CSSOM (Cascading Style Sheet Object Model) tree to be consumed by the rendering engine.

Library Browser Language Stability

Creating the CSSOM Tree

The CSSOM Tree maintaings the knowledge of all styles of the DOM tree so long as a rule has been specified in external, inline or embedded CSS.

body {
  font-size: 16px;
}
p {
  font-weight: bold;
}
span {
  color: red;
}
p span {
  display: none;
}
img {
  float: right;
}

The CSSOM (CSS Object Model) tree is generated much like the DOM tree.

  1. The CSS module of the Browser Engine receives an input byte stream representing CSS content from one of multiple inlets:
    • An input byte stream originating from an external CSS stylesheet, referenced by a <link> tag, discovered by the HTML module while constructing the DOM tree
    • An embedded stylesheet referenced inside a <style> tag, discovered by the HTML module while constructing the DOM tree
    • An inline style belonging to an HTML node discovered by the HTML module while constructing the DOM tree
  2. A Lexer tokenizes the input byte stream, turning it into the corresponding CSS rules
  3. A Parser arranges each CSS selector into an Abstract Syntax Tree (CSSOM tree)

The CSSOM has a tree structure because the browser starts with the most general rule applicable to a specific node (for example, if it is a child of a body element, then all body styles apply) and then recursively refines the computed styles by applying more specific rules; that is, CSS rules "cascade down.

Every browser provides a default set of styles also known as "user agent styles". These styles form the initial CSSOM tree and the external, inline or embedded styles simply override these defaults.4

Construction of the CSSOM tree is in the critical rendering path and this process is completed very quickly, faster than a single DNS lookup.4

Changes in the CSSOM tree

The CSSOM tree is refreshed when:

  • The dimensions of the browser viewport have changed
  • The visibility of an HTML node has changed

Question: How are media queries represented in the CSSOM tree?

Links

Rendering Engine

💻 💻 💻 💻 💻

This module handles how to interpret the parsed HTML and creating a plan to displaying it on the screen. Man years have gone into development of different versions of this module.

🚩 There is not a lot of community exploration and research on this subject which leads me to believe it is heavily misunderstood.

Library Browser Language Stability
Trident Internet Explorer
Gecko Firefox
Webkit Safari and Chrome 0-27
KHTML KDE desktop environment. Webkit forked from KHTML some years ago
Elektra Opera 4-6
Presto Opera 7-12
Blink Chrome 28+, Opera 15+, webkit fork

Creating the Render Tree

The Render Tree maintains the knowledge of which nodes are rendered onto the page and where they are located and how they are should appear according to constraints such as screen size and user interaction.

  • Combine the DOM Tree and CSSOM Tree.

Additional Information

A Regular Expression Engine to pre find all required HTML, CSS, JS objects and methods and counts to pre set the rendering engine before the rendering loop begins.

Things like the id, tags with their widets and CSS Nested Rules need to be combined in each render loop, with some examples below:

  • A set of both Horizontal and Vertical that will handle the Cells below as box Frames for div 1 sections→
  • A Canvas or Frame that allows painting or drawing widgets and shapes as box Frames for div 1 to div 6 sections→
  • A combo of a rectangle, a label or text widget stacked for block level elements —the cells background, spacing's and colored border
  • A combo of a rectangle a label or text widget for Hyperlink Elements
  • A combo of a rectangle, a button widget for button widgets with added background borders provided by the extra drawn rectangle on the canvas.
  • A combo for every other widget the same such as entry, text entry, combo box, scroll box, radio buttons, check boxes, dividers for hr tags etc.

All of this above requires massive libraries in pairs for either CSS float or display to handle Horizontal and Vertical and to store all their tag or id names by counts and sections to allow JS to do after render methods.

The cell allows margins and borders between the rectangle object and the label or widget objects.

Margins and padding are easily added into the widgets as integers using math and by their own names which some may require minor Tokenizing from the CSS Tokenizer earlier: Regular Expressions do not like background-color or text-width where they are confused bgy width and color after a delimiter “-color”or”-width” so minor tokenizing is required for such.

width=parent_width+margin_left # The Frame Notice the Regular Expression Tokenized “-” to ”_” in margin_left
width=parent_width+padding_left etc. # The Cell

A Pic of Chrome beside Tk testing HTML Cell Rendering to adjust border and spacing properties.

It does require math within the Cell Methods also.

CSS can be transferred without allot of Tokenizing simply by assigning the CSS value directly to the widgets value variables. Such as background=background-color etc. CSS is render-blocking. That means the browser blocks page rendering until receive and process all CSS of the page. The reason why it happens is that CSS overrides it, so if you, for example, will allow partial rendering, with partial CSS, you will end up with an inconsistent state of the view until finish to process all CSS of the page.

Width and height have to be carried and returned for each new widtget combo in the stack also.

Nothing will fully match between Webkit, Firefox and Edge spacing because of NCSA Mosaic Widgets and they also do not fully match in many ways because of differences between Spyglass and Netscape Navigator’s having a different CSS Engine while both were developed by both Sun Micro Systems and Microsoft to render CSS differently.

Links

Javascript Engine

💻 💻 💻 💻 💻

Library Browser Language Stability
SpiderMonkey Gecko/Firefox
TraceMonkey SpiderMonkey in Firefox 3.1 and introduces JIT (just-in-time) compilation
KJS Konqueror, tied to KHTML
JScript Trident, Internet Explorer
JavascriptCore Webkit by the Safari browser
SquirrelFish Webkit and adds JIT like TraceMonkey
V8 Google's Javascript engine used in Chrome and Opera
Chakra

A Javascript engine consists of a Lexer and a Parser.

This module is entirely separate from the Browser Engine.

Several of these tend to be tied to a particular rendering engine.

GUI Toolkit

This is the module that draws the elements created by the rendering engine to the screen .

Library Browser Language Stability
wxWidgets
Qt
Tcl/Tk
GTK

User Interface

This module handles displaying the UI for operations such as:

  • Navigation between pages
  • Page history
  • Clearing temporary files
  • Typing in a URL
  • Autocompleting URLs
Library Browser Language Stability
skia chrome

Networking

💻 💻

Navigation is the first step of loading a web page. It happens when the user enters a URL in the address bar or clicks on a link.

This module handles all of complexity / subtlety of the HTTP protocol eg data transfer, expires headers, different versions, TLS etc.

Library Browser Language Stability
Necko Firefox

Url Request Methods Url Multi IO Threaded Request Methods for CSS, Images, Media and JS files, these have to be live to be fast and efficient.

Development considerations:

DNS lookup

The first step is to find the IP address where the resources are located. This is done by a DNS lookup.

The Domain Name System (DNS) Server is a server that is specifically used for matching website hostnames (like www.example.com) to their corresponding Internet Protocol or IP addresses. The DNS server contains a database of public IP addresses and their corresponding domain names

For example, if you visit www.example.com, the DNS server will return the IP address 93.184.216.34 which is its corresponding IP address.

3-way TCP Handshake

The next step is to establish a TCP connection with the server. This is done by a 3-way TCP handshake.

TCP-Handshake

First, the client sends a request to open up a connection to the server with a SYN packet.

The server then responds with a SYN-ACK packet to acknowledge the request & requesting the client to open up a connection.

Finally, the client sends an ACK packet to the server acknowledging the request.

TLS handshake

If the website uses HTTPS (encrypted HTTP protocol), the next step is to establish a TLS connection via a TLS handshake.

TLS-Handshake

During this step, some more messages are exchanged between the browser and the server.

  1. Client says hello: The browser sends the server a message that includes which TLS version and cipher suite it supports and a string of random bytes known as the client random.
  2. Server hello message and certificate: The server sends a message back containing the server's SSL certificate, the server's chosen cipher suite, and the server random (a random string of bytes that's generated by the server).
  3. Authentication: The browser verifies the server's SSL certificate with the certificate authority that issued it. This way the browser can be sure that the server is who it says it is.
  4. The premaster secret: The browser sends one more random string of bytes called the premaster secret, which is encrypted with a public key that the browser takes from the SSL certificate from the server. The premaster secret can only be decrypted with the private key by the server.
  5. Private key used: The server decrypts the premaster secret.
  6. Session keys created: The browser and server generate session keys from the client random, the server random, and the premaster secret.
  7. Client finished: The browser sends a message to the server saying it has finished.
  8. Server finished: The server sends a message to the browser saying it has also finished.
  9. Secure symmetric encryption achieved: The handshake is completed and communication can continue using the session keys.
  10. Now requesting and receiving data from the server can begin.

Database

Stores things relevant to the browser such as:

  • Cookies
  • Cache
  • Security Certificates
  • Bookmarks

On Analyzing Browser Builds

It's necessary to determine a baseline test for all browsers to consume and assess how close each come to rendering controlled layouts, and it's nice to know where the stopping point is.5

Assessment will focus on:

  • Visual accuracy
  • Time

Web Browser Projects

Tensor Programming - Rust Browser

Custom Design

Josh on Design - Rust Browser

Custom Design

A Toy Rendering Engine

Custom Design

EinkBro

Kosmonaut

Polypane

Uses Electron WebView; Not a true browser implementation

Metastream

Lexbor

Custom Design

Browser From Scratch - Viethung

Custom Design

Open Source Web Browsers

How Web Browsers Work

Libraries related to Web Browsers

Miscellaneous (I need to sort)

Ad Block for Web Browsers

This is the main source of motivation - to learn how to undesired content is blocked from view and prevented from running.

Image from https://brave.com/adauctions-economist/

On Bypassing Paywalls

Footnotes

  1. Understanding DOM, CSSOM, Render Tree, Layout, and Painting 2

  2. Looking for a clear definition of what a "tokenizer", "parser" and "lexers" are and how they are related to each other and used?

  3. Constructing a Document Tree

  4. How Browsers Work - Building the CSSOM 2

  5. Josh on Design: Building a Rust Web Browser

About