ft_server

If you want to learn more about IT topics, I invite you to join my Patreon channel and visit my website: IA Notes

This project will put us fully into the world of web servers. Once we have seen how the network works , we take another step to learn how to set up our own server from any device thanks to Docker.

How to run the server?

Clone ft_server repository:

git clone https://github.com/pgomez-a/ft_server.git && cd ft_server

To create the image, within the main directory you must use:
```
docker build -t ft_server .
```
Once the images are created, to create the web server:
```
docker run -d -p 80:80 -p 443:443 ft_server
```

Main objectives

We will have to create a web server with Nginx , within a single Docker container. This container must work on the Debian Buster operating system. Also, our web server has to be able to work with three different services: Wordpress, PhpMyAdmin and MySQL ; and make sure they all work together. When possible, our server should be able to use the SSL protocol.
Lastly, we must be able to manage our server so that it works with an automatic index that can be disabled.

Once the objectives are understood, it is time to read and learn in order to configure our ft_server . Here is some of the information that I read and studied in order to configure my server. However, I advise you to read as much as you can about the technologies that are defined in the objectives and on different pages so that you do not depend on a single source of information.

What is a server?

A server is a set of computers capable of attending to the requests of a client and returning a response accordingly. The term server has two meanings in the field of computing: the first refers to the computer (hardware part) and the second refers to the program that runs on the computer (software part):

Hardware server: physical machine integrated into a computer network in which, in addition to the Operating System (OS), one or more software-based servers operate. An alternative name for a hardware-based server is host.
Server software: program that offers a special service that other programs called clients can use locally or through a network. The type of service depends on the type of server software. The basis of communication is centered on the client-server model.

How does a server work?

The client-server model makes it possible to distribute tasks among different computers and make them accessible to more than one end user independently. Each service available through a network will be offered by a server that is permanently on standby. This is the only way to ensure that customers always have the ability to actively access the server and use the service according to their needs.

How does a web server work?

The main task of a web server is to save and organize web pages and deliver them to clients such as web browsers or crowlers. Communication between server and client is based on HTTP , that is, in the hypertext transfer protocol; or in HTTPS, which is its encoded variant. As a rule, HTML documents and the elements embedded in it are transmitted.
The most popular web servers are: HTTP Apache, Internet Information Server (ISS) and Nginx.

What is Debian?

Debian is an all-volunteer organization dedicated to developing free software and promoting the ideals of the free software community. The Debian project arises with the intention of creating a new distribution (OS) based on the Linux kernel.

Linux is an OS: a set of programs that allows the user to interact with the computer and run other programs. Therefore, we can say that an operating system consists of several fundamental programs that the computer needs to be able to communicate and receive instructions from users. The most important part of an OS is the kernel, while the rest of the system consists of other programs, many of which were developed for the GNU project. Since the Linux kernel does not by itself make up an OS, what we commonly refer to as Linux is actually GNU / Linux.
Linux is modeled after a Unix-type OS. From the beginning, it was developed to be a multi-tasking and multi-user system:

Multi-tasking: allows several processes or tasks to run apparently at the same time.
Multi-user: allows to provide service and processing to several users simultaneously.

What is the GNU project?

The GNU project has developed a set of free software tools to be used by UNIX and other Unix-type OS , such as Linux. These tasks allow both creating and deleting files and compiling programs, among many other things.

Debian Directory Tree

The structure of Linux directories, as well as their content and functions, is defined in the Filesystem Hierarchy Standard (FHS) . The whole tree starts from a common root called root (/) . The FHS distinguishes between:

Static Directories vs Dynamic Directories.
Shareable Directories vs Non-Shareable Directories.

root (/)

From where all other directories are born, regardless of whether they are physically stored on disks or on separate drives.

bin, sbin (/bin/ /sbin/)

They are both static directories. In the directory bin all the necessary binaries are stored to guarantee the basic functions at the user level, including many of the commands used daily such as cd, cat, cp, mv, etc.
The sbin directory does the same, but for binaries related to the tasks of the OS and that can only be managed by the root user, such as restoring, booting, etc.

boot (/boot/)

Static directory that includes all the executables and files necessary in the system boot process, and that must be used before the kernel begins to give the execution orders of the different system modules.

dev (/dev/)

It includes all storage devices, in the form of files, connected to the system. That is, any hard drive, USB memory, etc., that is connected to the system and that the system can understand as a logical storage volume. This directory contains the information for each of the volumes connected to the system.

etc (/etc/)

In charge of storing the configuration files both at the level of components of the operating system itself, as well as the programs and applications installed later.

home (/home/)

Standard user directory and therefore intended to store all user files such as photos, videos, documents, etc. Inside /home/ we find the home directories of all users, named according to the username used.

lib (/lib/)

It includes the essential libraries that are necessary so that all the binaries found in the /bin/ and /sbin/ directories can run correctly, as well as the modules of the same kernel.
On 64-bit OS, in addition to / lib / there is another directory called / lib64 / , referring to libraries for 64-bit applications.

media (/media/)

Represents the mount point of all logical volumes that are temporarily mounted.

opt (/opt/)

It contains all those files that are read-only and that are part of self-contained programs and that, therefore, do not follow the standards of storing the different files within the different subdirectories.

proc (/proc/)

It contains information about the processes and applications that are running at any given time on the system, but it doesn't really save anything as such, since the files that are stored are virtual, so their content is null.

root (/root/)

It would be like the /home/ directory of the root user. It is separated from the rest of the users of the system.

srv (/srv/)

It is used to store files and directories related to servers installed within the system.

sys (/sys/)

Like /proc/ it contains virtual files that provide kernel information regarding OS events.

tmp (/tmp/)

It is used to store temporary files of all kinds. It is a directory designed to store short-lived files, and they are usually deleted automatically when the system is restarted. However, these files should not be deleted manually as they may affect the functionality of some programs.
There is another tmp in /var/tmp/ arranged also for the storage of temporary files but whose content is not deleted when the system is restarted.

usr (/usr/)

It is used to store all files that are read-only and related to user utilities, including all software installed through package managers, such as apt-get.

var (/var/)

It contains several files with system information, such as log files, emails from system users, databases, etc.

Nginx

Famous open source web server software. In practical terms it is defined as a software that processes the requests of the users of the network and that guarantees that the exchange of information takes place.
Its development has focused on creating a server that is characterized by a very high performance, that serves as many clients as possible at the same time and that also consumes the least amount of resources possible.
Like Apache, Nginx is modular software. This means that the different features are presented in the form of modules and, as an administrator, they can be activated or deactivated. If activated, some of the features that the user enjoys are:

Application Acceleration: streamlines content delivery.
Reverse Proxy Server: a proxy is a server that allows request and response communication between client-server. The reverse case, where the request is made by the server, is a reverse proxy.
TLS encryption: for secure data transfer.
Bandwidth management: to improve performance.

While Apache opens a process or thread for each request from a client, Nginx works focused on events. Apache, by creating a process for each request, wastes resources making the server load increase and the user has to wait to be able to access the server. Meanwhile, Nginx is able to handle all requests in a few processes thanks to its event-oriented architecture, which saves resources.
Directives

All Nginx configuration files are located in the /etc/nginx/ directory, which is related with the FHS hierarchy that we saw in the Debian section. The main configuration file is nginx.conf.
In Nginx, configuration options are called directives. These are organized in groups called blocks. In this file we find four fundamental configuration directives:

user: defines the user and group credentials used by worker processes.
worker-processes: defines the number of worker processes. The optimal value depends on the components of the computer. For this reason, it is advisable to use the value auto to find the optimal value for that value.
error_log: configures the log.
pid: defines a file that will store the ID of the main process.

You can create individual configuration files for each virtual server and save them in the /etc/nginx/sites-available/ folder, where you can modify them whenever you need. For Nginx to take these changes into account, the files must be in the /etc/nginx/sites-enabled/ folder.
The main configuration file is named default , and uses the following directives:

listen: specifies the port through which HTTP communication is made.
root: is the path of the directory with the web content.
index: default files that are served when the URL does not specify a specific file.
server_name: domain to which the server is associated.

HTTPS configuration with SSL certificate

The SSL certificate is a global security standard that enables the transfer of encrypted data between a browser and a web server. Basically, the SSL layer allows two parties to have a private conversation. In order to use the secure HTTPS protocol on a test server or on a local network, we can use self-signed certificates.
A self-signed certificate is one that has not been validated by a Certificate Authority (CA) . The level of encryption can be the same as any other type of certificate, but as it is not validated by a CA, the browser will display a warning when the site is displayed.

HTTP to HTTPS redirection

Normally, when an SSL certificate is installed, we will have 2 server blocks for the same domain. The first will be for the HTTP version on port 80 and the second for the HTTPS version on port 443. In this way, to achieve an HTTP redirection to HTTPS, we will make the requests made on port 80 be permanently redirected using special redirect directives.

Snakeoil

If you've done some research on self-signed certificates you may have come across the term snakeoil. This term refers to a cryptography concept used to refer to any cryptographic method or product considered false or fraudulent.

Autoindex

By default, Nginx tries to display a list of directories when a particular URL ends in / . For example, if you were to use a /assets/css/ path, Nginx would look in the /css/ directory to see if it can find an index.html > present, and if not, it will give us a 404 error. However, if it does have an index.html in the directory, Nginx will use it to display a directory list of the files within the directory.

LEMP Stack

LEMP is a group of software that can be used to serve dynamic web pages and web applications. This is an acronym that describes: the OS Linux (in our case Debian), with a Nginx web server, a MySQL database ( in our case MariaDB) and dynamic processing is handled by PHP .

MariaDB

A database is a set of data belonging to the same context and systematically stored for later use. In computing, it is an organized collection of structured information, or data , typically stored in a computer system.
The software used to manage a database is called "Database Management System "(DBMS).
There are two types of databases:

Relational databases: MySQL.
Non-relational databases. MongoDB.

Relational databases are based on the organization of information in small parts that are integrated by means of identifiers; unlike non-relational databases, which, as the name suggests, do not have an identifier that can be used to relate two or more data sets.

Each table consists of a set of rows and columns.
Each row contains information about a single entity. This is known as a record.
Each column contains information about the entities. This is known as attribute or field.

One way to manage the data in a database is SQL. Almost all DBMSs use SQL, like MariaDB or MySQL.

Once the Nginx web server is installed, we need a database management system to store and manage the data for our site. In our case we use MariaDB. MariaDB will allow us to create our own databases and tables, as well as decide which users can access and modify them. In the same way, MariaDB has a simple syntax that will allow us to do all this. However, despite its simplicity, it can be much more useful to use a control panel to manage our database. We will obtain this control panel with the installation of PhpMyAdmin.

PHP

A website is a site on the World Wide Web that contains hierarchically organized documents. Each document contains text and / or graphics that appear with digital information on the computer screen. One of the divisions that we could make between all the types of existing web pages are:

Static Web Page: that page mainly focused on displaying permanent information, where the navigator is limited to obtaining said information, without being able to interact with the web page.
Dynamic Web Page: one that contains applications within the web itself, providing greater interactivity with the navigator.

The most common extensions of web pages are:

html, htm, asp, jsp, php -> only the .html or .htm are static web pages and, therefore, the only ones that we can see directly in the browser. The rest, in order to be viewed, must be contained in a web server. Nginx uses FPM (FastCGI Process Manager) to process PHP scripts. FPM is a good alternative to FastCGI.

PhpMyAdmin

Tool written in PHP with the intention of managing MySQL administration through web pages using a web browser. You can currently create and delete databases; create, delete and alter tables; delete, edit and add fields; etc.

Wordpress

Wordpress is a content management system focused on the creation of any type of web page. There are many applications of this type, which are also known as Content Management System (CMS) . The reason why using this type of application is very simple: they allow you to create websites and their contents in a visual way, without having to program.

Docker

The idea behind Docker is to create lightweight and portable containers for software applications that can run on any machine with Docker installed, regardless of the operating system that the system has underneath. When talking about Docker, we will handle different concepts:

Container: is something self-contained in itself, which can be carried from one place to another independently, it is portable. So that we can access an application as normal users, said software application needs to be running on a machine (container). This container needs to have a series of programs installed for the application to run correctly. Thus, Docker allows me to put in a container all those programs that my computer needs for the application to be executed. In this way, said application can be run on any machine that has Docker installed, without the need for any further requirements.
Image: an image is a static representation of the application or service and its configuration and dependencies. To run the application, an instance of the application images is created to create a container, which will run on the Docker host.
Dockerfile: file that Docker uses to read a configuration. It contains all the commands that we want to execute on the command line to build an image.

Once you have read a little about Docker and how it works, you will see that there are a number of very useful commands that you will have to use frequently. My goal with this README is for you to be able to understand why you need to create a web server and why you use the programs you use, so that is why I will focus solely on explaining how each one works without giving examples (remember that you must read more than one source to carry out your projects):

docker images: provides a list of installed images. These images appear in a table where it is indicated: image name, version used, unique identifier of the image, publication date and weight.
docker search: to find the name of the image that we want to install from the command console.
docker pull: allows you to install an image
docker run: allows us to run an image.
docker ps: allows us to see the containers that are running.
docker start: reruns a container that has been stopped.
docker stop: ends a running container
docker rm: removes a container that has already been terminated.
docker attach: allows us to establish the connection of a container in the foreground.

Vocabulary

Crowler: computer program that inspects pages of the World Wide Web in a methodical and automated way.
Service: set of activities that seek to respond to the needs of a client.
Linux kernel: Linux OS kernel. Its main function is to be in charge of controlling the computer hardware. Specifically, this kernel is responsible for managing the system memory and the time of the processes.
PID: process identifier, integer number used by the kernel of some OS to uniquely identify a process. That is, each process is numbered to differentiate it from the rest.
Process: running program.

To end

I have many more things noted in the notebook. If you want to discuss some of the topics covered here or if you have any questions, do not hesitate to contact me. I will be happy to talk about these technologies and help you with anything. 😉