Website to get data from other website, clean the code and get the most common words.
This was a project developed in a subject when I was student. As better developer I refactor all the project with better practices and unit testing.
Due to this project is not deployed anymore and does not have more purposes, is no longer maintained.
- Teamwork in the first release.
- Got the data with PHP and process it with JavaScript.
- Better HTML semantic, JavaScript code and practices.
- Used regular expressions to clean the data instead of fors and ifs.
- Implemented unit testing with Jest.
These instructions will get you a copy of the project up and running on your local machine.
The programs you need are:
- PHP 7.
- NodeJs.
Install the JavaScript dependencies.
npm run install
Finally run the server:
php -S localhost:8080
There are some unit testing to guarantee functionalities about functions and filters, some snapshots are included to save results about many functions and filter. Run the test with:
npm run test
Note: You can run the previous command dynamically with test:watch
.
Run the coverage about functions and filters with:
npm run test:coverage
- Paste a complete URL in the input and click on get information.
- The program cleaned the source code (delete html tags, javascript and css code) to get just valuable text (words with 1 or 2 characters, special characters and some words are deleted).
- Then the program calculates the most common words and tags.
Note: At the moment this program does not work well with website with client side server or/and strange structure.
- Martín S. Campos mascam97
- Some classmates who do not use github anymore :´(.
You're free to contribute to this project by submitting issues and/or pull requests.
This personal project is licensed under the MIT License.