Nyheimh / new-relic-coding-challenge-3

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NR Browser Coding Challenge

Table of Contents

Objective

Create a JavaScript program executable from the command line that, when given text(s), will return a list of the 100 most common three word sequences.

Installation and Running Application

This project uses JavaScript for simplicity, ease of use, and to meet requirements.

Getting Started

Clone repo (if Github)

  • HTTPS
git clone https://github.com/Nyheimh/new-relic-coding-challenge-3.git

or

Download Zip 

Ensure installations are active:

npm install

To run the program:

node index.js text/intro.txt
node index.js text/moby_dick.txt
node index.js text/counting.txt
node index.js text/unicode.txt

To run all four text files:

node index.js text/intro.txt text/counting.txt text/unicode.txt text/moby_dick.txt

Example output for node index.js text/intro.txt:

│ (index) │ sequence           │ count │
| 0       │ 'welcome to my'    │ 7     │
│ 1       │ 'to my nr'         │ 5     │
│ 2       │ 'this is this'     │ 4     │
│ 3       │ 'test this is'     │ 4     │
│ 4       │ 'my nr code'       │ 3     │
│ 5       │ 'hi this this'     │ 2     │
│ 6       │ 'this this is'     │ 2     │
│ 7       │ 'is this is'       │ 2     │
│ 8       │ 'test file this'   │ 2     │
│ 9       │ 'file this is'     │ 2     │
│ 10      │ 'is this hi'       │ 2     │
│ 11      │ 'welcome to welcome' │ 2   │
│ 12      │ 'to my welcome'    │ 2     │
│ 13      │ 'my welcome to'    │ 2     │
│ 14      │ 'my nr welcome'    │ 2     │
│ 15      │ 'nr welcome to'    │ 2     │
│ 16      │ 'nr code welcome'  │ 2     │
│ 17      │ 'code welcome to'  │ 2     │
│ 18      │ 'this is is'       │ 1     │
│ 19      │ 'this hi hi'       │ 1     │
│ 20      │ 'hi hi this'       │ 1     │
│ 21      │ 'this hi welcome'  │ 1     │
│ 22      │ 'hi welcome welcome' │ 1   │
│ 23      │ 'welcome welcome to' │ 1   │
│ 24      │ 'to welcome to'    │ 1     │
│ 25      │ 'nr code challenge' │ 1   │
│ 26      │ 'code challenge welcome' │ 1 │
│ 27      │ 'challenge welcome to' │ 1 │

Requirements

  • The program accepts as arguments a list of one or more file paths (e.g., ./solution.rb file1.txt file2.txt ...).
  • The program also accepts input on stdin (e.g., cat file1.txt | ./solution.rb).
  • The program outputs a list of the 100 most common three-word sequences.
  • The program ignores punctuation, line endings, and is case insensitive (e.g., “I love\nsandwiches.” should be treated the same as "(I LOVE SANDWICHES!!)"). Ensure contractions aren't split into two words (e.g., "shouldn't" should not become "shouldn t").
  • The program should be well tested.
  • The program should be well structured and understandable.
  • The program is capable of processing large files efficiently.

Testing

To run tests:

npm test

GitHub Actions

This project uses GitHub Actions for CI/CD. When changes are pushed to the branch, Jest will run before the file can be pushed to the main branch.

Extra Credit

  • The program can process large files efficiently. Consider memory consumption and speed, especially if handling 1,000 Moby Dicks at once.
  • The program handles Unicode characters (e.g., the ü in Süsse or ß in Straße).

Future Enhancements

  • (What would you do next, given more time)

  • Regex has limitations, after user testing, I realized there were missing edge cases such as the missing apostrophe, which needed to add extra lines. In addition, to the partial matching when coming across dashes/hyphens (-) or (/n) focuses on edge cases to bypass. The limitations of regex made some complexities that needed to be accounted for.

  • As this program stands, incorporating Docker for efficiency and scalablity.

  • Add a performance metric in Jest tests to measure the program's efficiency.

Known Issues

  • (Bug and Issues)

There are no known issues at this time.

About


Languages

Language:JavaScript 100.0%