dwyl / english-words

:memo: A text file containing 479k English words for all your dictionary/word-based projects e.g: auto-completion / autosuggestion

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Epic: English-Words Project Roadmap πŸ—ΊοΈ

nelsonic opened this issue Β· comments

Just Want a List of Words? ❓

If you only care about the list of words in this repo, πŸ“
that's great; use them and have an awesome day! πŸŽ‰

donaldtrum-best-words

Want More? πŸš€

For the minuscule minority of people who want more, this issue is for you! πŸ™Œ

oliver-please-have-some-more

Brief History / Context

A few years ago I needed a list of English Words for a work project. πŸ‘¨β€πŸ’»
Went searching and didn't find a ready-made list of English Words ... πŸ” πŸ€·β€β™‚οΈ

But found this StackOverflow Question and Answer:
https://stackoverflow.com/questions/2213607/how-to-get-english-language-word-database

stackoverflow-english-words

Extracted the words from the Excel file that was on InfoChimps (now 404) and dumped them in a .txt file.
Put it on GitHub and linked to it in a comment on SO and didn't give it anymore thought. πŸ‘Œ

Sadly, the work project that used the words was closed source for a company that got acquired and the App was shut down. 😒 The folly of working on closed source things is that you often have nothing to show for your years of your life! πŸ’­

Meanwhile many thousands of people have downloaded the word list and the repo has 8.3k ⭐ 🀯

The mini [Open Source] demo project I created: nelsonic/autocomplete ➑️ wordsy.herokuapp.com ...

autocomplete-wordsy-demo

will soon be taken offline by Heroku's Bean-counters πŸ™„

I outlined what I wanted to do in autocomplete#tasks but it's very incomplete ...
so this issue will give a muuuuch better roadmap of what we're doing. 🀞

What challenge are we solving? πŸ€”

The original purpose of this repo will 100% be maintained. βœ…
What we are doing is enhancing the repo with a showcase App that allows people to:

With that in mind, this is the plan:

  1. High quality list of English words in an easy to extract file/format e.g. .txt, .json and .zip
  2. Instructions for how to use the words in various programming languages; code examples.
  • JavaScript/TypeScript
  • Python
  • Elixir
  • Dart
  • Rust
  • Invite contributions from the community for code examples from more programming languages [but NOT frameworks]
    Make it clear that we really don't want a React sample because we don't want to encourage anyone to use it.
  1. Clarity on the Process for updating the words list both adding, correcting and removing [invalid] words.
  2. Automate the creation of the .zip file so that we don't have people attempting to submit Pull Requests with Zip Files.

We're never going to accept a PR with a zip file. It's an easy attack vector for a malicious auto-executable.
Read more: https://github.com/snyk/zip-slip-vulnerability
It's not that we don't "trust" people ... but we know that not everyone on GitHub has good intentions.
Crime pays otherwise there wouldn't be any crims ... And cyber-crime pays big BTCs! So let's just avoid it. πŸ‘Œ

  1. Allow anyone to lookup words with auto-completion and to make suggestions via Web App/UI. That will invite way more people including non-technical people who don't know how to use GitHub to help maintain+improve the list of words.

Todo

  • Review the existing/open PRs and try to merge them: #155
  • Create Phoenix App πŸ†• ... Note: waiting for Phoenix v1.7 to do this to minimise time wasted with updates ... ⏳
  • Re-create basic features from nelsonic/autocomplete:
    • Use PostgreSQL for simplicity.
    • If we notice too much query latency, we can switch to SQLite or ETS for speed:
  • Load the current English Words List into the DB
  • Determine/decide what other metadata we want to store for each word. πŸ’­
  • Discuss any other features we want to have. (please comment!) πŸ’¬

@LuchoTurtle does this answer your questions regarding the "roadmap" for this repo? πŸ’­ 🀞