elasticdog / genhost

generate unused hostnames by randomly picking from a word list

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Removal of confusing words

zqad opened this issue · comments

Hi,
I have used genhost for a while now, but reacted when it returned the suggested hostname develop. While this is a perfectly usable word in the context of the mnemonic word list, it has quite strong associations in the context of non-service tied hostnames.
So, my suggestion is to do a curation of the list. I have gone through my local genhost and removed a number of words that might be confusing in our local context. To simplify discussions, I did split the list into discrete categories. The middle list are words that feels more subjectively confusing to me, and I have removed them from my list, but that might not be objectively confusing.
My suggested action is that we discuss for a bit if and which words that are subject to removal, after which I can summarize the decision as a pull request.

Names that have a meaning in a hostname/network context

  • address
  • cloud
  • data
  • develop
  • email
  • example
  • laptop
  • local
  • machine
  • mailbox
  • mobile
  • monitor
  • network
  • proxy
  • service

Names that might have a meaning in a hostname/network context

  • absent
  • alert
  • alias
  • broken
  • connect
  • crash
  • delete
  • front
  • halt
  • image
  • legacy
  • nobody
  • null

Specific common technologies

  • acrobat
  • chef
  • matrix
  • python
  • ruby
  • salt

The biggest problem that I find is that the wordlist isn't denoted in Motorola Assembly markup, which is my native tongue, and can also be confusing. It also lacks the planet Remulak, and Melmac, which my native fauna and flora come from.

I have no right to speak for this product, but feel free to create your own wordlist and make any alterations you see fit with your own fork. YMMV.

Oh, but I have my own fork. This ticket summarizes the work I have done there, but since the whole matter is subjective as you say, I formatted the changes as an issue instead of a MR so that I could create a MR based on the feedback here.

Just doing a quiet fork might be the default method here on GitHub, as I have seen it being done for many repos. But I would much rather see if my changes can help someone else, and if so upstream them.

Sorry for my lack of communication here...the word list has always included words that subjectively don't make sense in a technical context, but those words are inevitably different for every organization. There are a ton of other words on the list that have technical connotations like wheel, voodoo, mono, exit, etc. but I've preferred to leave the list as a direct copy of the mnemonic encoding project to allow people to make their own customizations. The special properties of the words on the list make them all valuable, in some context. I wrote more about that in A Proper Server Naming Scheme:

These 1633 words were chosen very specifically to be short (4-7 letters), phonetically different from one another, easy to understand over the phone, and also recognizable internationally. The mnemonic word list should be much less prone to typos and transposed characters when compared to more structural names. A lot of time and research went into these words, and their properties make them ideal for our purpose.

If an organization is consistent with randomly selecting host names using this list, then there should be a preexisting understanding that the names themselves have no the special association with the function of the host. I do understand that there's potential for confusion, which is also why it's mentioned in the README:

If a hostname has the potential to be confusing based on technical jargon (like email.example.com), simply ignore it and generate a replacement.

That's a long way of saying that I agree with you; but also think there's not a subset of words that would be universally perfect. I always appreciate when users contribute feedback/improvements to my projects, but in this particular case, maintaining a fork with the subset of words that make sense in your context seems like the right move.

I am happy with that answer. My hope is that if someone thinks something similar, they will find this ticket and use this discussion as a basis for how to move forward. Closing the ticket.