lrstanley / girc

:bomb: girc is a flexible IRC library for Go :ok_hand:

Home Page:https://pkg.go.dev/github.com/lrstanley/girc

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Allow extra characters

qaisjp opened this issue · comments

Somewhat related to https://github.com/lrstanley/girc/projects/1#card-6997933

Functions like IsValidNick and IsValidUser exist:

girc/format.go

Lines 214 to 241 in 51b8e09

// IsValidNick validates an IRC nickname. Note that this does not validate
// IRC nickname length.
//
// nickname = ( letter / special ) *8( letter / digit / special / "-" )
// letter = 0x41-0x5A / 0x61-0x7A
// digit = 0x30-0x39
// special = 0x5B-0x60 / 0x7B-0x7D
func IsValidNick(nick string) bool {
if len(nick) <= 0 {
return false
}
// Check the first index. Some characters aren't allowed for the first
// index of an IRC nickname.
if (nick[0] < 'A' || nick[0] > '}') && nick[0] != '?' {
// a-z, A-Z, '_\[]{}^|', and '?' in the case of znc.
return false
}
for i := 1; i < len(nick); i++ {
if (nick[i] < 'A' || nick[i] > '}') && (nick[i] < '0' || nick[i] > '9') && nick[i] != '-' {
// a-z, A-Z, 0-9, -, and _\[]{}^|
return false
}
}
return true
}

girc/format.go

Lines 243 to 286 in 51b8e09

// IsValidUser validates an IRC ident/username. Note that this does not
// validate IRC ident length.
//
// The validation checks are much like what characters are allowed with an
// IRC nickname (see IsValidNick()), however an ident/username can:
//
// 1. Must either start with alphanumberic char, or "~" then alphanumberic
// char.
//
// 2. Contain a "." (period), for use with "first.last". Though, this may
// not be supported on all networks. Some limit this to only a single period.
//
// Per RFC:
// user = 1*( %x01-09 / %x0B-0C / %x0E-1F / %x21-3F / %x41-FF )
// ; any octet except NUL, CR, LF, " " and "@"
func IsValidUser(name string) bool {
if len(name) <= 0 {
return false
}
// "~" is prepended (commonly) if there was no ident server response.
if name[0] == '~' {
// Means name only contained "~".
if len(name) < 2 {
return false
}
name = name[1:]
}
// Check to see if the first index is alphanumeric.
if (name[0] < 'A' || name[0] > 'Z') && (name[0] < 'a' || name[0] > 'z') && (name[0] < '0' || name[0] > '9') {
return false
}
for i := 1; i < len(name); i++ {
if (name[i] < 'A' || name[i] > '}') && (name[i] < '0' || name[i] > '9') && name[i] != '-' && name[i] != '.' {
// a-z, A-Z, 0-9, -, and _\[]{}^|
return false
}
}
return true
}

It would be useful if we could abstract this into an interface NickValidator.

My particular use-case is that we have a customised IRC server that allows certain users to use the tilde (~) character — discord puppets have a ~d suffix. See qaisjp/go-discord-irc.

Would you be willing to accept a pull request that implements this?

(This would prevent the need to fork your repo.)

My first question would be.. why? Why choose a character that isn't standard (if it is, excuse all of this, its been while since I've looked at the spec)? Why not compromise and use a different, already supported character? Why choose a character that is unlikely to be supported by some clients/bots/libraries/etc?

Yes, I had thought about moving these things into a separate package, but the goal is not to make it pluggable in the sense of "replacing" components with their own -- as, the only reason someone would want to deviate from what I have, is if what I am checking for is wrong (then, a PR to allow other characters makes sense), or if they're not following standards. I don't really support deviating from the standard as this is already a huge issue with the IRC community and protocol.

The goal was to let users only use formatting and/or validation without importing all of the other cruft, for things other than the client library (e.g. maybe someone stores logs in a db and they want to validate input into that db, etc).

Closing this given my last comments, though can re-open if you have further questions/concerns.

Why should the client validate what the server is going to anyway? Don't second guess yourself, if the server doesn't want it then the server will complain. The RFCs are not the gospel, IRC has long since evolved past the RFCs and there are dozens of implementations with varying levels of conformance for which girc would be useful if not for the fact that it second-guesses itself.

This is the most reasonable "authoritative" reference on what's valid:

https://modern.ircdocs.horse/#user-message

It says nothing about permissible characters for USER et al. The only reasonable thing to enforce is that there are no spaces, because otherwise it wouldn't fit into the IRC message grammar.