jcugat / django-custom-user

Custom user model for Django with the same behaviour as the default User class but with email instead of username.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

EmailUser considers a@b.com and a@B.com different users

joerick opened this issue · comments

The domain part of an email address is case-insensitive. These days the local-part is considered case-sensitive too by nearly every email provider.

With EmailUser I was able to create two users in the admin with addresses a@b.com and a@B.com - this should be impossible, these should not be judged as unique.

I'm grappling with this in a project I'm working on, where users can type an email address to invite other users to a project - if they use a different case, then the user that already has an account gets another signup email! Not good. So I have to use get(email__iexact='...'), but now I have the problem that get() could crash since there could be multiple accounts returned!

Currently I'm 'fixing' with the following manager:

class UserManager(EmailUserManager):
    def filter(self, **kwargs):
        if 'email' in kwargs:
            kwargs['email__iexact'] = kwargs['email']
            del kwargs['email']
        return super(UserManager, self).filter(**kwargs)

    def get(self, **kwargs):
        if 'email' in kwargs:
            kwargs['email__iexact'] = kwargs['email']
            del kwargs['email']
        return super(UserManager, self).get(**kwargs)

But this feels pretty hacky and I'm concerned that these won't pass to the queryset, so User.objects.get(email='...') works fine but User.objects.filter(is_admin=True).filter(email='...') won't.

Any ideas on how this could be resolved and perhaps incorporated into django-custom-user?

The Django devs are not interested in fixing this in EmailField, according to this bug from 4 years ago. https://code.djangoproject.com/ticket/17561

This is an interesting SO question on the subject - http://stackoverflow.com/questions/7773341/case-insensitive-unique-model-fields-in-django (which is also where the above hack came from)

I'm not sure what would be the best way to fix this in django-custom-user. Maybe the easiest way is to have the email field unique (this is already done) and always validate the email field in all the forms (create, update, ...) to save the lower() version of the string to the database. The downside is that users would see their input changed (a bit unintuitive).

Another option is to save the user input as is, and always check with iexact if the email already exists. The main problem is that this opens a race condition, since the unique doesn't work at the database level anymore. We could end with two users with the same email in different cases, and get(email__iexact='...') raising an exception.

So, not sure how to proceed here :/

Great summary @jcugat. I don't like the idea of changing the case of the email address, since that might be meaningful to the user.

The query should be iexact, for sure. Interesting what you say about the race, hadn't thought of that. I can think of two ways to enforce at the DB level-

  • A custom index as described in this SO answer. This would enforce uniqueness, but it's Postgres-specific and I don't know the best way to tell Django to add this index and deal with migrations etc. Maybe someone here has an idea?

  • Another field lookup_email that stores the lowercased email that is unique. Set in save(), e.g.

      def save(self):
          self.lookup_email = self.email.lower()
          super(self, AbstractEmailUser).save()
    

Thoughts?

There was an interesting discussion in the django-developers mailing list regarding this same issue. Leaving this here for future reference: https://groups.google.com/d/topic/django-developers/SW7_qI81G58/discussion

Thanks @jcugat, the Postgres CITEXT option might work well for me.

It actually is going to get more complicated as email addresses become more internationalized. The rules for internationalization are different for the local part and domain part of addresses. In the local part, complex Unicode input from end users should be normalized to a canonical Unicode form. For example, if a user enters two code points to achieve an accented letter, that's the same email address as when the accented letter is represented by a single pre-composed code point.

I worked through this while trying to build an email address validation module:

https://github.com/JoshData/python-email-validator

I'd be interested to find a way to get Django to do this properly, to ensure that a) no user accounts are created with invalid email addresses, and b) the right normalizations are in place at user account creation (to prevent non-uniqueness) and at login (in case the user enters a different character string but equivalent address). I took a stab at this in https://github.com/if-then-fund/django-betteruser (same purpose as this repository) but I'd rather work with others on solving this rather than inventing my own user model. :)

@JoshData This is not exclusive to Django of course. Countless sites use email as the user identifier, it almost seems like it's unfashionable to login with usernames now. I'm guessing Facebook has addressed this, for example.

Will check out your code (I'm implementing this too).

As for the OP, I'm going to go with the CITEXT option since I'm using PostgreSql and need to ship asap.

@jcugat What would be the best way to implement CITEXT via Django do you think? FYI, as will become apparent, I'm a Django newbie.

All I want to do is modify django-custom-user's email field in the model to a CIEmailField, and I really don't want to fork this repo just for that.

I could override the model, but Django doesn't allow you to override model fields. So I think that just leaves me with writing a migration script to change the field. I wrote this, but it doesn't work:

# -*- coding: utf-8 -*-
from __future__ import unicode_literals

from django.db import migrations, models
from django.contrib.postgres.operations import CITextExtension
from django.contrib.postgres.fields import CIEmailField

class Migration(migrations.Migration):

    dependencies = [
        ('custom_user', '0002_initial_django18')
    ]

    operations = [
        CITextExtension(),
        migrations.AlterField('EmailUser', 'email', CIEmailField(verbose_name='email address', db_index=True, max_length=255, unique=True))
    ]

Error:

...
  File "/usr/www/claimer-api/env/lib/python3.6/site-packages/django/db/migrations/operations/fields.py", line 200, in state_forwards
    state.models[app_label, self.model_name_lower].fields
KeyError: ('api', 'emailuser')

Looking at the source for fields.py, I suspect this means I'm unable to edit a model from another app in this way? Any help would be appreciated!

Since Django 1.10 fields of abstract models can be overriden. So it should be as easy as following the instructions to extend EmailUser model and setting the email field there

from django.contrib.postgres.fields import CIEmailField

class CaseInsensitiveEmailUser(AbstractEmailUser):
    email = CIEmailField(_('email address'), max_length=255, unique=True, db_index=True)

Haven't tried it, @assembledadam could you check if it works?

Sorry took a while to get round to it, but yes - worked like a charm. Thanks. Often the simplest approach is the best - letting makemigrate do the work!

Just had to also import _ (from django.utils.translation import ugettext_lazy as _) and install citext on the postgresql instance (CREATE EXTENSION citext;)

Edit: for anyone reading, as my previous post actually already implies, one can simply add CITextExtension() (from django.contrib.postgres.operations import CITextExtension) to have the migration install the extension if necessary.

Looks like this is resolved with the above snippet!