jupyterhub / jupyterhub-deploy-teaching

Reference deployment of JupyterHub and nbgrader on a single server

Home Page:http://jupyterhub-deploy-teaching.readthedocs.io/en/latest/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

nbgrader too welded to configuration

ellisonbg opened this issue · comments

Some original use cases for this deployment seem to be broken right now:

  • Deploy with no nbgrader config at all: I know the current name of this deployment is jupyterhub-deploy-teaching but that is a bit of a misnomer. In reality, this deployment is also very well matched to small teams of data scientists working on a single shared system. Originally, the nbgrader config could be left as is and everything would work fine. That is not the case today as the default config will cause the user instructor to be used. We should add a use_nbgrader flag to completely disable it.
  • Multiple nbgrader configs, setup after initial deployment: We often run a single server for a long period of time with this deployment and over time add more nbgrader/formgrade configurations. Previously, I could edit the nbgrader vars in the host_vars file and rerun the deploy-formgrade.yml playbook to add a new nbgrader setup. That no longer works as the formgrade user and the formgrade API token is injected into the jupyterhub config file. It is important to support adding new nbgrader courses, without restarting or reconfiguring the hub. To support this usage case, we may want to move the vars related to a particular nbgrader course to the deploy-formgrade.yml playbook to make that relationship more clear.

Relaunching the Hub to reload config should generally be safe (it doesn't stop user servers, and needn't stop the proxy, either), and is the right thing to do if config changes. But you only need to modify the Hub if you are:

  • adding new services (e.g. a new /services/course endpoint)
  • adding new API tokens

When you add a course, are you updating the course (only one active at a time), or creating a new one that should be running concurrently with the old one?

#35 should cover the first bullet easily enough, and really covers the titular issue.

I think I need more information for how to deal with the second. Exactly what information do you want to be able to change and re-run with?

I typically edit the following variables with an entirely new instructor, section, etc.:

nbgrader_course_id: mycourse
nbgrader_owner: instructor
nbgrader_base_dir: "{{home_dir}}/{{nbgrader_owner}}/nbgrader/{{nbgrader_course_id}}"
nbgrader_graders: 
    - instructor
    - grader
nbgrader_port: 5005

And then rerun the formgrade stuff to start the formgrader for that new course/section/instructor.

Does that start an additional course, or is the previous course defunct when you do that?

Sorry I wasn't clear - yes, it starts an additional course. We often run a jupyterhub instance for multiple quarters and have multiple sections each quarter. More clearly, the usage case is multiple serial and parallel over time...

Gotcha. Sounds like the nbgrader setup should really be a list of courses, rather than a singleton. While I wouldn't expect changing the nbgrader info and deploying to delete the course files, I would expect nbgrader courses no longer reflected in the ansible config to become defunct (i.e. formgrader no longer running, etc.). Here's what I think makes sense right now:

  • nbgrader courses is a list
  • nbgrader deploy modifies jupyterhub services config, relaunches hub to reload config
  • make sure hub relaunch is set up for minimum disruption (proxy stays running, as do single-user servers)

Does that make sense?

I'll try to do this as part of updating to JupyterHub 0.7 deployment next week.

Should be fixed in master.

The only part I haven't done is to ensure that the proxy and single-user servers stay running. I will open a separate issue for that.