CCI-MOC / hil

Hardware Isolation Layer, formerly Hardware as a Service

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unable to delete a nic after a networking action on it

naved001 opened this issue · comments

I can't seem to delete a nic on a node.

[naved@kzn-vbmi01res ~]$ hil node nic delete neu-3-9 nic1
[naved@kzn-vbmi01res ~]$ hil node nic delete neu-3-9 nic2
^[[A^[[A[naved@kzn-vbmi01res ~]$ hil node show neu-3-9
project	  :  None
nics	  :  [{u'port': None, u'switch': None, u'macaddr': u'00:0f:53:0d:f7:29', u'networks': {}, u'label': u'nic2'}, {u'port': None, u'switch': None, u'macaddr': u'00:0f:53:0d:f7:28', u'networks': {}, u'label': u'nic1'}]
name	  :  neu-3-9
metadata	  :  {}

Curl returns a 500 error. Before running this.

I then repeated this in my development environment. All you have to do is, register a nic, run a networking operation on it (connect to a network, or run revert port). And then try to delete it.

IntegrityError: (sqlite3.IntegrityError) NOT NULL constraint failed: networking_action.nic_id [SQL: u'UPDATE networking_action SET nic_id=? WHERE networking_action.id = ?'] [parameters: (None, 2)] (Background on this error at: http://sqlalche.me/e/gkpj)
INFO:werkzeug:127.0.0.1 - - [19/Sep/2018 17:19:30] "DELETE /v0/node/dell1/nic/nic1 HTTP/1.1" 500 -

Probably because we create an entry in the neworking_action table for querying later so we can't just delete the nic because the networking_action table refers to that nic_id.

It was easily fixed by clearing my networking_action table.

Solution 1.

backref=db.backref('current_action', uselist=False))

We change that to this:

backref=db.backref('current_action', uselist=False, cascade="all,delete"))

http://docs.sqlalchemy.org/en/rel_0_9/orm/cascades.html

Comes with a migration script.

Solution 2.

In the nic_delete call, manually check and delete for any entries in the networking_action table and then delete the nic. For instance we require the nic to be deleted (by a user) before a node can be deleted, rather than doing a cascade delete on node.

@zenhack thoughts?

We should be refusing to delete a nic that has pending operations on it. But it shouldn't be throwing 500s, we should be detecting the issue and reporting it to the user cleanly.

It's not just pending operations anymore. We keep the networking actions in the table and mark them DONE (or 'ERROR`) so that users can query the status later on.

And we keep a networking action for that nic until a new action on that nic replaces it.