D-X-Y / AutoDL-Projects

Automated deep learning algorithms implemented in PyTorch.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is it a problem when two edges of this node have the same j?

prismformore opened this issue · comments

node_str = '{:}<-{:}'.format(i+2, j)

If I understand it correctly, in this snippet, when all the in_node in node point from the same node j, they will share the same node_str. When forward it will repeat the same operation and sum the results. It can be a problem.

      for in_node in node:
        name, j = in_node[0], in_node[1]
        node_str = '{:}<-{:}'.format(i+2, j)
        op = self.edges[ node_str ]
        clist.append( op(states[j]) )

Thanks for the question. This is due to the premise of there is only a single edge between two nodes.
If two "in_node" have the same "j", it is invalid.

@D-X-Y Thank you for your timely reply.
When searching, search cell doesn't force the searched genotype to follow this premise, but picks two edges from the results as selected edges:

  def genotype(self):
    def _parse(weights):
      gene = []
      for i in range(self._steps):
        edges = []
        for j in range(2+i):
          node_str = '{:}<-{:}'.format(i, j)
          ws = weights[ self.edge2index[node_str] ]
          for k, op_name in enumerate(self.op_names):
            if op_name == 'none': continue
            edges.append( (op_name, j, ws[k]) )
        edges = sorted(edges, key=lambda x: -x[-1])
        selected_edges = edges[:2]
        gene.append( tuple(selected_edges) )
      return gene

Maybe we can find the top 2 edges from different j here, or abandon the second in_node with same j in the infer cell.

Thank you!

That is a very good point, and I agree with that.
The genotype function follows the original DARTS implementation. I will add a NOTE at here to clarify the mismatch between search and re-train.

BTW, what would be the correct behaviour in this case? Should we select top 2 from two different edges or change the assumption that they have to be from the different nodes.

@pomonam I feel it is case-by-case and should be determined by yourself. Both are fine and can be supported by some reasons.