fabvio / ld-lsi

Deep learning based lane/freespace detector embedded in ROS node (built for UC3M LSI)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cnn does not find any lanes, when trying to infer on own data

AlexKaravaev opened this issue · comments

Hi there!
First of all, thank you very much for sharing the code and for the awesome paper, you have released. Really great work!
Now, to the problem. I am trying to infer on my own data, using pretrained models, and results are very bad. Ego_lane_points and other_lane_points are always empty, though data is pretty simillar to bdd_data. Maybe I am missing something? Should I somehow preprocess data or maybe there is a need to train the net?
Also I have converted the code to not use ROS and use python3, instead of python2, just to test it out on the few images, so I could easily break something in the code, but I just deleted all the ros stuff and few things for python3(ex. substitung '/' with a '//').

This is cnn "node" code

class LDCNNNode:
    """
        CNN Node. It takes an image as input and process it using the neural network. Then it resizes the output
        and publish it on a topic.
    """
    def img_received_callback(self, image, name):
        '''
            Callback for image processing
            It submits the image to the CNN, extract the output, then resize it for clustering
            and publishes it on a topic
        
              Args:
                  image: image published on topic by the camera
        '''
        try:
            ### Pytorch conversion
            print("Received image")
            start_t = time.time()
            input_tensor = torch.from_numpy(image)
            input_tensor = input_tensor.float() // 255
            input_tensor = input_tensor.permute(2,0,1).unsqueeze(0)
            print(input_tensor.size())
        except Exception as e:
            print("Cannot convert image to pytorch. Exception: %s" % e)

        try:
            ### PyTorch 0.4.0 compatibility inference code
            if torch.__version__ < "0.4.0":
                input_tensor = Variable(input_tensor, volatile=True)
                output = self.cnn(input_tensor)
            else:
                with torch.no_grad():
                    input_tensor = Variable(input_tensor)
                    output = self.cnn(input_tensor)

            if self.with_road:
                output, output_road = output
                road_type = output_road.max(dim=1)[1][0]
            ### Classification
            output = output.max(dim=1)[1]
            output = output.float().unsqueeze(0)

            ### Resize to desired scale for easier clustering
            output = F.interpolate(output, size=(output.size(2) // self.resize_factor, output.size(3) // self.resize_factor) , mode='nearest')

            ### Obtaining actual output
            ego_lane_points = torch.nonzero(output.squeeze() == 1)
            other_lanes_points = torch.nonzero(output.squeeze() == 2)

            ego_lane_points = ego_lane_points.view(-1).cpu().numpy()
            other_lanes_points = other_lanes_points.view(-1).cpu().numpy()

        except Exception as e:
            print("Cannot obtain output. Exception: %s" % e)

        print("-ego: {}".format(ego_lane_points))
        self.pub.publish(ego_lane_points,other_lanes_points,-1 if not self.with_road else road_type,name)
        self.time.append(time.time() - start_t)
        print("Sent lanes information to clustering node with " \
                 + " %s ego lane points and %s other lanes points. %s fps" % (len(ego_lane_points), len(other_lanes_points), len(self.time) // sum(self.time)))
        ### Debug visualization options
        if self.debug:
            try:
                # Convert the image and substitute the colors for egolane and other lane
                output = output.squeeze().unsqueeze(2).data.cpu().numpy()
                output = output.astype(np.uint8)

                output = cv2.cvtColor(output, cv2.COLOR_GRAY2RGB)
                output[np.where((output == [1, 1, 1]).all(axis=2))] = COLORS_DEBUG[0]
                output[np.where((output == [2, 2, 2]).all(axis=2))] = COLORS_DEBUG[1]

                # Blend the original image and the output of the CNN
                output = cv2.resize(output, (image.shape[1], image.shape[0]), interpolation=cv2.INTER_NEAREST)
                image = cv2.addWeighted(image, 1, output, 0.4, 0)
                if self.with_road:
                    cv2.putText(image, ROAD_MAP[road_type], (20, 40), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2)

                # Visualization
                print("Visualizing output")
                cv2.imwrite("./out/cnn_{}.jpg".format(name), cv2.resize(image, (320, 240), cv2.INTER_NEAREST))
                #cv2.waitKey(1)
            except Exception as e:
                print("Visualization error. Exception: %s" % e)

    def __init__(self):
        """
            Class constructor.
        """
        try:
            # Adding models path to PYTHONPATH to import modules
            print(os.path.join(MODULE_PATH,'res','models'))
            sys.path.insert(0, os.path.join(MODULE_PATH,'res','models'))

            # Initialize CNN parameters with defaults
            model_name = 'erfnet'
            weights_name = 'weights_erfnet.pth'
            self.resize_factor = 1
            self.debug = True
            self.with_road = False
            queue_size = 10
            
        except Exception as e:
            print("Cannot load parameters. Check your roscore. %s" % e)

        try:
            weights_path = os.path.join(MODULE_PATH, 'res', 'weights', weights_name)
            print(weights_path)
            # Assuming the main constructor is method Net()
            self.cnn = importlib.import_module(model_name).Net()

            # try on cpu
            device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 
            model_dict = torch.load(weights_path, map_location=lambda storage, loc: storage)
            self.cnn = torch.nn.DataParallel(self.cnn, device_ids=[0]).to(device)
            print(device)
            self.cnn.load_state_dict(model_dict)
            self.cnn.eval()

            print("Initialized CNN %s", model_name)
        except Exception as e:
            print("Cannot load neural network. Exception: %s" % e)

        self.time = []

        try:            
            # Publisher interface to send messages to the clustering node
            self.pub = ros_interface_pub(LDClusteringNode())

            # ROS node setup
        except Exception as e:
            print("Cannot initialize ros node. Exception: %s" % e)    

This is how read images

def start_inferring(node, path):
    for image in path.iterdir():
        print("--- Processing {} ---".format(str(image)))
        oriimg = cv2.imread(str(image),cv2.IMREAD_COLOR)
        print(oriimg.shape)

        img = cv2.resize(oriimg,(640,360))
        node.img_received_callback(img, image.stem)
        print("---------------------------\n")       

This is link to the data sample, which I infer

Hi and thanks for your interest. The problem is here, and it is related to python3:

input_tensor = input_tensor.float() // 255

Before converting the tensor to float, you have an input with np.uint8 values, so with values from 0 to 255. You have to normalize your input to have values in [0, 1]. With a double slash operator, you are performing an integer division, so your input image goes to 0. To avoid other problems similar to this, I will update the code and use torch.div.

This is the output I get on the image you uploaded.

out

@fabvio Thank you very much!