Shr1ftyy / carla_nn

Currently working on a Level 2 capable self-driving system in CARLA

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Level 2-3 Autonomous Driving System - Using Carla

This repo is now archived, was a fun research project and I learnt a lot, but all good things must come to an end...

  • orbslam.py - toy implementation of monocular SLAM (ORBSlam) - INCOMPLETE
  • model.py - Neural Network Model - architecture needs improvement, is currently just a basic ResNet
  • renderer.py - Radar point cloud visualization, very slow lol

TODO:

  • Machine Learning

    • Research on different ANN Architectures
      • Implement recurrence, temporal dynamics of scene
      • Semantic Segmentation
    • Look more into automatic labelling of moving vs. non-moving objects in scene (have my own little idea wanna implement 😄).
    • Set up data trigger infrastructure - data collection for edge cases
  • SLAM

    • Read more of Multiple View Geometry
    • Figure out what to do after getting Camera Matrix (projection matrix?)
    • Get at least a toy implementation of ORBSlam up and running
  • Misc.

    • Leverage Cloud GPU for training and stuff (Google Colab)
    • git gud

Goal:

My final goal is to get a Level 2-3 Self-Driving System up and running, whilst also learning about relevant concepts

Preview:

Radar Visualization

radar

NOTES

Multiple View Geometry in Computer Vision:

Notation used in book:

  • A bold-face symbol such as x always represents a column vector, and its transpose is the row vector xT.
  • a point in the plane will be represented by the column vector (x,y)T, rather than its transpose, the row vector (x,y).
  • a line may naturally be represented by the vector (a, b, c)T
  • The set of equivalence classes of vectors in R3 − (0, 0, 0)T forms the projective space P2.
  • A conic is a curve described by a second-degree equation in the plane. In Euclidean geometry conics are of three main types: hyperbola, ellipse, and parabola
  • The equation of a conic in inhomogeneous coordinates is: ax2 + bxy + cy2 + dx + ey + f = 0
  • i.e. a polynomial of degree 2. “Homogenizing” this by the replacements: x → x1/x3, y → x2/x3 gives ax12 + bx1x2 + cx22 + dx1x3 + ex2x3 + fx32 = 0 (2.1) or in matrix form: xTCx = 0

Overview of Steps for implementing ORBSlam:

  1. Get two images points, first image being x and next one being x'

  2. Extract features(keypoints) from each image u,v, along with descriptors of every point using ORB.

  3. Use KNN with point descriptors to match points from each image

  4. Filter out points using ratio test (distance from points)

  5. Extract the Fundamental matrix F, using RANSAC to filter matches and the 8-point algorithm.

  6. Perform SVD on F, and find focal length values from s1 and s1 of matrix S (need to explore different method, currently unsure)

  7. Set the Intrinsic Matrix K using focal length and center point of pinhole (need to investigate further on using better methods, extracting focal length and getting intrinsic params.).

  8. Repeat steps from 1-4, then normalize homogenous image points (x,y) -> (x,y,z) x and x' into x^^ and x^^':

    x^^ = K-1x

    x^^' = K-1x'

  9. Find the Essential Matrix *E *using (x,y)_ from normalized coords

  10. Perform Singular Value Decomposition on E

  11. Extract Rotation and translation as demonstrated here: Determining R and t from E

  12. And finally, extract 3D points as shown here: 3D points from image points

About

Currently working on a Level 2 capable self-driving system in CARLA

License:MIT License


Languages

Language:Python 100.0%