cvat-ai / cvat

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.

Home Page:https://cvat.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What to do with my additions?

Fred-Erik opened this issue · comments

Actions before raising this issue

  • I searched the existing issues and did not find anything similar.
  • I read/searched the docs

Is your feature request related to a problem? Please describe.

Hi CVAT devs,

For these past weeks, I've been busy adjusting CVAT to my own needs, and several of these changes might be helpful for you to have. But as I'm new both to Django and React, so I'm pretty sure my coding style is not up to par for public CVAT. I've also implemented a lot of changes that might not be helpful to have in the official CVAT repo. So I'm thinking, is there a way in which I can share the useful changes without me having to take months to learn the details of Django/React/CVAT?

  1. I'm currently working on creating a crops-based view, a bit similar to this. My use case is as follows:
  • do automatic annotation on a dataset in CVAT which detects traffic signs and labels them with the attribute that says which of a set of >500 traffic signs it is
  • correct bounding boxes in normal CVAT view
  • switch to new crops-based view. This shows all traffic signs in the current job per attribute value, so all (for example) stop signs are clustered. A bit like this:
    image
  • in this view you can easily spot the one crop that is not same as the rest (e.g. a parking sign in the midst of twenty stop signs) and correct the label

I have now created the backend for this with a Django app and this works. I'm hesistant to integrate the frontend with the existing CVAT interface though, as the level of complexity of the React codebase is overwhelming. So I'm thinking to create a seperate Vue app (which I am familiar with) with a link or iframe in the current CVAT interface to see the current job in the the crop interface. I would prefer to integrate into cvat-ui proper, but without help, I will not be able to accomplish this, I'm afraid.

My smaller additions up to now have been (in chronological order):

  1. support many-values attributes. Updated AttributeSpec to from 4k 32k max length, added search in Select element, changed interface to not show all options (as that doesn't fit with my options). I use it to label the make and model of a car, for which I have >300 options. I can now simply type "Golf" and I get the "Volkswagen Golf" option.
  2. make new user worker instead of user so they cannot do anything unless given permission
  3. fix frame rotation reset on frame deletion
  4. Arrow keys control size of rectangle annotation which is hovered over, allowing automatic annotation correction without using the mouse as much, which has worked very well for our label team

I also plan to add these things:

  1. confidence per attribute and shape coming from automatic annotations, and being able to visualise the confidences and use them for filtering in both the normal CVAT view and my new crop-based view
  2. automatic annotation using open vocabulary object detection, similar to https://github.com/CVHub520/X-AnyLabeling

I have also changed these things which you might not want:

  1. created serverless functions for object detection with attributes of certain objects filled in by other neural networks. E.g. an object detector detects a license plate and a OCR network adds the license plate reading. The networks are proprietary but the serverless script I could share
  2. hide registration of new user in login GUI
  3. lots of changes to get it working behind with my reverse proxy setup, but those changes end up adding a lot of hard-coded URLs/IPs in various files. I guess I could write some documentation on where to add what
  4. add Issue creation to Standard view
  5. hidden lots of GUI options I don't need (layers, Cloud Storages, etc)
  6. changed shortcuts as not to interfere with arrow keys-based rectangle shape changes

Describe the solution you'd like

If an experienced CVAT dev could help me figure out what code would be useful to share and isolate from my branch (which currently has all above described changes in it) and then make it up to par with how you would like the CVAT code to be, I could share some of my work.

Describe alternatives you've considered

Keep my changes internal and develop them in my own way, and not bother with sharing. I'd prefer not to, but I'm anxious I don't have the skills (and time) to refactor my additions to make them good enough to be accepted by the official CVAT devs.

Additional context

No response