The idea is simple, every site UI should be understood with a little 'common senese', GPT is a one big 'common senese' machine. What if we will be able to describe to GPT the webpage that is currently open in the browser and our goal, can GPT interact the mouse and keyborad?
Useful Command:
- Start the selenuim server: ''' docker run -d -p 4444:4444 -p 7900:7900 --shm-size="2g" selenium/standalone-chrome:latest '''
List of valuable resource: