SingularGPT is a open source project that aims to automate complex tasks such as device automation using the latest trending LLM models that is ChatGPT & GPT-4.
With
For example:
Let’s say you wish to accomplish any task on your device instead of writing automation scripts, then testing it further debugging it, playing around with co-ordinates.
Query: Hey, please click on the item with text “Document Writer” after that click on the image with path “image.png” after that scroll down and then find element that is top of text “File” , double left click it.
The SingularGPT will process it and does the task.
The old way using X_PATH or CSS/JS Selectors or by just co-ordinates.
element_xpath = driver.find_element(By.XPATH, "//a[@href="http://github.com/login"]") element_xpath.click() # or element_css = driver.find_element(By.CSS_SELECTOR, "button.btn-primary") element_css.click()
No, it uses the new GUI element detection techniques.
Nopes !
zex.text('Menu').click() zex.text('Edit').FindLeftOf().click() # Used to locate the element that is just left side of the target element.
You may even locate and perform actions to the element that is left or right or even the most nearest element to it.
ZexUI is a standalone library that uses image processing techniques for GUI automation.
Make sure this project currently works on linux and x11 servers.
You may just run it in google colab with a GPU.
The requirements.txt
file specifies the following packages:
paddleocr
: A deep learning-based optical character recognition (OCR) toolkit.opencv-python-headless
: A computer vision library that provides real-time computer vision applications.google-cloud-vision
: A cloud-based OCR service that can be used to extract text from images.numpy
: A fundamental package for scientific computing in Python.matplotlib
: A visualization library in Python for 2D plots and graphs.
You can install these packages, along with their dependencies, using the following command:
pip install -r requirements.txt
Make sure that you run this command in the same directory where the requirements.txt
file is located.
🌟 Quickstart
Create a .env
file with OPENAI_API
and place your openai_api api there or pass as environment variable.
Write your prompt query in Prompts