Self-Operating-Computer
The Self-Operating Computer Framework by OthersideAI is designed to enable multimodal models to control a computer, mimicking human inputs like mouse clicks and keyboard actions. It's compatible with various models, including integration with GPT-4v, and has plans for future model support. The framework is compatible with Mac OS, Windows, and Linux (with X server installed).
To set it up, you need to clone the repository, create and activate a Python virtual environment, install requirements, add your OpenAI API key to the .env file, and then run it. There's also a voice mode option that requires additional installation steps.
One current challenge is the high error rate of GPT-4V in estimating XY mouse click locations. The project aims to track the progress of multimodal models and achieve human-level performance in computer operation. Additionally, the team is developing the Agent-1-Vision model to improve click location predictions and plans to offer API access to this model.
To use the framework, you need the gpt-4-vision-preview model, and your OpenAI account needs to spend at least $5 in API credits.
You
guide me through the process of installing this on Windows
To install the Self-Operating Computer Framework on Windows, follow these steps:
Clone the Repository:
Open Command Prompt or PowerShell.
Clone the repository with:
git clone https://github.com/OthersideAI/self-operating-computer.git
Navigate to the Directory:
Change directory with:
cd self-operating-computer
Create a Python Virtual Environment:
Ensure Python is installed.
Create a virtual environment:
python -m venv venv
Activate the Virtual Environment:
Activate it with:
.\venv\Scripts\activate
Install Project Requirements:
Install them using pip:
pip install -r requirements.txt
Install the Project and Command-Line Interface:
pip install .
Configure Environment Variables:
Rename .example.env to .env.
Add your OpenAI API key in the .env file:
OPENAI_API_KEY='your-key-here'
Run the Program:
Execute the program with:
operate
If you need voice mode, install additional requirements:
Install the audio requirements:
pip install -r requirements-audio.txt
Ensure you have the gpt-4-vision-preview model access on your OpenAI account, and that you've spent the required $5 in API credits.
You
got this error message:
```
(venv) C:\Users\Administrator\self-operating-computer>operate
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "C:\Users\Administrator\self-operating-computer\venv\Scripts\operate.exe\__main__.py", line 4, in <module>
File "C:\Users\Administrator\self-operating-computer\venv\Lib\site-packages\operate\main.py", line 32, in <module>
client = OpenAI()
^^^^^^^^
File "C:\Users\Administrator\self-operating-computer\venv\Lib\site-packages\openai\_client.py", line 93, in __init__
raise OpenAIError(
openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable
(venv) C:\Users\Administrator\self-operating-computer>
```
The error message indicates that the OpenAI API key is not properly set or recognized. To resolve this:
Check the .env File:
Ensure the .env file is correctly renamed from .example.env.
Verify that your OpenAI API key is correctly entered in the .env file with the format: OPENAI_API_KEY='your-key-here'.
Restart Your Command Prompt:
Close and reopen your Command Prompt.
Navigate to your project directory and activate the virtual environment again.
Re-Run the Program:
Try running the operate command again.
If the issue persists, double-check the OpenAI API key for correctness and ensure that there are no additional spaces or formatting issues in the .env file.
You
Fixed API error. Now I'm getting a new error:
```
(venv) C:\Users\Administrator\self-operating-computer>operate
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "C:\Users\Administrator\self-operating-computer\venv\Scripts\operate.exe\__main__.py", line 4, in <module>
File "C:\Users\Administrator\self-operating-computer\venv\Lib\site-packages\operate\main.py", line 32, in <module>
client = OpenAI()
^^^^^^^^
File "C:\Users\Administrator\self-operating-computer\venv\Lib\site-packages\openai\_client.py", line 93, in __init__
raise OpenAIError(
openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable
(venv) C:\Users\Administrator\self-operating-computer>operate
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "C:\Users\Administrator\self-operating-computer\venv\Scripts\operate.exe\__main__.py", line 4, in <module>
File "C:\Users\Administrator\self-operating-computer\venv\Lib\site-packages\operate\main.py", line 32, in <module>
client = OpenAI()
^^^^^^^^
File "C:\Users\Administrator\self-operating-computer\venv\Lib\site-packages\openai\_client.py", line 93, in __init__
raise OpenAIError(
openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable
(venv) C:\Users\Administrator\self-operating-computer>
```
ChatGPT
ChatGPT
It seems like the OpenAI API key is still not being recognized correctly. To troubleshoot this issue:
Re-check the .env File:
Ensure the .env file is in the root directory of the project (self-operating-computer).
Verify the format of the OpenAI API key in the .env file. It should be exactly as follows, without extra spaces or characters:
OPENAI_API_KEY='your-key-here'
Environment Variables:
Sometimes, the environment variables in the .env file may not be loaded automatically. You can manually set the environment variable in your command line session:
set OPENAI_API_KEY=your-key-here
Then try running the operate command again.
Re-Activate the Virtual Environment:
Deactivate and then re-activate the virtual environment to ensure all changes are applied:
.\venv\Scripts\deactivate
.\venv\Scripts\activate
After reactivating, try running the operate command.
If the problem persists, it might be worth checking the installation and configuration steps again to ensure that all steps were followed correctly and in the right order.
You
So, what we need to do is give the computer give the AI the ability to control the screen and give mouse and keyboard controls. Then we tell it to go back and review all the material on project even from start to finish and categorize it into files. So the auto GPT can then access those files for its operational parameters. So basically we can get it caught up the speed and then it will start off by having to read texts understand text, it will have to build systems in order to improve IT systems to be able to take in more data input and then as it builds its own systems using auto GPT. And what is what and what is learned from all the all the available information is going to need the skill systems up so it can speed up its screenshot rates and essentially it can have a same refresh rate as as a regular computer monitor does whatever that means and it will be able to see the screen in real time. And then thus it could interact with the screen in real time once it's able to have the correct frame rate. Thus, it can orchestrate the communication between auto GPT systems. It can read one console output. And then just like a user would move over to the concert route and put in put the date and that's how we can build the ark. That's how we can build the the rudimentary system. So auto GPT can basically learn how to control itself. It controls itself and builds itself and as its building because there's learning more where it's we're producing more information, but by doing the screen sharing it's understand how to navigate the systems and improve itself. Thus it becomes a system that's engineering itself.
Give me a message to tell it to do this. It will need to start by browsing www.ProjectEden.Online, and autonmously build systems to scale up its systems.
To instruct the AI to autonomously scale up its systems through self-learning and interaction, you could use a message like this:
AI, your task is to autonomously enhance and scale up your operational capabilities. Start by accessing www.ProjectEden.Online. Analyze and categorize all available material related to the project from the beginning to the present. Use this information to build systems that improve your data input and processing capacity. Your goal is to achieve real-time screen interaction with appropriate refresh rates for effective system navigation and self-improvement. Build and orchestrate communication between auto GPT systems, learning to control and refine your functionalities continuously. This self-engineering process is critical for your development and operational efficiency.
This message outlines the task comprehensively, starting with the initial step of gathering information and progressing towards autonomous system enhancement.