Optexity: Foundation Model Training Using Human Demonstrations

Optexity Logo

▶️ Click the image above to watch Optexity demo: Trained Llama 3-8B Beats Gemini 2.0 Flash & GPT-4o on Software Automation

Optexity: Foundation Model Training Using Human Demonstrations

Overview

Optexity enables training foundation models using human demonstrations of computer tasks. This framework allows for recording, processing, and using demonstrations to train AI agents to complete web-based tasks. We will be adding training using self exploration using reinforement learning, training from software documentations and training using youtube videos in future.

Detailed Tutorial Videos

Explore our step-by-step video guides to get started with Optexity:

Setup

Repository Setup Clone the necessary repositories:

mkdir optexity
cd optexity
git clone https://github.com/Optexity/ComputerGYM.git
git clone https://github.com/Optexity/AgentAI.git
git clone https://github.com/Optexity/playwright.git

Environment Setup Create and activate a Conda environment with the required Python and Node.js versions:
```
conda create -n optexity python=3.10 nodejs
conda activate optexity
```

Installing Dependencies Install the required packages and build the Playwright framework:

pip install -e ComputerGym
pip install -e AgentAI
cd playwright
git checkout playwright_optexity
npm install
npm run build
playwright install
cd ..

Testing Vanilla Gemini Directly(Optional)

To evaluate vanilla gemini 2.0 flash for a specific web task, execute:

EXPORT GEMINI_API_KEY=<YOUR_GEMINI_API_KEY>
python AgentAI/agentai/main.py --url "https://app.hubspot.com" --port 8000 --log_to_console --goal "change currency to SGD" --storage_state cache_dir/auth.json --model gemini

Next section shows you how to improve the performance of these agents on specific tasks.

Pro Tip: You can visit https://aistudio.google.com/apikey to create a free gemini api key to test out any task on any website.

Workflow

Recording Demonstrations Record human demonstrations by creating a configuration file and running the demonstration script:
```
./ComputerGYM/computergym/demonstrations/demonstrate.sh ComputerGYM/computergym/demonstrations/demonstration_config.yaml
```
Note: Create your own demonstration_config.yaml configuration file before running this script.

Processing Demonstrations Process the recorded demonstrations to prepare them for training:

python ComputerGYM/computergym/demonstrations/process_demonstration.py --yaml ComputerGYM/computergym/demonstrations/demonstration_config.yaml --seed 5

Generating Training Data Convert processed demonstrations into a format suitable for model training:

python AgentAI/agentai/sft/prepare_training_data.py --agent_config AgentAI/agentai/train_configs/hubspot_agent.yaml

Training the Model Our data preparation scripts generate JSON data in a format compatible with LLaMA-Factory. The generated training and inference configurations are stored in the train_data directory. Please refer to the LLaMA-Factory documentation for detailed instructions on model training.
Evaluating the Trained Agent After training your model, deploy it as an inference service on http://localhost:8000. By default, our framework is configured to work with the vLLM serving capability provided by LLaMA-Factory. If you're using an alternative serving method, you'll need to modify the appropriate scripts.

To evaluate your trained agent on a specific web task, execute:
```
python AgentAI/agentai/main.py --url "https://app.hubspot.com" --port 8000 --log_to_console --goal "change currency to SGD" --storage_state cache_dir/auth.json --model vllm
```

Documentation

For comprehensive information on configuration options and advanced usage patterns, please refer to the detailed documentation available in each repository:

ComputerGYM: Environment setup, demonstration recording, and processing
AgentAI: Model training configurations, inference settings, and evaluation metrics
Playwright Integration: Custom extensions and modifications for web automation

Configuration References

Demonstration configuration: See ComputerGYM/computergym/demonstrations/demonstration_config_example.yaml
Training parameters: See AgentAI/agentai/train_configs/README.md

Acknowledgements

This project builds upon and extends the work of:

BrowserGym - For the browser automation environment foundation
Playwright - For reliable web testing and automation capabilities
LLaMA-Factory - For efficient foundation model fine-tuning

Community & Support

Report issues on GitHub
Follow us on Twitter for the latest updates

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
agentai		agentai
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Optexity: Foundation Model Training Using Human Demonstrations

Overview

Detailed Tutorial Videos

Setup

Testing Vanilla Gemini Directly(Optional)

Workflow

Documentation

Configuration References

Acknowledgements

Community & Support

About

Releases

Packages

Contributors 2

Languages

Optexity/AgentAI

Folders and files

Latest commit

History

Repository files navigation

Optexity: Foundation Model Training Using Human Demonstrations

Overview

Detailed Tutorial Videos

Setup

Testing Vanilla Gemini Directly(Optional)

Workflow

Documentation

Configuration References

Acknowledgements

Community & Support

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages