Skip to content

Commit

Permalink
feat(agent/workspace): Add GCS and S3 FileWorkspace providers (Signif…
Browse files Browse the repository at this point in the history
…icant-Gravitas#6485)

* refactor: Rename FileWorkspace to LocalFileWorkspace and create FileWorkspace abstract class
  - Rename `FileWorkspace` to `LocalFileWorkspace` to provide a more descriptive name for the class that represents a file workspace that works with local files.
  - Create a new base class `FileWorkspace` to serve as the parent class for `LocalFileWorkspace`. This allows for easier extension and customization of file workspaces in the future.
  - Update import statements and references to `FileWorkspace` throughout the codebase to use the new naming conventions.

* feat: Add S3FileWorkspace + tests + test setups for CI and Docker
  - Added S3FileWorkspace class to provide an interface for interacting with a file workspace and storing files in an S3 bucket.
  - Updated pyproject.toml to include dependencies for boto3 and boto3-stubs.
  - Implemented unit tests for S3FileWorkspace.
  - Added MinIO service to Docker CI to allow testing S3 features in CI.
  - Added autogpt-test service config to docker-compose.yml for local testing with MinIO.

* ci(docker): tee test output instead of capturing

* fix: Improve error handling in S3FileWorkspace.initialize()
  - Do not tolerate all `botocore.exceptions.ClientError`s
  - Raise the exception anyways if the error is not "NoSuchBucket"

* feat: Add S3 workspace backend support and S3Credentials
  - Added support for S3 workspace backend in the Autogpt configuration
  - Added a new sub-config `S3Credentials` to store S3 credentials
  - Modified the `.env.template` file to include variables related to S3 credentials
  - Added a new `s3_credentials` attribute on the `Config` class to store S3 credentials
  - Moved the `unmasked` method from `ModelProviderCredentials` to the parent `ProviderCredentials` class to handle unmasking for S3 credentials

* fix(agent/tests): Fix S3FileWorkspace initialization in test_s3_file_workspace.py
  - Update the S3FileWorkspace initialization in the test_s3_file_workspace.py file to include the required S3 Credentials.

* refactor: Remove S3Credentials and add get_workspace function
  - Remove `S3Credentials` as boto3 will fetch the config from the environment by itself
  - Add `get_workspace` function in `autogpt.file_workspace` module
  - Update `.env.template` and tests to reflect the changes

* feat(agent/workspace): Make agent workspace backend configurable
  - Modified `autogpt.file_workspace.get_workspace` function to either take a workspace `id` or `root_path`.
  - Modified `FileWorkspaceMixin` to use the `get_workspace` function to set up the workspace.
  - Updated the type hints and imports accordingly.

* feat(agent/workspace): Add GCSFileWorkspace for Google Cloud Storage
  - Added support for Google Cloud Storage as a storage backend option in the workspace.
  - Created the `GCSFileWorkspace` class to interface with a file workspace stored in a Google Cloud Storage bucket.
  - Implemented the `GCSFileWorkspaceConfiguration` class to handle the configuration for Google Cloud Storage workspaces.
  - Updated the `get_workspace` function to include the option to use Google Cloud Storage as a workspace backend.
  - Added unit tests for the new `GCSFileWorkspace` class.

* fix: Unbreak use of non-local workspaces in AgentProtocolServer
  - Modify the `_get_task_agent_file_workspace` method to handle both local and non-local workspaces correctly
  • Loading branch information
Pwuts committed Dec 7, 2023
1 parent fdd7f8e commit 1f40d72
Show file tree
Hide file tree
Showing 20 changed files with 1,425 additions and 152 deletions.
14 changes: 13 additions & 1 deletion .github/workflows/autogpt-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,15 @@ jobs:
matrix:
python-version: ["3.10"]

services:
minio:
image: minio/minio:edge-cicd
ports:
- 9000:9000
options: >
--health-interval=10s --health-timeout=5s --health-retries=3
--health-cmd="curl -f https://localhost:9000/minio/health/live"
steps:
- name: Checkout repository
uses: actions/checkout@v3
Expand Down Expand Up @@ -154,8 +163,11 @@ jobs:
tests/unit tests/integration
env:
CI: true
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
PLAIN_OUTPUT: True
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
S3_ENDPOINT_URL: https://localhost:9000
AWS_ACCESS_KEY_ID: minioadmin
AWS_SECRET_ACCESS_KEY: minioadmin

- name: Upload coverage reports to Codecov
uses: codecov/codecov-action@v3
Expand Down
31 changes: 21 additions & 10 deletions .github/workflows/autogpt-docker-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,15 @@ jobs:
test:
runs-on: ubuntu-latest
timeout-minutes: 10

services:
minio:
image: minio/minio:edge-cicd
options: >
--name=minio
--health-interval=10s --health-timeout=5s --health-retries=3
--health-cmd="curl -f https://localhost:9000/minio/health/live"
steps:
- name: Check out repository
uses: actions/checkout@v3
Expand Down Expand Up @@ -124,23 +133,25 @@ jobs:
CI: true
PLAIN_OUTPUT: True
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
S3_ENDPOINT_URL: https://minio:9000
AWS_ACCESS_KEY_ID: minioadmin
AWS_SECRET_ACCESS_KEY: minioadmin
run: |
set +e
test_output=$(
docker run --env CI --env OPENAI_API_KEY \
--entrypoint poetry ${{ env.IMAGE_NAME }} run \
pytest -v --cov=autogpt --cov-branch --cov-report term-missing \
--numprocesses=4 --durations=10 \
tests/unit tests/integration 2>&1
)
test_failure=$?
docker run --env CI --env OPENAI_API_KEY \
--network container:minio \
--env S3_ENDPOINT_URL --env AWS_ACCESS_KEY_ID --env AWS_SECRET_ACCESS_KEY \
--entrypoint poetry ${{ env.IMAGE_NAME }} run \
pytest -v --cov=autogpt --cov-branch --cov-report term-missing \
--numprocesses=4 --durations=10 \
tests/unit tests/integration 2>&1 | tee test_output.txt
echo "$test_output"
test_failure=${PIPESTATUS[0]}
cat << $EOF >> $GITHUB_STEP_SUMMARY
# Tests $([ $test_failure = 0 ] && echo '✅' || echo '❌')
\`\`\`
$test_output
$(cat test_output.txt)
\`\`\`
$EOF
Expand Down
31 changes: 24 additions & 7 deletions autogpts/autogpt/.env.template
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,32 @@ OPENAI_API_KEY=your-openai-api-key
## EXECUTE_LOCAL_COMMANDS - Allow local command execution (Default: False)
# EXECUTE_LOCAL_COMMANDS=False

### Workspace ###

## RESTRICT_TO_WORKSPACE - Restrict file operations to workspace ./data/agents/<agent_id>/workspace (Default: True)
# RESTRICT_TO_WORKSPACE=True

## DISABLED_COMMAND_CATEGORIES - The list of categories of commands that are disabled (Default: None)
# DISABLED_COMMAND_CATEGORIES=

## WORKSPACE_BACKEND - Choose a storage backend for workspace contents
## Options: local, gcs, s3
# WORKSPACE_BACKEND=local

## WORKSPACE_STORAGE_BUCKET - GCS/S3 Bucket to store workspace contents in
# WORKSPACE_STORAGE_BUCKET=autogpt

## GCS Credentials
# see https://cloud.google.com/storage/docs/authentication#libauth

## AWS/S3 Credentials
# see https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html

## S3_ENDPOINT_URL - If you're using non-AWS S3, set your endpoint here.
# S3_ENDPOINT_URL=

### Miscellaneous ###

## USER_AGENT - Define the user-agent used by the requests library to browse website (string)
# USER_AGENT="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36"

Expand All @@ -29,12 +52,6 @@ OPENAI_API_KEY=your-openai-api-key
## EXIT_KEY - Key to exit AutoGPT
# EXIT_KEY=n

## PLAIN_OUTPUT - Plain output, which disables the spinner (Default: False)
# PLAIN_OUTPUT=False

## DISABLED_COMMAND_CATEGORIES - The list of categories of commands that are disabled (Default: None)
# DISABLED_COMMAND_CATEGORIES=

################################################################################
### LLM PROVIDER
################################################################################
Expand Down Expand Up @@ -201,5 +218,5 @@ OPENAI_API_KEY=your-openai-api-key
## Note: Log file output is disabled if LOG_FORMAT=structured_google_cloud.
# LOG_FILE_FORMAT=simple

## PLAIN_OUTPUT - Disables animated typing in the console output.
## PLAIN_OUTPUT - Disables animated typing and the spinner in the console output. (Default: False)
# PLAIN_OUTPUT=False
42 changes: 25 additions & 17 deletions autogpts/autogpt/autogpt/agents/features/file_workspace.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,15 @@
if TYPE_CHECKING:
from pathlib import Path

from ..base import BaseAgent
from ..base import BaseAgent, Config

from autogpt.file_workspace import FileWorkspace
from autogpt.file_workspace import (
FileWorkspace,
FileWorkspaceBackendName,
get_workspace,
)

from ..base import AgentFileManager, BaseAgentConfiguration
from ..base import AgentFileManager, BaseAgentSettings


class FileWorkspaceMixin:
Expand All @@ -22,32 +26,36 @@ def __init__(self, **kwargs):
# Initialize other bases first, because we need the config from BaseAgent
super(FileWorkspaceMixin, self).__init__(**kwargs)

config: BaseAgentConfiguration = getattr(self, "config")
if not isinstance(config, BaseAgentConfiguration):
raise ValueError(
"Cannot initialize Workspace for Agent without compatible .config"
)
file_manager: AgentFileManager = getattr(self, "file_manager")
if not file_manager:
return

self.workspace = _setup_workspace(file_manager, config)
self._setup_workspace()

def attach_fs(self, agent_dir: Path):
res = super(FileWorkspaceMixin, self).attach_fs(agent_dir)

self.workspace = _setup_workspace(self.file_manager, self.config)
self._setup_workspace()

return res

def _setup_workspace(self) -> None:
settings: BaseAgentSettings = getattr(self, "state")
assert settings.agent_id, "Cannot attach workspace to anonymous agent"
app_config: Config = getattr(self, "legacy_config")
file_manager: AgentFileManager = getattr(self, "file_manager")

def _setup_workspace(file_manager: AgentFileManager, config: BaseAgentConfiguration):
workspace = FileWorkspace(
file_manager.root / "workspace",
restrict_to_root=not config.allow_fs_access,
)
workspace.initialize()
return workspace
ws_backend = app_config.workspace_backend
local = ws_backend == FileWorkspaceBackendName.LOCAL
workspace = get_workspace(
backend=ws_backend,
id=settings.agent_id if not local else "",
root_path=file_manager.root / "workspace" if local else None,
)
if local and settings.config.allow_fs_access:
workspace._restrict_to_root = False # type: ignore
workspace.initialize()
self.workspace = workspace


def get_agent_workspace(agent: BaseAgent) -> FileWorkspace | None:
Expand Down
47 changes: 30 additions & 17 deletions autogpts/autogpt/autogpt/app/agent_protocol_server.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,11 @@
from autogpt.commands.user_interaction import ask_user
from autogpt.config import Config
from autogpt.core.resource.model_providers import ChatModelProvider
from autogpt.file_workspace import FileWorkspace
from autogpt.file_workspace import (
FileWorkspace,
FileWorkspaceBackendName,
get_workspace,
)
from autogpt.models.action_history import ActionErrorResult, ActionSuccessResult

logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -340,7 +344,7 @@ async def create_artifact(
else:
file_path = os.path.join(relative_path, file_name)

workspace = get_task_agent_file_workspace(task_id, self.agent_manager)
workspace = self._get_task_agent_file_workspace(task_id, self.agent_manager)
await workspace.write_file(file_path, data)

artifact = await self.db.create_artifact(
Expand All @@ -361,7 +365,7 @@ async def get_artifact(self, task_id: str, artifact_id: str) -> Artifact:
file_path = os.path.join(artifact.relative_path, artifact.file_name)
else:
file_path = artifact.relative_path
workspace = get_task_agent_file_workspace(task_id, self.agent_manager)
workspace = self._get_task_agent_file_workspace(task_id, self.agent_manager)
retrieved_artifact = workspace.read_file(file_path, binary=True)
except NotFoundError:
raise
Expand All @@ -376,24 +380,33 @@ async def get_artifact(self, task_id: str, artifact_id: str) -> Artifact:
},
)

def _get_task_agent_file_workspace(
self,
task_id: str | int,
agent_manager: AgentManager,
) -> FileWorkspace:
use_local_ws = (
self.app_config.workspace_backend == FileWorkspaceBackendName.LOCAL
)
agent_id = task_agent_id(task_id)
workspace = get_workspace(
backend=self.app_config.workspace_backend,
id=agent_id if not use_local_ws else "",
root_path=agent_manager.get_agent_dir(
agent_id=agent_id,
must_exist=True,
)
/ "workspace"
if use_local_ws
else None,
)
workspace.initialize()
return workspace


def task_agent_id(task_id: str | int) -> str:
return f"AutoGPT-{task_id}"


def get_task_agent_file_workspace(
task_id: str | int,
agent_manager: AgentManager,
) -> FileWorkspace:
return FileWorkspace(
root=agent_manager.get_agent_dir(
agent_id=task_agent_id(task_id),
must_exist=True,
)
/ "workspace",
restrict_to_root=True,
)


def fmt_kwargs(kwargs: dict) -> str:
return ", ".join(f"{n}={repr(v)}" for n, v in kwargs.items())
10 changes: 10 additions & 0 deletions autogpts/autogpt/autogpt/config/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
OPEN_AI_CHAT_MODELS,
OpenAICredentials,
)
from autogpt.file_workspace import FileWorkspaceBackendName
from autogpt.logs.config import LoggingConfig
from autogpt.plugins.plugins_config import PluginsConfig
from autogpt.speech import TTSConfig
Expand Down Expand Up @@ -51,10 +52,19 @@ class Config(SystemSettings, arbitrary_types_allowed=True):
chat_messages_enabled: bool = UserConfigurable(
default=True, from_env=lambda: os.getenv("CHAT_MESSAGES_ENABLED") == "True"
)

# TTS configuration
tts_config: TTSConfig = TTSConfig()
logging: LoggingConfig = LoggingConfig()

# Workspace
workspace_backend: FileWorkspaceBackendName = UserConfigurable(
default=FileWorkspaceBackendName.LOCAL,
from_env=lambda: FileWorkspaceBackendName(v)
if (v := os.getenv("WORKSPACE_BACKEND"))
else None,
)

##########################
# Agent Control Settings #
##########################
Expand Down
14 changes: 0 additions & 14 deletions autogpts/autogpt/autogpt/core/resource/model_providers/schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -172,24 +172,10 @@ class ModelProviderCredentials(ProviderCredentials):
api_version: SecretStr | None = UserConfigurable(default=None)
deployment_id: SecretStr | None = UserConfigurable(default=None)

def unmasked(self) -> dict:
return unmask(self)

class Config:
extra = "ignore"


def unmask(model: BaseModel):
unmasked_fields = {}
for field_name, field in model.__fields__.items():
value = getattr(model, field_name)
if isinstance(value, SecretStr):
unmasked_fields[field_name] = value.get_secret_value()
else:
unmasked_fields[field_name] = value
return unmasked_fields


class ModelProviderUsage(ProviderUsage):
"""Usage for a particular model from a model provider."""

Expand Down
16 changes: 15 additions & 1 deletion autogpts/autogpt/autogpt/core/resource/schema.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import abc
import enum

from pydantic import SecretBytes, SecretField, SecretStr
from pydantic import BaseModel, SecretBytes, SecretField, SecretStr

from autogpt.core.configuration import (
SystemConfiguration,
Expand Down Expand Up @@ -39,6 +39,9 @@ def update_usage_and_cost(self, *args, **kwargs) -> None:
class ProviderCredentials(SystemConfiguration):
"""Struct for credentials."""

def unmasked(self) -> dict:
return unmask(self)

class Config:
json_encoders = {
SecretStr: lambda v: v.get_secret_value() if v else None,
Expand All @@ -47,6 +50,17 @@ class Config:
}


def unmask(model: BaseModel):
unmasked_fields = {}
for field_name, _ in model.__fields__.items():
value = getattr(model, field_name)
if isinstance(value, SecretStr):
unmasked_fields[field_name] = value.get_secret_value()
else:
unmasked_fields[field_name] = value
return unmasked_fields


class ProviderSettings(SystemSettings):
resource_type: ResourceType
credentials: ProviderCredentials | None = None
Expand Down
Loading

0 comments on commit 1f40d72

Please sign in to comment.