Atoma node deployment

Deploying an Atoma node

Quickstart

Clone the repository

git clone https://github.com/atoma-network/atoma-node.git
cd atoma-node

Configure environment variables by creating .env file:

Start by running

cp .env.example .env

You should then see a file of the form:

# Hugging Face Configuration
HF_CACHE_PATH=~/.cache/huggingface
HF_TOKEN=   # Required for gated models

# Inference Server Configuration
INFERENCE_SERVER_PORT=50000    # External port for vLLM service
MODEL=meta-llama/Llama-3.1-70B-Instruct
MAX_MODEL_LEN=4096            # Context length
GPU_COUNT=1                   # Number of GPUs to use
TENSOR_PARALLEL_SIZE=1        # Should be equal to GPU_COUNT

# Sui Configuration
SUI_CONFIG_PATH=~/.sui/sui_config

# Atoma Node Service Configuration
ATOMA_SERVICE_PORT=3000       # External port for Atoma service

You need to fill the HF_TOKEN variable with your HuggingFace api key. See the official [website](https://huggingface.co/docs/hub/security-tokens) for more information on how to set an HF api key.

Configure config.toml, using config.example.toml as template, by running:

cp config.example.toml config.toml

You should now have a config.toml file with the following contents

[atoma-service]
inference_service_url = "http://vllm:8000"    # Internal Docker network URL for inference service
embeddings_service_url = ""
multimodal_service_url = ""
models = ["meta-llama/Llama-3.1-70B-Instruct"] # Replace it with the list of models you want to deploy
revisions = [""]
service_bind_address = "0.0.0.0:3000"         # Bind to all interfaces

[atoma-sui]
http_rpc_node_addr = ""
atoma_db = ""
atoma_package_id = ""
toma_package_id = ""
request_timeout = { secs = 300, nanos = 0 }
max_concurrent_requests = 10
limit = 100
node_small_ids = [0, 1, 2]  # List of node IDs under control
task_small_ids = []         # List of task IDs under control
sui_config_path = "~/.sui/sui_config/client.yaml"
sui_keystore_path = "~/.sui/sui_config/sui.keystore"

[atoma-state]
database_url = "sqlite:///app/data/atoma.db"

You can set multiple services for your node such as inference, embeddings and multi-modal, by setting the public url.

Create required directories

mkdir -p data logs

Start the containers

If you plan to run a chat completions service:

# Build and start all services
COMPOSE_PROFILES=chat_completions_vllm docker compose up --build

# Or run in detached mode
COMPOSE_PROFILES=chat_completions_vllm docker compose up -d --build

For text embeddings:

# Build and start all services
COMPOSE_PROFILES=embeddings_tei docker compose up --build

# Or run in detached mode
COMPOSE_PROFILES=embeddings_tei docker compose up -d --build

For image generation:

# Build and start all services
COMPOSE_PROFILES=image_generations_mistral docker compose up --build

# Or run in detached mode
COMPOSE_PROFILES=image_generations_mistral docker compose up -d --build

It is possible to run any combination of the above, if a node has enough GPU compute available. For example to run all services simultaneously, simply run:

# Build and start all services
COMPOSE_PROFILES=chat_completions_vllm,embeddings_tei,image_generations_mistral docker compose up --build

# Or run in detached mode
COMPOSE_PROFILES=chat_completions_vllm,embeddings_tei,image_generations_mistral docker compose up -d --build

Container Architecture

The deployment consists of two main services:

vLLM Service: Handles the AI model inference
Atoma Node: Manages the node operations and connects to the Atoma Network

Service URLs

vLLM Service: http://localhost:50000 (configured via INFERENCE_SERVER_PORT)
Atoma Node: http://localhost:3000 (configured via ATOMA_SERVICE_PORT)

Volume Mounts

HuggingFace cache: ~/.cache/huggingface:/root/.cache/huggingface
Sui configuration: ~/.sui/sui_config:/root/.sui/sui_config
Logs: ./logs:/app/logs
SQLite database: ./data:/app/data

Managing the Deployment

Check service status:

docker compose ps

View logs:

# All services
docker compose logs

# Specific service
docker compose logs atoma-node
docker compose logs vllm

# Follow logs
docker compose logs -f

Stop services:

docker compose down

Troubleshooting

Check if services are running:

docker compose ps

Test vLLM service:

curl http://localhost:50000/health

Test Atoma Node service:

curl http://localhost:3000/health

Check GPU availability:

docker compose exec vllm nvidia-smi

View container networks:

docker network ls
docker network inspect atoma-network

Security Considerations

Firewall Configuration

# Allow Atoma service port
sudo ufw allow 3000/tcp

# Allow vLLM service port
sudo ufw allow 50000/tcp

HuggingFace Token

Store HF_TOKEN in .env file
Never commit .env file to version control
Consider using Docker secrets for production deployments

Sui Configuration

Ensure Sui configuration files have appropriate permissions
Keep keystore file secure and never commit to version control

PreviousHuggingFace authentication NextAtoma node daemon

Last updated 8 months ago