Atoma node deployment
Deploying an Atoma node
Quickstart
Clone the repository
git clone https://github.com/atoma-network/atoma-node.git
cd atoma-node
Configure environment variables by creating
.env
file:
Start by running
cp .env.example .env
You should then see a file of the form:
# Hugging Face Configuration
HF_CACHE_PATH=~/.cache/huggingface
HF_TOKEN= # Required for gated models
# Inference Server Configuration
INFERENCE_SERVER_PORT=50000 # External port for vLLM service
MODEL=meta-llama/Llama-3.1-70B-Instruct
MAX_MODEL_LEN=4096 # Context length
GPU_COUNT=1 # Number of GPUs to use
TENSOR_PARALLEL_SIZE=1 # Should be equal to GPU_COUNT
# Sui Configuration
SUI_CONFIG_PATH=~/.sui/sui_config
# Atoma Node Service Configuration
ATOMA_SERVICE_PORT=3000 # External port for Atoma service
You need to fill the HF_TOKEN
variable with your HuggingFace api key. See the official [website](https://huggingface.co/docs/hub/security-tokens) for more information on how to set an HF api key.
Configure
config.toml
, usingconfig.example.toml
as template, by running:
cp config.example.toml config.toml
You should now have a config.toml file with the following contents
[atoma-service]
inference_service_url = "http://vllm:8000" # Internal Docker network URL for inference service
embeddings_service_url = ""
multimodal_service_url = ""
models = ["meta-llama/Llama-3.1-70B-Instruct"] # Replace it with the list of models you want to deploy
revisions = [""]
service_bind_address = "0.0.0.0:3000" # Bind to all interfaces
[atoma-sui]
http_rpc_node_addr = ""
atoma_db = ""
atoma_package_id = ""
toma_package_id = ""
request_timeout = { secs = 300, nanos = 0 }
max_concurrent_requests = 10
limit = 100
node_small_ids = [0, 1, 2] # List of node IDs under control
task_small_ids = [] # List of task IDs under control
sui_config_path = "~/.sui/sui_config/client.yaml"
sui_keystore_path = "~/.sui/sui_config/sui.keystore"
[atoma-state]
database_url = "sqlite:///app/data/atoma.db"
You can set multiple services for your node such as inference, embeddings and multi-modal, by setting the public url.
Create required directories
mkdir -p data logs
Start the containers
If you plan to run a chat completions service:
# Build and start all services
COMPOSE_PROFILES=chat_completions_vllm docker compose up --build
# Or run in detached mode
COMPOSE_PROFILES=chat_completions_vllm docker compose up -d --build
For text embeddings:
# Build and start all services
COMPOSE_PROFILES=embeddings_tei docker compose up --build
# Or run in detached mode
COMPOSE_PROFILES=embeddings_tei docker compose up -d --build
For image generation:
# Build and start all services
COMPOSE_PROFILES=image_generations_mistral docker compose up --build
# Or run in detached mode
COMPOSE_PROFILES=image_generations_mistral docker compose up -d --build
It is possible to run any combination of the above, if a node has enough GPU compute available. For example to run all services simultaneously, simply run:
# Build and start all services
COMPOSE_PROFILES=chat_completions_vllm,embeddings_tei,image_generations_mistral docker compose up --build
# Or run in detached mode
COMPOSE_PROFILES=chat_completions_vllm,embeddings_tei,image_generations_mistral docker compose up -d --build
Container Architecture
The deployment consists of two main services:
vLLM Service: Handles the AI model inference
Atoma Node: Manages the node operations and connects to the Atoma Network
Service URLs
vLLM Service:
http://localhost:50000
(configured via INFERENCE_SERVER_PORT)Atoma Node:
http://localhost:3000
(configured via ATOMA_SERVICE_PORT)
Volume Mounts
HuggingFace cache:
~/.cache/huggingface:/root/.cache/huggingface
Sui configuration:
~/.sui/sui_config:/root/.sui/sui_config
Logs:
./logs:/app/logs
SQLite database:
./data:/app/data
Managing the Deployment
Check service status:
docker compose ps
View logs:
# All services
docker compose logs
# Specific service
docker compose logs atoma-node
docker compose logs vllm
# Follow logs
docker compose logs -f
Stop services:
docker compose down
Troubleshooting
Check if services are running:
docker compose ps
Test vLLM service:
curl http://localhost:50000/health
Test Atoma Node service:
curl http://localhost:3000/health
Check GPU availability:
docker compose exec vllm nvidia-smi
View container networks:
docker network ls
docker network inspect atoma-network
Security Considerations
Firewall Configuration
# Allow Atoma service port
sudo ufw allow 3000/tcp
# Allow vLLM service port
sudo ufw allow 50000/tcp
HuggingFace Token
Store HF_TOKEN in .env file
Never commit .env file to version control
Consider using Docker secrets for production deployments
Sui Configuration
Ensure Sui configuration files have appropriate permissions
Keep keystore file secure and never commit to version control
Last updated