Atoma node deployment
Deploying an Atoma node
Quickstart
Clone the repository
git clone https://github.com/atoma-network/atoma-node.git
cd atoma-nodeConfigure environment variables by creating
.envfile:
Start by running
cp .env.example .envYou should then see a file of the form:
# Hugging Face Configuration
HF_CACHE_PATH=~/.cache/huggingface
HF_TOKEN= # Required for gated models
# Inference Server Configuration
INFERENCE_SERVER_PORT=50000 # External port for vLLM service
MODEL=meta-llama/Llama-3.1-70B-Instruct
MAX_MODEL_LEN=4096 # Context length
GPU_COUNT=1 # Number of GPUs to use
TENSOR_PARALLEL_SIZE=1 # Should be equal to GPU_COUNT
# Sui Configuration
SUI_CONFIG_PATH=~/.sui/sui_config
# Atoma Node Service Configuration
ATOMA_SERVICE_PORT=3000 # External port for Atoma serviceYou need to fill the HF_TOKEN variable with your HuggingFace api key. See the official [website](https://huggingface.co/docs/hub/security-tokens) for more information on how to set an HF api key.
Configure
config.toml, usingconfig.example.tomlas template, by running:
cp config.example.toml config.tomlYou should now have a config.toml file with the following contents
[atoma-service]
inference_service_url = "http://vllm:8000" # Internal Docker network URL for inference service
embeddings_service_url = ""
multimodal_service_url = ""
models = ["meta-llama/Llama-3.1-70B-Instruct"] # Replace it with the list of models you want to deploy
revisions = [""]
service_bind_address = "0.0.0.0:3000" # Bind to all interfaces
[atoma-sui]
http_rpc_node_addr = ""
atoma_db = ""
atoma_package_id = ""
toma_package_id = ""
request_timeout = { secs = 300, nanos = 0 }
max_concurrent_requests = 10
limit = 100
node_small_ids = [0, 1, 2] # List of node IDs under control
task_small_ids = [] # List of task IDs under control
sui_config_path = "~/.sui/sui_config/client.yaml"
sui_keystore_path = "~/.sui/sui_config/sui.keystore"
[atoma-state]
database_url = "sqlite:///app/data/atoma.db"You can set multiple services for your node such as inference, embeddings and multi-modal, by setting the public url.
Create required directories
mkdir -p data logsStart the containers
If you plan to run a chat completions service:
# Build and start all services
COMPOSE_PROFILES=chat_completions_vllm docker compose up --build
# Or run in detached mode
COMPOSE_PROFILES=chat_completions_vllm docker compose up -d --buildFor text embeddings:
# Build and start all services
COMPOSE_PROFILES=embeddings_tei docker compose up --build
# Or run in detached mode
COMPOSE_PROFILES=embeddings_tei docker compose up -d --buildFor image generation:
# Build and start all services
COMPOSE_PROFILES=image_generations_mistral docker compose up --build
# Or run in detached mode
COMPOSE_PROFILES=image_generations_mistral docker compose up -d --buildIt is possible to run any combination of the above, if a node has enough GPU compute available. For example to run all services simultaneously, simply run:
# Build and start all services
COMPOSE_PROFILES=chat_completions_vllm,embeddings_tei,image_generations_mistral docker compose up --build
# Or run in detached mode
COMPOSE_PROFILES=chat_completions_vllm,embeddings_tei,image_generations_mistral docker compose up -d --buildContainer Architecture
The deployment consists of two main services:
vLLM Service: Handles the AI model inference
Atoma Node: Manages the node operations and connects to the Atoma Network
Service URLs
vLLM Service:
http://localhost:50000(configured via INFERENCE_SERVER_PORT)Atoma Node:
http://localhost:3000(configured via ATOMA_SERVICE_PORT)
Volume Mounts
HuggingFace cache:
~/.cache/huggingface:/root/.cache/huggingfaceSui configuration:
~/.sui/sui_config:/root/.sui/sui_configLogs:
./logs:/app/logsSQLite database:
./data:/app/data
Managing the Deployment
Check service status:
docker compose psView logs:
# All services
docker compose logs
# Specific service
docker compose logs atoma-node
docker compose logs vllm
# Follow logs
docker compose logs -fStop services:
docker compose downTroubleshooting
Check if services are running:
docker compose psTest vLLM service:
curl http://localhost:50000/healthTest Atoma Node service:
curl http://localhost:3000/healthCheck GPU availability:
docker compose exec vllm nvidia-smiView container networks:
docker network ls
docker network inspect atoma-networkSecurity Considerations
Firewall Configuration
# Allow Atoma service port
sudo ufw allow 3000/tcp
# Allow vLLM service port
sudo ufw allow 50000/tcpHuggingFace Token
Store HF_TOKEN in .env file
Never commit .env file to version control
Consider using Docker secrets for production deployments
Sui Configuration
Ensure Sui configuration files have appropriate permissions
Keep keystore file secure and never commit to version control
Last updated