# Atoma node deployment

**Quickstart**

1. Clone the repository

```bash
git clone https://github.com/atoma-network/atoma-node.git
cd atoma-node
```

2. Configure environment variables by creating `.env` file:

Start by running

```bash
cp .env.example .env
```

You should then see a file of the form:

```bash
# Hugging Face Configuration
HF_CACHE_PATH=~/.cache/huggingface
HF_TOKEN=   # Required for gated models

# Inference Server Configuration
INFERENCE_SERVER_PORT=50000    # External port for vLLM service
MODEL=meta-llama/Llama-3.1-70B-Instruct
MAX_MODEL_LEN=4096            # Context length
GPU_COUNT=1                   # Number of GPUs to use
TENSOR_PARALLEL_SIZE=1        # Should be equal to GPU_COUNT

# Sui Configuration
SUI_CONFIG_PATH=~/.sui/sui_config

# Atoma Node Service Configuration
ATOMA_SERVICE_PORT=3000       # External port for Atoma service
```

You need to fill the `HF_TOKEN` variable with your HuggingFace api key. See the official \[website]\(<https://huggingface.co/docs/hub/security-tokens>) for more information on how to set an HF api key.

3. Configure `config.toml`, using `config.example.toml` as template, by running:

```bash
cp config.example.toml config.toml
```

You should now have a config.toml file with the following contents

```toml
[atoma-service]
inference_service_url = "http://vllm:8000"    # Internal Docker network URL for inference service
embeddings_service_url = ""
multimodal_service_url = ""
models = ["meta-llama/Llama-3.1-70B-Instruct"] # Replace it with the list of models you want to deploy
revisions = [""]
service_bind_address = "0.0.0.0:3000"         # Bind to all interfaces

[atoma-sui]
http_rpc_node_addr = ""
atoma_db = ""
atoma_package_id = ""
toma_package_id = ""
request_timeout = { secs = 300, nanos = 0 }
max_concurrent_requests = 10
limit = 100
node_small_ids = [0, 1, 2]  # List of node IDs under control
task_small_ids = []         # List of task IDs under control
sui_config_path = "~/.sui/sui_config/client.yaml"
sui_keystore_path = "~/.sui/sui_config/sui.keystore"

[atoma-state]
database_url = "sqlite:///app/data/atoma.db"
```

You can set multiple services for your node such as inference, embeddings and multi-modal, by setting the public url.&#x20;

4. Create required directories

```bash
mkdir -p data logs
```

5. Start the containers

If you plan to run a chat completions service:

```bash
# Build and start all services
COMPOSE_PROFILES=chat_completions_vllm docker compose up --build

# Or run in detached mode
COMPOSE_PROFILES=chat_completions_vllm docker compose up -d --build
```

For text embeddings:

```bash
# Build and start all services
COMPOSE_PROFILES=embeddings_tei docker compose up --build

# Or run in detached mode
COMPOSE_PROFILES=embeddings_tei docker compose up -d --build
```

For image generation:

```bash
# Build and start all services
COMPOSE_PROFILES=image_generations_mistral docker compose up --build

# Or run in detached mode
COMPOSE_PROFILES=image_generations_mistral docker compose up -d --build
```

It is possible to run any combination of the above, if a node has enough GPU compute available. For example to run all services simultaneously, simply run:

```bash
# Build and start all services
COMPOSE_PROFILES=chat_completions_vllm,embeddings_tei,image_generations_mistral docker compose up --build

# Or run in detached mode
COMPOSE_PROFILES=chat_completions_vllm,embeddings_tei,image_generations_mistral docker compose up -d --build
```

**Container Architecture**

The deployment consists of two main services:

* **vLLM Service**: Handles the AI model inference
* **Atoma Node**: Manages the node operations and connects to the Atoma Network

**Service URLs**

* vLLM Service: `http://localhost:50000` (configured via INFERENCE\_SERVER\_PORT)
* Atoma Node: `http://localhost:3000` (configured via ATOMA\_SERVICE\_PORT)

**Volume Mounts**

* HuggingFace cache: `~/.cache/huggingface:/root/.cache/huggingface`
* Sui configuration: `~/.sui/sui_config:/root/.sui/sui_config`
* Logs: `./logs:/app/logs`
* SQLite database: `./data:/app/data`

**Managing the Deployment**

Check service status:

```bash
docker compose ps
```

View logs:

```bash
# All services
docker compose logs

# Specific service
docker compose logs atoma-node
docker compose logs vllm

# Follow logs
docker compose logs -f
```

Stop services:

```bash
docker compose down
```

**Troubleshooting**

1. Check if services are running:

```bash
docker compose ps
```

2. Test vLLM service:

```bash
curl http://localhost:50000/health
```

3. Test Atoma Node service:

```bash
curl http://localhost:3000/health
```

4. Check GPU availability:

```bash
docker compose exec vllm nvidia-smi
```

5. View container networks:

```bash
docker network ls
docker network inspect atoma-network
```

**Security Considerations**

1. Firewall Configuration

```bash
# Allow Atoma service port
sudo ufw allow 3000/tcp

# Allow vLLM service port
sudo ufw allow 50000/tcp
```

2. HuggingFace Token

* Store HF\_TOKEN in .env file
* Never commit .env file to version control
* Consider using Docker secrets for production deployments

3. Sui Configuration

* Ensure Sui configuration files have appropriate permissions
* Keep keystore file secure and never commit to version control


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://atoma.gitbook.io/atoma-docs/node-operators/atoma-node/atoma-node-deployment.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
