Atoma Node
Atoma Nodes are the backbone of the Atoma's decentralized compute layer. They are responsible for executing AI workloads and providing compute resources to the network. Nodes are rewarded with native TOMA tokens for their contributions to the network.
In this section, we will explain how to set up a node and connect it to the Atoma, so anyone with available computing resources can participate in the network.
Note: The Atoma Node is currently under development and the following documentation is for Atoma's alpha release sole purpose.
Requirements
Have Rust and Cargo installed, for more details please consult here.
Have a machine with either one or more Nvidia GPUs, or any MacBook Pro that supports Metal.
For Nvidia GPUs, it is recommended to have a at least CUDA 12.1 installed.
In order to use optimized CUDA kernels, with Flash Attention2 support, it is recommended to have a Nvidia Ampere or newer GPU architecture (see more details below).
It is possible to run the Atoma Node relying solely on a CPU, even though the performance will likely be far from optimal.
Clone the Atoma's node repository.
An Hugging Face API key, to download the models used for inference.
For IPFS compatibility, it is recommended to have a local IPFS node running (see more details below).
For Gateway's compatibility, it is recommended to have created a Gateway account and have access to a valid API key (see more details here).
To support the current Atoma native UI application, it is recommended to have a Supabase account and access to a valid API key (see more details here).
Have a Sui wallet, and some
SUI
tokens in it. Please visit the Sui official CLI website to install the Sui CLI and follow the instructions to set up a wallet.Have Atoma's native faucet
TOMA
token available on your Sui wallet. You can request faucet tokens here here.
Configuration file
After you have installed Rust and Cargo, and have cloned the Atoma's node repository, you are required to specify a set of configuration parameters that will allow the node to connect to the Atoma Network. We recommend you to create a config.toml
file in the root of the Atoma node's repository with the following parameters:
In order to fill the above configuration file, you will need to:
Fill the Sui
client
parameters with the path to your Sui config file, together with your node registration small id. In order to register your node on the Atoma contract, please follow the instruction below. As an example, if my Sui config file is located at~/.sui/sui_config/client.yaml
(standard file path), and your registration node small id is 1234, yourconfig.toml
file should look like this:
Fill the
inference
parameters with your Hugging Face API key, and the model you want to use for inference. Your node must download a specific model weight from Hugging Face. Your node will download the model weights to the./models
directory, and will use theDTYPE
andUSE_FLASH_ATTENTION
parameters to select the best inference configuration to serve the specific model inference. Please refer to our supported model page for more information about the supported models, for the current alpha release.If you have a machine with an Nvidia GPU with CUDA 12.1 or newer and an Ampere or newer GPU architecture, you can use the optimized CUDA kernels and Flash Attention2 by setting
USE_FLASH_ATTENTION
totrue
, otherwise you should set it tofalse
. You can create a free Hugging Face's account page, here.You will be able to find your API key on your account page, under the
Settings
tab. If you wish to deploy a Llama3.1 8b instruct model for inference, withbf16
precision (we suggest eitherbf16
orfp16
precision types for most of the models, or else some quantized precision types), you should set theDTYPE
parameter tobf16
. In this case, yourconfig.toml
file should look like this:
The model id for a specific AI model can be found in the model's page on Hugging Face's website. For example, the model id for Llama3.1 8b instruct can be found here. If you support multiple GPU devices, you can specify which GPU device ids to use by setting the
device_ids
parameter to a list of integers, e.g.device_ids = [0, 1]
to use the first and second GPU devices. With more than one GPU device, the model weights will be automatically split across the available GPUs, with tensor weight parallelism applied. For example with 1 NVIDIA RTX 3090/4090 it is possible to run a Llama3.1 8b model with precisionbf16
orfp16
. With a NVIDIA A100 is is possible to run a Llama3.1 8b model with precisionfp32
, or a Llama3.1 70b quantizedint4
orint8
model. With 2 NVIDIA A100 GPUs it is possible to run a Llama3.1 70b model with precisionbf16
orfp16
, across both 2 GPUs. With 8 NVIDIA A100 or H100 GPUs it is possible to run a Llama3.1 405b model with precisionfp8
precision, across all 8 GPUs, whereas with 16 NVIDIA A100 or H100 GPUs it is possible to run a Llama3.1 405b model with precisionbf16
orfp16
precision, across all 16 GPUs.Fill the
input_manager
,output_manager
andstreamer
parameters with your Firebase project credentials, and the URL to your local IPFS daemon. To run a local IPFS daemon, you can follow the instructions here, we can also find more details below. The Supabase credentials are required for the node to support the current Atoma's alpha native UI application. This is not strictly necessary if you are only interested in contributing to the Atoma's compute layer at the smart contract level.Fill the
event_subscriber
parameters with the URL to your Sui RPC provider. We suggest using a Sui RPC provider that is geographically close to your node, to reduce latency and improve performance. We recommend a few providers including BlockVision, Shinami, etc.
Run the Atoma Node
In order to run the Atoma Node, you need to run the following commands, at the root of the Atoma node's repository:
The FEATURE
flag can be one of the following:
cuda
: to run the Atoma Node with CUDA support.metal
: to run the Atoma Node with Metal support.flash-attn
: to run the Atoma Node with Flash Attention2 support (for Ampere or newer GPU architecture).cpu
: to run the Atoma Node with CPU support.
The YOUR_CONFIG_TOML_FILE_PATH
should be the path to the config.toml
file you created in the previous step. If you created it at the root of the Atoma node's repository, it should be ../config.toml
.
Once your node is running, you should be able to see the node's logs in the terminal. If you have set up the node correctly, you should see the node's logs starting with INFO
within a few seconds. You can also check how long it takes for the node to load the model weights into the GPU device.
Your node will start listening to incoming inference requests, once you have registered your node on the Atoma smart contract on the Sui network.
Node registration
In order to register your node on the Atoma smart contract, you need to first have some SUI tokens in your Sui wallet (whose keypair client information is specified in the config.toml
file above). You will also need to have some faucet TOMA tokens in your wallet. You can request TOMA tokens from the Atoma faucet, here.
The first step is to clone the Atoma's smart contract repository, and build the Atoma's smart contract package.
To register your node on the Atoma smart contract, you need to run the following command:
Then you need to subscribe your node to the currrent AI model you are hosting, as follows:
In order to replace the flag YOUR_MODEL_ID
you can consult the right value in here. You can further find your ECHELON
value by consulting here, depending on your GPU hardware specs.
Flash Attention 2 requirements
It is required holding one or more a Nvidia Ampere or newer GPU architecture to use the optimized CUDA kernels and Flash Attention2.
Cuda requirements
We support any NVIDIA series 20xx or newer. It is recommended to have a CUDA 12.1 or newer installed driver installed. For more details about how to update your NVIDIA driver, please refer to the NVIDIA website.
Metal requirements
We support any Apple Silicon series M or newer. It is recommended to have a Metal version 3.0 or newer compatible machine.
IPFS compatibility
It is recommended to have a local IPFS node running, to store AI generated model outputs on behalf of the user, if this has requested it. It is possible to run a local IPFS node by following the steps:
Install the IPFS daemon by following the instructions here.
On a different terminal command line, run
ipfs init
to initialize the IPFS node.On that same terminal command line, run
ipfs daemon
to start the IPFS daemon.
Gateway compatibility
It is recommended to have a Gateway account, to store AI generated model outputs on behalf of the user, if this has requested it. It is possible to create a free Gateway account here. Once logged in, you will be able to create a new API key by navigating to the API Keys
section and clicking on the Create API Key
button. We will be using the API key to authenticate requests to the Gateway API.
Last updated