AIME API Server
Unlock the Power of Custom AI Solutions
To simplify the AI deployment, we provide the AIME API Server. Designed to offer flexibility and control, the AIME API server is your gateway to provide powerful AI functionalities as an API interface with the ability to customize, extend and scale as needed.
The AIME API is a powerful and scalable API server specifically designed for the inference of AI models to help you run and scale your AI models as working web application. The AIME API performs tasks such as load balancing of requests and load distribution to the individual GPU nodes, simplifying the process of providing AI models as services in order to exploit the full potential of technologies.
We are continuously improving the API server with new features, adding open-source models, streaming of various input, output modalities and advanced admin backend features.
Find out more about the AIME API Server here:
AIME API Server
for Model Inference
You have a deep learning model running in your console or Jupyter notebook and would like to make it available to your company or deploy it to the world? AIME API is the easy and scalable solution to do so. Turn your script to a secure and robust web API acting as your interface to the mobile, browser and desktop world.
With AIME API one deploys deep learning models (PyTorch, TensorFlow) through a job queue as API endpoint capable of serving millions of model inference requests.
The AIME API Server solution implements a distributed server architecture with a central AIME API Server communicating through a job queue with a scalable GPU compute cluster. The GPU compute cluster can be heterogeneous and distributed at different locations without requiring an interconnect. The model compute jobs are processed through so called compute workers which connect to the AIME API Server through a secure HTTPS interface. The locality of the compute workers can be independent from the API server, they only need internet access to request jobs and send the compute results.
This way we enable the creation of AI-based software products by simply developing a frontend that is connected to the API, to which GPU worker nodes log on in an easily scalable manner. Load balancing is handled by the API server.
Integrated in OpenWebUI, vLLM-ready
Data Sovereignty & Privacy with OpenWebUI & vLLM Integration
In response to the growing demand for data privacy and GDPR-compliant large language model integrations, AI agents, and RAG-solutions, AIME has extended the OpenWebUI framework with an AIME API compatible interface.
Use OpenWebUI as modern LLM chat frontend powered by the AIME API Server as scalable backend This allows you to run your own private version of a ChatGPT-like LLM, ensuring complete data sovereignty without compromising on performance or functionality.
Now you can operate large language models securely and in compliance with the highest data protection standards, all while maintaining full control over your data, choosing from the large pool of language models that vLLM supports. Run OpenWebUI on your own hardware as on-premise solution or connect to our GDPR compliant GPU cloud offerings.
AIME MLC
The Deep Learning Framework Management System
The AIME Machine Learning Container Manager (AIME MLC) is a framework designed to streamline the deployment and management of deep learning environments based on Docker. The necessary libraries, GPU drivers, operating system libraries for various Deep Learning framework versions are bundled in preconfigured Docker containers and can be instantiated with just a single command. It emphasizes flexibility with significant benefits for researchers and engineers for experimenting with different AI models.
AIME MLC simplifies working with deep learning frameworks, such as TensorFlow and PyTorch, making it versatile for various applications in artificial intelligence. Users can customize environments with specific configurations, including versions of CUDA, drivers, OS and Python libraries, enabling users to define configurations tailored to specific projects and hardware setup s for fast switching between different projects or users.
AIME ML Container Manager
for Model Training & Inference
AIME MLC is particularly useful for AI research and development, enterprise AI solutions, and any application involving computationally intensive GPU usage like AI model training.
A key aspect of AIME MLC is its seamless integration into high performance GPU environments, such as those provided by AIME’s cloud infrastructure. This makes it particularly effective for applications requiring intensive computational resources, like training deep learning models on multi-GPU systems. The framework facilitates efficient resource allocation and management, allowing for optimized performance and scalability.
AIME servers and workstations ship with the AIME ML Container Manager preinstalled. Also integrated with AIME's high performance GPU cloud services, it provides a robust infrastructure to easily setup, accelerate, and navigate between deep learning AI projects and frameworks.
The AIME ML Container Manager makes life easier for developers so they do not have to worry about framework version installation issues.
1
Step 1: Configure
Login through ssh or remote desktop to your dedicated instance. Everything is already installed and thanks to the AIME ML container system, your preferred deep learning framework can be configured right away with a single command:
> mlc create my-container Pytorch 2.5.0
2
Step 2: Start working!
Your deep learning framework is ready to be used. You can now start developing, training or deploy your model, f.e. as an AIME API worker to scale up your application!
> mlc open my-container
AIME Benchmarking Suite
Measure Performance with Comprehensive Benchmarking Tools
As both a hardware and software company, AIME has developed a suite of benchmarking tools to evaluate and optimize AI performance across various hardware and software configurations.
The AIME Benchmark Suite provides detailed insights by:
- Benchmarking GPUs, from RTX 3090 to the latest H200 models.
- Evaluating AI model inference and training on diverse configurations and GPUs using popular open-source AI models.
- Benchmarking PyTorch and TensorFlow on multi-GPU setups, fully compatible with both NVIDIA CUDA and AMD ROCm environments.
Additionally, we offer a benchmarking tool for the AIME API Server, designed to test, monitor, and compare the performance of connected AI models (GPU workers). This enables performance evaluations to fine-tune AI deployments for maximum efficiency and reliability, enabling us to gain valuable insights, empowering our customers to make informed decisions about their hardware and software setups and ensuring optimal performance for their AI workloads.
Read our benchmarking results in our hands-on blog articles which we regularly publish on current open source AI models, for example:
- Pytorch Benchmarks
- GPU Benchmarks
- Llama 3.x Blog Articles
- FLUX-Benchmarks
Contact us
Call us or send us an email if you have any questions. We would be glad to assist you to find the best suitable compute solution.
-
AIME GmbH
Marienburger Str. 1
10405 Berlin
Germany