Turn any computer or edge device into a command center for your computer vision projects.
-
Updated
Apr 28, 2025 - Python
Turn any computer or edge device into a command center for your computer vision projects.
The simplest way to serve AI/ML models in production
This repository allows you to get started with a gui based training a State-of-the-art Deep Learning model with little to no configuration needed! NoCode training with TensorFlow has never been so easy.
The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
A Beautiful Flask Web API for Yolov7 (and custom) models
Lightweight Inference server for OpenVINO
Train and predict your model on pre-trained deep learning models through the GUI (web app). No more many parameters, no more data preprocessing.
CLI & Python API to easily summarize text-based files with transformers
This repository allows you to get started with training a State-of-the-art Deep Learning model with little to no configuration needed! You provide your labeled dataset and you can start the training right away. You can even test your model with our built-in Inference REST API. Training classification models with GluonCV has never been so easy.
This is a repository for an image classification inference API using the Gluoncv framework. The inference REST API works on CPU/GPU. It's supported on Windows and Linux Operating systems. Models trained using our Gluoncv Classification training repository can be deployed in this API. Several models can be loaded and used at the same time.
the small distributed language model toolkit; fine-tune state-of-the-art LLMs anywhere, rapidly
An open source framework for Retrieval-Augmented System (RAG) uses semantic search helps to retrieve the expected results and generate human readable conversational response with the help of LLM (Large Language Model).
Practice for Machine Learning in Production course
A Node.js backend that exposes a Typescript implementation of the deCheem inference engine.
A networked inference server for Whisper so you don't have to keep waiting for the audio model to reload for the x-hunderdth time.
Computer VIsion API built using FastAPI and pretrained models converted to ONNX format
Chat prompt template evaluation and inference monitoring
A message queue based server architecture to asynchronously handle resource-intensive tasks (e.g., ML inference)
Text components powering LLMs & SLMs for geniusrise framework
Add a description, image, and links to the inference-api topic page so that developers can more easily learn about it.
To associate your repository with the inference-api topic, visit your repo's landing page and select "manage topics."