infrence

Nectar-X-Studio is a powerful, Local AI-Inferencing application that allows the user download, create, run agents and run large language models on their own machine. With no internet connection required, Nectar ensures privacy-first, high-performance inference using cutting-edge open-source models from Hugging Face, Ollama, and beyond.

ai ml ai-agents infrence stable-diffusion gguf-model-support

Updated Mar 20, 2026
Python

rdcm / triton-ng

Star

Rust SDK for writing custom backends for NVIDIA Triton Inference Server

rust nvidia infrence triton-inference-server custom-backend

Updated Apr 11, 2026
Rust

YL-Raj / llm-balance-paraphraser

Star

Local-first LLM toolkit: token/KV-cache/VRAM analyzer, Ollama paraphrase pipeline, and a weighted load-balancer with health checks. 100% local — no API keys, no data leaves your machine. 51 passing tests.

privacy tokenizer load-balancer fastapi infrence llm local-llm ollama

Updated Jun 1, 2026
Python

Ryuk1811 / Duplex

Star

Duplex is an advanced, strictly client-side application designed to interface with multiple Large Language Models simultaneously. Run local instances through Ollama and multiple cloud APIs in a unified, privacy-first interface.

api ai webapp chat-application chat-app multimodal infrence llm llm-inference llm-tools

Updated Jun 11, 2026
TypeScript

Improve this page

Add a description, image, and links to the infrence topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the infrence topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

infrence

Here are 8 public repositories matching this topic...

Eamon2009 / Quadtrix.cpp

thanos / ex_datalog

ckorikov / 2025-ttie

petlukk / Cougar

headlessripper / Nectar-X-Studio

rdcm / triton-ng

YL-Raj / llm-balance-paraphraser

Ryuk1811 / Duplex

Improve this page

Add this topic to your repo