OpenMOSS repositories

MOSS-TTS

Public

MOSS‑TTS Family is an open‑source speech and sound generation model family from MOSI.AI and the OpenMOSS team. It is designed for high‑fidelity, high‑expressive…

audio text-to-speech multimodal

audio text-to-speech multimodal voice-cloning llm audio-tokenizer

Python

•

Apache License 2.0

•254•2.8k•5•2•Updated

Jun 2, 2026

Llamascopium

Public

Performant framework for training, analyzing and visualizing Sparse Autoencoders (SAEs) and their frontier variants.

sparse-autoencoders interpretability sparse-dictionary

sparse-autoencoders interpretability sparse-dictionary mechanistic-interpretability

Python

•29•219•8•0•Updated

Jun 2, 2026

MOSS-Video-Preview

Public

A real-time video understanding foundation model with gated cross-attention. Offline & real-time inference.

Python

•

Apache License 2.0

•4•138•0•0•Updated

Jun 1, 2026

MOSS-VL

Public

MOSS-VL is the core multimodal model series within the OpenMOSS ecosystem, dedicated to visual understanding.

Python

•

Apache License 2.0

•4•256•0•0•Updated

Jun 1, 2026

MOSS

Public

An open-source tool-augmented conversational language model from Fudan University

natural-language-processing deep-learning text-generation

natural-language-processing deep-learning text-generation dialogue-systems large-language-models chatgpt

Python

•

Apache License 2.0

•1.1k•12k•235•6•Updated

May 27, 2026

.github

Public

0•0•0•0•Updated

May 27, 2026

MOSS-TTS-Nano

Public

MOSS-TTS-Nano is an open-source multilingual tiny speech generation model from MOSI.AI and the OpenMOSS team. With only 0.1B parameters, it is designed for real…

multilingual realtime tts

multilingual realtime tts english chinese streaming-audio multi-modality voice-clone audio-tokenizer

Python

•

Apache License 2.0

•432•3.3k•45•5•Updated

May 26, 2026

Awesome-WAM

Public

A curated, continuously updated reading list, paper blogs, and resources for World Action Models (WAMs) in embodied AI.

HTML

•

MIT License

•16•633•1•2•Updated

May 24, 2026

MOSS-Audio

Public

MOSS-Audio is an open-source foundation model for unified audio understanding, enabling speech, sound, music, captioning, QA, and reasoning in real-world scenar…

Python

•36•511•11•0•Updated

May 19, 2026

sglang

Public

Python

•

Apache License 2.0

•0•3•0•0•Updated

May 12, 2026

MOSS-VL-Demo

Public

Vue

•0•5•0•0•Updated

May 11, 2026

MOSS-Music

Public

MOSS-Music is an open-source music understanding model for targeting musical captioning, lyrics ASR, structural analysis, chord / key / tempo reasoning, and lon…

Python

•5•83•2•0•Updated

May 9, 2026

MOSS-TTS-Nano-Reader

Public

JavaScript

•5•46•0•0•Updated

May 7, 2026

MOVA

Public

MOVA: Towards Scalable and Synchronized Video–Audio Generation

multimodal diffusion-models sglang

multimodal diffusion-models sglang video-audio-generation

Python

•

Apache License 2.0

•87•1k•33•3•Updated

May 6, 2026

MOSS-Audio-Tokenizer

Public

MOSS-Audio-Tokenizer is a Causal Transformer-based audio tokenizer built on the CAT architecture. Trained on 3M hours of diverse audio, it supports streaming an…

audio music tokenizer

audio music tokenizer speech tts unified speech-representation

Python

•

Apache License 2.0

•15•218•3•1•Updated

May 6, 2026

mlx-audio

Public

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Sil…

Python

•

MIT License

•614•6•0•0•Updated

Apr 27, 2026

MOSS-TTS-Nano-Demo

Public

CSS

•1•1•0•0•Updated

Apr 13, 2026

llama.cpp

Public

C++

•

MIT License

•2•5•0•2•Updated

Apr 8, 2026

BandPO

Public

Official implementation of BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning. BandPO replaces canoni…

Python

•

GNU General Public License v3.0

•4•49•0•0•Updated

Apr 8, 2026

imclaw-skill

Public

Python

•0•7•0•0•Updated

Apr 3, 2026

TransformerLens

Public

A library for mechanistic interpretability of GPT-style language models

Python

•

MIT License

•580•2•0•0•Updated

Mar 31, 2026

Embodied-Planner-R1

Public

Embodied-Planner-R1: Unleashing Embodied Task Planning Ability in LLMs via Reinforcement Learning

Python

•

Apache License 2.0

•1•27•0•4•Updated

Mar 30, 2026

DiRL

Public

Python

•

Apache License 2.0

•7•160•0•1•Updated

Mar 30, 2026

OurClaw

Public

Institutional OpenClaw Solution. Share One Claw with Others.

TypeScript

•

MIT License

•3•24•0•0•Updated

Mar 30, 2026

RoboOmni

Public

Official code of "RoboOmni: Proactive Robot Manipulation in Omni-modal Context"

Python

•6•109•6•0•Updated

Mar 28, 2026

MOSS-TTSD

Public

MOSS-TTSD is a spoken dialogue generation model designed for expressive multi-speaker synthesis. It features long-context modeling, flexible speaker control, a…

streaming finetune text-to-speeh

streaming finetune text-to-speeh large-language-models sglang speech-dialogue-generation

Python

•

Apache License 2.0

•131•1.3k•52•0•Updated

Mar 23, 2026

TTSD-eval

Public

Python

•0•4•0•0•Updated

Mar 16, 2026

OpenMOSS.github.io

Public

JavaScript

•0•2•0•0•Updated

Mar 3, 2026

Website

Public

wangye

JavaScript

•3•0•0•1•Updated

Mar 2, 2026

FRoM-W1

Public

[ArXiv 26] FRoM-W1: Towards General Humanoid Whole-Body Control with Language Instructions

whole-body-control g1 humanoid-robots

whole-body-control g1 humanoid-robots h1 motion-generation foundation-models text-to-motion unitree fftai

Python

•

Apache License 2.0

•7•166•3•0•Updated

Feb 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenMOSS (SII)

All

All

52 repositories

MOSS-TTS

Llamascopium

MOSS-Video-Preview

MOSS-VL

MOSS

.github

MOSS-TTS-Nano

Awesome-WAM

MOSS-Audio

sglang

MOSS-VL-Demo

MOSS-Music

MOSS-TTS-Nano-Reader

MOVA

MOSS-Audio-Tokenizer

mlx-audio

MOSS-TTS-Nano-Demo

llama.cpp

BandPO

imclaw-skill

TransformerLens

Embodied-Planner-R1

DiRL

OurClaw

RoboOmni

MOSS-TTSD

TTSD-eval

OpenMOSS.github.io

Website

FRoM-W1

All

All

Repositories list

52 repositories