File size: 5,514 Bytes
cc12aa3
 
 
 
 
 
 
 
 
9416b8e
 
49bb411
6d69eaf
9416b8e
 
e4bbd2d
3795f6e
9bd943d
6334be5
3795f6e
e4bbd2d
 
 
 
09ae307
339ec8f
ced432d
 
 
fb40b16
 
ced432d
339ec8f
3795f6e
fb40b16
 
9416b8e
 
568d7dd
b310e5b
fb40b16
 
b310e5b
8b125cb
 
 
b310e5b
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
---
title: README
emoji: 🏆
colorFrom: red
colorTo: yellow
sdk: static
pinned: false
---

## InfiX.ai

Welcome to **InfiX.ai**! We believe our research will eventually lead to decentralized Generative AI—a future where everyone can access, contribute to, and benefit from AI equally.  
Our Mission: **Generative AI for all, intelligence in every task.**

---
### 🤖 Our Model Series
#### 🔗 Model Fusion & Model Merging
> **Model fusion** refers to the process of combining multiple trained models—often from different domains, architectures, or training datasets—into a single, more powerful model. The goal is to integrate their strengths and knowledge, improving performance, generalization, or efficiency.  
> **Model merging** is a specific type of model fusion that involves combining the internal parameters (typically weights) of two or more pretrained models to produce a single model that inherits knowledge from all sources. Unlike ensemble methods, model merging produces a single merged model rather than relying on multiple models at inference time.

- [InfiFusion](https://huggingface.co/collections/InfiX-ai/infifusion-683c7d7f00c71614ba8ceb96): **InfiFusion** is a logit-level fusion pipeline based on Universal Logit Distillation, enhanced with Top-K filtering and logits standardization. It supports both pairwise and unified fusion strategies to balance performance and efficiency.
- [InfiGFusion](https://huggingface.co/InfiX-ai/InfiGFusion-14B): **InfiGFusion** is a structure-aware extension that builds co-activation graphs from logits and aligns them via an efficient Gromov-Wasserstein loss approximation, capturing cross-dimension semantic dependencies for stronger reasoning.
- [InfiFPO](https://huggingface.co/InfiX-ai/InfiFPO-14B): **InfiFPO** is a lightweight fusion method during the preference alignment phase that injects fused model behavior into preference learning, enabling richer signal during DPO-style fine-tuning.


#### 🧠 Reasoning-Enhanced Low-Resource Training Pipeline

- [InfiR](https://huggingface.co/papers/2502.11573): **InfiR** aims to advance AI systems by improving reasoning, reducing adoption barriers, and addressing privacy concerns through smaller model sizes. 
- [InfiR-FP8](https://huggingface.co/InfiX-ai): **InfiR-FP8** is a smaller reasoning-enhanced model trained from scratch using FP8 precision, achieving successful convergence while reducing memory usage by 10% and improving training speed by 20% during the training process. The model will be released in mid-September.
- [InfiAlign](https://huggingface.co/papers/2508.05496): **InfiAlign** is a scalable and data-efficient post-training framework that combines supervised fine-tuning (SFT) and reinforcement learning (RL) with a high-quality data selection pipeline to enhance reasoning in large language models.
- [InfiMMR](https://huggingface.co/papers/2505.23091): **InfiMMR** is a novel three-phase curriculum framework that systematically enhances multimodal reasoning capabilities in small language models through foundational reasoning activation, cross-modal adaptation, and multimodal reasoning enhancement. 

#### 🖥️ Advanced Vision-Native Agent for GUI Interaction
- [InfiGUIAgent](https://huggingface.co/papers/2501.04575): **InfiGUIAgent** is a GUI agent that embeds native hierarchical and expectation-reflection reasoning through a unique two-stage supervised pipeline, enabling robust, multi-step GUI task automation. 
- [InfiGUI-R1](https://huggingface.co/papers/2504.14239v1): **InfiGUI-R1** is a GUI agent developed via the Actor2Reasoner framework, which evolves a reactive model into a deliberative reasoner capable of sophisticated planning and error recovery through spatial reasoning distillation and reinforcement learning.
- [InfiGUI-G1](https://huggingface.co/papers/2508.05731): **InfiGUI-G1** is a multimodal GUI agent that employs Adaptive Exploration Policy Optimization (AEPO) to improve semantic alignment in GUI grounding. The novel training framework achieves up to **8.3%** relative improvement over baseline methods.

---
### 📰 News

- 🔥[2025/8/11] Our paper "[InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization
](https://arxiv.org/abs/2508.05731)" released. More information can be found in [the repository](https://github.com/InfiXAI/InfiGUI-G1). Model is available [here](https://huggingface.co/InfiX-ai/InfiGUI-G1-7B)
- 🔥[2025/5/20] Our paper "[InfiGFusion: Graph-on-Logits Distillation via Efficient Gromov-Wasserstein for Model Fusion
](https://arxiv.org/abs/2505.13893)" released. More information can be found in [the repository](https://github.com/InfiXAI/InfiGFusion). Model is available [here](https://huggingface.co/InfiX-ai/InfiGFusion-14B)
- 🔥[2025/5/20] Our paper "[InfiFPO: Implicit Model Fusion via Preference Optimization in Large Language Models
](https://arxiv.org/abs/2505.13878)" released. More information can be found in [the repository](https://github.com/InfiXAI/InfiFPO). Model is available [here](https://huggingface.co/InfiX-ai/InfiFPO-14B)
- 🔥[2025/4/19] Our paper "[InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners](https://arxiv.org/abs/2504.14239)" released. More information can be found in [the repository](https://github.com/Reallm-Labs/InfiGUI-R1).
- 🔥[2025/1/9] Our paper "[InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection](https://arxiv.org/abs/2501.04575)" released.