Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

codelion 
posted an update 1 day ago
view post
Post
3737
Introducing Dhara-70M: A diffusion language model that achieves 3.8x higher throughput than autoregressive models!

Key findings from our research on optimal architectures for small language models:

→ Depth beats width: 32 layers outperforms 12 layers at the same parameter count
→ Best-in-class factuality: 47.5% on TruthfulQA
→ 10x training efficiency using WSD (Warmup-Stable-Decay) conversion
→ Canon layers add only 0.13% parameters but improve reasoning

We trained on 1B tokens using the optimal 50-30-20 dataset mix (PDFs + filtered web + educational content), then converted to diffusion with just 100M additional tokens.

Blog: https://huggingface.co/blog/codelion/optimal-model-architecture
Model: codelion/dhara-70m
  • 1 reply
·
MonsterMMORPG 
posted an update 2 days ago
view post
Post
4528
Qwen Image Edit 2511 Free and Open Source Crushes Qwen Image Edit 2509 and Challenges Nano Banana Pro : https://www.youtube.com/watch?v=YfuQuOk2sB0

Full tutorial link > https://www.youtube.com/watch?v=YfuQuOk2sB0

Full HF article here : https://huggingface.co/blog/MonsterMMORPG/qwen-image-edit-2511-free-and-open-source-crushes

Qwen Image Edit 2511 model just published and it is literally competing against Nano Banana Pro at image editing tasks. With native whopping 2560x2560 pixels image output capability and with only 12 steps it is next level. With our installers and specially made Quant FP8 Scaled model, you can run this amazing beast even as low as 6 GB GPUs. In this tutorial, I have compared Qwen Image Edit 2511 with previous successor model Qwen Image 2509 with 12 different unique and hard prompts and cases. Everything is step by step explained and provided.

Here check some comparison images
  • 5 replies
·
ronantakizawa 
posted an update 4 days ago
dhruv3006 
posted an update 3 days ago
view post
Post
1692
Hey folks 👋

We’re experimenting with a new response panel layout and would love your feedback.We’re testing a more focused experience:

- Only one response section open at a time (instead of multiple)
- The response body now takes up most of the vertical space, making it easier to read and inspect

The goal is simple: reduce clutter and keep the response as the main focus.

That said, we know many developers are comfortable with the classic layout (Postman / Bruno-style), where multiple sections can stay open at once.What would you prefer?

- A new, focused single-section layout
- The classic multi-section layout
- A toggle that lets you choose between both?

Download Voiden here :https://voiden.md/download
MikeDoes 
posted an update 1 day ago
view post
Post
2327
What if an AI agent could be tricked into stealing your data, just by reading a tool's description? A new paper reports it's possible.

The "Attractive Metadata Attack" paper details this stealthy new threat. To measure the real-world impact of their attack, the researchers needed a source of sensitive data for the agent to leak. We're proud that the AI4Privacy corpus was used to create the synthetic user profiles containing standardized PII for their experiments.

This is a perfect win-win. Our open-source data helped researchers Kanghua Mo, 龙昱丞, Zhihao Li from Guangzhou University and The Hong Kong Polytechnic University to not just demonstrate a new attack, but also quantify its potential for harm. This data-driven evidence is what pushes the community to build better, execution-level defenses for AI agents.

🔗 Check out their paper to see how easily an agent's trust in tool metadata could be exploited: https://arxiv.org/pdf/2508.02110

#OpenSource
#DataPrivacy
#LLM
#Anonymization
#AIsecurity
#HuggingFace
#Ai4Privacy
#Worldslargestopensourceprivacymaskingdataset
inoculatemedia 
posted an update 3 days ago
view post
Post
1363
I’m opening the waitlist for what I believe to be the most advanced multimodal bridge for A/V professionals. Txt2img, img2video, editing, export to ProRes, apply Luts, Pexels and TouchDesigner integrations, music and voice gen, multichannel mixing.

Announcing: Lilikoi by Haawke AI

Teaser video made entirely with Lilikoi:
https://youtu.be/-O7DH7vFkYg?si=q2t5t6WjQCk2Cp0w

Https://Lilikoi.haawke.com

Technical brief:
https://haawke.com/technical_brief.html

kanaria007 
posted an update 3 days ago
view post
Post
258
✅ New Article: *Hardware Paths for Structured Intelligence* (Draft v0.1)

Title:
🧩 From CPUs to SI-GSPU: Hardware Paths for Structured Intelligence
🔗 https://huggingface.co/blog/kanaria007/hardware-paths-for-si

---

Summary:
Most “AI hardware” is built for dense matrix math. But real-world intelligence systems bottleneck elsewhere: **semantic parsing, structured memory, governance checks, auditability, and evaluation loops** — the parts that turn models into safe, resilient systems.

This article maps the gap clearly, and sketches how a future **SI-GSPU class accelerator** fits: not “a better GPU,” but a co-processor for **semantics + governance runtime**.

> GPUs carry the models.
> S
I-GSPU carries the rules that decide when models are allowed to act.

---

Why It Matters:
• Explains *why* “more GPU” doesn’t fix governance-heavy AI stacks
• Identifies what to accelerate: semantic transforms, memory ops, coverage/metrics, effect ledgers
• Shows how to build **SI-GSPU-ready** systems *today* on conventional clouds — without a rewrite later
• Keeps performance numbers explicitly **illustrative**, avoiding spec-washing

---

What’s Inside:
• Bottleneck taxonomy: where CPUs melt when you implement SI-Core properly
• Accelerator landscape (GPU/TPU/FPGA/DPU) vs. SI workloads
• What SI-GSPU would accelerate — and what it explicitly should *not*
• Determinism + audit chains + attestation requirements for governance-critical acceleration
• A staged roadmap: software-only → targeted offloads → semantic-fabric clusters
• A toy TCO intuition (shape, not pricing guidance)

---

📖 Structured Intelligence Engineering Series
A non-normative hardware guide: how to layer Structured Intelligence onto today’s compute, and where specialized silicon actually changes the economics.
  • 1 reply
·
Parveshiiii 
posted an update 5 days ago
view post
Post
3415
Hey everyone!
We’re excited to introduce our new Telegram group: https://t.me/XenArcAI

This space is built for **model builders, tech enthusiasts, and developers** who want to learn, share, and grow together. Whether you’re just starting out or already deep into AI/ML, you’ll find a supportive community ready to help with knowledge, ideas, and collaboration.

💡 Join us to:
- Connect with fellow developers and AI enthusiasts
- Share your projects, insights, and questions
- Learn from others and contribute to a growing knowledge base

👉 If you’re interested, hop in and be part of the conversation: https://t.me/XenArcAI
·
kanaria007 
posted an update about 6 hours ago
view post
Post
52
✅ New Article: *Pattern-Learning-Bridge (PLB)*

Title:
🧩 Pattern-Learning-Bridge: How SI-Core Actually Learns From Its Own Failures
🔗 https://huggingface.co/blog/kanaria007/learns-from-its-own-failures

---

Summary:
Most stacks “learn” by fine-tuning weights and redeploying — powerful, but opaque.
SI-Core already produces *structured evidence* (jump logs, ethics traces, effect ledgers, goal vectors, rollback traces), so learning can be *structural* instead:

*Upgrade policies, compensators, SIL code, and goal structures — using runtime evidence.*

> Learning isn’t a model tweak.
> *It’s upgrading the structures that shape behavior.*

---

Why It Matters:
• Makes improvement *localized and explainable* (what changed, where, and why)
• Keeps “self-improvement” *governable* (versioned deltas + review + CI/CD)
• Turns incidents/metric drift into *actionable patches*, not postmortem PDFs
• Scales to real ops: ethics policies, rollback plans, semantic compression, goal estimators

---

What’s Inside:
• What “learning” means in SI-Core (and what changes vs. classic ML)
• The *Pattern-Learning-Bridge*: where it sits between runtime evidence and governed code
• Safety properties: PLB proposes *versioned deltas*, never edits production directly
• Validation pipeline: sandbox/simulation → conformance checks → golden diffs → rollout

---

📖 Structured Intelligence Engineering Series
A non-normative, implementable design for “learning from failures” without sacrificing auditability.
AbstractPhil 
posted an update about 17 hours ago
view post
Post
119
Happy Holidays all! geofractal architectural expansions; timm is now a core component for experimenting. As it stands, the system is growing rapidly in one direction, and timm brings a whole lot to the table in another rapid-prototyping direction. Therefore, timm is now a core component for ease-of-use.

BaseUtil is a new core component; aka src.geofractal.router.base_util inherits BaseComponent's behavior, so it should allow device movement for util operations which will direct utilization for device-to-device behavior for the upcoming accelerate integration.

I'm trying to mitigate the base component structure as much as possible, but the need to chain components in specific orders presented a unique problem. By compartmentalizing utils into structures that can be delegated and moved, these structures can be repurposed, expanded autonomously, reduced autonomously, and more.

ChainComponent inherits a subsystem specifically designed to organize multi-system multi-device formulas designated for inception and synchronization purposes. This is meant to allow distributed tasking to multiple-devices in chained utilization. This also enables ease-of-integration into nn.ModuleList with a few other caveats that will be ironed out meant to target wide-distributed models.

FusionComponent is specifically dedicated to the new fusion processing system meant for experimental expansion. This includes sub-module schedule control, Component and Tower functional control, device-movement, and will be packaged under the term "gfu.UtilType" as a standard naming convention.
"gfc.ComponentTypeName"
"gfr.RouterTypeName"
"gfu.UtilityTypeName"
"gft.TowerTypeName"
All of which are basically just import thing as.
"gf.AnythingTopLevelPackaged" which will include the core.

Better debugging for compilation
I'm in prototyping phases of a better debugging for compiled wide models and will prepare a baseline component readout structure by the end of the day today or tomorrow.