AI & ML interests

None defined yet.

Ujjwal-Tyagi 
posted an update about 5 hours ago
view post
Post
43
We are sleepwalking into a crisis. I am deeply concerned about AI model safety right now because, as the community rushes to roll out increasingly powerful open-source models, we are completely neglecting the most critical aspect: safety. It seems that nobody is seriously thinking about the potential consequences of unregulated model outputs or the necessity of robust guardrails. We are essentially planting the seeds for our own destruction if we prioritize raw performance over security.

This negligence is terrifyingly evident when you look at the current landscape. Take Qwen Image 2512, for example; while it delivers undeniably strong performance, it has incredibly weak guardrails that make it dangerous to deploy. In stark contrast, Z Image might not get as much hype for its power, but it has much better safety guardrails than Qwen Image 2512.

It is imperative that the open-source community and developers recognize that capability without responsibility is a liability. We must actively work on protecting these models from bad actors who seek to exploit them for malicious purposes, such as generating disinformation, creating non-consensual imagery, or automating cyberattacks. It is no longer enough to simply release a powerful model; we must build layers of defense that make it resistant to jailbreaking and adversarial attacks. Developers need to prioritize alignment and robust filtering techniques just as much as they prioritize benchmark scores. We cannot hand such potent tools to the world without ensuring they have the safety mechanisms to prevent them from being turned against us.
  • 1 reply
·
KingNish 
posted an update 30 days ago
view post
Post
2458
Muon vs MuonClip vs Muon+Adamw

Muon has gone from an experiment to a mainstream optimizer, but does it hold up for fine‑tuning? We ran head‑to‑head tests on Qwen3‑4B (10k+ high‑quality instruction rows) to find out.

Short story: Pure Muon converged fastest at the start, but its gradient‑norm spikes made training unstable. MuonClip (Kimi K2’s clipping) stabilizes long pretraining runs, yet in our small‑scale fine‑tune it underperformed, lower token accuracy and slower convergence. The winner was the hybrid: Muon for 2D layers + AdamW for 1D layers. It delivered the best balance of stability and final performance and even beat vanilla AdamW.

Takeaway: for small-scale fine-tuning, hybrid = practical and reliable.

Next Step: scale to larger models/datasets to see if Muon’s spikes become catastrophic or if clipping wins out.

Full Blog Link: https://huggingface.co/blog/KingNish/optimizer-part1
KingNish 
posted an update about 1 month ago
KingNish 
posted an update 5 months ago
KingNish 
posted an update 7 months ago
view post
Post
1193
What's currently the biggest gap in Open Source Datasets ??
·
not-lain 
posted an update 10 months ago
not-lain 
posted an update 11 months ago
not-lain 
posted an update 12 months ago
view post
Post
1823
we now have more than 2000 public AI models using ModelHubMixin🤗
not-lain 
posted an update 12 months ago
view post
Post
4170
Published a new blogpost 📖
In this blogpost I have gone through the transformers' architecture emphasizing how shapes propagate throughout each layer.
🔗 https://huggingface.co/blog/not-lain/tensor-dims
some interesting takeaways :
not-lain 
posted an update about 1 year ago
view post
Post
2493
ever wondered how you can make an API call to a visual-question-answering model without sending an image url 👀

you can do that by converting your local image to base64 and sending it to the API.

recently I made some changes to my library "loadimg" that allows you to make converting images to base64 a breeze.
🔗 https://github.com/not-lain/loadimg

API request example 🛠️:
from loadimg import load_img
from huggingface_hub import InferenceClient

# or load a local image
my_b64_img = load_img(imgPath_url_pillow_or_numpy ,output_type="base64" ) 

client = InferenceClient(api_key="hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")

messages = [
	{
		"role": "user",
		"content": [
			{
				"type": "text",
				"text": "Describe this image in one sentence."
			},
			{
				"type": "image_url",
				"image_url": {
					"url": my_b64_img # base64 allows using images without uploading them to the web
				}
			}
		]
	}
]

stream = client.chat.completions.create(
    model="meta-llama/Llama-3.2-11B-Vision-Instruct", 
	messages=messages, 
	max_tokens=500,
	stream=True
)

for chunk in stream:
    print(chunk.choices[0].delta.content, end="")
davidaparicio 
in nerdyface/llama-v1 about 1 year ago

Brazilian Portuguese

1
#2 opened about 1 year ago by
davidaparicio
davidaparicio 
posted an update about 1 year ago
view post
Post
1139
Amazing workshop! Let's go!!