Hilko
hilkob
ยท
AI & ML interests
None yet
Recent Activity
reacted
to
RakshitAralimatti's
post
with ๐ง
17 days ago
I built something crazy you never saw before.
Please check - https://huggingface.co/blog/RakshitAralimatti/streaming-data-rag
A real-time Streaming Data to RAG system that listens to live radio, transcribes it on-the-fly, and lets you query across TIME.
Not just "what was discussed" โ but "what happened in the last 10 minutes on channel 0?" or "at 9 AM, what was the breaking news?" This is RAG that understands temporal context.
reacted
to
RakshitAralimatti's
post
with ๐ฅ
17 days ago
I built something crazy you never saw before.
Please check - https://huggingface.co/blog/RakshitAralimatti/streaming-data-rag
A real-time Streaming Data to RAG system that listens to live radio, transcribes it on-the-fly, and lets you query across TIME.
Not just "what was discussed" โ but "what happened in the last 10 minutes on channel 0?" or "at 9 AM, what was the breaking news?" This is RAG that understands temporal context.
reacted
to
hesamation's
post
with โค๏ธ
23 days ago
this is big... 50 AI researchers from Bytedance, Alibaba, Tencent, and other labs/universities just published a 300-page paper with surprising lessons about coding models and agents (data, pre and post-training, etc).
key highlights:
> small LLMs can beat proprietary giants
RL (RLVR specifically) gives small open-source models an edge over big models in reasoning. a 14B model trained with RLVR on high-quality verified problems can match the performance of OpenAI's o3.
> models have a hard time learning Python.
mixing language models during pre-training is good, but Python behaves different from statically typed languages. languages with similar syntax (Java and C#, or JavaScript and TypeScript) creates high positive synergy. mixing Python heavily into the training of statically typed languages can actually hurt because of Python's dynamic typing.
> not all languages are equal (coding scaling laws)
the amount of data required to specialize a model on a language drastically depends on the language. paper argues like C# and Java are easier to learn (less training data required). languages like Python and Javascript are actually more tricky to learn, ironically (you see AI most used for these languages :)
> MoE vs Dense (ability vs stability)
MoE models offer higher capacity, but are much more fragile during SFT than dense models. hyperparams in training have a more drastic effect in MoE models, while dense models are more stable. MoE models also require constant learning rate schedules to avoid routing instability.
> code models are "insecure" by default (duh)
training on public repos makes models learn years of accumulated insecure coding patterns. safety fine-tuning often fails to work much on code. a model might refuse to write a hate speech email but will happily generate a SQL-injection vulnerable function because it "works."
read the full paper:
https://huggingface.co/papers/2511.18538
Organizations
None yet