Hilko's picture

Hilko

hilkob

·

AI & ML interests

None yet

Recent Activity

reacted to RakshitAralimatti's post with 🧠 17 days ago

I built something crazy you never saw before. Please check - https://huggingface.co/blog/RakshitAralimatti/streaming-data-rag A real-time Streaming Data to RAG system that listens to live radio, transcribes it on-the-fly, and lets you query across TIME. Not just "what was discussed" – but "what happened in the last 10 minutes on channel 0?" or "at 9 AM, what was the breaking news?" This is RAG that understands temporal context.

reacted to RakshitAralimatti's post with 🔥 17 days ago

I built something crazy you never saw before. Please check - https://huggingface.co/blog/RakshitAralimatti/streaming-data-rag A real-time Streaming Data to RAG system that listens to live radio, transcribes it on-the-fly, and lets you query across TIME. Not just "what was discussed" – but "what happened in the last 10 minutes on channel 0?" or "at 9 AM, what was the breaking news?" This is RAG that understands temporal context.

reacted to hesamation's post with ❤️ 23 days ago

this is big... 50 AI researchers from Bytedance, Alibaba, Tencent, and other labs/universities just published a 300-page paper with surprising lessons about coding models and agents (data, pre and post-training, etc). key highlights: > small LLMs can beat proprietary giants RL (RLVR specifically) gives small open-source models an edge over big models in reasoning. a 14B model trained with RLVR on high-quality verified problems can match the performance of OpenAI's o3. > models have a hard time learning Python. mixing language models during pre-training is good, but Python behaves different from statically typed languages. languages with similar syntax (Java and C#, or JavaScript and TypeScript) creates high positive synergy. mixing Python heavily into the training of statically typed languages can actually hurt because of Python's dynamic typing. > not all languages are equal (coding scaling laws) the amount of data required to specialize a model on a language drastically depends on the language. paper argues like C# and Java are easier to learn (less training data required). languages like Python and Javascript are actually more tricky to learn, ironically (you see AI most used for these languages :) > MoE vs Dense (ability vs stability) MoE models offer higher capacity, but are much more fragile during SFT than dense models. hyperparams in training have a more drastic effect in MoE models, while dense models are more stable. MoE models also require constant learning rate schedules to avoid routing instability. > code models are "insecure" by default (duh) training on public repos makes models learn years of accumulated insecure coding patterns. safety fine-tuning often fails to work much on code. a model might refuse to write a hate speech email but will happily generate a SQL-injection vulnerable function because it "works." read the full paper: https://huggingface.co/papers/2511.18538

View all activity

Organizations

None yet

hilkob 's models

None public yet