1 100 3

Ksenia Se

Kseniase

https://www.turingpost.com/

AI & ML interests

None yet

Recent Activity

replied to their post about 17 hours ago

9 Recent advances in Multi-Agent Systems (all open-source) The idea to split tasks across multiple agents instead of relying on one universal agent is now seen as one of the most effective ways to build an AI stack. Concepts like “agent swarms” were highlighted at the AI Engineer Code Summit in NYC (Nov 20–21) as the winning architecture. And this trend is not only about coding and software. It applies across all AI domains. So here is some recent research that helps keep multi-agent systems (MAS) better and up-to-date: 1. LatentMAS → https://huggingface.co/papers/2511.20639 AI agents share their hidden "thoughts" directly in latent space instead of talking through text. This makes collaboration and reasoning way faster and accurate (no extra training needed) 2. Puppeteer → https://huggingface.co/papers/2505.19591 Uses a “puppeteer” LLM that dynamically decides which agents (“puppets”) to call and in what order. By learning this orchestration with reinforcement learning (RL), the system solves complex tasks more efficiently and with fewer compute costs 3. MADD → https://huggingface.co/papers/2511.08217 A MAS with 4 agents for drug discovery. It lets researchers describe a drug discovery task in plain language. Then MADD automatically builds and runs the full hit-identification pipeline, making AI-driven drug design a simple end-to-end workflow 4. Multi-Agent Tool-Integrated Policy Optimization (MATPO) → https://huggingface.co/papers/2510.04678 Lets one LLM act as multiple agents (like a planner and a worker) by using different prompts and training them together with RL. So you get the benefits of a multi-agent system without needing multiple models If you're interested in trends in multi-agent for software development of the future, explore my article with the emergent playbook. This is super interesting → https://www.turingpost.com/p/aisoftwarestack Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe Read further below ⬇️

posted an update about 17 hours ago

replied to their post 8 days ago

6 Essential Reads on Spatial Intelligence In AI, spatial intelligence is basically the model’s “sense of space” – its ability to understand where things are, how they relate, and how they move. It lets an AI models navigate a room, interpret a scene, or figure out how objects fit together, like giving it a built-in mental map. For example, world models can't live without spatial intelligence. Here are 6 good reads to explore what spatial intelligence is and how it's evolving: 1. From Words to Worlds: Spatial Intelligence is AI’s Next Frontier by Fei-Fei Li → https://drfeifei.substack.com/p/from-words-to-worlds-spatial-intelligence Fei-Fei Li, the godmother of AI, is a key figure in spatial intelligence, since her work in computer vision, especially ImageNet, helped AI learn to recognize and understand objects in space. She's recently started a blog, and this post, in particular, argues that true intelligence requires grounding in space, understanding geometry, motion and consequences in the real world 2. Spatial Reasoning in Multimodal LLMs: A Survey of Tasks, Benchmarks and Methods → https://arxiv.org/abs/2511.15722 Breaks down how AI models handle spatial reasoning from a cognitive angle, maps all the existing tasks and benchmarks to that framework 3. What is Spatial Intelligence? → https://www.turingpost.com/p/cvhistory5 Our special article easily explains what spatial intelligence actually is, why it matters, and how researchers are trying to boost it so machines can better understand and navigate the physical world 4. From 2D to 3D Cognition: A Brief Survey of General World Models → https://arxiv.org/pdf/2506.20134 Shows how AI world models are evolving from simple 2D perception to full-on 3D understanding, explaining the tech behind it, what new 3D abilities these models gain, and where they’re used in the real world Read further below ⬇️ If you like it, also subscribe to the Turing Post: https://www.turingpost.com/subscribe

View all activity

Organizations

replied to their post about 17 hours ago

QuantAgent → https://huggingface.co/papers/2509.09995
A multi-agent LLM system for high-frequency trading in real time. It splits the job between 4 agents – Indicator, Pattern, Trend, and Risk – to make quick, precise decisions, based on short-term market signals
MAC-Flow → https://huggingface.co/papers/2511.05005
Learns complex multi-agent coordination with a flow model and distills it into fast one-step policies, providing diffusion-level coordination with Gaussian-level real-time speed
MrlX → https://github.com/AQ-MedAI/MrlX
A multi-agent RL framework where 2 agents talk through a multi-turn dialogue (Agent A initiates it, Agent B engages in responses), learn from each other, and update their models in a continuous “generate → train → sync” loop. The agents co-evolve and get better at collaborative decision-making over time
M-GRPO for Multi-Agent Deep Research → https://huggingface.co/papers/2511.13288
This training method lets different agents in a MAS use their own specialized LLMs while still learning together. It gives each agent its own local reward signal and aligns their uneven trajectories, so they stay coordinated even when running at different speeds or on different servers
MarsRL→ https://huggingface.co/papers/2511.11373
Trains the Solver, Verifier, and Corrector agents together with separate rewards for each and a pipeline-style RL setup, which makes them better at catching mistakes and refining answers and reaching much higher accuracy on math benchmarks

posted an update about 17 hours ago

Post

725

9 Recent advances in Multi-Agent Systems (all open-source)

The idea to split tasks across multiple agents instead of relying on one universal agent is now seen as one of the most effective ways to build an AI stack. Concepts like “agent swarms” were highlighted at the AI Engineer Code Summit in NYC (Nov 20–21) as the winning architecture. And this trend is not only about coding and software. It applies across all AI domains.

So here is some recent research that helps keep multi-agent systems (MAS) better and up-to-date:

1. LatentMAS → Latent Collaboration in Multi-Agent Systems (2511.20639)
AI agents share their hidden "thoughts" directly in latent space instead of talking through text. This makes collaboration and reasoning way faster and accurate (no extra training needed)

2. Puppeteer → Multi-Agent Collaboration via Evolving Orchestration (2505.19591)
Uses a “puppeteer” LLM that dynamically decides which agents (“puppets”) to call and in what order. By learning this orchestration with reinforcement learning (RL), the system solves complex tasks more efficiently and with fewer compute costs

3. MADD → MADD: Multi-Agent Drug Discovery Orchestra (2511.08217)
A MAS with 4 agents for drug discovery. It lets researchers describe a drug discovery task in plain language. Then MADD automatically builds and runs the full hit-identification pipeline, making AI-driven drug design a simple end-to-end workflow

4. Multi-Agent Tool-Integrated Policy Optimization (MATPO) → Multi-Agent Tool-Integrated Policy Optimization (2510.04678)
Lets one LLM act as multiple agents (like a planner and a worker) by using different prompts and training them together with RL. So you get the benefits of a multi-agent system without needing multiple models

If you're interested in trends in multi-agent for software development of the future, explore my article with the emergent playbook. This is super interesting → https://www.turingpost.com/p/aisoftwarestack
Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe

Read further below ⬇️

1 reply

replied to their post 8 days ago

A Survey of Large Language Model-Powered Spatial Intelligence Across Scales: Advances in Embodied Agents, Smart Cities, and Earth Science -> https://arxiv.org/abs/2504.09848
Explores spatial memory and reasoning in LLMs, and compares spatial intelligence across scales – from agents to cities to the planet – offering a framework and insights for future
SITE: Towards Spatial Intelligence Thorough Evaluation → https://arxiv.org/abs/2505.05456
Introduces the SITE benchmark to evaluate spatial intelligence in vision-language models across modalities and scales

posted an update 8 days ago

Post

1903

6 Essential Reads on Spatial Intelligence

In AI, spatial intelligence is basically the model’s “sense of space” – its ability to understand where things are, how they relate, and how they move. It lets an AI models navigate a room, interpret a scene, or figure out how objects fit together, like giving it a built-in mental map. For example, world models can't live without spatial intelligence.

Here are 6 good reads to explore what spatial intelligence is and how it's evolving:

1. From Words to Worlds: Spatial Intelligence is AI’s Next Frontier by Fei-Fei Li → https://drfeifei.substack.com/p/from-words-to-worlds-spatial-intelligence
Fei-Fei Li, the godmother of AI, is a key figure in spatial intelligence, since her work in computer vision, especially ImageNet, helped AI learn to recognize and understand objects in space. She's recently started a blog, and this post, in particular, argues that true intelligence requires grounding in space, understanding geometry, motion and consequences in the real world

2. Spatial Reasoning in Multimodal LLMs: A Survey of
Tasks, Benchmarks and Methods → https://arxiv.org/abs/2511.15722
Breaks down how AI models handle spatial reasoning from a cognitive angle, maps all the existing tasks and benchmarks to that framework

3. What is Spatial Intelligence? → https://www.turingpost.com/p/cvhistory5
Our special article easily explains what spatial intelligence actually is, why it matters, and how researchers are trying to boost it so machines can better understand and navigate the physical world

4. From 2D to 3D Cognition: A Brief Survey of General World
Models → https://arxiv.org/pdf/2506.20134
Shows how AI world models are evolving from simple 2D perception to full-on 3D understanding, explaining the tech behind it, what new 3D abilities these models gain, and where they’re used in the real world

Read further below ⬇️
If you like it, also subscribe to the Turing Post: https://www.turingpost.com/subscribe

1 reply

replied to their post 15 days ago

TD-JEPA (Temporal difference JEPA) → https://huggingface.co/papers/2510.00739
An unsupervised RL method that uses TD learning to model long-term latent dynamics, training encoders and a policy-conditioned predictor for zero-shot reward optimization

5 Iconic JEPA types:

I-JEPA (Image-based) → https://huggingface.co/papers/2301.08243
Masks out parts of an image and predicts their latent representation from the remaining context region. Uses Vision Transformers; no pixel-level reconstruction needed
V-JEPA (Video-based) → https://huggingface.co/papers/2404.08471
Predicts future or missing frame embeddings from observed frames. Learns temporal dynamics without contrastive negatives or text supervision

V-JEPA 2 trained on 1M+ hours of internet videos and a little bit of robot interaction data. It can watch, understand, answer questions, and help robots plan and act in physical world → https://huggingface.co/papers/2506.09985

MC-JEPA (Motion-Content) → https://huggingface.co/papers/2307.12698
Jointly learns motion (optical flow) and content features with a shared encoder. It combines a flow prediction task with a standard image representation task (VICReg) in one model
A-JEPA (Audio-based) → https://huggingface.co/papers/2311.15830
Extends JEPA to audio spectrograms. Masks time-frequency patches of the spectrogram (with a curriculum strategy) and predicts their latent features from the unmasked context
TI-JEPA (Text-Image) → https://huggingface.co/papers/2503.06380
Aligns text and image embeddings in a shared latent space via an energy-based predictive objective

We break down how JEPA works and its main ideas in this comprehensive article: https://www.turingpost.com/p/jepa

Check out more JEPA types here:

posted an update 15 days ago

Post

5993

12 Types of JEPA

Since Yann LeCun together with Randall Balestriero released a new paper on JEPA (Joint-Embedding Predictive Architecture), laying out its theory and introducing an efficient practical version called LeJEPA, we figured you might need even more JEPA. Here are 7 recent JEPA variants plus 5 iconic ones:

1. LeJEPA → LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics (2511.08544)
Explains a full theory for JEPAs, defining the “ideal” JEPA embedding as an isotropic Gaussian, and proposes the SIGReg objective to push JEPA toward this ideal, resulting in practical LeJEPA

2. JEPA-T → JEPA-T: Joint-Embedding Predictive Architecture with Text Fusion for Image Generation (2510.00974)
A text-to-image model that tokenizes images and captions with a joint predictive Transformer, enhances fusion with cross-attention and text embeddings before training loss, and generates images by iteratively denoising visual tokens conditioned on text

3. Text-JEPA → Speaking in Words, Thinking in Logic: A Dual-Process Framework in QA Systems (2507.20491)
Converts natural language into first-order logic, with a Z3 solver handling reasoning, enabling efficient, explainable QA with far lower compute than large LLMs

4. N-JEPA (Noise-based JEPA) → Improving Joint Embedding Predictive Architecture with Diffusion Noise (2507.15216)
Connects self-supervised learning with diffusion-style noise by using noise-based masking and multi-level schedules, especially improving visual classification

5. SparseJEPA → SparseJEPA: Sparse Representation Learning of Joint Embedding Predictive Architectures (2504.16140)
Adds sparse representation learning to make embeddings more interpretable and efficient. It groups latent variables by shared semantic structure using a sparsity penalty while preserving accuracy

6. TS-JEPA (Time Series JEPA) → Joint Embeddings Go Temporal (2509.25449)
Adapts JEPA to time-series by learning latent self-supervised representations and predicting future latents for robustness to noise and confounders

Read further below ↓
It you like it, also subscribe to the Turing Post: https://www.turingpost.com/subscribe

1 reply

replied to their post 22 days ago

FP4 → https://arxiv.org/abs/2310.16836 (4-bit Transformer); https://arxiv.org/abs/2305.14314 (QLoRA)
Experimental format for ultra-compact inference. It's used in research and quantization-aware inference, including 4-Bit Floating-Point Quantized Transformers and 4-bit NormalFloat (NF4) in QLoRA
INT8/INT4 → https://arxiv.org/abs/2004.09602
Integer low-precision formats that use 8 or 4 bits. Primary used in inference. The model's weights and activations are converted into integer values that can be processed efficiently on hardware optimized for integer arithmetic
2-bit (ternary or binary quantization) → https://research.ibm.com/blog/low-precision-computing
Experimental ultra-low precision for computation in ultra-efficient AI accelerators. Uses values like {-1, 0, 1}. It turns multiplications into additions/subtractions - extremely cheap operations

posted an update 22 days ago

Post

4066

7+ Main precision formats used in AI:

Precision is very important in AI as it shapes how accurate and efficient models are. It controls how finely numbers are represented, approximating real-world values with formats like fixed-point and floating-point. A recent BF16 → FP16 study renewed attention to precision impact.
Here are the main precision types used in AI, from full precision for training to ultra-low precision for inference:

1. FP32 (Float32):
Standard full-precision float used in most training: 1 sign bit, 8 exponent bits, 23 mantissa bits. Default for backward-compatible training and baseline numerical stability

2. FP16 (Float16) → https://arxiv.org/abs/2305.10947v6
Half-precision float. It balances accuracy and efficiency. 1 sign bit, 5 exponent bits, 10 mantissa bits. Common on NVIDIA Tensor Cores and mixed-precision setups. There’s now a new wave of using it in reinforcement learning: https://www.turingpost.com/p/fp16

3. BF16 (BFloat16) → https://cloud.google.com/blog/products/ai-machine-learning/bfloat16-the-secret-to-high-performance-on-cloud-tpus
Same dynamic range as FP32 but fewer mantissa bits: 1 sign bit, 8 exponent bits (same as FP32), 7 mantissa bits. It was developed by the research group Google Brain as part of their AI/ML infrastructure work at Google. Preferred on TPUs and modern GPUs

4. FP8 (E4M3 / E5M2) → https://proceedings.neurips.cc/paper_files/paper/2018/file/335d3d1cd7ef05ec77714a215134914c-Paper.pdf
Emerging standard for training and inference on NVIDIA Hopper (H100) and Blackwell (B200) tensor cores and AMD MI300. Also supported in NVIDIA’s Transformer Engine: https://developer.nvidia.com/blog/floating-point-8-an-introduction-to-efficient-lower-precision-ai-training/
E4M3 = 4 exponent, 3 mantissa bits
E5M2 = 5 exponent, 2 mantissa bits

Read further below ⬇️
If you like this, also subscribe to the Turing post: https://www.turingpost.com/subscribe

1 reply

replied to their post 29 days ago

Agentic Entropy-Balanced Policy Optimization (AEPO) → https://huggingface.co/papers/2510.14545
Keeps web agents from collapsing during training by balancing entropy in data collection and policy updates, and adjusting gradients on high-uncertainty steps
Agent- and Turn-wise Grouped Reinforcement Policy Optimization (AT-GRPO) → https://huggingface.co/papers/2510.11062
PO for multi-agent LLM systems. It groups training by agent roles and dialogue turns, allowing each agent to learn more effectively within its context
Direct Group Preference Optimization (DGPO) → https://huggingface.co/papers/2510.08425
RL method made for diffusion models. Learns directly from group-level preferences between samples, allowing it to use fast deterministic ODE samplers instead of noisy stochastic ones
Entropy-regularized Policy Optimization (EPO) → https://huggingface.co/papers/2509.22576
Controls entropy and adapts it across training phases, encouraging exploration early on and steady convergence later
Multiplayer Nash Preference Optimization (MNPO) → https://huggingface.co/papers/2509.23102
Extends human feedback alignment to a multiplayer game setup. Each policy competes with a population of others, capturing more complex and realistic human preference patterns while keeping stable Nash equilibria

posted an update 29 days ago

Post

11118

11 Fascinating new Policy Optimization techniques

Policy optimization (PO) algorithms are central to training AI models with preference-based feedback. In recent weeks, numerous new PO methods have emerged that build on or replace the popular PPO and GRPO, solving their issues. Here are 11 of them:

1. BAlanced Policy Optimization (BAPO) → BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping (2510.18927)
Dynamically adjusting the clipping bounds in PPO-style updates to balance positive and negative gradients and prevent entropy collapse

2. Training-Free GRPO → Training-Free Group Relative Policy Optimization (2510.08191)
Instead of using numeric rewards, it compares rollouts semantically to distill useful knowledge as a token prior, which is then applied during inference to guide the model’s behavior

3. Asymmetric Importance Sampling Policy Optimization (ASPO) → ASPO: Asymmetric Importance Sampling Policy Optimization (2510.06062)
Fixes imbalanced token weighting in LLM training. It flips the importance sampling ratios for positive tokens to correct over- and under-updates, and adds a soft dual-clipping step to keep gradients stable

4. In-Context Steered Policy Optimization (ICPO) → https://arxiv.org/abs/2510.26519
Uses a model’s own in-context learning ability to guide training with existing data. It combines Mixed-Policy GRPO with Implicit Expert Forcing to expand exploration and adds Expert Region Reject Sampling and Annealed Expert-Bonus Reward Shaping to ensure stability and balanced expert influence

5. Graph-Enhanced Policy Optimization (GEPO) → https://arxiv.org/abs/2510.26270
Builds a graph of an agent’s experiences to understand how different states connect, guide exploration and assign rewards more effectively

6. Information Gain-based Policy Optimization (IGPO) → Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn LLM Agents (2510.14967)
Uses the model’s own belief updates to create dense, informative feedback for smoother multi-turn learning

Read further below ⬇️
If you like this, also subscribe to the Turing post: https://www.turingpost.com/subscribe

2 replies

replied to their post about 1 month ago

DeepCode → https://github.com/HKUDS/DeepCode
A platform where multiple AI agents work together to turn research papers or natural language descriptions into full, production-ready applications
AutoGPT https://github.com/Significant-Gravitas/AutoGPT
A platform for building, deploying, and running continuous AI agents for complex workflows – available as a free self-hosted setup and a soon-to-launch cloud service
Kilo Code https://github.com/Kilo-Org/kilocode
AI coding agent for VS Code, powered by all top models like GPT-5 and Claude 4. It turns your editor into a self-checking, multi-mode AI coworker that streamlines development from planning to debugging
CodeGeeX → https://github.com/zai-org/CodeGeeX (the later update CodeGeeX4 https://github.com/zai-org/CodeGeeX4)
Streamlines global software development by enabling seamless cross-language coding, faster prototyping, and intelligent code assistance across multiple platforms and IDEs

posted an update about 1 month ago

Post

842

12 Awesome GitHub repos to upgrade your AI coding

Coding is the field where AI is welcomed with open arms. Here’s a collection to help you take your AI-assisted coding workflows to the next level of convenience and efficiency:

1. Smol Developer → https://github.com/smol-ai/developer
A lightweight AI “junior dev” that takes your product spec and automatically scaffolds or helps you build full codebases

2. Tabby → https://github.com/TabbyML/tabby
A self-hosted AI coding assistant that runs locally as an alternative to GitHub Copilot. Easy to integrate, GPU-friendly, and doesn’t rely on the cloud

3. Beads (bd) Issue Tracker → https://github.com/steveyegge/beads
Gives coding agents long-term memory, letting them organize, plan, and execute complex tasks reliably across sessions

4. MetaGPT → https://github.com/FoundationAgents/MetaGPT
A multi-agent framework that imitates a software company team using LLMs. It assigns AI agents roles like PM, Architect, and Developer to produce user stories, designs, specs, and final code

5. Open Interpreter → https://github.com/openinterpreter/open-interpreter
Gives you ChatGPT’s coding power with full local control – no limits, no sandbox – so you can automate, analyze, and create anything right from your desktop through a chat interface

6. OpenSpec → https://github.com/Fission-AI/OpenSpec
A lightweight, spec-driven development tool that helps humans and AI agree on what to build before any code is written

7. PR-Agent → https://github.com/qodo-ai/pr-agent
An AI code reviewer that automatically reviews, describes, and improves pull requests across GitHub, GitLab, and other platforms

8. BabyAGI → https://github.com/yoheinakajima/babyagi
A self-building AI framework that gives agents the ability to write, manage, and refine their own functions, turning them from passive tools into active, self-building systems

9 ...⬇️

Subscribe to the Turing Post: https://www.turingpost.com/subscribe – your shortcut to deep, clear AI analysis

2 replies

posted an update about 1 month ago

Post

4037

5 Lectures and keynotes defining AI right now

If you want to understand the multifaceted AI landscape in 2025 and see where the field is heading – start with (or revisit) these legendary talks. They can help you capture what’s happening in AI from multiple angles:

1. Andrej Karpathy: Software Is Changing (Again) → https://www.youtube.com/watch?v=LCEmiRjPEtQ
Unveils Software 3.0 – a paradigm where LLMs are the new computers, programmed with prompts instead of code. The key: developers must now master coding, training, and prompting as AI becomes the heart of software building

2. Richard Sutton, The OaK Architecture: A Vision of SuperIntelligence from Experience → https://www.youtube.com/watch?v=gEbbGyNkR2U
Unveils the OaK (Options and Knowledge) architecture – a model-based RL framework for continual intelligence, where every component learns, meta-learns & builds hierarchical abstractions

3. GTC March 2025 Keynote with NVIDIA CEO Jensen Huang → https://www.youtube.com/watch?v=_waPvOwL9Z8
Dives into the accelerated computing and the importance of Physical AI. From the Blackwell GPU architecture & AI factories to breakthroughs in agentic AI & robotics, Jensen Huang explains how NVIDIA aims to power every layer of the AI ecosystem

4. Yann LeCun "Mathematical Obstacles on the Way to Human-Level AI" → https://www.youtube.com/watch?v=ETZfkkv6V7
Yann LeCun always argues we need a new path to machines that reason about the world – not LLMs or RL. So this lecture is about self-supervised systems with world models, planning, memory and energy-based learning

5. Andrew Ng: State of AI Agents → https://www.youtube.com/watch?v=4pYzYmSdSH4
Highlights one of the most pressing topics of 2025 – agents, explaining why most effective AI agents rely on simple, linear workflows built from modular “Lego-brick” tasks + what predicts AI startup success in the new agent era

Subscribe to the Turing Post: https://www.turingpost.com/subscribe –your shortcut to deep, clear AI analysis

replied to their post about 2 months ago

Synthesia → https://www.synthesia.io
Generates digital realistic avatars that narrate your script. Choose from virtual actors or customize. They feature natural expressions, lip-sync + you can select dozens of languages/accents for the AI voice
HeyGen → https://www.heygen.com
Offers interactive avatars that can answer questions or have branching dialogues, powered by a knowledge base you provide. Useful for FAQ bots or personalized marketing
Kaiber → https://kaibarai.com
Stylized, artistic video generation from text or images. The output is more painterly or animated rather than photorealistic. It’s especially popular for music videos, digital art, and creative storytelling, like visualizing song lyrics with surreal animations
InVideo (AI Video Creator) → https://invideo.io/
AI video creator for marketing, adds, social content and explainers. Turns scripts, prompts, or even blog posts and articles into ready-made videos with stock clips and voice-overs

posted an update about 2 months ago

Post

3168

9 Powerful AI Video Generation Tools

Since Sora 2 is on fire these weeks, reminding us what high-quality video generation should look like, we decided you really need this list of video generation tools – great alternatives or complements to it.

1. Sora 2 → https://openai.com/sora/
It needs no introduction, but this OpenAI’s text-to-video model produces short, ultra-realistic clips across styles (cinematic, photorealistic, animated, etc.) with synced audio

2. Google Veo 3 (Gemini Video Generation) → https://aistudio.google.com/models/veo-3
Part of Gemini AI. Generates 8-second high-fidelity videos from text or images with native sound: background soundtracks and realistic voices with near-perfect lip sync

3. Runway (Gen-4 by Runway ML) → https://runwayml.com/
Text, image, or video-to-video generation with advanced editing like changing lighting, weather, camera angles or replacing objects. Popular in AI filmmaking

4. Pika Labs → https://pollo.ai/m/pika-ai
Provides creative, often stylized short videos – from cinematic mini-scenes to cartoon-like animations. Ideal for social media clips and visual storytelling. Plus, you can add playful effects to manipulate objects in the generated videos

5. Luma’s Dream Machine → https://lumalabs.ai/dream-machine
Powered by Luma AI’s latest Ray 3 model, it quickly visualizes story ideas, animated concept art, or abstract motion videos. It supports consistent custom characters and seamless looping

Read further below ⬇️
If you like it, also subscribe to the Turing Post https://www.turingpost.com/subscribe

1 reply

replied to their post about 2 months ago

MCTS-in-the-loop → https://huggingface.co/papers/2501.01478
Allows to score each reasoning step for correctness, retrain on the best ones, and repeats the cycle to steadily improve reasoning.
Plus, building MCTS into training broadens exploration in RLVR, hitting new reasoning SOTA with 5.7× less compute → https://huggingface.co/papers/2509.25454
Process-aware RL (like PRM-style GRPO) → https://huggingface.co/papers/2509.21154
Theory shows GRPO implicitly learns a process reward model (PRM), judging the quality of reasoning steps under the hood. Approaches like Posterior-GRPO makes this explicit by rewarding reasoning within correct answers to reduce reward hacking → https://huggingface.co/papers/2508.05170
Reinforcement Learning from AI Feedback (RLAIF) → https://huggingface.co/papers/2212.08073
It's like RLHF, but the reward signals come from a strong AI judge

posted an update about 2 months ago

Post

3819

8 Emerging trends in Reinforcement Learning

Reinforcement learning is having a moment - and not just this week. Some of its directions are already showing huge promise, while others are still early but exciting. Here’s a look at what’s happening right now in RL:

1. Reinforcement Pre-Training (RPT) → Reinforcement Pre-Training (2506.08007)
Reframes next-token pretraining as RL with verifiable rewards, yielding scalable reasoning gains

2. Reinforcement Learning from Human Feedback (RLHF) → Deep reinforcement learning from human preferences (1706.03741)
The top approach. It trains a model using human preference feedback, building a reward model and then optimizing the policy to generate outputs people prefer

3. Reinforcement Learning with Verifiable Rewards (RLVR) → Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs (2506.14245)
Moves from subjective (human-labeled) rewards to objective ones that can be automatically verified, like in math, code, or rubrics as reward, for example → Reinforcement Learning with Rubric Anchors (2508.12790), Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains (2507.17746)

4. Multi-objective RL → Pareto Multi-Objective Alignment for Language Models (2508.07768)
Trains LMs to balance multiple goals at once, like being helpful but also concise or creative, ensuring that improving one goal doesn’t ruin another

5. Parallel thinking RL → Parallel-R1: Towards Parallel Thinking via Reinforcement Learning (2509.07980)
Trains parallel chains of thought, boosting math accuracy and final ceilings. It first teaches the model “parallel thinking” skill on easier problems, then uses RL to refine it on harder ones

Read further below ⬇️
And if you like this, subscribe to the Turing post: https://www.turingpost.com/subscribe

Also, check out our recent guide about the past, present and future of RL: https://www.turingpost.com/p/rlguide

3 replies

replied to their post 2 months ago

Apify MCP Server → https://github.com/apify/apify-mcp-server
Allows AI agents extract data from websites, social media, search engines, maps, and more using thousands of ready-made scrapers and automation tools
MCP Apple Notes → https://github.com/RafalWilinski/mcp-apple-notes
Enables semantic search and RAG over Apple Notes, letting AI assistants like Claude search and reference them in conversation
Alibaba Cloud Ops MCP Server → https://github.com/aliyun/alibaba-cloud-ops-mcp-server
Lets AI assistants manage Alibaba Cloud resources via MCP, with support for ECS, Cloud Monitor, OOS and more

posted an update 2 months ago

Post

4577

12 Excellent MCP Servers

The family of MCP (Model Context Protocol) servers keeps expanding to bridge agents, models, tools, web, data and apps. Here are 12 useful MCP servers that will help you create convenient agentic ecosystems:

1. Chrome DevTools MCP → https://github.com/ChromeDevTools/chrome-devtools-mcp
Lets your coding agent (Gemini, Claude, Cursor, Copilot) control a live Chrome browser with full DevTools access for automation, debugging, and performance analysis

2. Windows-MCP → https://github.com/CursorTouch/Windows-MCP
Provides interaction between agents and Windows, handling file navigation, app control, UI actions, QA testing

3. MCPControl → https://github.com/claude-did-this/MCPControl
Windows control server for programmatic control of mouse, keyboard, window management, and screen capture

4. MetaMCP → https://github.com/metatool-ai/metamcp
A proxy that aggregates multiple MCP servers into one, with middleware support. Works as a standard MCP server for any client

5. MindsDB → https://github.com/mindsdb/mindsdb
Humans, models, agents and apps get accurate answers from large-scale data sources

6. Playwright MCP → https://github.com/microsoft/playwright-mcp
Lets LLMs interact with web pages via structured accessibility snapshots, no need for screenshots or visually-tuned models

7. MCP Access Point → https://github.com/sxhxliang/mcp-access-point
Bridges MCP clients with HTTP services, no server-side changes needed

8. Browserbase MCP Server → https://github.com/browserbase/mcp-server-browserbase
Connects LLMs to external data and tools, adding cloud browser automation via Browserbase and Stagehand. It enables LLMs to browse, capture, extract, and act on web pages with precision

9. Yutu → https://github.com/eat-pray-ai/yutu
Automates YouTube workflows, managing videos, playlists, channels, comments, captions, etc.

3 more below ↓
Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe

1 reply

replied to their post 2 months ago

Semantic-guided LoRA (SG-LoRA) → https://huggingface.co/papers/2509.10535
Generates task-specific LoRA parameters from semantic preferences, enabling zero-shot and privacy-preserving adaptation to new tasks
PHLoRA (Post-hoc LoRA) → https://huggingface.co/papers/2509.10971
Extracts LoRA adapters after fine-rank fine-tuning by low-rank factoring the weight differences
LoRA-Gen → https://huggingface.co/papers/2506.11638
Generates LoRA parameters from a large cloud model for small edge models, merging them for efficient task specialization and faster inference.
DP-FedLoRA → https://huggingface.co/papers/2509.09097
Adds DP noise to LoRA matrices in federated on-device fine-tuning for privacy preservation

Ksenia Se

AI & ML interests

Recent Activity

Organizations

Kseniase's activity