In a Training Loop 🔄

527 2315 28522

John Smith PRO

John6666

John6666cat

AI & ML interests

None yet

Recent Activity

reacted to Reubencf's post with 🤗 about 6 hours ago

Now Live: The https://huggingface.co/spaces/Reubencf/Nano_Banana_Editor now includes 10 free requests/day! 🍌 I'm personally sponsoring these credits to help make open AI accessible to all. (Note: Limits are subject to change based on funding). Enjoy !

reacted to dhruv3006's post with 👀 about 6 hours ago

The problem : Hardcoded URLs, tokens, and IDs make API workflows brittle and painful to maintain. What devs do today Duplicate values across files or manually swap configs for dev, staging, and prod - easy to break, hard to scale. Why Voiden : - Voiden Variables let you define once and reuse everywhere. - Switch environments easily, keep secrets out of request files, and reuse dynamic values across requests. Static requests → dynamic, maintainable workflows. Use Voiden here : https://voiden.md

reacted to rajkumarrawal's post with 👍 about 6 hours ago

I submitted a "AgencyBench: Benchmarking the Frontiers of Autonomous Agents in 1M-Token Real-World Contexts" Paper by @weizhihao1KeyuLi Junhao shi @dqwangDequan Wang @YangXiao-nlpYang Xiao Mohan Jiang @Sunshine279Jie Sun Yunze Wu Shijie Xia Xiaojie Cai Tianze Xu Weiye Si Wenjie Li Pengfei Liu From https://huggingface.co/SJTU Shanghai Jiao Tong University https://huggingface.co/PolyUHK The Hong Kong Polytechnic University https://huggingface.co/GAIRSII-GAIR to Daily Papers on https://huggingface.co/huggingfaceHugging Face. A potentially another direction for Benchmarking the Frontiers of Autonomous Agents in 2026 Some of the observations founded are :- -- Long-horizon tasks remain challenging : Even frontier models struggle with sustained reasoning over real world tasks that require 1M tokens and 90 tool calls, indicating limits in long context autonomy. -- Proprietary models outperform open source models: Closed source models achieve a higher average score (48.4%) than open source counterparts (32.1%), revealing a persistent performance gap on complex agentic tasks. -- Feedback driven self correction varies widely: Models like GPT 5.2 and Claude show strong gains from iterative feedback, while others (e.g. DeepSeek V3.2) exhibit minimal or no improvement after feedback. -- Efficiency trade offs are significant: High performing models often consume far more tokens and time, some models (e.g. Grok 4.1 Fast) are more token efficient despite lower absolute scores. -- Agentic scaffolds strongly influence performance: Models tend to perform best within their native or optimized ecosystems, highlighting that agent performance depends on tight coupling between the model and its scaffold not the model alone. ..... many more... https://huggingface.co/papers/2601.11044

View all activity

Organizations

reacted to Reubencf's post with 🤗 about 6 hours ago

Post

280

Now Live: The Reubencf/Nano_Banana_Editor now includes 10 free requests/day! 🍌 I'm personally sponsoring these credits to help make open AI accessible to all.
(Note: Limits are subject to change based on funding).

Enjoy !

reacted to dhruv3006's post with 👀 about 6 hours ago

Post

133

The problem : Hardcoded URLs, tokens, and IDs make API workflows brittle and painful to maintain.
What devs do today
Duplicate values across files or manually swap configs for dev, staging, and prod - easy to break, hard to scale.
Why Voiden :
- Voiden Variables let you define once and reuse everywhere.
- Switch environments easily, keep secrets out of request files, and reuse dynamic values across requests.

Static requests → dynamic, maintainable workflows.

Use Voiden here : https://voiden.md

reacted to rajkumarrawal's post with 👍 about 6 hours ago

Post

274

SJTU Shanghai Jiao Tong University

PolyUHK The Hong Kong Polytechnic University GAIRSII-GAIR to Daily Papers on huggingfaceHugging Face.

A potentially another direction for Benchmarking the Frontiers of Autonomous Agents in 2026

Some of the observations founded are :-

-- Long-horizon tasks remain challenging :
Even frontier models struggle with sustained reasoning over real world tasks that require 1M tokens and 90 tool calls, indicating limits in long context autonomy.

-- Proprietary models outperform open source models:
Closed source models achieve a higher average score (48.4%) than open source counterparts (32.1%), revealing a persistent performance gap on complex agentic tasks.

-- Feedback driven self correction varies widely:
Models like GPT 5.2 and Claude show strong gains from iterative feedback, while others (e.g. DeepSeek V3.2) exhibit minimal or no improvement after feedback.

-- Efficiency trade offs are significant:
High performing models often consume far more tokens and time, some models (e.g. Grok 4.1 Fast) are more token efficient despite lower absolute scores.

-- Agentic scaffolds strongly influence performance:
Models tend to perform best within their native or optimized ecosystems, highlighting that agent performance depends on tight coupling between the model and its scaffold not the model alone.

..... many more...

AgencyBench: Benchmarking the Frontiers of Autonomous Agents in 1M-Token Real-World Contexts (2601.11044)

1 reply

reacted to kostakoff's post with 🤗 about 6 hours ago

Post

312

Hi Everyone!
I am new in LLM AI, and I think it is very cool.
Thanks HF authors that Hugging Face exist. It is great place to start learning AI.

1 reply

reacted to mahimairaja's post with 👀 about 6 hours ago

Post

334

Lacking vllm support for Transformers v5, frustrating only me?

reacted to hassenhamdi's post with 🔥 about 6 hours ago

Post

630

Google published the paper. I shipped the code. 🚀

DeepMind just released PACEvolve (Progress-Aware Consistent Evolution), a massive overhaul of the AlphaEvolve framework. It solves the critical issues of "Context Pollution" and "Mode Collapse" that have historically crippled evolutionary coding agents.

But there was no public implementation. So I built one.

Introducing OpenPACEvolve: A fully open-source, production-grade implementation of the PACEvolve framework.

🛠 I engineered this framework solo, but I wasn't working alone. I orchestrated a custom coding agents powered by Claude Opus 4.5 as Engineer and Gemini Pro 3 Preview ensuring fiedelity and quallty.

By leveraging these SOTA models, I was able to translate complex theoretical research into functional, modular Python architecture in record time. This is what the future of AI engineering looks like: Human architectural oversight + AI velocity.

🧠 What OpenPACEvolve Solves: Unlike standard agents that get "stuck" in loops, this framework implements the paper's full recipe for long-horizon stability: ✅ Hierarchical Context Management (HCM): Bi-level pruning to keep the agent's memory clean. ✅ Momentum-Based Backtracking (MBB): Uses "power-law backtracking" to detect stagnation and force pivots. ✅ Self-Adaptive Crossover: Intelligent code-sharing between parallel "islands."

👨‍💻 This project is more than a repo; it's a demonstration of rapid research-to-production cycles using next-gen AI workflows.

📎 Link of the paper : https://arxiv.org/abs/2601.10657

The code is live. The agents are ready. Check out the repository below. 👇
https://github.com/hassenhamdi/OpenPACEvolve
Star the repo 🌟.

reacted to branikita's post with 🚀 about 6 hours ago

Post

758

Our engineer Alan from https://robonine.com/ (Educational Robotics) integrated Feetech STS3250 and STS3215 servo motors into the prototype and completed the first test run of a 6-DOF semi-SCARA manipulator.

During motion, the structure demonstrates high stiffness with no visible backlash or mechanical play. The kinematic chain remains stable throughout the test trajectory, confirming the rigidity of the mechanical design and joint assembly.

The next stage includes full assembly with all actuators operating in backlash compensation mode, followed by quantitative measurement of positioning accuracy and repeatability.

reacted to mitkox's post with 🚀 about 6 hours ago

Post

502

GLM-4.7-Flash is fast, good and cheap.
3,074 tokens/sec peak at 200k tokens context window on my desktop PC.
Works with Claude Code and opencode for hours. No errors, drop-in replacement of the Anthropic cloud AI.
MIT licensed, open weights, free for commercial use and modifications.
Supports speculative decoding using MTP, which is highly effective in mitigating latency.
Great for on device AI coding as AWQ 4bit at 18.5 GB. Hybrid inference on a single consumer GPU + CPU RAM.

2 replies

reacted to AdinaY's post with 🔥 about 6 hours ago

Post

207

AgentCPM-report 🔥 local DeepResearch agent released from OpenBMB

openbmb/AgentCPM-Report

✨ 8B - Apache 2.0
✨ Gemini-2.5-Pro level DeepResearch report generation
✨ Fully offline, privacy-first local deployment
✨ + GGUF version

1 reply

reacted to kanaria007's post with ❤️ about 6 hours ago

Post

290

✅ New Article: *Effectful Ops That Don’t Break the World* (v0.1)

Title:
🧾 Effectful Ops in SI-Core: RML and Compensator Patterns
🔗 https://huggingface.co/blog/kanaria007/effectful-ops-in-si-core

---

Summary:
Structured Intelligence systems don’t just *think*—they *change the world* (payments, bookings, city actuators, learning/medical records). In distributed reality, partial failures and retries are normal, so “do it once” is a myth.

This article is a practical cookbook for making effectful operations *retry-safe, reversible (when possible), and auditable*, using *RML levels (1→3)*, *Sagas + compensators*, and “single storyline” effect traces—then measuring quality via *RBL / RIR / SCI*.

> A compensator is *another effect*, not a magical “undo”.

---

Why It Matters:
• Prevents double-apply / half-committed states by defaulting to *idempotency + durable traces*
• Makes rollback *engineering-real*: compensators must be *idempotent*, monotone toward safety, and bounded to a durable terminal/pending state
• Handles “can’t undo” honestly: model *partial reversibility* + remaining risk + follow-up tasks
• Turns failure handling into metrics you can operate: *RBL (rollback latency), RIR (rollback integrity), SCI (structural inconsistencies)*

---

What’s Inside:
• RML levels overview: *RML-1 (idempotent effects)* → *RML-2 (Sagas/compensators)* → *RML-3 (goal-native reversible flow graphs)*
• Compensator patterns: idempotent refunds, append-only “compensating logs”, corrective/restitution effects
• Cross-domain templates (payments / reservations / city / learning) + common pitfalls (ghost holds, out-of-order msgs)
• A full walkthrough: partial success → compensate → re-plan & re-apply as *one coherent conversation with the world*
• Implementation path: effect records → idempotency → mini-sagas → metrics → lift critical flows toward RML-3

---

📖 Structured Intelligence Engineering Series
this is the *how-to-design / how-to-operate* layer for effectful systems.

reacted to Parveshiiii's post with 👀 1 day ago

Post

129

📢 The Announcement
Subject: XenArcAI is now Modotte – A New Chapter Begins! 🚀

Hello everyone,

We are thrilled to announce that XenArcAI is officially rebranding to Modotte!

Since our journey began, we’ve been committed to pushing the boundaries of AI through open-source innovation, research, and high-quality datasets. As we continue to evolve, we wanted a name that better represents our vision for a modern, interconnected future in the tech space.

What is changing?

The Name: Moving forward, all our projects, models, and community interactions will happen under the Modotte banner.

The Look: You’ll see our new logo and a fresh color palette appearing across our platforms.

What is staying the same?

The Core Team: It’s still the same people behind the scenes, including our founder, Parvesh Rawal.

Our Mission: We remain dedicated to releasing state-of-the-art open-source models and datasets.

Our Continuity: All existing models, datasets, and projects will remain exactly as they are—just with a new home.

This isn’t just a change in appearance; it’s a commitment to our next chapter of growth and discovery. We are so grateful for your ongoing support as we step into this new era.

Welcome to the future. Welcome to Modotte.

Best regards, The Modotte Team

reacted to projectlosangeles's post with ❤️🔥 1 day ago

Post

1250

Check out Orpheus Karaoke! Turn any MIDI into a unique Karaoke MIDI!

projectlosangeles/Orpheus-Karaoke

reacted to efecelik's post with 🔥 1 day ago

Post

1134

🎮 Introducing: Paper Popularity Game

Think you know which AI papers go viral? Test your instincts!
I built a little game where you try to guess the popularity of AI research papers from the Hugging Face Daily Papers feed.

How it works:
You'll see two papers side by side—read the titles, check the abstracts, and pick which one you think got more upvotes from the HF community.

It's a great way to discover trending AI research while having fun.
Tests your intuition about what the ML community finds interesting.

Try it out:
efecelik/paper-popularity-game
Would love to hear your high scores and feedback!

reacted to davidmezzetti's post with 🚀 1 day ago

Post

128

Do you like to "think" in Jupyter Notebooks? Want to integrate OpenCode with that? Then check this project out.

NCoder is an open-source AI coding agent that connects a local running OpenCode server to a Jupyter Notebook via TxtAI.

https://github.com/neuml/ncoder

reacted to nyuuzyou's post with 🔥 1 day ago

Post

1258

🏛️ Google Code Archive Dataset - nyuuzyou/google-code-archive

Expanding beyond the modern code series, this release presents a massive historical snapshot from the Google Code Archive. This dataset captures the open-source landscape from 2006 to 2016, offering a unique time capsule of software development patterns during the era before GitHub's dominance.

Key Stats:

- 65,825,565 files from 488,618 repositories
- 47 GB compressed Parquet storage
- 454 programming languages (Heavily featuring Java, PHP, and C++)
- Extensive quality filtering (excluding vendor code and build artifacts)
- Rich historical metadata: original repo names, file paths, and era-specific licenses

This is one of those releases that I'm most interested in getting feedback on. Would you like to see more old code datasets?

reacted to unmodeled-tyler's post with 🚀 1 day ago

Post

1157

NEW MODEL: vanta-research/mox-small-1

Mox-Small-1 has landed on the Hub!

Finetuned from the fantastic Olmo3.1 32B architecture by AllenAI, Mox-Small-1 was trained using the same datasets and methodology as Mox-Tiny-1, making this model our second addition to the Mox-1 family of models.

Mox-1 is designed to prioritize clarity, honesty, and genuine utility over blind agreement. These models are perfect for when you want to be challenged in a constructive, helpful way.

By utilizing Olmo3.1 32B's architecture, Mox-Small-1 brings greater conversational depth and reasoning quality to the Mox-1 model family. Check it out!

reacted to lorraine2's post with ❤️ 1 day ago

Post

1422

📽️ New NVIDIA paper: Motion Attribution for Video Generation 📽️

We propose MOTIVE, a method for taking query video clips and identifying which training data will improve or degrade performance after finetuning, enabling sophisticated data curation and beyond!

🔎 Project Page: https://research.nvidia.com/labs/sil/projects/MOTIVE/
📖 Full Paper: https://arxiv.org/abs/2601.08828

Check out more work from the NVIDIA Spatial Intelligence Lab here: https://research.nvidia.com/labs/sil/

This project was led by the great work of Xindi(Cindy) Wu, along with Despoina Paschalidou, Jun Gao, Antonio Torralba, Laura Leal-Taixé, Olga Russakovsky, and Sanja Fidler.

reacted to phronos-research's post with 👀 1 day ago

Post

1119

Can we measure how AI interaction reshapes human cognition? We built two semantic association instruments that pit humans against Claude Haiku—testing divergent thinking and communicability under constraint. Try the instruments and contribute to the dataset: https://instruments.phronos.org/ins-001/

Explanation here: https://phronos.org/dispatches/semantic-cartography

reacted to AdinaY's post with 🔥 1 day ago

Post

1204

DeepSeek R1 dropped one year ago 🐳 and a lot has changed.

With @irenesolaiman , we’re launching a blog series about how that moment reshaped AI + open source in 2025, starting with strategic shifts and the explosion of new open models in China!

https://huggingface.co/blog/huggingface/one-year-since-the-deepseek-moment

John Smith PRO

AI & ML interests

Recent Activity

Organizations

John6666's activity