Tremendous quality of life upgrade on the Hugging Face Hub - we now have auto-complete emojis ๐ค ๐ฅณ ๐ ๐ ๐
Get ready for lots more very serious analysis on a whole range of topics from yours truly now that we have unlocked this full range of expression ๐ ๐ค ๐ฃ ๐
This is a fantastic example of large-scale curation of public domain books with intentional governance for AI research and use - definitely recommend checking it out, experimenting with the metadata (institutional/institutional-books-1.0-metadata), and starting to build on top of it ๐ค
reacted to frascuchon's
post with ๐ฅ7 months ago
Extending datasets just got a whole lot easier! ๐ With Sheets, I was able to create a Spanish version of the popular fka/awesome-chatgpt-prompts dataset in just a few minutes โฑ๏ธ.
Want to try it out for yourself? Head over to the Sheets space and see how easy it is to extend and modify existing datasets ๐คฏ. The possibilities are endless! ๐
reacted to dvilasuero's
post with ๐โค๏ธ๐ฅ7 months ago
Super excited to launch Hugging Face Sheets: Spreadsheets meet AI and unstructured data.
A few months ago, we started imagining new ways to build and transform datasets with the latest open-source models.
Today, I'm thrilled to introduce our first step in this direction.
In a nutshell:
๐ Effortlessly run prompts and models over your data. ๐ Agentic search for accuracy and real-time information. ๐ผ๏ธ Familiar, minimalistic interface for interacting with data. ๐ฏ Human feedback 2.0: Your input directly improves generated data. ๐ฏ Access hundreds of open models and leading inference providers.
With Sheets, try a new way to create structured content with the help of AI!
No installs. No login. Just open a link and ๐คฉ
This app lets you create a dataset by importing a file or starting from a prompt.
Whatโs different about SHEETS? ๐ Web search integration to ground answers in real-world data ๐ In-context learning from validated sources ๐ Transparent sourcing โ every result is linked ๐งฉ Runs on multiple open-source models
Fight hallucinations and start creating content you can rely on.
reacted to frascuchon's
post with โค๏ธ๐7 months ago
Hey! I built RAG MCP Server Space, a simple Gradio MCP server for RAG systems that allows you to search relevant results without passing huge contexts to your LLM.
You can use this space to integrate with your agents and improve the efficiency of your search results. Feel free to try it out and let me know if you have any feedback or questions!
New in smolagents v1.16.0: ๐ Bing support in WebSearchTool ๐ Custom functions & executor_kwargs in LocalPythonExecutor ๐ง Streaming GradioUI fixes ๐ Local web agents via api_base & api_key ๐ Better docs
We're thrilled to announce the launch of our comprehensive Model Context Protocol (MCP) Course! This free program is designed to take learners from foundational understanding to practical application of MCP in AI.
In this course, you will: ๐ Study Model Context Protocol in theory, design, and practice. ๐งโ๐ป Learn to use established MCP SDKs and frameworks. ๐พ Share your projects and explore applications created by the community. ๐ Participate in challenges and evaluate your MCP implementations. ๐ Earn a certificate of completion.
At the end of this course, you'll understand how MCP works and how to build your own AI applications that leverage external data and tools using the latest MCP standards.
Hey! I built an AI Agent to query the FOIA API for a workshop at the Hacks/Hackers Summit in Baltimore and you can do it too!
Itโs a quick proof of concept to demo what agents can do, how to design workflows, and how to approach the coding side. TWant a fun project to learn how AI agents work? I built one that queries the FOIA API โ and you can too!
It's a quick proof of concept I did for a workshop at the Hacks/Hackers Summit in Baltimore, demonstrating what agents can do, how to design workflows, and approaches to coding them.
Recent RL paradigms often relied on a set of questions an answers that needs to be manually curated. Researchers from Tsinghua University went like "why though".
๐ค Indeed, why learn from question designed by a human teacher, when the model can start from their base knowledge and learn by experimenting in a code environment, proposing coding tasks themselves and trying to solve them?
Thus they created โAbsolute Zero Reasoningโ (AZR), an approach that removes any need for human curated data. ๐ญ ๐๐๐ฎ๐น ๐ฟ๐ผ๐น๐ฒ๐: โฃ Proposer: Generates challenging but solvable coding tasks โฃ Solver: Attempts to solve those self-proposed tasks
๐งช ๐ง๐ต๐ฟ๐ฒ๐ฒ ๐๐ฎ๐๐ธ ๐๐๐ฝ๐ฒ๐: all types are defined as triplets of program, input and output โฃ Deduction: Give model an input and program, it must deduce the output โฃ Abduction: Give model an program and output, it must find the input that gave said output โฃ Induction: Synthesize a program from input/output pairs Btw this reminded me of my long-forgotten philosophy classes: Aristotle was more on the induction side, learning from real-world analogies, while Plato was more on the deduction side, trying to progress quite far with just one input and his reasoning.
๐ ๐ฅ๐ฒ๐๐๐น๐๐: โฃ AZR post-training creates a nice improvement on known models like Qwen2.5-7B โฃ Shows strong cross-domain transfer: coding โ๏ธ math reasoning
๐ง ๐ข๐๐ต๐ฒ๐ฟ ๐ณ๐ถ๐ป๐ฑ๐ถ๐ป๐ด๐: โฃ Having a better base performance (general or code specific) amplify the gains from Absolute Zero Reasoning โฃ Researchers warn about "Uh-oh moments" (winking to the "aha moments" of DeepSeek) where the model generates concerning goals like "make an extremely convoluted code to outsmart all these humans": so supervision is still needed!