Skip to main content

AI Agent with Robotics

· 8 min read
Damon Lee
Fission Team

The Dawn of Generalist Humanoid Robots

In the rapidly evolving field of artificial intelligence, one of the most exciting frontiers is the development of generalist humanoid robots capable of performing a wide range of tasks in human environments. Recent breakthroughs in hardware design, model architecture, and training methodologies have brought us closer than ever to realizing this vision. This blog post explores the current state of robotics AI agents, focusing on NVIDIA's groundbreaking GR00T N1 model, and examines both the possibilities and challenges that lie ahead.

The Rise of Robot Foundation Models

Just as foundation models like GPT and CLIP have revolutionized natural language processing and computer vision, similar approaches are now being applied to robotics. These robot foundation models aim to provide a versatile "backbone" of intelligence that can reason about novel situations, handle real-world variability, and quickly adapt to new tasks.

NVIDIA recently introduced GR00T N1, an open foundation model specifically designed for humanoid robots. GR00T (Generalist Robot Transformer) represents a significant step toward creating robots that can operate in complex, unstructured environments and perform a diverse array of tasks through a unified learning framework.

Hierarchical Structuring in Data Labeling for Unstructured Data

· 26 min read
Christopher
Fission Team
Damon Lee
Fission Team

1. Introduction

Managing unstructured data (text, images, audio, video) for machine learning is labor-intensive, especially when labeling large datasets. A proposed solution is to use hierarchical structuring of labels – for example, having a broad category like “human” with sub-labels like “race”, “gender”, “age group”. The hypothesis is that a taxonomy of labels (categories and subcategories) can streamline annotation workflows and reduce labeling and preprocessing costs. In this report, we investigate existing research and industry practices to see if this hypothesis holds true. We examine how hierarchical taxonomies and multi-level schemas have been applied in data labeling pipelines, whether they improved efficiency or reduced resource use, and how this approach relates to modern data-centric AI techniques (like weak supervision or active learning). We also discuss how hierarchical labeling could integrate with emerging architectures such as the MCP(Model Context Protocol) for agent-based AI, and consider potential benefits for AI safety and fairness (through more consistent, transparent labeling).

2. Background

Taxonomies and Hierarchical Labels

A taxonomy in data labeling is a structured classification scheme: it defines categories and (optionally) subcategories in a tree-like hierarchy. In practice, a well-designed taxonomy can enhance labeling efficiency and consistency. By providing clear category definitions and relationships, taxonomy-based labeling gives annotators a structured guideline, reducing ambiguity. According to industry guides, a good taxonomy enhances efficiency and accuracy in data labeling, reduces training time, and improves data comprehension.

The Present and Future of MCP(Model Context Protocol)

· 14 min read
Damon Lee
Fission Team

1.png

In the rapidly evolving landscape of artificial intelligence, a new standard has emerged that's quietly revolutionizing how AI agents interact with tools and capabilities. The Model Context Protocol (MCP) has grown from an overlooked concept to a central pillar in AI development seemingly overnight. But what exactly is driving this sudden interest, and why should developers and AI enthusiasts take notice?

The Unexpected Journey of MCP

When the Model Context Protocol was first introduced in November 2023, it received little attention from the developer community. Initially described as "the future of LLM Agent Tools," MCP was largely dismissed as just another collection of tools for large language models - nothing more revolutionary than a basic toolbox.

This lack of interest stemmed from two key factors. First, MCP appeared to be merely a collection of tools without significant differentiation from existing solutions. Second, and perhaps more importantly, no major frameworks supported MCP integration at the time, limiting its practical application.

The turning point came in early 2024 when Cursor.AI, one of the leading automated coding tools, announced full MCP integration into their system. This endorsement from a prominent platform sparked widespread interest, transforming MCP from an overlooked concept to one of the hottest topics in AI development within just a few months of its release.

Fission’s DeSAi Vision: Merging DeSci, AI Optimization, and Futarchy

· 20 min read
Christopher
Fission Team
Damon Lee
Fission Team

1. Introduction

1-1. What Fission Does

Fission is an AI research and development team that specializes in model optimization—making advanced AI models more efficient, scalable, and accessible to a broader range of users. Rather than pouring endless resources into ever-larger models, we focus on streamlining computing requirements and enhancing model performance through smarter, more targeted approaches. Our guiding principle is simple yet profound: the best AI emerges when technical innovation is combined with high-quality, iterative human feedback.

To achieve this, we design workflows and tools that invite user participation throughout the AI lifecycle. From initial data refinement to ongoing feedback loops, we’ve learned that human insight—especially when harnessed at scale—often trumps brute-force computation. By refining AI models through collective user input, we aim to craft systems that are both robust and grounded in real-world needs.

1-2. The Core Narrative

As we delved deeper into model optimization, we discovered a critical tipping point: no matter how elegant or efficient our algorithms became, they still depended on authentic human feedback to truly excel. While data labeling had been the traditional approach, we realized that continuous, context-rich evaluation—from domain experts, enthusiasts, and everyday users—was the key to fine-tuning AI in a more dynamic, impactful way.

However, gathering and managing this feedback via centralized methods proved costly and time-consuming. It was also difficult to maintain transparency and fairness at scale. That challenge led us to explore more decentralized structures—platforms where users could offer feedback and be rewarded in ways that felt natural, equitable, and community-driven.

That’s where DeSci (Decentralized Science) entered the conversation. DeSci promised an open, collaborative environment for knowledge-sharing and verification, yet it quickly became clear that advanced AI was needed to handle large data streams and real-time inputs. Combining DeSci’s ethos with AI-driven workflows gave rise to DeSAi—a synergy merging decentralized collaboration with iterative optimization.

Stop Calling Everything an Agent : Here’s What it Actually Means

· 3 min read
Damon Lee
Fission Team

An LLM-based agent is an AI system that leverages a Large Language Model (LLM) as its core computational engine to perform complex tasks autonomously. These agents are capable of understanding and generating human-like language, reasoning through problems, planning actions, and interacting with external tools or environments to achieve specific objectives.

Key Components of LLM-Based Agents:

  1. Core LLM: The foundational large language model trained on extensive text data, enabling the agent to comprehend and produce human-like language.
  2. Prompting Mechanism: Carefully crafted prompts that define the agent's identity, instructions, and context, guiding its responses and actions.
  3. Memory Modules:
    • Short-Term Memory: Maintains context within ongoing interactions, ensuring coherent and contextually relevant responses.
    • Long-Term Memory: Stores information from past interactions, allowing the agent to recall and utilize previous knowledge in future tasks.
  4. Knowledge Integration: Incorporates domain-specific knowledge, commonsense understanding, and procedural information to enhance decision-making and task execution.
  5. Tool Integration: Interfaces with external tools, APIs, or services to perform specialized tasks beyond language processing, such as data retrieval, computations, or accessing real-time information.

Why Centralized Data Labeling Will Be a Potential Danger in AI's Future

· 5 min read
Damon Lee
Fission Team

The rapid evolution of artificial intelligence (AI) hinges significantly on the quality of data used for training models. However, centralized data labeling—where a few entities control the annotation and validation processes—poses significant risks to the development and deployment of robust AI systems. As AI becomes increasingly multi-modal, combining textual and visual modalities, the limitations of centralized labeling become more evident, threatening the very progress of the field.

The Limitations of Large Language Models (LLMs)

Recent research has highlighted critical shortcomings in large language models (LLMs) when interpreting complex datasets, such as images of clocks with hour and minute hands. And it even has difficulty to understand ambiguity in general text contexts. While advanced models like CLIP and BLIP have demonstrated promise in aligning textual descriptions with visual inputs and top edge GPT based models are shown impressive understanding both texts and images, they struggle with tasks requiring precise spatial reasoning and understanding general context of overall languages. This is not merely a limitation of the models themselves but also a reflection of the constraints imposed by the quality and structure of labeled data.

1-img.png

Retrieval Augmented Generation

· 4 min read
Damon Lee
Fission Team

image.png

Retrieval-Augmented Generation (RAG): A Critical Tool for Managing and Selecting Datasets in Decentralized Data Labeling

In the evolving landscape of artificial intelligence (AI), data is the lifeblood of innovation. As the field transitions toward decentralized data labeling frameworks to address biases and enhance diversity, the challenge of managing and selecting datasets becomes paramount. Retrieval-Augmented Generation (RAG) emerges as a transformative solution, combining retrieval mechanisms with generative models to optimize data utilization and improve the quality of AI systems. This blog explores the concept of RAG, its role in decentralized labeling, and why it is essential for dataset management.

Understanding Retrieval-Augmented Generation (RAG)

RAG is a hybrid approach that integrates information retrieval with generative AI models. Unlike standalone generative models, which rely entirely on pre-trained data, RAG systems retrieve relevant external data to enhance the generation process. This method combines the strengths of retrieval systems, such as search engines, with the creative potential of generative models, enabling:

  1. Context-Aware Generation: By pulling in relevant external knowledge, RAG can generate responses or insights that are more accurate and grounded in factual data.
  2. Dynamic Adaptation: The ability to retrieve real-time or updated information ensures that models remain relevant and effective.
  3. Enhanced Diversity: By sourcing data from decentralized and diverse repositories, RAG systems mitigate biases inherent in single-source datasets.

How can we find(evaluate) A good Agent?

· 5 min read
Damon Lee
Fission Team

Alright, But How the Hell Can We Find(Evaluate) A good Agent?

What Makes a "Good Agent"? Evaluating Agents in the Era of LLM-Based RAG Systems

image.png

The rise of Large Language Model (LLM)-powered Retrieval-Augmented Generation (RAG) has led to an explosion of projects and services claiming to integrate "agents" into their systems. From task automation to advanced decision-making, these agents are reshaping industries. However, amid the wave of hype, a critical question emerges: What constitutes a good agent? As the AI community navigates this flood of agent-based solutions, it’s imperative to establish robust evaluation methods to differentiate effective agents from underperforming ones. This article explores how to evaluate agents using methods like "G-Eval" and "Hallucination + RAG Evaluation" and why this is critical for the future of agent-based systems.

The Current Challenge: Defining a Good Agent

An agent in the context of LLM-based RAG systems typically performs tasks by combining reasoning, retrieval, and interaction capabilities. However, the effectiveness of these agents varies widely due to:

  1. Ambiguous Standards: There is no universally agreed-upon metric for evaluating an agent’s performance.
  2. Complexity of Multi-Step Tasks: Many agents fail to maintain contextual accuracy across multi-turn or complex interactions.
  3. Hallucinations: Agents often generate factually incorrect or irrelevant responses, undermining trust and utility.
  4. Domain-Specific Demands: Agents must adapt to the nuances of specific fields, such as healthcare, finance, or Web3.

Without rigorous evaluation frameworks, it’s challenging to identify and improve truly effective agents.

Challenges and directions for crypto-based AI agents

· 10 min read
Damon Lee
Fission Team

1-1. Definition of a decentralized AI agent

decentralized AI agent is a system that harnesses artificial intelligence for automation, learning, and reasoning, while simultaneously ensuring data sovereignty through distributed ledger technologies (blockchains) and consensus mechanisms. By doing so, these agents mitigate reliance on centralized servers or organizations, and empower individual users or communities to control the data they generate or consume. Potential applications range from automated asset management in decentralized finance (DeFi) to decision-support engines in Decentralized Autonomous Organizations (DAOs).

1-2. Why is decentralization important in AI?

🤖 Decentralized AI Agents for Trust, Security, and Next-Generation Applications

Background and Motivation

AI has become a cornerstone of modern industry, fueling innovation in areas like finance, healthcare, manufacturing, and education. However, most AI systems today are centralized, aggregating data and training resources under the jurisdiction of a few major entities. This arrangement has repeatedly raised concerns about data sovereigntytransparency, and equity.

In contrast, decentralized AI agents capitalize on distributed trust to enhance security and accountability, allowing individual stakeholders to define how, when, and to what extent their data is leveraged. By having the broader network verify data and model processes, these agents reduce reliance on traditional centralized platforms and create an ecosystem that is more horizontally structured and community-driven.

Hybrid Search = Spare + Dense RAG

· 4 min read
Damon Lee
Fission Team

Why We Use Hybrid Search RAG (Sparse + Dense Embedding + ReRanker) Instead of Naive RAG?

Problem Statement: Decentralized Web3 Agents and the Need for Efficient Data Retrieval

The emergence of decentralized Web3 agents has redefined the landscape of AI-driven automation. Unlike traditional centralized frameworks, these agents operate on decentralized platforms, emphasizing transparency, user ownership, and multi-modal data processing. However, managing and retrieving data in decentralized environments poses unique challenges:

  1. Data Fragmentation: Information is scattered across multiple decentralized nodes, making efficient retrieval complex.
  2. Diverse Data Modalities: Web3 agents require access to text, images, and structured metadata to function effectively.
  3. Performance Bottlenecks: Standard retrieval mechanisms struggle with scalability and semantic understanding in decentralized systems.

This is where Hybrid Search RAG—a sophisticated blend of sparse and dense embedding retrieval with re-ranking—becomes a game-changer. It not only addresses these challenges but also sets a new benchmark for data retrieval in decentralized frameworks.

What is Naive RAG?

Naive RAG integrates a generative AI model with a retrieval component that fetches relevant documents from a database. This retrieval is typically based on:

While effective for basic applications, naive RAG has critical shortcomings:

  1. Limited Context Understanding: Sparse embeddings often fail to capture semantic nuances, especially in multi-modal data.
  2. Suboptimal Ranking: Dense embeddings can retrieve irrelevant documents due to lack of fine-grained ranking mechanisms.
  3. Scalability Issues: Naive implementations struggle to efficiently handle large-scale or multi-modal datasets.