Skip to main content
Christopher
Fission Team
View all authors

Hierarchical Structuring in Data Labeling for Unstructured Data

· 26 min read
Christopher
Fission Team
Damon Lee
Fission Team

1. Introduction​

Managing unstructured data (text, images, audio, video) for machine learning is labor-intensive, especially when labeling large datasets. A proposed solution is to use hierarchical structuring of labels – for example, having a broad category like “human” with sub-labels like “race”, “gender”, “age group”. The hypothesis is that a taxonomy of labels (categories and subcategories) can streamline annotation workflows and reduce labeling and preprocessing costs. In this report, we investigate existing research and industry practices to see if this hypothesis holds true. We examine how hierarchical taxonomies and multi-level schemas have been applied in data labeling pipelines, whether they improved efficiency or reduced resource use, and how this approach relates to modern data-centric AI techniques (like weak supervision or active learning). We also discuss how hierarchical labeling could integrate with emerging architectures such as the MCP(Model Context Protocol) for agent-based AI, and consider potential benefits for AI safety and fairness (through more consistent, transparent labeling).

2. Background​

Taxonomies and Hierarchical Labels​

A taxonomy in data labeling is a structured classification scheme: it defines categories and (optionally) subcategories in a tree-like hierarchy. In practice, a well-designed taxonomy can enhance labeling efficiency and consistency. By providing clear category definitions and relationships, taxonomy-based labeling gives annotators a structured guideline, reducing ambiguity. According to industry guides, a good taxonomy enhances efficiency and accuracy in data labeling, reduces training time, and improves data comprehension.

Fission’s DeSAi Vision: Merging DeSci, AI Optimization, and Futarchy

· 20 min read
Christopher
Fission Team
Damon Lee
Fission Team

1. Introduction​

1-1. What Fission Does​

Fission is an AI research and development team that specializes in model optimization—making advanced AI models more efficient, scalable, and accessible to a broader range of users. Rather than pouring endless resources into ever-larger models, we focus on streamlining computing requirements and enhancing model performance through smarter, more targeted approaches. Our guiding principle is simple yet profound: the best AI emerges when technical innovation is combined with high-quality, iterative human feedback.

To achieve this, we design workflows and tools that invite user participation throughout the AI lifecycle. From initial data refinement to ongoing feedback loops, we’ve learned that human insight—especially when harnessed at scale—often trumps brute-force computation. By refining AI models through collective user input, we aim to craft systems that are both robust and grounded in real-world needs.

1-2. The Core Narrative​

As we delved deeper into model optimization, we discovered a critical tipping point: no matter how elegant or efficient our algorithms became, they still depended on authentic human feedback to truly excel. While data labeling had been the traditional approach, we realized that continuous, context-rich evaluation—from domain experts, enthusiasts, and everyday users—was the key to fine-tuning AI in a more dynamic, impactful way.

However, gathering and managing this feedback via centralized methods proved costly and time-consuming. It was also difficult to maintain transparency and fairness at scale. That challenge led us to explore more decentralized structures—platforms where users could offer feedback and be rewarded in ways that felt natural, equitable, and community-driven.

That’s where DeSci (Decentralized Science) entered the conversation. DeSci promised an open, collaborative environment for knowledge-sharing and verification, yet it quickly became clear that advanced AI was needed to handle large data streams and real-time inputs. Combining DeSci’s ethos with AI-driven workflows gave rise to DeSAi—a synergy merging decentralized collaboration with iterative optimization.