Data labeling - Zelix Glossary

What it means

Data labeling is the work of attaching meaningful tags to raw data. The conversation transcript becomes 'sales enquiry / hot lead / wants pricing'. The image becomes 'damage / dent / front-bumper'. The email becomes 'support / billing / not urgent'. Without labels, a model has no signal about what counts as the right answer.

Labels can be applied retrospectively (going back through last quarter's conversations and tagging them) or prospectively (the team tags each new conversation as it closes). The first builds a training set; the second keeps the model improving over time.

Why it matters

Labeling is the difference between a model that learns your business and a model that learns a generic version of your business. A general LLM knows what 'angry customer' looks like in general; it does not know what 'angry customer' looks like in your industry, your tone, your edge cases. Labels teach it that.

The cost of labeling is real (it is time, usually from senior people who know the business). The cost of not labeling is steeper: a model that confidently misroutes or misclassifies until you start retraining it with real labels.

Example

A B2B SaaS company labels 800 inbound emails over two weeks: which were sales-ready, which were support, which were spam, which were billing. The labelled set trains the routing agent. Inbound triage time drops from 22 minutes per email to under a minute, and senior sales people stop being copied on billing questions.

Where this comes up

Workflow engineering

What it means

Why it matters

Example

Related terms

Where this comes up