Explosion
Explosion is a software company that develops developer-focused tools and libraries for building, deploying, and maintaining Natural Language Processing (NLP) systems and Machine Learning (ML) workflows.
- Creator and maintainer of spaCy (machine learning / NLP framework) for production-grade text processing and statistical NLP pipelines.
- Developer of Prodigy (data annotation / ML training) for creating labeled datasets and iteratively improving ML models.
- Tooling and workflows for NLP model training, packaging, and deployment (ML operations / Machine Learning Operations (MLOps)) in Python-centric environments.
- Support, licensing, and consulting around integrating its NLP and annotation tools into enterprise and research workflows.
- Educational resources, documentation, and example projects focused on applied NLP, information extraction, and text analytics.
More About Explosion
Explosion focuses on software and tooling that support the full lifecycle of NLP systems, from data collection and annotation through model development and deployment into production environments. Its products are oriented toward software engineers, data scientists, and ML teams who build text analytics, information extraction, search, and language understanding components into applications and services.
The company is widely associated with spaCy (machine learning / NLP framework), an open-source Python library for industrial-strength NLP. spaCy provides tokenization, part-of-speech tagging, syntactic dependency parsing, named entity recognition, text classification, and related components that can be composed into pipelines and integrated into larger applications. Architecturally, spaCy is implemented in Python with performance-critical components in Cython and uses models that can be trained or fine-tuned on custom data. It supports interoperability with other Python ML frameworks and model formats for deployment.
Explosion also develops Prodigy (data annotation / ML training), a commercial annotation tool designed for creating and curating labeled datasets for NLP and other ML tasks. Prodigy runs as a scriptable application that exposes a web-based interface, allowing teams to annotate text, images, or other data. It supports active learning workflows in which model predictions guide what examples to label next, with the goal of reducing annotation effort while maintaining or improving model quality. Prodigy integrates closely with spaCy-based pipelines but can also be used with other ML back ends.
In enterprise and institutional environments, Explosion’s tools are used to implement domain-specific NLP pipelines for tasks such as document processing, contract analysis, customer interaction analysis, or scientific text mining. Teams can start with off-the-shelf language models and adapt them to internal taxonomies, entities, and classification schemes using Prodigy for annotation and spaCy for model training and packaging. The resulting components can be deployed into microservices, Representational State Transfer (REST) APIs, batch processing jobs, or embedded within existing Python-based applications.
From a directory and marketplace perspective, Explosion aligns to categories such as NLP frameworks, data annotation platforms, and MLOps tooling for text analytics. Its offerings target organizations that prefer programmatic, scriptable workflows over purely graphical platforms and that require on-premises (on-prem) or self-managed options for handling sensitive text data. By focusing on Developer Experience (DevEx), reproducible pipelines, and integration with standard Python tooling, Explosion’s products fit into modern ML architectures that emphasize modular components, containerization, and continuous improvement of models based on new labeled data.