How Mumu Migrated From Prometheus to InfluxDB and Tripled Their Metric Coverage
By
Charles Mahler
Jun 25, 2026
Developer
Use Cases
Navigate to:
When a team uses an internal Slack channel for everything from contact form submissions to deployment alerts and server warnings, the notification engine quickly becomes critical infrastructure. When the same team builds that engine as a product for other teams to use, the bar gets even higher.
Mumu is an all-in-one productivity platform for modern teams. While most companies stitch together separate SaaS tools for org charts, agile estimation, internal Q&A, skill mapping, recognition, and notifications, Mumu offers all of those as connected modules under a single subscription. The premise is that your organizational structure shouldn’t be replicated across five different databases; it should live in one place and flow into every workflow your team uses.
In this blog, we will go over why the Mumu team rebuilt their monitoring stack on InfluxDB 3 and how the migration went.
Why pull-based monitoring stopped making sense
Like many teams running their own infrastructure, Mumu started off using Prometheus as its primary monitoring solution. The main problem over time was the fundamental mismatch between Prometheus’s pull-based data collection model and the type of data Mumu was working with. Rather than telemetry data that can be scraped at a regular interval, Mumu often needs to track discrete events like user triggered actions, scripts completing, and a pipeline finishing. As a result, push-based delivery for tracking events made more sense.
That architectural mismatch wasn’t the only problem. As Mumu evaluated alternatives like Betterstack, VictoriaMetrics, PostHog, Graphite, Datadog, and New Relic, issues related to transparency became a concern. Several of the SaaS solutions came with documentation that made it genuinely hard to understand what was happening under the hood, particularly around how metrics were ingested and stored. For a team that ships fast and needs to be able to debug its own pipeline, that was a dealbreaker.
Why InfluxDB 3 was the right option
Two things about InfluxDB stood out during evaluation. The first was that self-hosting was effortless. Mumu runs its own dedicated servers, and spinning up an InfluxDB 3 Core instance using the official Docker image took almost no time or configuration overhead.
The second factor was the push-based HTTP API. InfluxDB’s Line Protocol HTTP API lets Mumu’s services emit metrics at the exact line of code where an event occurred with no sidecar, no exposition format, no scrape interval, just a POST request at the moment an event happened.
In hindsight, the team’s biggest evaluation lesson was that they should have built a small proof of concept with InfluxDB earlier. The time spent evaluating other tools wasn’t wasted as it gave them context and confidence in the final decision, but InfluxDB’s simplicity would have been apparent within an afternoon.
Migration process
The migration involved three phases over a 3-month period: dual writing to InfluxDB and the existing Prometheus setup, validation, and finally, decommissioning the Prometheus infrastructure.
Phase 1: Dual-Writing via Vector
The first move was to make the same metrics flow into both systems at once. Mumu added InfluxDB as a second sink alongside Prometheus in their existing Vector pipeline, so every metric was being written to both simultaneously. That parallel run is what made the eventual cutover risk-free by allowing the team to confirm performance and validate both systems against each other.
Migrating to InfluxDB is made easy using AI agents.
About 80% of the migration work, such as Vector configuration changes, the dual-write sink setup, and the boilerplate around the new HTTP drivers in Go and TypeScript, was generated by coding agents.
A migration becomes a lot less daunting when the routine work compresses into hours, but what really made this work was what the agent had to work with on the InfluxDB side. InfluxDB 3 exposes a full REST API with a published OpenAPI specification. When an AI agent can read that contract directly, it doesn’t have to guess at parameter names from stale blog posts or hallucinate endpoint shapes from vague documentation. It reads the spec, generates correct client code, and gets the integration right on the first pass. InfluxDB also has an MCP server for integrating with AI agents, although it wasn’t used by Mumu.
This is an important property in a world where agents are doing more and more of the integration work. The systems that will be easiest to adopt over the next few years are not necessarily the ones with the most features, they’re the ones whose APIs are legible to machines.
Phase 2: Validation
Running two systems in parallel only helps if you actually compare them, and this is where the team spent its caution wisely. The validation approach was deliberately simple: they duplicated their Grafana panels side by side, with one panel pulling from Prometheus and an identical panel pulling from InfluxDB. When two panels showing the same metric look identical for weeks on end, confidence accumulates quickly.
Beyond visual parity, four things got specific attention:
- Retention policies: Confirming data was being stored at the right granularity and for the expected duration.
- Tag cardinality: Making sure the tagging strategy wouldn’t cause write-performance problems at scale. Keeping cardinality low on high-volume metric streams is a lesson the team internalized early.
- Batch write behavior: Validating that the NestJS batching logic produced correct time series data with no gaps or duplicates.
- Dashboard parity: Rebuilding key Grafana dashboards from scratch against InfluxDB to confirm they told the same story as their Prometheus equivalents.
Phase 3: Cutover
Because Mumu’s Go and TypeScript codebases already had proper abstractions and interfaces for metric delivery, writing a new driver that sent metrics to InfluxDB via the HTTP API required almost no changes to the application code. The abstraction layer in their app code meant that swapping the metrics backend was a contained, well-scoped task rather than a sprawling refactor.
The TypeScript driver came in at around 160 lines of code, largely because the team leaned on the official InfluxDB 3 client library package. The Go implementation was slightly longer due to manual HTTP handling, retry logic, and error handling, but was still a straightforward, bounded piece of work. Once the drivers were in place, the team decommissioned Prometheus for business metrics and declared the migration complete.
Benefits of InfluxDB 3—from 150 to 560 metrics
Before InfluxDB, Mumu collected around 150 metrics. Today, they collect 560 metrics, and that number is constantly increasing.
That growth didn’t come from a dedicated instrumentation initiative. There was no mandate, no quarter-long observability push, it happened organically because adding a new metric became a one-line HTTP call. When friction drops that far, engineers instrument things they would previously have skipped.
The number is less a measure of throughput than a measure of how much the team’s relationship with its own data changed once the cost of asking a question fell to nearly zero. And because metric delivery was suddenly cheap, Mumu started instrumenting things that would have seemed impractical before:
- Operational automation: Mumu sends a metric for every command executed on their servers, with automations built on top using MsgGO, and certain commands automatically trigger a Slack alert. The result is passive visibility into operational activity with no manual reporting required.
- CI/CD observability: They emit metrics from their Bitbucket pipelines, including how long each pipeline runs. Over time, this has established a baseline for normal build duration, making regressions easy to spot.
- Per-developer environments: Every developer tags their metrics with an
envfield set to their local environment name, such aslocal:john,local:kate, and so on. Each developer can observe their own environment in Grafana, test new instrumentation locally before it ships, and build personal dashboards, all without polluting shared production data. - General script instrumentation: Bash scripts, custom CLI commands, and database migration durations during deployments are all now tracked, where before each would have demanded disproportionate effort.
SQL on time series data
Volume was not the only shift. InfluxDB 3’s SQL support meaningfully improved Mumu’s ability to build dashboards and debug metric data. Before the migration, querying time series data meant learning a specialized query language and reasoning about its particular semantics. With SQL, any engineer on the team can write an ad-hoc query to investigate a metric anomaly, validate that a new event is being tracked correctly, or prototype a Grafana panel without consulting documentation.
The qualitative win is harder to put a number on, but is equally important: the metrics are now trusted. Engineering time that used to go into questioning whether a dashboard was telling the truth now goes into acting on what it shows.
Architecture overview
Mumu runs entirely on its own dedicated servers, giving the team full control, room for hardware-level optimization, and predictable costs. The application layer runs on Docker, with services written in Go and NestJS; Go handles core infrastructure-level work while NestJS handles application layer operations.
Metrics reach InfluxDB along two paths. Vector collects and transforms log-based metrics from the server environment and forwards them to InfluxDB. The Go and NestJS services send business and application metrics directly over the HTTP API.
The tagging strategy reflects that split. Server-level and infrastructure metrics carry richer tag sets like env, container, process_name, process_instance, and service. Business and application metrics are tagged more lightly, typically just env plus a small number of domain-specific identifiers. On the NestJS side, metrics are batched and flushed either every 60 seconds or when the batch size crosses a configured threshold, which is a configuration the team continues to tune to balance data freshness against RAM usage and write overhead. Grafana sits on top of it all, querying InfluxDB directly.
Future plans for utilizing InfluxDB 3
For Mumu, InfluxDB has unlocked more than just an internal observability story. The team is now actively building toward making it a first-class part of their product surface.
The most immediate project is integrating InfluxDB as a delivery target inside MsgGO. Today, MsgGO routes messages to Slack, Telegram, Discord, Email, SMS, and Webhooks. Adding InfluxDB as a target means any system already sending events through MsgGO like contact forms, deployment notifications, server alerts, and application events, can now route structured event data directly into InfluxDB with no additional integration work. Since Mumu uses MsgGO heavily inside its own infrastructure, this would pay off immediately in its own workflows, with the customer-facing benefits coming close behind.
Further out, the team is evaluating whether to move user-activity statistics that are currently kept as activity records in a NoSQL database into InfluxDB. That data is inherently time series in nature, and putting it in InfluxDB would let them expose richer usage analytics inside the Mumu dashboard without needing a separate query infrastructure. And they want to lean on InfluxDB’s SQL interface to drive product decisions, using internal usage metrics to understand which modules see the most engagement, where users drop off, and how feature adoption shifts after a release.