Cumulus Networks Launches NetQ, a Telemetry-Based Fabric Validation System — Evolving Beyond its Network OS

NetQ increases network agility, improves uptime, and reduces management complexity with the widest surface area across the data center network from host to switch

MOUNTAIN VIEW, Calif.--()--Cumulus Networks®, the leader in bringing web-scale networking to enterprise cloud, today announced the availability of NetQ, a telemetry-based fabric validation system, that ensures the network is behaving as intended. With over 600 customers, including 32% of the Fortune 50, Cumulus Networks is evolving beyond its Network OS to bring web-scale efficiencies across the entire network lifecycle. Businesses can now design, build, and operate their data center networks with the speed and agility of web-scale giants. NetQ reduces complexity with managing the network, dramatically improves network uptime, greatly enhances network agility, and is a critical step along the journey to intent-based networking. It provides monitoring, visibility, and instrumentation for the modern web-scale network that is baked in from day one.

Customers are adopting web-scale networking en masse; they need greater agility every single day,” said Josh Leslie, CEO, Cumulus Networks. “NetQ gives operators the confidence to move at higher, web-scale speeds by enabling a preventative approach to network validation. That’s not just good news for the engineers, but the whole business. Adding NetQ to automation workstreams frees people from fire-drills, making them more productive and enabling them to focus on business accelerating decisions.”

A recent study conducted by the Information Technology Intelligence Group reported that “98% of organizations say a single hour of downtime costs over $100,000.” Manual errors are the main culprit for network downtime, taking hours to isolate and resolve a problem. Traditional tools, such as ping and traceroute, invented over 20 years ago are stuck in the dark ages and are heavily reliant on ancient management protocols and the basic system events that they provide. Additionally, these tools are highly manual, box-by-box, reactive, and unable to match the rate of change that modern web-scale workflows require, leaving network engineers ill-equipped to operate in highly automated environments where workloads are orchestrated by software. Whether or not a problem is in the network, network engineers lack confidence in traditional tools because they cannot ensure the network is behaving as it should. Web-scale IT workflows require closed-loop fabric validation and real-time network status updates that keep up with a scaling data center.

According to Gartner, “The mentality of ‘play it safe; don't risk an outage’ is pervasive in mainstream enterprise networking teams. While this was effective during the relatively static business models of the past two decades, it is no longer viable during the disruptive times of digital business transformation. Network teams are handcuffed into incremental improvements due to the zero-risk tolerance inflicted on most network teams. Conversely, web-scale organizations place high importance on innovation, which it enables by managing risk, rather than just trying to ‘avoid the bad’. Web-scalers aren't better because they operate at scale; they can operate at scale because they adopted a better approach to dealing with risk.” - Bringing Web-Scale Networking Concepts to Your Data Center, Gartner’s Joe Skorupa and Andrew Lerner (November 2016)

NetQ is built for the modern enterprise cloud and gives operators the comfort to identify, embrace, and manage risk. By validating fabric-wide network correctness within a DevOps workflow, NetQ prevents errors while rolling out configurations in production, provides proactive alerts of network state changes, and helps diagnose problems that would otherwise take hours to chase down or are impossible to solve with the naked eye.

Key capabilities and benefits of NetQ include:

1. Preventative Workflows to Validate Configurations with Confidence: The worry that the automated propagation of a mistaken change will cause an outage has hampered the adoption of highly automated network changes. NetQ enables incremental checks so that changes to a network environment can be validated pod by pod. Adding validation within automated rollouts reduces the risk of programmatic configuration changes and helps operators avoid manual deployment errors, one of the main causes of network downtime.

2. Proactive Notifications & Alerts for Faster Remediation: NetQ detects faulty network state, a cause of network downtime, alerting the user in real time with precise fault location for faster remediation. NetQ notifications can detect state change and an ad-hoc change that occurs outside of automation workstreams before trouble arises, getting knowledge of the potential trouble into the network team even before a ticket is opened.

3. Time Machine Diagnostics to Determine Root Cause: Leveraging fabric-wide telemetry instead of box-by-box polling data and system statistics alone, an approach only possible with an open, extensible network operating system such as Cumulus Linux, NetQ enables network operators to go back in time and replay a previous network state. In doing so, they can dissect “needle in the haystack” problems even when the issue occurred yesterday or the associated application workload is no longer in service.

4. Widest Surface Area, Fabric-Wide for Holistic Network Validation: NetQ analyzes both Linux host and Cumulus Linux network devices, creating the widest surface area in the industry for fabric validation systems.

5. Delegate Access to Accelerate Operations: Visibility into the state of the network puts power into the hands of the user. NetQ provides a remote, command-line interface and a web-based service console to infrastructure operators and application owners that allows them to self-validate —identifying a problem or proving the network is good and reducing the number of poorly defined or unnecessary trouble tickets.

6. Incorporates into your Existing Operations Toolset: NetQ seamlessly incorporates into your existing automation, change management, end-user notification (PagerDuty, Slack, Splunk, etc.), and other operational toolsets to validate network state, streamlining the workflow.

"We’ve integrated NetQ into our CI/CD environment and use it to detect network state deviation when pushing config changes from our Ansible playbooks. With NetQ, we’re able to make changes with much more certainty and quickly roll-back if things don’t go as planned. NetQ is enabling us to operate with a higher degree of confidence at a much faster speed since issues are detected early, reducing downstream complexities,” said Romain Aviolat at Nagravision.

Historically, we have leveraged a combination of custom scripts and SNMP for monitoring and alerting which can be cumbersome to maintain and are prone to human error. NetQ provides a platform that has empowered us to automate and enhance our monitoring and alerting stack, resulting in reduced administration time when carrying out daily tasks such as upgrades and network changes,” said Tynan Young at Campaign Monitor.

"Red Hat believes automation is key to building, managing and scaling modern cloud networks. Red Hat collaborates with technology partners such as Cumulus Networks to bring solutions like NetQ for telemetry-based fabric validation to Ansible users, helping to further our vision of enabling test-driven deployment and continuous compliance," said Andrius Benokraitis, Principal Product Manager, Red Hat.

NetQ is immediately available and requires Cumulus Linux Release 3.3. Additional information:

About Cumulus Networks

Cumulus Networks is leading the transformation of bringing web-scale networking to enterprise cloud. As the only systems solution that fully unlocks the vertical network stacks of the modern data center, Cumulus Linux's network switch allows you to affordably build and efficiently operate your network just like the world’s largest data centers. By allowing operators to use standard hardware components, Cumulus Networks offers unprecedented operational speed and agility, at the industry’s most competitive cost. Cumulus Networks has received venture funding from Andreessen Horowitz, Battery Ventures, Sequoia Capital, Peter Wagner and four of the original VMware founders.

For more information visit cumulusnetworks.com or follow @cumulusnetworks.

CUMULUS, the Cumulus Logo, CUMULUS NETWORKS, and the Rocket Turtle Logo (the “Marks”) are trademarks and service marks of Cumulus Networks, Inc. in the U.S. and other countries. You are not permitted to use the Marks without the prior written consent of Cumulus Networks.

The registered trademark Linux® is used pursuant to a sublicense from LMI, the exclusive licensee of Linus Torvalds, owner of the mark on a worldwide basis.

Contacts

Bite for Cumulus Networks
Sara Shaughnessy, 203-246-1926
sara.shaughnessy@biteglobal.com

Contacts

Bite for Cumulus Networks
Sara Shaughnessy, 203-246-1926
sara.shaughnessy@biteglobal.com