We’ve all heard how big, unstructured data is being generated at an unprecedented pace today, and about the tremendous value that can be generated by capturing insights from big data–whether from tweets, cybersecurity feeds, or geospatial data. Companies like Cloudera*, MongoDB*, Elastic and others have made huge strides storing and processing this unstructured data, and have built large businesses around it.
Recently however, several macro trends related to agile software development and the “Internet of Things” (IoT) have added an additional dimension to big data: “fast”, time-stamped data. Time is of the essence when analyzing and acting on this fast big data, known formally as “time-series data”.
Where do we find this data? Historically, hedge funds and financial powerhouses have benefited to the tune of millions of dollars by processing stock-price data in split seconds. Michael Lewis’ book “Flash Boys” demonstrated how high-frequency trading firms gamed the system by analyzing stock purchase supply and demand and locking in profits, all enabled by ultra high-speed, time-series analytics. IoT and agile-software development, sometimes also referred to as the DevOps movement, also can now unlock significant value by leveraging time-series storage and analytics.
First, in terms of IoT, Intel predicts there will be more than 200 billion “smart objects” by 2020—around 26 for every person on Earth. Whether it’s a Nest thermostat or camera in your home, an Apple watch on your wrist, or sensors monitoring oil fields or smart buildings, all these devices are generating enormous amounts of time-stamped data with potentially game-changing applications.
Storing and processing this granular data and scraping valuable insights from it can unlock tremendous value across the board. Healthcare companies may want to analyze timely exercise patterns from wearable devices across their patient population to recommend preventative health action. Likewise, auto companies in Detroit can use time-series data to detect fault patterns in manufacturing lines in real time and proactively address them to avoid costly car recalls later.
Smart building companies can detect physical security break-ins or sub-optimal energy usage in real time and act on it. Drones and vision sensors in cars can throw off critical data that needs to be processed in real-time to guide autonomous driving. The possibilities for using this data are immense, but the foundation for all of it requires high speed, time-series data processing; storage; and analytics.
The second macro-trend driving the growth of time-series data is the explosion in cloud computing and agile IT, whereby IT execs are finding it more strategic to lease IT equipment as a service from vendors like Amazon Web Services and Microsoft Azure than run it themselves. The growth of outsourced cloud computing recently has been staggering: Last year, Amazon Web Services alone generated $7.88 billion in revenue. At the same time, developers are consuming IT infrastructure without operations support – the so called DevOps movement. Both factors combined mean infrastructure often is being shared across hundreds of customers, sometimes for seconds or minutes – think of Airbnb to the extreme with shared IT infrastructure.
To enable this shared IT and server-less movement, cloud providers and customers alike need to closely track usage, compliance, and security in their shared environment down to the micro-seconds. As the old adage goes “You can’t manage what you can’t measure!” If the underlying, ephemeral infrastructure only exists for split seconds, monitoring and securing it with real-time metrics is key to consuming it effectively.
The major issue for people hoping to leverage time-series data, however, is that traditional SQL databases are not set up to handle this type of rolling, time-sensitive data. Some of the market leaders in NoSQL can address time-series storage and analytics by tweaking their systems, but we believe the category is sizable and critical enough to pursue a purpose-built, time-series database and analytics platform.
Today, we’re announcing Battery’s investment in time-series data company InfluxData**, which currently boasts the most users in its market, according to DB Engines, which tracks the database management market. Time-series is the fastest growing category in the NoSQL space, according to DB Engines. Battery is leading the $16 million Series B round with participation from several other investors—including Mayfield and Trinity–who have experience with other prominent open-source and data-focused companies such as Docker and Couchbase.
InfluxData, based in San Francisco, is led by industry veterans and open-source community leaders Paul Dix and Todd Persen, as well as experienced CEO Evan Kaplan. Enterprises including Nordstrom, Cisco, eBay, AXA, Solar City and Telefonica already use InfluxData, and we are looking forward to watching the company grow as more enterprises leverage InfluxDB for DevOps monitoring, cybersecurity and IoT solutions.
*MongoDB and Cloudera were portfolio companies of Mr. Thakker’s at his previous firm, Intel Capital.
**Denotes a current or former Battery portfolio company. For a full list of all Battery investments and exits, please click here.