Abstract Security Adds Data Lake to Reduce Storage Costs
Abstract Security this week added a data lake, dubbed LakeVilla, to a portfolio of tools for migrating data between cybersecurity tools to provide a less expensive alternative to a security information event management (SIEM) platform for storing data.
Company CEO Colby DeRodeff said LakeVilla doesn’t replace the need for a SIEM but rather provides an option for storing telemetry data that is not being actively used on inexpensive object storage services running on Amazon Web Services (AWS), Microsoft Azure and the Google Cloud Platform (GCP).
Unlike a general-purpose data lake, however, LakeVilla can still be queried and searched without first having to be rehydrated, he added. Archived data can also be replayed whenever required, said DeRodeff.
Storing data has become a significant issue as the volume of telemetry data being collected continues to exponentially increase, he added. Cybersecurity teams tend to store that data in case it might be needed at some future date but costs will start to add up. LakeVilla reduces those costs by enabling cold data to be stored more cost-effectively using comparatively inexpensive object storage in the cloud, said DeRodeff.
Cybersecurity teams are constantly looking for proverbial needles in haystacks of data. By reducing the volume of data in a SIEM, it should become easier to discover anomalies that are indicators of compromise that warrant further investigations, added DeRodeff. The paradox is that while the more data there is to analyze, the harder it can be to identify patterns. There is simply too much noise to discern any actual signals, said DeRodeff.
Unfortunately, many cybersecurity teams lack data management expertise, so there is a clear need for tools and platforms that make it simpler to move and store massive amounts of telemetry data. It’s not enough to simply build pipelines for moving data. Cybersecurity teams need to follow best practices to keep costs under control. Hopefully, there will come a day soon when there are also artificial intelligence (AI) agents that make it even simpler to manage telemetry data at scale.
In the meantime, cybersecurity teams might want to revisit how much telemetry data is being stored in what systems across the portfolio of cybersecurity tools and platforms they have deployed. Much of that data collected by multiple tools might even prove to be actually redundant. Reducing those costs at a time when more organizations than ever are sensitive to the total cost of cybersecurity can make a crucial difference, especially if there might otherwise be a need to reduce the size of the cybersecurity staff, regardless of how many potential threats there may be.
Regardless of motivation, cybersecurity teams clearly need to find more efficient ways of analyzing massive amounts of telemetry data using tiered storage services. After all, the cause of an incident is most likely going to be found in the telemetry data most recently collected. That doesn’t necessarily mean older telemetry data should be thrown away, but it does present an opportunity to bring some order to telemetry chaos.