Home » Security Bloggers Network » Time-series Databases, Graph Databases, Kafka

Time-series Databases, Graph Databases, Kafka

by Srini Vadlamani on May 9, 2022

Issue #3 of Data Watcher, a weekly newsletter of curated content on topics related to data management and data protection

Welcome to Data Watcher #3, where I curate interesting articles on data management and data protection for data practitioners and executives. This week’s themes are time-series databases, graph databases, and Kafka.

Data Watcher #3: Time-series Databases, Graph Databases, and Kafka

Time-series Databases & Kafka

Kafka Summit London 2022 just wrapped up this week, so I’ll dig around for interesting talks to share over the coming weeks. In the meantime, take a look at this short presentation by @russellsavage from last year’s summit, titled Kafka & InfluxDB: BFFs for Enterprise Data Applications.

InfluxDB is an open-source time series database, similar to Prometheus, and is typically used to store metrics data generated from sensors, infrastructure, and applications. The presentation above talks about how robust real-time data pipelines can be built by combining InfluxDB with Kafka.

Though the focus in the presentation is on InfluxDB, the general architectural pattern seems to be the following:

Time-series databases are good at ingesting, storing and analyzing metrics data in real-time.
However, reliable delivery of the data in the ingest path (if one cares about that sort of a thing) cannot be easily guaranteed without over-engineering the pipeline.
Throwing in a message broker like Kafka into the mix allows data to be staged before being ingested into the time-series database.

Kafka provides scale, durability, and fault-tolerance, and allows for an architecture with a clean separation of concerns.

In the case of InfluxDB, integration with Kafka is achieved using the Telegraf agent, which can act as both a Kafka producer and consumer as the picture below illustrates.

Integrating InfluxDB with Kafka using Telegraf (Source: InfluxDB online resources) — Integrating InfluxDB with Kafka using Telegraf (*Source: InfluxDB online resources*)

Additional reading

Charles Mahler writes about a similar use case in How Companies Are Using InfluxDB and Kafka in Production.

Graph Databases

Graph databases, such as Neo4j, Amazon Neptune, and Oracle’s Graph, model data as nodes, edges, and properties / attributes, unlike relational databases which model data using primary key / foreign key relationships.

@neelimatd talks about an interesting application of graph databases in A Brilliant Use for Graph Databases: Mapping Legacy Software.

In re-architecting an old monolithic legacy application, how does one easily identify its component modules, and the inter-dependencies among them?

This requires both deep domain knowledge from module owners and experts, and good documentation, which is often inadequate, incomplete, or stale. It’s why teams often struggle with modernizing their old legacy applications!

Oracle’s Graph helps simplify this process using a concept called Bounded Context, which is a self-contained independent module of the application (similar to a service in a Service Mesh). For example, in an e-commerce application, Order Management and Payment Processing might are their own independent modules (hence, bounded contexts).

Graph has pre-built algorithms that use historical data about monolithic applications collected over 30-40 years to identify bounded contexts in complex monolithic applications. It then helps visualize and analyzer the inter-dependencies and relationships among the bounded contexts, thus saving teams valuable time in the first step of modernizing their application.

Frankly, all this sounds a little too good to be true! Check it out, though, as it’s an interesting idea.

Additional reading:

The extended article, Graphs simplify software development, talks about two other use cases of Oracle’s Graph – Software testing, and Identifying business processes in an application – in addition to Identifying Bounded Context.

That’s all for this edition of Data Watcher. Hope you enjoy reading the linked articles!

Huge shoutout to the folks at @InfluxDB and @Oracle for the content!

Cheers, and hope you’re having a great weekend! Read original post on Dragon’s Egg.

The post Time-series Databases, Graph Databases, Kafka appeared first on Cyral.

*** This is a Security Bloggers Network syndicated blog from Blog Archive - Cyral authored by Srini Vadlamani. Read the original post at: https://cyral.com/blog/time-series-databases-graph-databases-kafka/

May 9, 2022May 9, 2022 Srini Vadlamani

Time-series Databases, Graph Databases, Kafka

Issue #3 of Data Watcher, a weekly newsletter of curated content on topics related to data management and data protection

Time-series Databases & Kafka

Additional reading

Graph Databases

Additional reading:

Senator Sanders Wants to Own AI Companies — and Hand America’s Adversaries the Keys

NIST’s Nine: The PQC Signature Race Moves to Round Three

The Quantum Arms Race: Why Washington Just Wrote a $2 Billion Check to Nine Companies

Beyond Moore’s Law: The Hyper-Acceleration of Autonomous AI Cyber Capabilities

The Exception Economy: When Security Teams Stop Protecting and Start Negotiating

GoPlus’s Latest Report Highlights How Blockchain Communities Are Leveraging Critical API Security Data To Mitigate Web3 Threats

C2A Security’s EVSec Risk Management and Automation Platform Gains Traction in Automotive Industry as Companies Seek to Efficiently Meet Regulatory Requirements

Zama Raises $73M in Series A Lead by Multicoin Capital and Protocol Labs to Commercialize Fully Homomorphic Encryption

RSM US Deploys Stellar Cyber Open XDR Platform to Secure Clients

ThreatHunter.ai Halts Hundreds of Attacks in the past 48 hours: Combating Ransomware and Nation-State Cyber Threats Head-On

Randall Munroe’s XKCD ‘Bottle’