Home » Security Bloggers Network » Connecting Apache Zeppelin and Apache Drill, PostgreSQL, etc.

Connecting Apache Zeppelin and Apache Drill, PostgreSQL, etc.

by hrbrmstr on August 11, 2018

A previous post showed how to use a different authentication provider to wire up Apache Zeppelin and Amazon Athena. As noted in that post, Zeppelin is a “notebook” alternative to Jupyter (and other) notebooks. Unlike Jupyter, I can tolerate Zeppelin and it’s got some nifty features like plug-and-play JDBC access. Plus it can do some nifty things automagically, like turn the output of a simple aggregation query into a visualization like the one shown at the top of the post.

Drill + Zeppelin (normally that’d be a deadly combination)

The steps to wire-up Apache Drill are similar to those for Athena:

Go to the Interpreter menu (it’s a drop down of the top right main Zeppelin menu bar)
Create a new one (I named mine — unimaginatively — drill)
Set the default.driver to org.apache.drill.jdbc.Driver
Fill in the necessary JDBC default.url. I use jdbc:drill:zk=localhost:2181 and you can have multiple ones if you have need to connect to more than one Drill cluster.
Setup authentication parameters if you need to
Under Dependencies, add an arifact and use the path JAR in $DRILL_HOME/jars/jdbc-driver/. In my case that’s /usr/local/drill/jars/jdbc-driver/drill-jdbc-all-1.14.0.jar

We can use one of Drill’s built-in datasets to test out our connection.

You can do the same thing in the Query box in the Drill web interface, but — even with the ACE editor now being embedded on the page — I think the Zeppelin interface is better and it’s possible to sequence a number of steps in the same notebook (i.e. use a %shell paragraph to grab some JSON data and copy it to a Drill-accessible folder then have a %drill paragraph right below it convert it to parquet and a %spark paragraph below that do some ML on the data and a %knitr block make a proper visualization with R).

Drill + PostgreSQL/etc.

By now, you’ve likely figured out it’s the same, heh, drill for other databases with JDBC support.

For PostgreSQL (i.e. a %postgres interpreter) you need to obtain the JDBC driver and wire its location up as an artifact; use org.postgresql.Driver for the default.driver; enter an appropriate default.url for your setup, which is more than likely jdbc:postgresql://localhost:5432/…if not (i.e. the Zeppelin node and PostgreSQL node are on different systems), then you’ll need to ensure PostgreSQL is configured to listen on an interface that is accessible to the Zeppelin node; enter authentication info and fire up a test query for data that you have. Something like:

FIN

Fear not! There shall be no more simple “wiring-up” posts unless a commenter asks for one or there’s some weird complexity with a new one that deserves some exposition, especially since there are plenty of examples over at the Zeppelin main site. Hopefully these posts have encouraged you do give Zeppelin a try if you haven’t already. I tried working with very early versions of Zeppelin a while back and left it sit for a while, so give it a chance if you did the same. Version 0.9.0 is about to come out and it’s looking even better than 0.8.0, plus it dropped in perfectly on Ubuntu and macOS (even 10.14 beta), too.

Drop a note in the comments if you have any tips/tricks for connecting to other JDBC sources.

*** This is a Security Bloggers Network syndicated blog from rud.is authored by hrbrmstr. Read the original post at: https://rud.is/b/2018/08/11/connecting-apache-zeppelin-and-apache-drill-postgresql-etc/

August 11, 2018August 11, 2018 hrbrmstr Apache Drill, apache zeppelin, zeppelin

Connecting Apache Zeppelin and Apache Drill, PostgreSQL, etc.

Drill + Zeppelin (normally that’d be a deadly combination)

Drill + PostgreSQL/etc.

FIN

Senator Sanders Wants to Own AI Companies — and Hand America’s Adversaries the Keys

NIST’s Nine: The PQC Signature Race Moves to Round Three

The Quantum Arms Race: Why Washington Just Wrote a $2 Billion Check to Nine Companies

Beyond Moore’s Law: The Hyper-Acceleration of Autonomous AI Cyber Capabilities

The Exception Economy: When Security Teams Stop Protecting and Start Negotiating

GoPlus’s Latest Report Highlights How Blockchain Communities Are Leveraging Critical API Security Data To Mitigate Web3 Threats

C2A Security’s EVSec Risk Management and Automation Platform Gains Traction in Automotive Industry as Companies Seek to Efficiently Meet Regulatory Requirements

Zama Raises $73M in Series A Lead by Multicoin Capital and Protocol Labs to Commercialize Fully Homomorphic Encryption

RSM US Deploys Stellar Cyber Open XDR Platform to Secure Clients

ThreatHunter.ai Halts Hundreds of Attacks in the past 48 hours: Combating Ransomware and Nation-State Cyber Threats Head-On

Randall Munroe’s XKCD ‘Horizontal Stabilizers’