At ShiftLeft we’re firm believers in the value of open source software. We leverage too many libraries to count, which massively scales our feature velocity and reliability. We also believe in contributing back when and where can so it is my pleasure to introduce you to our latest contribution today:
While in someways like an Object-relational mapping (O/RM), it differs in several key ways.
Its pretty simply, we needed it to deliver our application security service.
The road to hell is paved with good intentions
—
Someone a long time ago… that was probably right
We started building our service in early 2017 and our stack looks very different today from what we started with. We had the basic layout:
The value of cross references were mutating along with the inherently evolving roadmap of an early-stage startup. However, our overarching goal remained:
To provide a rock solid service that yields the most useful/actionable data to our users in the most concise possible way
Our initial implementation of much of our API was a bare bones implementation in Go. We manually created simple persistence, which was more than enough at that time. We were focused on building aspects of our service that didn’t require anything more. But then, suddenly, our simple persistence wasn’t enough and we started mutating our structure as the data showed us the path. We had built an elegant and simple solution, but what we needed was a a very flexible and generic one, at least for a period.
You should spend more on differentiating software that makes your business more desirable, and less on software that doesn’t…
—
Bruce Perens
We decided that, until we better understood the best way to shape our data, we shouldn’t worry about optimizing the efficiency of storing it. The tricky thing with data efficiency, is that you first need to figure out the best way to extract the information your service requires, in order to determine the best architecture. A parallel could be drawn to a work table, you first need to use it, work on it, live it to analyze the mess and from it obtain a use pattern for your tools and then arrange them.
And there, we made a compromise, an O/RM. O/RMs, like many other technologies that bridge two different paradigms, have their fair share of detractors and supporters… and we’ve certain experienced both sides. At first it was wonderful, in about a week we moved our code base to use the O/RM (I am intentionally omitting the name because I don’t believe in software shaming open source projects) and for a period it was good: We moved the structure of our data, added columns, made queries, moved info and it was all done relatively easily, almost “magically”.
Magic always comes with a price
—
Robert Carlyle, representing Rumpelstiltskin.
The cracks on the convenience of an O/RM were quite evident fairly fast, as our data structure stabilized. When you have a solid data shape, you start thinking in powerful queries, products and operations, O/RMs make simple and very common operations easy to introduce but increase greatly the difficulty of non-common ones. Turns out that our honeymoon period O/RM was getting to an end, and our frustrations with it, just beginning.
I hate this O/RM
—
everybody at my company
Our code base is pretty clean and therefore adding or removing dependencies is quite easy, yet the sunken cost fallacy was a bit hard to defeat. We stretched the O/RM like the last chewing gum, a lot of creativity was used until we reached a point that we realized we were bypassing most of the O/RM and going to straight raw SQL, which is quite prone to error, it was time for us to reconsider our approach. .
You’re not wrong Walter. You’re just an a******.
—
The Dude in`The big lebowsky`
There are many things about our O/RM we appreciated, but many others, while close to what we needed, ultimately weren’t close enough. Many of the approaches felt fresh and fit our model very well. Some others were not.
A big deterrent was non-determinism. There is not a clear path between Go and SQL in many aspects, so behavior in SQL is often hard to predict from the Go counterpart and clear patterns are hard to discern on how different structures are going to translate from one to another. It is impossible to cover all the gaps between two paradigms, that is completely understandable, but the issue is that all the gaps that are not covered are dreadful “edge cases.” Many times I went from method call to doc, to code, to be forced to write a test case to learn what the O/RM would do in a determined situation. Even then my conclusions were only valid for specific version because the behavior was not granted by design. Other limiting aspect of O/RMs is that the way to resolve the numerous edge cases is by defining many corresponding conventions on how to use it. The works most of the time for a basic usage, doesn’t scale as your codebase grows.
There were some things though that we really liked about the idea of an O/RM:
We wanted to:
Essentially, we wanted the library to help us talk to the DB, without being a bottleneck.
Our solution is gaum, a flexible open source library to talk to Postgres. Gaum has several key features, including:
We’re optimistic that the flexibility that we’ve built into Gaum will be beneficial for many use cases. By offering a range of options, that let you be as high or low level as you need it to be, we’ve already leveraged Gaum in many ways, but we aren’t done yet. We’ll keep growing Gaum as we gain experience using it to handle data our data and we hope it will be helpful for yours as well!
Introducing Gaum: An Open Source O/RM That isn’t an O/RM was originally published in ShiftLeft Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.
*** This is a Security Bloggers Network syndicated blog from ShiftLeft Blog - Medium authored by Horacio Duran. Read the original post at: https://blog.shiftleft.io/introducing-gaum-an-open-source-o-rm-that-isnt-an-o-rm-e7fd2880396e?source=rss----86a4f941c7da---4
Digital+ Partners Leads Continuation Funding Round in Growing Automated Threat Analysis & Detection Provider, Closing its Series B Round at…
For three years OpenWRT had a severe validation problem with its download package manager, until a fuzz tester found and…
It’s time to say a final “Goodbye” to Flash. (Or should that be “Good riddance”?) With earlier this week seeing…
This is the second in a series of blog posts that discuss how smart DNS resolvers can enhance ongoing network…
Security researchers detected a new spear-phishing attack that’s using an exact domain spoofing tactic in order to impersonate Microsoft. On…
Welcome back to the last part of our three-part blog series on how to leverage analytics to deliver an exceptional…