SBN

Building resilient and secure systems – Lessons from Devoxx Poland

Building resilient and secure systems - Lessons from Devoxx Poland

Devoxx Poland is a large-scale developer first conference that brings together nearly 3,000 engineers for 3 days of presentations and workshops from some of the most forward thinking minds in the space. This year I found an unofficial theme forming around building, testing and securing the complex architectures of modern applications. Some talks in particular left me walking away with a radically new outlook on solutions to solve some of these complexity challenges. I’ve outlined a few highlights from the conference I think are critical to all engineers.

Testing the software architecture

During his presentation, “Testing software architecture”, Mark Richards gave incredible insights into how and why the architecture behind our applications need to be tested. Despite developers generally being great at forming unit tests, stress testing our architecture is not something we have been very effective on at all.

“We are familiar with writing unit tests, we are well practiced at this kind of thing, but we are not so good at testing our architecture.” – Mark Richards

As Mark explained during his talk, testing architecture is different to testing applications or services but it can be done and it's called a fitness function.

So what are we actually looking to test when we are performing a fitness function? Mark gave some areas of your architecture you need to stress test. These include testing the system’s:

Performance

Elasticity 

Recoverability 

Responsiveness 

Data integrity 

Security 

Availability 

Fault tolerance 

Concurrency 

Scalability 

Data consistency 

Reliability 

Mark explained that there are two different ways we can perform fitness functions: trend-based fitness functions, and threshold based fitness functions. Trend based fitness functions are looking to see if things are getting better or worse over time. Threshold fitness functions are monitoring activity over a certain threshold. But when do we actually run these functions?

Fitness functions can be triggered by an event, usually within your CI/CD pipelines, or run continuously in production. Triggered functions are good because we can run these before we go into production. But the big problem is that we have to write these tests based on how we think the users will behave in our applications. But users will never behave how we expect them to. Continuous functions are cheaper and more accurate, but only alert us when our systems fitness starts to fail or breaches a predefined performance threshold in production. Obviously finding out you have issues in your architecture already in production, is less than ideal.

So which should you use? The answer is actually unsurprisingly both: triggered functions = early detection while continuous = more accurate and cheaper detection at a later stage.

So you know what fitness functions are and you know when to run them, but how do we actually perform them? A great takeaway from Mark’s talk were some tools he put forward to help perform architecture tests, such as:

“These tests have nothing to do with functionality, it has to do with structure.” – Mark Richards

Building residual systems through residuality theory

A talk from Barry O’reily was truly one of my favorite presentations of DevOxx. It truly challenged me and the audience on our thinking on building resilient systems for the real world. This talk used a new theory of system testing called residuality theory. Understanding this is a journey, but stay with me until the end and you won’t regret it.

The whole talk focused on the idea that our software systems face a huge amount of random stressors. In the real world, these stressors, or you can call them events, cause our systems to change, sometimes in unexpected and damaging ways. The issue is that we cannot possibly predict all stressors that reality will throw our way. But using the mathematical principles of what Barry calls ‘residuality theory’, we don’t need to find all possible stressors. We simply need to focus on ensuring the residue of our systems work despite these events. What's the residue you ask?

After a stressor event, our systems may change, and what is left over of our systems is called the residue. In software terms the residue is functionality your software has after a stressor or stress event occurs. Real world stressors can be unbelievably random: users will do things you never expected them to do in a million years. But using the principles of Kauffman’s boolean network, Barry actually proved that when completely random events happen in systems that are linked, patterns start to form and the possible outcomes are reduced. This is because nodes within the linked systems act as attractors that would provide structure to randomness. These attractors are the components that will in the end effect your residue. A stressor is an external event, the components this event impacts is the attractor and the leftover functionality is the residue.

“The curse of high dimensionality – When we try to do random things as human beings we fail miserably.” Barry O’reily

“We focus on things that have a high probability based on our own experiences.” Barry O’reily

So now to the crux of the presentation: why on earth does any of this matter? Because Barry put forward a change in mindset when testing for resilience in software architecture. Instead of testing for every possible scenario, focus on absolutely obscure and random stressors, these will then reveal your attractors which in turn will show your residue. If you focus on having a functional residue, you will solve the issues behind more stress events than possible in traditional thinking. A game changer.

Keeping your code base secure

Well-tested and designed architecture is meaningless if it is not secure. The final presentation I want to discuss was from Olimpiu Pop and Steve Poole. The talk titled “Three things developers should know to keep their code secure” was a great developer first look into security.

Olimpiu and Steve demonstrated the huge expansion cyber crime has made on the global scale. It was only around 2016 when cyber crime first surpassed drug trade as being the most profitable organized crime activities in the world. Today, cyber crime costs 23 more times each year than the drug trade at about $11.5 trillion a year.

“If cyber crime was a country it would be the 3rd biggest economy in the world by GDP.” Steve Poole

During the talk the presenters outlined how the zero-day window, the time in which you have to patch a previously unknown security vulnerability, has shrunk down to nearly nothing. And with the help of AI, these malicious actors are getting even more efficient at finding and exploiting vulnerabilities inside your software. A new phrase was even coined, “Prompt Kiddies' ' which is a variation of what we knew as script kiddies previously: lesser skilled malicious actors using tools like ChatGPT to do malicious things.

“AI won’t take your job, but bad guys with AI might.” – Steve Poole

Key to the presentation was how malicious actors were targeting software today, a lot centered around that software supply chain and the open-source components it's made up of. In particular malicious actors exploiting vulnerabilities in these packages. Interestingly, Steve outlined that in 94% of the cases where a vulnerable package was used, there was a non-vulnerable package available. But luckily, there were some core actions outlined in the talk that developers can actively do to keep their applications secure:

  • Using a SBOM (Software bill of materials) – This new requirement creates great transparency into our  dependencies and dependents.
  • Introducing reproducible builds – This is a mechanism to double-check the builds we use.
  • Using SigStore – a very new development enabling you to start signing builds.

While none of these three actions are a silver bullet, together they can form an effective defense against vulnerabilities in your source application.

Some of the pushback for security is productivity, based on the idea that security slows down an organization. To counter this argument, I want to finish on what I found to be one of the most interesting and powerful points in the presentation:

“Companies that cared more about productivity than security, were less productive than the ones that gave priority to security.” Steve Poole

Conclusion

DevOxx Poland was an incredible conference with lots of radically new insights and game changers. Of course there were plenty of other fantastic presentations I could have written 100 more blogs on (who knows, maybe I will). But for me, I came away with new perspectives, tools and knowledge on how to build, test and secure modern applications for the modern era. Do check out the DevOxx conference page to see where the next one in your area is.

*** This is a Security Bloggers Network syndicated blog from GitGuardian Blog - Automated Secrets Detection authored by Mackenzie Jackson. Read the original post at: https://blog.gitguardian.com/building-resilient-and-secure-systems-devoxx-poland/

Avatar photo

Mackenzie Jackson

Mackenzie Jackson is the developer advocate at GitGuardian. He is passionate about technology and building a community of engaged developers to shape future tools and systems.

mackenzie-jackson has 22 posts and counting.See all posts by mackenzie-jackson