We were delighted to welcome a stellar line-up of speakers including Vlad, co-founder & CTO of Revolut; Maxime, Dir. Engineering at Criteo; Eliot, co-founder & CTO at PhotoRoom; and Ori, co-founder & CSO of Platform.sh, moderated by Balderton investor Zoe.
We were kindly hosted at the Paris HQ of Ekimetrics. The panel was followed by four breakout sessions covering cloud spend management, multi-cloud management, LLMs in devops and innovations in infrastructure-as-code. The sessions were led by Florian from PlayPlay, Julien from Ekimetrics, Margaux from Balderton, and Dmitry from Emma.ms.
The cloud often feels free and infinite...
While cloud costs might not be as relevant for younger companies, cloud computing costs start becoming more of a problem with scale and traction. As Vlad put it, ‘the cloud often feels free and infinite’. This is a key concern for larger teams across the industry, as it is not uncommon for engineers to not look at the bill.
During the cloud spend management breakout session different internal and external tools were discussed. There are several steps to cloud cost reduction:
Another approach is not being on the cloud at all! Criteo decided to remain bare metal, using its own first party data centres, while most of the world transitioned to the cloud. This approach clearly isn’t for everyone, but at their scale when comparing to what they would be spending on the cloud, they are saving a lot of money. There is of course added complexity with this on premise approach, including the need for dedicated infrastructure teams.
In the case of PhotoRoom, after initially integrating Stability AI into their stack, the company experienced latency issues it needed to reduce, and so opted for Nvidia’s TensorRT. Not only did compiling the model transform processing speeds (sometimes by as much as 10x faster) but it also ended up being a huge source of cloud cost savings. A clear reminder that often solutions to one problem can be tangled up with another…
A key topic of discussion was the increasing importance of evaluating and mitigating carbon emissions within one’s infrastructure. This starts with server location - if energy comes from a clean source that’s a big win. A good place to have a quick assessment of this is on Electricity Maps. Then comes embedded emissions - such as the building and shipping of servers - which are hard to quantify.
This is a real priority for Maxime’s team at Criteo, who are currently assessing the majority of emissions on building and shipping of servers, and reducing spare capacity as much as possible.
Not considering the carbon impact of our workloads is not business friendly
Vlad, co-founder and CTO of Revolut
While there are many reasons to use multiple cloud providers, it inevitably adds complexity. Engineers need to know how to use the different clouds, and infrastructure and cost management become more complex when having to aggregate the different providers. If a company has enough capacity in one sole provider, then staying in a single cloud is probably the best option. It’s simpler to manage, and allows for easier architecture diagrams. However, avoiding being locked into any one specific platform will be beneficial from a cost negotiation perspective, so avoiding using a specific service that is only offered in one cloud is encouraged.
One very valid reason to embrace a multi-cloud strategy is when there are serious resource capacity issues, as is the case for many AI native companies that need lots of GPU capacity and can’t necessarily find it all within a single cloud provider.
There was general agreement that most want to have a cloud-agnostic approach rather than a multi-cloud one, meaning they want the flexibility of being on several cloud providers without the added complexity - they often prefer to have that part abstracted away. Kubernetes based stacks facilitate this.
While it is still early days for LLMs in DevOps, many were curious on the emergence of numerous exciting LLM applications, not just in implementing code (writing/maintaining, debugging code), but also setting up code (developer portals, IDEs) and deploying code (CI/CD, hosting, feature flags). Within the implementation of code, players like Tabnine, Ghostwriter, Cody by Sourcegraph, Pulumi.ai were covered with live demos.
We also discussed the multiple recent shifts of open-source players back to licensing models like Terraform’s (in some though not all cases), and potential successors like OpenTofu.
The breakout session covered not just the different tools on the market, but a wider discussion on how LLMs will impact DevOps in the coming years on the above verticals. We can expect to see an increasing number of different applications over time - for example, not just leveraging LLMs to write code quicker, but also in order to actually host and deploy code. That said, it’ll be a while before this comes into play, and developer inputs will still be required.
Sign up for our newsletter to stay up to date on news from Balderton, and our portfolio.