Dan was formerly the CTO at Criteo - Europe’s largest ad tech firm, where Justin grew and led the Data & Analytics team. Justin joined Criteo two years before its IPO, and saw the company grow more than 8x in eight years. Justin is currently a Director of Engineering at Google.
At Criteo, they were tracking over 4B devices and browsers, corresponding to close to half the Internet population. They served 4B personalised ads, and handled over 300B real-time bid requests per day. 100PB of data was stored in Hadoop (making it the largest cluster in Europe), and 3TB worth of new data was produced every single day. This was a scale comparable to Twitter or Netflix at the time.
With so much at stake, a best-in-class analytics team couldn’t have been more important. Along the way, Justin and his team learned valuable lessons about building and scaling a world-class Data & Analytics team, and you can find Justin’s top tips for start-ups below.
Dan Teodosiu, Margaux Wehr and Justin Coffey
As early as possible, but think carefully about what - and who - you need.
Never underestimate the value of decision support. Especially when it comes to technical decision support, you need to be thinking about it from the early stage of a company's growth. The more you can make difficult choices by cutting through ambiguity of technical and product decisions with data, the better you'll be.
That said, crawl before you walk. Avoid investing in advanced data science too soon. More to the point,classic decision support, business intelligence, and old school time series analysis get you a really long way. As your tools are not going to be super mature, you’ll want to privilege folks who are going to be able to cut through the accidental complexity inherent in immature data analytics stacks. You should also hire people who are good with stats and visualisations so that the data you collect isn’t used to tell you the wrong things.
Even once your startup has reached 100 or 200 persons, you're probably still delivering off your initial “eight page strategy deck”, trying to discover the one big thing that differentiates you in the market. It takes time to reach product market fit and it is going to need a fair amount of analyst time. Also, you would not want your analytics to be a single point of failure, so you'd want to probably think about a small analyst team, two to three people at the start.
Then, at around 200 or 300 employees, when you’re starting to explore derivatives of your initial business idea, that’s when you need to think about a rotating cast of people for data analytics jobs who are able to handle different types of missions.
At Criteo, the genesis of my job building and running Data & Analytics teams boiled down to just solving a scalability problem.
There’s the proverbial “laptop analytics stack”. In the early days at Criteo, there was a local BI team literally building all of their automation on a single laptop. There was a freak out moment when the cleaning staff came in one Friday night and closed the lid on said laptop and their entire automation pipeline went offline for the weekend. They all came in on Monday and nothing had run, and it was of course at the most inopportune time and caused a bunch of heartburn in the org.
On the other hand, you shouldn’t explicitly try to squash the “Cambrian explosion” of innovation in interesting automation that analysts will come up with. It’s a balance.
So you have to invest at the same time in centralising that automation as quickly as you can, so that you can have a well-lit path to something stable. When something is found to be mission critical, you want to be able to get it into a safe place.
As you scale your company, growing pains are inevitable. Here are some key learnings from Justin’s time at Criteo:
Good data architecture means you’re going to have better outcomes. It’s easier to maintain and adapt, easier for analysts to interface with, and easier for the leadership team to understand how data is flowing at a macro-level.
A centralised data catalogue can also be incredibly valuable (though granted, it might require 10x more work than you think!). A good data catalogue can tell people where the best data is, who owns it, and what the nature of the data is; it allows you to understand what data you can deprecate because, for example you can see that nobody's using it; it allows you to come up with sane migration plans and things like that. And it's also critical from a policy perspective in our current and future regulatory environment: data provenance tracking becomes non-negotiable.
Sign up for our newsletter to stay up to date on news from Balderton, and our portfolio.