Get the #1 Bestselling book 'From Cloud Native to AI Native'

Download it For FREE!

Podcast

May 18, 2026

72x Faster Software Delivery with a Former AI Skeptic

00:00

72x Faster Software Delivery with a Former AI Skeptic

agentic coding

agile methodology

context management

software architecture

legacy rewrite

ai workflows

Odevo Lead Engineer Dominic Warchalowski explains how to redesign Agile processes and repository structures to deliver software 72x faster using agentic AI.

Hosted by

Deejay

Featuring

Dominic Warchalowski

Guest Role & Company

Lead Engineer @ Odevo

Guest Socials

Episode Transcript

Deejay (00:03) This is the Waves of Innovation podcast and I am DJ your host. In this episode, I am talking to Dominic Warchalowski, who is a lead engineer from Odevo When Dom and I first met at the beginning of Odevo's agentic coding journey, Dom was quite AI skeptic. He had had bad experiences with AI coding before and doubted the quality that could be achieved. Fast forward to now, and it's been four or five months since he's written any code by hand and he's actually quite happy about that. Even more important, he's now part of a team that is delivering code 72 times faster than previously. So that is delivering eight years worth of work in 11 months with a third of the people, which I'm sure you can agree is pretty impressive. Now this isn't a full on software factory. Humans are still in the loop. Dominic and his colleagues are reading the code. And in this episode, we go quite deep on what that methodology looks like. So if you're a software engineer or you're responsible for software engineers, and you want to know what that journey both feels like from a personal perspective and also what kind of methodology could give you that 72 times speed increase. Definitely stick around for the full episode. It's a long one and we do get a little bit of kind of cabin fever and go slightly brain dead towards the end, but I hope it's worth it. Deejay (01:23) Warchalowski, I think that's close enough. Thank you very much for joining me. ⁓ It's good. And for context for the dear listeners, we were over at Odevo where you work Dominic (01:25) Perfect. Yep. Thanks for having me. Deejay (01:38) having a few beers at an after work event a couple of weeks ago. And you were showing me on your laptop, all the cool stuff that you're doing with a big software development project that's used fully like agentic first. I think it's safe to say, kind of using AI in all the ways, requirements gathering, through to the coding process, which we'll get to and is really cool and exciting and why I was really interested in getting your story out there. But do you want to start by telling people a little bit about what you do and for folks that haven't listened to previous episodes, what Odevo do and when we can take it from there. Dominic (02:12) Absolutely. Yeah. So I work at Odevo. Odevo is a property management company. I'm a lead engineer for a small team of five people in total. I've been working here for almost two years now. In three weeks, it would have been two years. And in February, we started this agentics journey, like you said. So. Deejay (02:35) Cool. So you're now on this new project, which we'll get to in a bit, but winding it all the way back. mean, I think I can't remember whether I met you in the first cohort of the training or whether you were in the pilot cohorts or whether you were in one of the later ones. ⁓ Dominic (02:53) I think I couldn't make it to the first one. So I think it was the first official cohort. And that was still when I was very much AI skeptic. Deejay (03:04) Yeah, and it would be good to kind of get your perspective on that because I think, you personally, both in your work, you've been on a journey, but personally, in terms of your approach and how much you've enjoyed working with the tools has been a bit of a journey. like, where were you at before, like maybe last summer, before we started working with you folks? ⁓ And when all the agentic coding hubbub was happening, What were you thinking then? And I mean, how long had you been developing at that point? And what was your kind of take on what you were hearing in social media and all of those fun places? Dominic (03:38) Yeah, that's a good one. So I have been ⁓ developing in C sharp since 2015, I think. So I had about 10, 11 years under my bag. ⁓ And last year during summer, of course, the whole AI thing was happening. People were, there was Jet GPT, which was free at the time. And of course I did test. to write some code with it that I could maybe put into production and I did not have a great time at all. ⁓ It constantly told me I was right. It never gave me the correct answer. ⁓ On the same question, I could get three different answers if I just kept asking, are you sure? That did not give me a confident feeling that this was something that I could use at all in my job. ⁓ I was not impressed at all and I basically took not I didn't take a decision But it was more like this is worthless I'm not going to do this so I also when with at the time I think I was kind of oblivious as well to what was happening on Twitter, LinkedIn, like I was not following that at all. So I think it kind of went past me in a way until more and more people started using it here at Odevo. And I just couldn't understand why they were doing it because it was so bad. Deejay (05:13) Yeah, it's so many people have had that early bad experience with I mean, you mentioned kind of chat GBT. So were you copying and pasting code into a web app? And then it's making some decisions kind of in isolation, which are very questionable by the sounds of things. Yeah, and Dominic (05:31) Yeah, yeah, as we know now, it's all about context. And that I didn't know at the time that I just gave a single statement, expected a good answer. So maybe it was also an input error. But in the general sense of thing, it was not a thing at the time at all, like a gen tick. I don't think I've heard the word until like a November or something. So ⁓ There was a whole movement within ODFO where people started talking a little bit about it more and I was still very much against it because that initial experience for me was so devastating for my trust with it that I just didn't want to give it another chance. No, is just, yeah, this isn't great. And then I think it was around November that Opus 4-6 came out. And so... a team member said just try it once more it is really good and then when i tried that after a long while of not touching it at all i immediately had a kind of neck hair raising moment of wow okay this is not the thing that i tried a couple of months ago ⁓ and still from the moment we didn't go all in into it like in our process we were still very much Deejay (06:42) You Dominic (06:56) doing Kanban and just splitting our tickets and delivering smaller pieces of stories and maybe I can't exactly remember maybe I used some AI at the time some cloud ⁓ to do some deliveries but not as a main driver or anything it was still just manual coding Deejay (07:17) Sure, sure. did you, ⁓ between using kind of ChatGPT and copying and pasting into a web interface and then using Claude code ⁓ and the new version of Opus in November, ⁓ did you go through the awkward GitHub copilot ⁓ phase in VS code? Like I know a lot of people that have done that. And also, similar to your experience of using ChatGPT as a web UI, people that have used early versions of VS code GitHub copilot. that were not very good and they were basically just glorified auto complete. And I know companies where they went through a whole training program with those very early versions of VS Code, GitHub, Copilot, and it kind of put everyone off. Did you go through that phase or was it straight from like copy paste to full on hardcore, agentic terminal? Dominic (08:07) No, absolutely not. So maybe I said it a little bit wrong because it was GitHub Copilot in VS Code with the Opus model. ⁓ Deejay (08:13) Okay. Okay, that wasn't you saying that was me making assumptions. Dominic (08:20) Okay, ⁓ so yeah, no, it was in VS Code and I was you and then when Opus 4.6 or 4.5, maybe it even was, ⁓ became available in VS Code in Copilot, that's when it got real for me. So I don't think I touched a terminal until the training from Resync in January. So before that, yeah, sorry. Deejay (08:39) Okay. And with the, sorry, with the, ⁓ the latency of ⁓ transnational video calls, when you were using VS code, GitHub Copilot, and you were trying out 4.5 Opus for the first time, did you go straight to trying to use it in agentic mode? Had you been using it in agentic mode already, or were you using like ask and edit and kind of much more kind of selective Just change this thing here, please. Kind of a usage of it. Dominic (09:14) Yeah, now you're asking me details from like seven, eight months ago. ⁓ Yeah, what I can remember is at first it was like, kind of ask and edit simple stuff. ⁓ I don't think I ever got around to doing like, real agentic work with it. So I think I just used it to plan stuff. Deejay (09:16) Which is a million years in AI time. Dominic (09:38) to create this file or refactor this file and make it use certain XYZ patterns instead and then do one file at a time. So not like complete features in multiple files, but more like a per file kind of process that I did. ⁓ And it was not as much of a process other than I don't know how to code this. Maybe Opus knows or GitHub Copilot knows. ⁓ So there wasn't that much thought behind it at the time. Deejay (10:10) Got you. So at that point you were kind of very much still in the driving seat. You're going around implementing a story and you're asking Opus to help you out with bits of it and presumably reading the code afterwards and writing the tests to go with it, that kind of thing. Dominic (10:25) Exactly. So it was basically just offshoring a little bit of the writing the syntax and thinking of a pattern for me every now and then, but the delivery was still all me in a way. If that makes sense. Deejay (10:42) Yeah, yeah, that makes sense. So that was in GitHub Copilot kind of around about November time. And then we did the training together and you ended up using or being showed or being made to use Claude code in the terminal, whether you liked it or not for a little while. Was that a change that did you stick with that? Or did you like, do you still like being in an IDE? Well, maybe we should before we get to like the present day, because I know the way that you're working now is reasonably different. like, once you'd seen code in the terminal. Did you immediately pick that up or were you still wanting to be in an IDE where you choose which files you get to see? Dominic (11:21) think it was kind of a mix because I did have a colleague and he told me that download open code and then you can use your GitHub co-pilot license in the terminal to just use open code. So I think it was just before the AI training that I started doing that a little, but at that time it was still like, I do stuff in the terminal and then I go to my IDE to look at the code, verify it, run it ⁓ and. At that point, I do think that I did 100 % kind of off, how do you say it? Like off short, I'm not sure if that's the correct word, but the typing of the syntax 100 % to ⁓ open code just before the training. Then we got to the training and when we got to use Claude code, we did have the basic ⁓ subscription at the time. So that was... not even like an hour of programming that we like some of the exercises they were not able, I wasn't able to finish it because I gave a wrong prompt then I had to redo it two times and my tokens ran out. So I went back to open code. Yeah. Deejay (12:19) Hahaha Yeah. Yeah, that's yeah, when we were doing those exercises, it was the variability of trying to do the agent coding stuff with people of like either the agents come up with completely different ideas or Amazon Kero is working really slowly today and everything takes four hours when it should have taken 20 minutes. That was yeah, that was fun. So ⁓ you've kind of gone from having a bad experience with copying and pasting just just seeing what LLMs in isolation, the kind of Dominic (12:49) Remember. Deejay (13:05) bad code that they can write to start to use agentic coding tools, realizing that Opus 4.5 as a model is really quite smart and coupled with an agent loop, it can do quite cool things. And you mentioned that you were working in a kind of typical Kanban fashion of picking up a story, going and as a human, reading it, implementing it, and then maybe asking for an agent to help you before you got to the big cool new project, which we'll still talk about in a bit. Did you, how did you find that kind of trying to tie together agents and the working practices that you already had? Did it fit together or was there like a mismatch there? Dominic (13:54) good question. The thing that we immediately noticed was that, yes, we could absolutely take tickets from Yira and if they were well defined enough, then we could do a delivery for a, let's say a story, so not a full epic, but a story with a little bit of a bigger story within an hour or two and then... we all kind of, I'm not sure if we all in the team started doing it, but I started doing it, another colleague started doing it. And then suddenly we had bigger PRs because we started to do, we batched the work in a bigger batch. And then we kind of got stuck on the PR ⁓ step. And I think you had it also in another podcast about like, we already started. feeling the bottleneck at that point. And we just started doing this. ⁓ The PR didn't go through. Some of them were waiting for two, three days because there were just too many and multiple people were doing a lot of big, like a little bit of bigger things at the same time. ⁓ So yeah, we noticed that it was a kind of a ⁓ friction there directly. And we quickly also said, I can remember at the time that I said, okay, if we're going to do this thing, even in this, to this degree, then we need to change our process already. Our process was very much designed to be very small batches, short cycle time, ⁓ basically not miniature stories, but really small. if we had to create a PR for every small little ticket, put a truggiera to every column that like that we we solve this is not going to work. We need to do we need to change something. Deejay (15:52) Okay, so with the with the so the way that you were working, just make sure I've got that clear the way that you were working before kind of doing agentic coding, you were favoring small batches anyway. Dominic (16:03) There was a side effect of having ⁓ of not being forced but the process we had if we take it like the agentic part out of the way we were still very much a classic development team of just doing normal classic development and it was Kanban small stories splitting them splicing workshops we've had ⁓ flow we've tried flow as well we had a whole training about that and Then when we started, like last year, November, October, November, when we started incorporating this a little bit more, then we found out this classic way of dividing up the work, taking smaller tickets, that didn't work. Does that make sense if I explain it like that? Deejay (16:47) Yeah, yeah. Yeah, yeah, no, it does make sense. And one of the reasons I wanted to clarify was just, ⁓ you know, I have for the longest time been pushing people like smaller batches of change, like, you know, you've never had the misfortune of working on a backlog with me. ⁓ Another customer that I'm working with at moment does have the misfortune of me being backlog manager and writing tickets. And just the tediously excruciatingly pedantically tiny stories and tickets that I write ⁓ have definitely Dominic (17:04) you Deejay (17:16) definitely caused some people to question my methods in the past. But you know, that was always conventional wisdom before I've make everything tiny so that you each smoke change is small and atomic. And so you don't end up with one Uber story, like blocking the whole, being stuck in progress for six weeks, because what only one aspect of it turns out to be tricky. then coding agents can deliver so much so quickly that the transaction cost of like picking up story, raising a PR, getting someone to look at it, merging it. absolutely dwarfs like all of that overhead is so massive compared to the actual coding time, which is now shrunken down to a teeny tiny amount that I'm gesturing with my fingers that the listeners won't be able to see. So, so yeah, so you had at that point, ⁓ been doing the training, maybe done the training and you were feeling this kind of, ⁓ impedance mismatch, between like the speed of agentic coding and, ⁓ the ways that you had previously been working. How long after that? What was the kind of time gap between that and the cool new thing that you're doing, which sooner or later we should stop teasing everyone about and, ⁓ and explain in more detail, but did you carry on working in that fashion for a while? Did you make any changes within that team and within that process before doing the cool new project? Dominic (18:34) Yeah, so it was exactly, I have to think really hard because this was February and that's also a long time ago, but at that time we were still working on the current product that's still live. And from that moment that we had the training, we went through it together, we sat down and we said, this process doesn't work. We need to change this. So then I also told our product manager and our ⁓ agile coach, hey, this new thing that's happening, you've been to the training, you've seen what's happening. Everything that we just kind of put like ⁓ the configuration of our way of working that we just did over the last year, I'm going to scratch that because this will not work for us anymore. And not really knowing what we like or how we wanted to divide our work, we just basically started experimenting. And instead of taking a large epic and dividing it into 15 stories, we just took the epic and started putting it in a little bit naively, but just started putting it into Claude. Because I think at that time we did get our proper Claude subscriptions. So then we're like, you know what? Yera MCP, we can get the tickets and we're just going to work on it. So we did an experiment as well with a colleague where we had a pretty well-defined Epic. Of course, there were some holes in it, but we went with it anyway. And we did this experiment where we would both sit with the exact same tickets, the same Epic and would go through the process of trying to deliver it at the same time. ⁓ without any guardrails, without any proper system around it and see how that would go. And see like, and maybe important to note as well, at that point we got an, okay, do your process, see how you want to experiment, see how you want to do it. So we did this, we did this experiment. And for me, well, there were so many hallucinations, stuff that was not in the story. There were all the business logic, all the stuff was just put, Deejay (20:55) You Dominic (21:01) straight into controllers, ⁓ certain patterns that my colleague had this thing where he told the agent before, hey, these patterns, they are in the code base, but we're going away from them. So don't use them. And then Claude said, of course, I will not do that. And then when the production cycle was done and the Epic should have been delivered, it of course used all the wrong patterns and those kinds of things. So... ⁓ Deejay (21:19) you Dominic (21:28) Maybe this is a little bit of a mix of both process and the experience that we had of, how do you say that properly? Like just putting it raw with Cloud and just kind of vibe. It was kind of vibing what we did, vibe coding. And yeah, that was not a great experience to be, to put it like that. And this was before we went to the new project. So that's between the training and the new project. Yeah. Deejay (21:48) Yeah. Sure. Cool. Cool. Yeah. And I think there's, ⁓ there's a couple of really interesting things in there. mean, to the, the latter point, having, having some structure and some guard rails around it, like, the first thing that I was thinking is you were talking there about recognizing that the process, you know, doesn't work anymore. Being able to talk to your bosses and, your colleagues and say, Hey, that stuff that we've been working on for the last year and all these new ways of working. Well, I'm sorry, but they're already out of date and we need to be able to change those. That is a really nice position for you to be in, like as an individual and also for an organization to actually be able to recognize that and act on it. There are so many stories floating around at the moment as more and more people start to adopt agentic engineering in one way or another. Kind of seeing echoes of the past where people are trying to use existing processes. their existing SDLC and just automate one tiny bit of it. And it's not working like either the PRs mount up and you know, there's a huge bottleneck there or the quality control isn't good enough. and 10 years ago when the, you know, the cloud, ⁓ it was not new, but cloud platforms were certainly a new thing. You know, we used to kind of, I used to somewhat, you know, arrogantly kind of think, well, it's all very stupid. People taking all this new technology and trying to do it in old ways. and don't they know the new ways? And I used to make a joke about giving people a cloud platform was a bit like giving someone that's only ever ridden a horse, a Ferrari, them getting in the front seat, slapping the side of the seat and shouting giddy up and being surprised they don't go anywhere. But what I think your experiment showed was that the old ways don't work, but we're not quite sure of the best new ways yet. And as an industry, we're learning that and certainly as you know, individuals trying this out for the first time, you gave it a go and you found you learned valuable information. so, you know, sharing on a podcast or a conference talk for the listeners, I've been nagging Dom into, I haven't checked whether you'd like to be called Dom or Dominic either. But I'm going to call you Dom anyway. ⁓ Dominic (24:06) That's fine. Deejay (24:10) But yeah, I've been trying to nag Dom into giving a conference talk about all this. So if you think that he should find him on LinkedIn and then tell him that his story is really interesting and that he should share this more widely. Yes. So that was like learning there at an industry level. And the fact that you are able to change means that hopefully you folks aren't going to get stuck in the rabbit hole that some others where people are going, agenda coding doesn't work because it causes bottleneck somewhere else. That's probably cause you're not changing enough. Dominic (24:40) Yeah, 100 % agree on that part. It's okay for Hapinir. So I think that's exactly because we were there as well. We had this whole process. and which we have been working with for years. Over the years I've worked with Scrum, Kanban and it's all more kind of same-ish in a way. It's this kind of synergy between everything. They are good processes for what they meant at the time. And we were also trying to get this agentic way of working into that same old process at first. Like it wasn't like in one day that we decided this. was, okay, we're going to use this and we're going to make the pull request bigger. And then you find out that the PRs are not going through. And then ⁓ at every step of the way, we kind of felt like, okay, it's not an... ⁓ agentic way of working is not just trying to fit it into this process. You have to redesign the process to make, you have to design it around the agentic way of working instead of the other way around. You cannot make your process fit agentic way of working. ⁓ In my opinion. Deejay (26:05) Yeah. Yeah. I think that's being seen around the industry. was a new it's Monday, the 11th of May at the time of recording. And a couple of days ago, I got a mail shot from the Dora group of a new Dora ROI report. And again, repeating for think the third reporter there's in the row that If you don't have the right fundamentals in place in your process, then it's not going to work. And also you need to change things to enable this ⁓ to work well. So talking of changing things ⁓ to get the most out of agentic coding, the thing that I was like quite excited by when you were showing me your laptop and show me all the cool stuff that you were working on with other folks was just the kind of end to end nature of it all. So ⁓ Are you, do you want to introduce like roughly what you folks are doing? mean, as I understand it, there is a project, a product that you folks have that's been built over five years. There's been lots of learning that's happened over that. It's a successful project. ⁓ And maybe to reflect back on what's Marsha in the last episode, if you haven't listened to that one, dear listener. ⁓ Tomas was talking about a dev O now making kind of bigger bets and realize it can be more ambitious. So as I understand it, there's like this kind of thing that had, ⁓ there's been built over five years, grew in complexity. And I don't know about you, but like, whenever you build software, kind of like, like if I'd have known that back then I probably would have done it differently. And so you folks are, ⁓ working on a, on a V2 going full on full power, agentic coding all the way. Dominic (27:53) Yeah. So yeah, I can share a little bit of background as well since ⁓ that it was all around. I haven't been 100 % involved in all the background story or the background scenery of how everything went, but there are a lot of people here in the company that have been very interested in AI for a long time. And when... At a certain point it came up that, hey, maybe we could rebuild the system in an agentic way. Maybe now is the time to take a look at this thing that's happening. Agents are getting so good. A manager here, he did a rebuild of the system with BMAT. I'm not sure how long it took him. I don't think it was that long. And then he had a decent version running on his local computer that was just... rebuilt from the ground up in a modern tech stack. And around that time, internally, we had a lot of good people with AI and a lot of people were already very into it, but this agentic way of working was not really a thing yet. So we looked outside and got a vendor who helped us with, who had a lot of industry experience with ⁓ in general, like, even if we take away the agentic development part, they have delivered software for hundreds of millions of people, like being used by hundreds of of people over the years. And ⁓ they have kind of set up a agentic way of working that we are now also using. And how do I say that? So... That was the moment that we got really into the agentic part is that when we had this external vendor who helped us set this up and that's when really the ball got rolling. And when we were like, okay, yes, we can do this. We have the old product, which we still have life. And we at the same time, we can start rebuilding this at the speed of light, which is really, really cool. Deejay (30:20) Yeah, the way that the logic changes around for years and years and years as a grey-haired consultant, I've been saying, you know, don't do big rewrites. They're too risky. the calculus on that has changed now. You know, now that the cost of doing so has shrunken so dramatically and the speed at which you can move and get far feedback has gone up. There are many more cases where a rewrite is at least worth trying. Dominic (30:34) They were. Deejay (30:50) ⁓ which I guess in part explains what you're up to. And we should probably name the external vendor, like give them props where credit's due. It's a end game, end.game, is that how they stylize their name? So yes, I've been in a meeting with a couple of them and very knowledgeable folks. So if you need help like this, then definitely reach out to them and also resync if you like. ⁓ So you've got this product that's still live and has been built over five years. Dominic (31:01) Correct. Deejay (31:20) And so emboldened by seeing what can be done with agentic coding is support from external vendor, like the training journey that you've gone on finding that, you know, trying to do things the old fashioned way doesn't get the most out the tools. What was the kind of, ⁓ the, the, mission for, for this new endeavor? what were you trying to achieve? Dominic (31:43) ⁓ I think one of the main reasons is development velocity. Since the product that we have, it is just, it's showing its age. ⁓ And over the years, organic growth happens. And it gets more complex to, as in every piece of software ever written, it always gets more complex to deliver at a certain velocity. So I think That is probably one of the main reasons that this is this really big thing. And also because a new fresh tech stack that is just industry standards, maybe not the correct word, but very, very modern in its way. And having this on complex code base, which just is a breeze to work in. and built from the ground up, designed for agentic engineering is... Yeah, that's... Not sure how to put it in words, but that's... If you compare it to the old school way of developing, this is where you want to go. Does that answer your question? Maybe I rambled a little bit, but... Deejay (33:05) No, no, no, that's, that's all good context. And so the, the belief is that you can go much faster now. How quickly, what's, what's the object? Everyone loves a good deadline or at least some certain managers love a good deadline. used to give conference talks about why they're evil and wrong. And I think along with estimates, so many other things have to go out the window these days. But what's, what kind of time box have you put on trying to rebuild this system that was built over five years with the, the, new, the new way of working? Dominic (33:36) So for the MVP that has been from the moment that the discovery work started, I think it's five months in total to deliver the MVP of a feature parody with the old product. to be on feature parity with the old product. And another six months, so December, 2026, to be at the level where we predicted to be three years from now. That is the kind of time scale we're working at now. Does that make sense? Deejay (34:12) So that is, yeah, yeah, that does. That's a total of, if you achieve that in December, that will be eight years worth of development in 11 months. And is the team size the same? Dominic (34:21) correct. No, the team size is smaller. I think we were about 15 people before divided over two teams, three teams even, sorry. And now we are eight people. Deejay (34:40) And are those eight people all in one team or are those split as well? Dominic (34:43) No, sorry, no, there's two teams. And we also have a person that is taking more care of the DevOps kind of stuff, who is not contributing with feature delivery, but is very important in all the other ways, like with the agentic part, where we sometimes don't have time for to think about context management, cleanup of... MD files, like looking more into the agentic process itself. So we have, let's say, seven people working on features and one ⁓ supporting role. Deejay (35:19) Cool, cool. Gotcha. just like, you know, we're talking about what might happen in December and the fact, you know, we're looking at, I think in June was the first sort of MVP date, but it's going well at the moment, right? Like this, it's not just that this is all speculation that this might work well. Like you're making substantial progress and you're on track with the MVP. Dominic (35:41) Absolutely. Yeah, nothing else to say to that. It's if you would have asked me in November if this this would have been a reality I would have laughed at you. It's Yeah, it's that crazy and it goes that fast Deejay (35:53) You Yeah. Yeah. So you're doing eight years worth of work for 15 people with seven or eight people over 11 months. ⁓ Like, let's talk about the process. is the, this is the cool part ⁓ where we'll hopefully nerd out. And we were talking before recording that a lot of material online is quite vague in these kind of parts. And people are a little bit scared of ⁓ alienating people who aren't developers. Now I know you don't know the whole thing inside and out. So this isn't going to be, isn't going to be a test, isn't going to be an exam. ⁓ but, ⁓ it would be good to like hear about the process and for you to explain all the cool things that I saw on your laptop and kind of end to end walk us through, what, does it look like doing eight years worth of work in 11 months this quickly? What, what, what does, what does that look like? Dominic (36:31) Phew, okay. Where to start? I'm probably going to go a little bit back and forth here. ⁓ But if we start at the very beginning, which I was not completely part of, is where there was a large analysis of all the code bases we had. I think we had like 15 projects with a lot of business logic, a lot of databases. ⁓ And that was... agentically or with AI analyzed to the last T. ⁓ That took quite a while. I don't know the exact number, but I think it was around two weeks of full token burnage, so to speak. ⁓ From that came spec documents ⁓ and ⁓ Deejay (37:31) You Dominic (37:41) Again, I wasn't part of it. I'm this is kind of a little bit of secondhand information, but ⁓ Internally here some people from us went over the analysis ⁓ looked at what the agent produced did some corrections and Deejay (37:56) And is that, do you know if that was, was that looking at kind of coding style and summarizing the architecture? it trying to extract the business model of like, Hey, we've got these ⁓ data objects. We've got these SQL tables. These must be the entities and how they relate. Was it trying to like infer the kind of the logical model of the app, or was it more about the code underneath? Dominic (38:21) No, so if I'm correct, it was not so much about the code itself. It was, like you said, the schemas, ⁓ the data models, everything that kind of could deliver the business logic. So the business logic was derived from all the technical bits and pieces. So the coding styles, like we are not even, it's not even the same ⁓ tech stack anymore. So it was just business logic that was extracted and ⁓ then analyzed to if it was correct or not. Deejay (39:00) Got you. And I said, I wouldn't make this an exam. I'm absolutely making this an exam. Do you happen to know any of the tools that they were using? like a recent, we did a similar thing for a customer and I think we wrote a custom harness to like do a Ralph Wiggum loop of just run an agent infinitely kind of make a plan, do the first thing on that plan, do the thing, update the plan with the next step, exit. Dominic (39:05) ⁓ no. Mm-hmm. Deejay (39:26) And then an outer harness kind of starts that all over again. So you can just leave it unattended for days at a time. Was there something like that at play or was it more just kind of manually poking Claude code? Dominic (39:39) I think this was a proprietary piece of software from Endgame. So I don't know anything about it. ⁓ And I honestly don't know. I just know they did the analysis. So I think this is the only thing that I really don't know the technical parts about. So sorry for that, if that's the first thing already. other than that. Deejay (39:52) Okay. No, no, that's cool. And I mean, I think the, ⁓ that there are probably many ways of doing that. And as with all of these things, there will be more and more tools, ⁓ kind of, appearing. ⁓ and, with, with the Gentic code, now everyone can make their own software. Like nobody is standardizing on anything anymore. Everyone is building their own, like, ⁓ I'm to build my own context, ⁓ context management framework. And I'm going to build my, you know, my own, ⁓ business domain understanding widget, Ralph Wiggum loop. Dominic (40:16) Really? Deejay (40:29) harness. So, you know, even if you had used it all, chances are somebody there would have made it and nobody that's listening to this is going to use it. They're probably just make their own. So the principles are probably more important. Dominic (40:30) Yeah. exactly. Yeah, and I also have a fun anecdote a little bit later down the line in the process where we did this exact thing that you just said about building it yourself. So I'll circle back on that later. Deejay (40:47) You Cool. So there was the kind of two weeks of just throwing money at Anthropic and burning all the tokens, understanding the code base ⁓ for the purpose of extracting the domain logic and the business logic into a model because you were going to like completely replace the code anyway. So you've got that. What do you remember what happened next in the process? Dominic (41:16) Yeah, so after that, ⁓ and again, I wasn't part of all of it, but what I remember being part of was that then it was a lot of discussions, ⁓ a lot of going deep into all the domain knowledge, all the domain business logic, those kind of things. And the cool thing here was, this is one of, like this was one of my aha moments is that. all these discussions, everything was being recorded, transcribed, ⁓ and during the meeting it was structured in a certain way that it was easy for an agent to kind of see the structure of the meeting just from the transcripts and that the transcript was then fed into a kind of document system that would extract everything that was being talked about, every feature, and then could link that to existing documentation so that every meeting, just by talking through it, no keyboard was used in this process. It was just talking, recording, feeding the transcript into this documenting system, updating the documentation and specs, and... having a new fresh updated set of specs based on just talking for a full day and I'm just saying a full day but I think it was probably a couple of weeks of just having meetings talking transcription feeding it into the beast and then having specs and kind of all the files updated to the latest standards Deejay (42:58) There's two things that kind of amused me about that. ⁓ One is it seems like yet another kind of example where if you are capable of being disciplined in the first place, you're going to benefit. And if you were not very organized, you're going to suffer. Like you mentioned having a well-structured meeting. I don't know how many people I've worked with over the years that are capable of having a well-structured meeting normally because I'm in it and I'll just talk over the top of everyone and go off on tangents. like, Dominic (43:23) You Deejay (43:26) having the discipline to do that. ⁓ Not everyone is going to be able to manage. And then there's also a thing about trying to get ⁓ everything right and as understood upfront as possible. That's really kind of interesting in that, you know, when spectrum development first kind of hit the scene, you know, what September 2025, the ⁓ ancient history, there was a lot of pushback of, ⁓ this is just waterfall by a different name. And I think that there's a bit of that that's true, but also the, the, again, the calculus of things has changed. Like we did agile and we just did things in small chunks because we're expecting the world to change whilst we were working on it. And then what was specified might have, you know, if it's going to take nine months to deliver something by the time you get to nine months time, everything might have changed. Your assumptions might be wrong, but if you can go from having everything specced and then in two days and agents written everything for you that that window of uncertainty between thinking and then implementing is very small. On the flip side of that, there's also though, sometimes you don't know what you want until you've seen it built. And then you're like, of course, you know, it shouldn't work like that. But in your case, you already had a working product. So I'm guessing that there was less of that. Dominic (44:41) Yeah. Correct. Deejay (44:49) need for kind of novel discovery. It was more just about clarifying like, we sure we know how this works? And like the agent has discovered the right thing. Dominic (44:59) Correct. there's a lot to unpack here, but I think an important detail here, what you mentioned, is that isn't this just waterfall? And yes, in this beginning phase, it was extremely important that some of the design decisions that were taken before regarding roles, permissions, the whole kind of backbone and schema of how the system was built, it needed an update. It was for where we need to go. It was not correct anymore. And most of the time was spent perfecting that basically just the schema behind everything. So roles, permissions, ⁓ hierarchy of certain structures of the data model. And that's... that's where a lot of time needed to be spent as well because it needed to be kind of perfect from the beginning, but because you don't want to find out three months down the line that your data model was incorrect. So, and yes, ⁓ I also have thought about the part where people say, okay, yeah, but this is just waterfall. And in a way, if you squint your eyes and you look at the outline of the process, yeah, it looks a lot like waterfall, but it's absolutely not. since you are not designing a complete ⁓ system from the ground up. You are taking subsets within the whole product, putting a lot of effort in the beginning of specing it, talking it through, thinking about all the edge cases, thinking about all the business rules that you... really want to have in place and then hand it over to your agent to do the implementation of it. But that's not a complete system. That's just a subsystem. Like it could be a single feature and it might sound like that you spent days in a meeting for a little thing. That's also not true. ⁓ there is a lot to unpack here for me as well because for some features it takes sometimes days to talk through it. Some features, you have a meeting with one person in 20 minutes, you talk through it, you feed the transcripts into the system and you build the feature. So it all depends on how important is the thing that you're building. Is it something regarding security? ⁓ You want to spend time on that. Is it something for your, ⁓ like I said before, the hierarchy of your whole data model? Yeah, you want to spend time on it. Is it adding a new page ⁓ with a form that can do two functions? You can spend an hour on it and you're done with the complete process. Deejay (48:06) Yeah, there's like, we could talk about this for hours, because it all gets quite philosophical really, in that like, I wonder how much that is human to human brain to brain doing like discovery and making sure that you understand things yourselves and like discovering through the act of communication, kind of like pair programming, you know, you would talk through ideas. And, you know, one of the things I again, used to give a conference talk about was that Measuring developer productivity is really difficult because it's discovery. It's not doing a thing that we already know how to do. Every time you do something, it's new by definition, because otherwise we could just reuse the code. So you're always grappling with the unknown. So I wonder how much of it is to do with that of trying to discover the right answer, make sure that everyone's in alignment. And then the other part of that is about fundamental irreducible complexity. Like you mentioned security requirements and things like that. There are some systems that are just fundamentally complex. they, that you can't make them any simpler. Steven Wolfram talks about this idea of irreducible computation and that some things just cannot be compressed beyond a certain point. And Donald Norman, who is big in the UX sphere and design circles, author of the design of everyday things. Dominic (49:10) Mm-hmm. Deejay (49:30) I a follow-up book, which was about living with complexity and the difference between complexity and complication. And complexity is like fundamental. then humans have a bad habit of adding complication. Like when you get a code base and you're like, ⁓ this is a mess. It could be refactored and we can make it so much nicer. That's an example of like complication, like wrapping complexity. So with the different examples you were giving there, like some things might take two days because there's discovery. or they're fundamentally complex and other things might be really quick because they're quite simple. that I realized that wasn't really a question. So that's very hard to respond to. So you went through the process of the code base was analyzed and then people are going through it and you're recording everything. You've got meetings in a ⁓ well-defined structure. So presumably you're kind of like making a point of talking about one thing. We're now going to talk about this feature. So then, you know, send the transcripts and the agent can pick it up only talking about that feature. Dominic (50:29) correct. Deejay (50:32) And then moving on without kind of going around in circles and that kind of stuff. Is that the technique that you used? Dominic (50:38) Yeah, that is in general what we try to do. But of course, like you said, it's so hard to... People have ideas all the time and sometimes somebody comes, but what about this feature? Can we use this here as well? And then it kind of gets a little fuzzy while you're talking about it. to be fair... Deejay (50:43) Hahaha Dominic (50:57) especially like the not trying to make like ⁓ to advertise it, but the Opus models are extremely good at untangling that and putting the, even if you go crisscross through each other and talk about three features, it will end up 99 % in the right place ⁓ in the documentation. So ⁓ we've had meetings and it was complete chaos. And I was like, this transcript is unusable. And then still in the end it's like, Okay, well, this is perfect. Like I don't have to make a single edit to it. ⁓ Deejay (51:31) Nice. And in terms of the transcription, ⁓ and again, trying to make things tangible if we can, do you know what tools were being used? Were you using just kind of Google Meet Gemini transcription and did it understand all the domain specific terms or were you using something more kind of ⁓ targeted towards developers? Dominic (51:49) no, whatever could record a transcript. Slack can do it. ⁓ But we just going back to Microsoft Teams ⁓ using the transcript, even though the transcription isn't that great sometimes, like it just doesn't pick up on some words, but it's still. especially with like a good model, it will figure out what you're talking about. So it's just Microsoft Teams, ⁓ just using like, I put every meeting that I book, I make it record on by default, every meeting I do, because I want to have all the transcripts. ⁓ And it's a little bit of fun story because first I was just doing... ⁓ told my agent, hey, there's this VTT file, ⁓ read it and put, find whatever feature you can find what we talked about. And then here's the other document that it should go into, just do that thing. ⁓ That got kind of tedious, sometimes didn't work that great. So I built this app myself, just with like my cloud subscription from work, of course, where I created something called Team Butler and it's... ⁓ with Microsoft, I'm not even sure what it's called, but it can log in, I think it's called a delegated access. So it has access to all my meetings and it can go into all my meetings and fetch the transcripts there. And I create this little app that just runs certain prompts against all my meetings. So it knows what my standup meeting is by having a certain title in my meeting. I have another set of prompts that it knows what to fire like, against the transcripts and I have this whole web interface where I have all my meetings and I have added a TypeSense to have a search index. It can go through all the transcripts and it's quite a nice UI as well. And ⁓ it can identify which features we talked about. And then just, I have a table that I created where there is this feature called whatever it's called, like documents. And then it can... it shows me like, ⁓ on these dates, in these meetings, you talked about this feature, here you said this, but there you said that. You need to reconcile this ambiguity. And this is just handmade. Like there's no product behind this, it's just me and just writing a little bit. And it's not that advanced. It's just having some prompts, making sure that you tell the agent, hey, patch these features and it can organize them for you. Deejay (54:26) Nice. I just thinking about ⁓ some of the people that have the misfortune of working with me, whether they would like for me to have an agent that could spot all the contradictions I make from one meeting to the next. They probably quite like that. So the fundamental tools you use and with just the kind of free enterprise ones that everyone has access to. So Microsoft Teams transcription, and then you did some custom plumbing kind of behind the scenes. So that will get transcribed and then Dominic (54:27) So. Deejay (54:53) What's the journey that all of that information goes on to become code, to become a running system? What happens next with those transcriptions? Dominic (55:03) Sure. ⁓ So, there's like a little bit of overlap between periods where we first did a very, very front heavy refinement process, which looked more like waterfall. And then we have the process that we do now. So if I go back first to that initial period, I think I mentioned it before as well, that we set for two, two and a half weeks, just discussing a lot of features, going through them. talking about them, getting them in a scope file. So we had a scope file for this platform that we were building, and it had the 14 features. And every time we talked more about them, we discovered, we refined a little bit more about certain edge cases, certain functionality, and it all got added into this large scope document that was basically the description on a high level. Yeah, of that certain platform. Then we have some specialized skills that can take a scope document, split up all the features that are in there, and create a great epics and stories from the scope documents. It will take all the information talking about taking all the details that we talked about. and then putting it into the epic and stories, creating this folder structure that is just an epic and then a couple of MD files for each story. ⁓ We go through those files. So we have epics and stories. And then we don't look at the epic. That's just for Yira because that gets synced back to Yira. We only use Yira for the overlords, for the of the business people too, so that they can still look at it. But all the work we're doing happens only in our repo. So maybe as a good little side step here is that we have one main repo, kind of a hub repo. And that contains all our... Deejay (56:51) you Dominic (57:10) epics, stories, rules, skills, everything that is being used for all those Git sub repos that are in there. So it's kind of a mothership or hub. Deejay (57:20) And interestingly, you're using sub modules, aren't you? This isn't a mono repo in the traditional kind of flat sense of the term. You're using sub modules. And before we finish recording, you'll definitely need to explain why sub modules, because I'm fine with sub modules. I used them a lot 10 years ago. Everywhere I've ever introduced them, people have like chased after me with big sticks being like, why the hell did you do that? The training actually that we delivered with you folks, I structured as sub modules for reasons that seemed to make sense. Dominic (57:42) Mm-hmm. Deejay (57:49) I think it was something to do with Git. Yeah, was, I think I had an excuse, something to do with Git overlays of themes in one of the slideshows. so you've got this kind of Uber module with all of the, I guess, of context. So all of the epics, the stories, the schema, and then sub modules. So what's the, what's the rationale for that approach? Dominic (57:50) I remember Daniel. I remember. Correct. You mentioned it, context. whenever, like, so if we go back to an Epic and stories and they are in the top layer of the, let's call it just the system, so of all the repos, ⁓ there are a set of specialized skills ⁓ that we, from the Hub repo, we can always look down and take a story or a complete Epic with all its stories. and then make the agent think about all the sub repos we have. So we have a front end repo, we have a backend repo, we have another front end ⁓ module that connects to the same backend. So we have two front end modules. And then it can just figure out what goes where. ⁓ Because when you actually start the development part, which is a little bit later, it can take a story. And I personally use spec-kit and then you can get your spec-kit to run for the correct repo from the hub repo. Maybe it sounds a little bit complex. I hope it comes over clear enough. Deejay (59:29) I think I'm following you at the moment. So we had all of the information that was transcribed, that got turned into a more structured format, which ends up in the Uber repo. And then we've got individual submodules for each deployable kind of component, presumably. I guess there's some microservices in there somewhere. ⁓ The bit I'm interested in is you mentioned using spec kit. Like when we first spoke, I assumed that... ⁓ Dominic (59:45) Correct. Deejay (59:55) this was probably replacing some of those spectrum development frameworks, but it sounds like it's not. So before we get too into that, we've got the Uber repo with all of the kind of project-wide context for the whole system. Somebody wants to, like, it's Monday morning, I want to get my agent implementing some stuff. Like, what happens next? How does it decide which Epic and which story to do? What does that process look like with you as the engineer kind of poking Claude saying, can you do some stuff now, please? Dominic (1:00:30) Yeah, so I think because if I immediately get like these flashbacks to reading LinkedIn and Twitter or X and then having people think that there's this whole automated kind of agent system that does everything from end to end. In our case, it is still quite hands-on. Deejay (1:00:37) You Dominic (1:00:55) in the sense that I don't poke my agent and say like, do some work. Because what I do, I I sit with an Epic. So I take a complete Epic, a full featured, which could be two, three pages, ⁓ a bunch of business logic. I like to set a little bit more context, we all kind of take an Epic now, not stories. We own every Epic. Also refinements, discovery. And that's basically a one person job at this moment. So just to give you a little bit of insight what I did yesterday, I was going into my agents and like, okay, I have this feature that I want to do. is one Epic, there's like seven stories to it. I take my spec kit and I was like, hey, here are these stories. Which ones can be grouped together? Like you can... find a couple of ones that are purely for one frontend project, you group those together. You have a couple of them that involve the backends. You group those together. So you have like a set of three stories, sorry, and a set of four epics, stories again, sorry. And then you take speckit and make speckit break it down into even smaller pieces so that the agent knows how to take the stories. and create workable chunks of work kind of with it. Deejay (1:02:24) Yeah, that kind of makes sense to me in that like you want the agent and its framework to be using the kind of unit of currency that it's used to the level of granularity that like however spec it wants to break up stories like show off spec it you do your thing. Dominic (1:02:38) Yeah, correct. So, and that's me. So I use spec kits and I use it in this way. And I just said as well, linked flashbacks to LinkedIn, people talking about these automated systems. And I also went into this expecting this kind of fully automated thing, but we have people using spec kit like me and some other colleague. And then there's someone who uses a rough loop. ⁓ I think even with beats, I don't use beats. just use markdown files. And then we have another one who just raw, sits raw in the terminal. So it's not a very unified way of working, it's the, how do you, sure. Deejay (1:03:23) I mean, that just to interrupt that that in itself is probably quite a powerful thing, right? One of the and it sounds like one of the things I think is really interesting takeaway from this is, you know, you're on track to deliver eight years work in 11 months with a third of the people, but still with humans in the loop, it's not a full on end to end automated software factory. There's still oversight and, there's space for diversity of approach there. Dominic (1:03:29) Absolutely. Not yet. Deejay (1:03:53) So it's not that everybody has to do things the same way. The fact that the approach supports ⁓ different tools and techniques, I think is the strength, especially in an organization that's trying to learn, trying to figure things out, and where you want to bring engineers along on the journey instead of dictating to them, like, no, this is the one tool that you're going to use, and you have to do it exactly this way. That sounds like quite a nice feature. Dominic (1:04:19) Absolutely and I to to zoom on in on this because I think you kind of hit the jackpot there is that ⁓ I have now realized over the months that ⁓ Before Like if we, if we talk about the pre agentic era, everything was kind of deterministic. Also the ways of working, like there was a correct way to do your SQL. There was a correct way to write your C sharp code and it needed to be this pattern. And that was also the way, the ways of thinking that we had and sometimes kind of blindly just accepting those facts like, yeah, but here we use a repository pattern. Here we use a service pattern and Yeah, but for this we use Kanban because this is our process kind of in a kind of a rigid way and we can work the way we do now where one is using a Ralph loop and one is using spec kits because all the the pre-work and I'm not sure if this is the correct term because I'm not native English but like the discrete deliverables that go into your agent they are all the same so the stories are kind of standardized they follow templates they have a bunch of ⁓ properly structured acceptance criteria. And that's where people kind of say, yeah, but that's so waterfall. But yes, you need to put the effort into getting the rigid structure in those documents because that's what the agent will follow. And if the agent will follow speckit or if it will follow Ralph loop, the input is still the same. And that is what's so important for people to start realizing is that We are also doing Greenfield. So that means that our whole code base is built from the ground up to support agentic development. We have industry leading kind of standards for testing, ⁓ for structuring the code base itself using an annex monorepo still in there. ⁓ Stuff like a set of golden rules that are not allowed to be broken. everything from the ground up has been designed this way so that whatever agent you throw at it, the entropy is low kind of to have like a bunch of different outcomes. ⁓ Deejay (1:07:00) Yeah, it's, it really, I think it's fascinating the things that are the same, but different. Like if that makes sense in the, you know, in the old days, a really well structured story would allow the implementing engineer to figure out like, I'm going to like, we need to cross the river. I will decide if I'm better at building bridges or boats. ⁓ And there's an element of that here, but also you need the structure of Dominic (1:07:08) Mm. Deejay (1:07:28) This is the architectural patterns we're going to use. This is data model. All of the agents have access to that context and are going to be on the same page about not necessarily like the, the, the way the, the, the, the bits around inside that box. Like if you define the boundaries of the box through that context, then the, you know, architectural standards and the data model and all those sorts of things, the bit in the middle that is up to each implementer now. is not so much necessarily how you write the code, but it's how you drive the agent, how you break that, turn that into code. But because the code is kind of constrained, the code that you get at the end should be roughly equivalent, whether you use Ralph loops or vibe coding or, ⁓ or spec kit or whatever else. Dominic (1:08:01) Correct. Thank you. Exactly, and that's kind of also like the analogy that I'm trying to make now is that ⁓ the entropy of development in this way where we have all these rules, we have... kind of blueprints for how code should be structured. So it should use Zot schemas. We have the ERD is part of our repo. all the context is always available to kind of make cookie cutter code or maybe not cookie cutter, but so that the agents always know how to structure the code in the correct way because we have defined it upfront. the things that we used to think about while developing the code, like you said, I'll decide how I crossed the river. Like, am I going to use a repository pattern here or am I going to do CQRS here? No, no, that's all defined before the agent touches anything because that has been decided upfront. It is going to be this pattern, this pattern, this pattern. You're going to put the code in the controller. That code you're going to put in a service. You're going to split it by... by feature so that it stays in modular monolith. And there is not so much wiggle room for the agent to kind of go off on its own path and create this chaotic thing because that's what we did before. Like my first experiment, what I just talked about is like, do this thing and then you expect it to do X and it did Z. So. Deejay (1:09:57) Yeah. And like, there's a, aspect here that, we get, definitely have to loop back and try and rejoin the narrative as we're going through the code, through the system. Cause this is super interesting. No, don't apologize. Like this is fascinating. ⁓ but, ⁓ the, where was I going with that? ⁓ the approach is really defined and that guidance is consistent and you're working on a greenfield rewrite of an existing thing. Dominic (1:10:06) Sure, sorry, Deejay (1:10:27) And I can imagine that one of the challenges that Brownfield projects have is having all of these different architectural approaches that like, we tried that this year and the next year we decided that. you know, Timmy writes his code like that and Sarah writes it this way. And, you know, all of that mess, like, and again, with entropy, instead of having a few well-defined things, you've got like this almost like white noise of totally different approaches through, from which no meaningful pattern. can be determined by the model. It can't kind of go, there's a strong signal. I should do exactly that instead. It's like, ⁓ I could do it that way or that way or that way or that way, or there's no guidance. I'll just make it up as I go along. Dominic (1:11:07) Yeah, that is, and. I don't even know how to answer there. Because I can imagine, like, I don't have the experience myself doing that for a longer time. I had these kind of three, four weeks where I did try this in the older code base where exactly like you said, we have a bunch of patterns that were mixed and matched over the years that do work. And it is a mixed bag of signals for the agent because it goes in, it reads the file and it sees pattern A and then it sees pattern B in another one and it tries to create this own kind of hallucinated version of it. So yeah, there is a difficulty there. And unfortunately, I'm not the expert at all, not even pretending to be in this, but ⁓ to give any clarity on that, on how that would work, what I would guess is that you can still... kind of distill certain patterns that you want and put it into a structured kind of rule sets for your agent to follow. And this is not a proven thing that I know, but... you want to give your agent as much standard structure as possible. So golden rules, anti-patterns, log all the anti-patterns. So if you know that your code base has certain patterns that are really bad for that type of code, create an anti-pattern rule, markdown file with those patterns. Give the example. give also an example on how that should be corrected instead and feed that into your cloud MD file. You can add, like with a little add sign, you can make the agent load in files on demand while it's loading in the cloud file. And then you can try to get more structure for your agents to start working in that way. It will probably not be perfect, but it's going to be better than just telling your agent, don't do this thing while you're kind of writing in your terminal. So like, vibing in, vibe coding in a Brownfield project, I think that is the worst thing that you could do. Deejay (1:13:29) And I'm sure we will as an industry be seeing a lot of evidence to support that in the coming months and years. So ⁓ there's been big upfront kind of discussion discovery or discovery first through the code base. What's the model look like? What's the structure of the codes, not the structure of the code, but the like the structure of the relationships between entities supposed to look like. Then you talk that through, instructive meetings, you transcribe that, put that into a big spec. Dominic (1:13:50) Yeah. Deejay (1:13:54) And then you've got a skill that breaks that down into Epic stories. Then it's down to each individual developer on your team to pick up an Epic and then hand it off to the agentic coding tool of their choice or the spectrum framework of their choice to actually go and implement that. What do you do with testing? you have, does everybody come up with their own way of testing? Have you got like a gold standard of this must have an outside in acceptance test from the perspective of the user? How does the kind of quality assurance side of things work? Dominic (1:14:25) So we have a ⁓ test coverage requirement of 80 % for every pull request. ⁓ So for example, with speckit, can... ⁓ can't remember the exact steps now from memory because I just type it in my console, but there's like six or seven steps that you go through. Then you get to implementing and you have, I think it's also a specialized skill, QA or testing skill that we can run on the, either the code that we already have generated or before you start implementing the code that follows a set of standards that is predefined. So it's not a... Deejay (1:14:46) you Dominic (1:15:11) per epic kind of thing where I go and it's like, oh, what am I going to write for tests this time? No, it's like the testing pyramids, like I think 5 % end to end or 10%, 20 % integration test, 80 % unit test. There's a whole bunch of other kind of stricter instructions in the testing skill. that is predetermined and then it just goes off and starts writing all these tests. So that is 100 % part of the process in doing this delivery because some pull requests, I had one a couple of days ago where I without tests had like 55 files touched. And that's a big pull request, that's massive. Deejay (1:16:03) Yes. I was about to ask, do you do humans doing code review? Do you get an agent to review first before getting a human to do it? Because that sounds like a big old PR. ⁓ Dominic (1:16:16) That's a mood breaker, if somebody would get that. no, there's just so much to unpack. But also during the process when I do spec hits and then we implement the code, you test around, you create tests, like you let the agent write tests. And then I also use the Entropic PR review toolkit, personally. Deejay (1:16:19) Yeah Dominic (1:16:42) And that last time, I'm not sure what my agent did, but I had to do like seven review cycles. And if you've used the Anthropic PR toolkit, I don't think our CTO was happy with how many tokens I burned that day because I ran out of my ⁓ clock plan in 20 minutes. My five hour limits got reached in like 20 minutes, but then it like seven cycles going through it, find some critical issues. Deejay (1:16:55) You Nice. Dominic (1:17:12) And that's a good thing because that toolkit, can also instruct it to take a look at your own rules that are defined in the repo. it has kind of a feedback loop where it looks, okay, the agent before me, did it use all the correct rules? Didn't it break any anti-patterns? Did it follow all these coding standards? And then it goes in this loop checking all the code that has been generated until no more critical high or medium issues have been found. ⁓ Then I offer a pull request. And at that point, I am not going to ask my colleagues to go in there. So there's at that point, not an extra set of eyes from the outside, but the effort that went into producing this code with all the guardrails, with all the checks, with all the testing requirements gives me the confidence for now to not require a colleague to take a look at it because it's been flipped over like seven times already. ⁓ Then I promised before to get to this, like in the beginning of our talk to this cool little thing called Rabizzards, which is our own kind of version of a CodeRabbit. I'm sure you know it. So we tested a little bit with CodeRabbit and GrapTile, these PR review ⁓ products that you can... Deejay (1:18:20) You Dominic (1:18:40) hook into your GitHub. ⁓ And one of our colleagues had a brilliant idea of, let's just try building this ourselves. So he went off and ⁓ he started building his own version of ⁓ CodeRabbit and Graptile, hence the name Rabizzart. Yeah, it's a genius name. And that is our kind of... ⁓ Deejay (1:18:48) Yeah. Dominic (1:19:06) another basically another PR review toolkit that runs in GitHub and it's able to check the code like it I think it has like 15 different prompts that it fires at the codes that we offer in GitHub in a pull request. It looks at security, ⁓ a bunch of different things. I cannot even name them all right now. And it will post comments for every every mistake or potential issue that it finds with different grading systems. you have red or critical issues and then orange is like medium issues. And it posts a prompt ⁓ or a set of prompts for each issue that it found. So let's say even after all those ⁓ cycles of PR review toolkit from Anthropic, it still finds like three issues. It will just also output a prompt for those three issues. I do look through those. like, hey, does this make sense? I look at the code like, oh yeah, that could be something. And then I just take the prompt from Rabizzard, put it back into cloud and then it's either find something or it doesn't. And then in the GitHub interface in the comments, I can reply back, this is a false positive because X, Y, That input goes back into Rabizzard and then it takes that learning and create, It has kind of its own eval database for that project and all the anti or the false positive that I for, ⁓ for example, have reported back and it will know for the next time, this is, this is just how we write something or this is just how we design it. So it's a really cool kind of in-house built thing. And it started out just for our team because we didn't use code rabbit. We didn't use grab towel, but we had our own thing. And now a lot of other teams inside of what they are like, Hey, Can we get Rebizar because this is like a really cool thing and it helped us a lot over the last few months. Deejay (1:21:05) There is something deliciously amusing about a SaaS company building their own tools so they don't have to use another SaaS company's product. it's, and how long did that tool take to, was that week's worth of effort? Was that a couple of days with somebody with a instance of Claude code, you know? Dominic (1:21:13) It's really fun. would have to ask my colleague. don't exactly know. I think the first version was like a day maybe. I'm sorry, Frederick, if I'm saying it wrong. But maybe a couple of days, I don't know, for the first version. And he has been tweaking it over all these months, just like improving the product and ⁓ adding a new feature here, adding some better evals and keeping it up to date, adding new features where you can chat with it in Deejay (1:21:35) Hahaha Dominic (1:21:56) in GitHub, there was not a thing at first. would just output this is wrong. But now you can chat with it and it can go back to the code ⁓ and you can have a conversation with it in GitHub if you don't trust its kind of ⁓ analysis. Deejay (1:22:13) And do you happen to know where it stores ⁓ that information? You mentioned that, you know, it learns from when you tell it that it's made of false positive. Do you happen to know, ⁓ is it just storing things in some Git repo as markdown or is there something more complicated and fancy like vectorized databases and that kind of thing? Dominic (1:22:35) No, I think I'm not entirely sure on the implementation side, but I think it's just a bunch of MD files in the end where it just scores. Deejay (1:22:41) Cool. I mean, if Odevo and Frederick want to open source it, I'm sure everyone, I mean, I work with a few companies and I know one that would be quite interested in that for sure. if you can share the love, then maybe someone might use it. Maybe they just build their own because, you know, these days finding a thing and understanding it is harder than just building it with an agent. ⁓ So... Dominic (1:23:04) I will have a chat with Frederick. Deejay (1:23:07) Yeah, yeah, more open source, the better. So we've got this kind of process where each individual developer is tackling an Epic, implementing it the way they want. You are asking the agents to write tests. You write some yourself and you're reading those tests. You're involved. You defer most of the code review to the PR review skill published by Anthropic. Do that several iterations. so that you have confidence in it and you're reacting to those kind of bits of feedback. ⁓ And then you can raise a PR and then this automated tool gives it one more check over or several 15, I think you said prompts. You can check to see if any of the things that said are legitimate. And then presumably then you're good to merge and you're merging your own pull requests because you know they've been reviewed by a million agents. ⁓ Dominic (1:23:56) Correct. At that point, Deejay (1:24:01) Yeah. Yeah. I had quite a few kind of virtual sets of eyes on them. And so at that point it's merged. How does updating the context in the Uber repo work? Is there some kind of automated process that's looking for changes in the latest stories? ⁓ Or when you're implementing something is updating, I know like one of the ERDs or like the SQL schema. Does that Dominic (1:24:26) So yeah, so sorry. Yeah, that's a really good point because that like I am going back and forth in the process here because there's like so many things that go into it, but that's a very good point. We treat especially up until a feature is delivered. So when we merge to main at that point. the full Hub repo, like the main repo, has to be in sync with what has been delivered. Because sometimes what happens is that you refine the hack out of a story, you come to implementing it, you take a look at it and you're like, this is just not correct, even though we went over it. The stories are in the Hub repo, they are committed to main. So we go back to the stories and we update them. to make sure that the stories and the specs, everything is correct with reality. We also, if we have schema changes or we add a new entity, we add it to the ERD, we add it to the schema file so that it is the source of truth for the system so that it can be reused as context for the next story or epic. by another person so we don't have to let the agents go looking through the code for the truth because that's where you will eat up a ton of context every time. ⁓ So yeah, we. Deejay (1:25:53) Yeah, kind of trawling through the code base. So do you do that manually? Like, you know, like the change has been made as part of delivering one of the stories in the epics that you're responsible for. So you kind of make sure that that's, that's happened up in the Uber repo. Dominic (1:26:08) Correct. So yeah, and we treat it as kind of a ledger as well. Maybe this is kind of an add-on and not a perfect answer to what you asked, but we treat the stories in Epic as a ledger in the sense that once we have implemented something, it has gone to main. So everything is merged. We have to deliver the feature. And if we have to do an adjustment to the feature later, we create a new Epic and story for it. So that's... the original intent is still present in spec files, but that also a new epic and story exists to kind of log the change. Because sometimes you remember so many times, or I remember so many times that you go and look at a piece of old code in the legacy system and you're like, what could have been the intention here? What could possibly be the reason for this code? And that is also what this prevents is that there is the truth is like in the repo, the whole history of everything is in there. You can always go back. You can take a look at it. The truth never changes because it's just there. Deejay (1:27:24) Yeah, there's, and I remember us talking about that actually. And I found it quite interesting because with I've been playing with a kind of convergent approach of screw, screw the Delta screw the diffs, just try and make, uh, make a change and ask the system to converge on it kind of being inspired by things like, uh, Kubernetes and its control loops. But I think that only works if you have a limited number of people changing things at once and the changes are quite small. Um, And there's plenty of space for lots of different approaches to work. ⁓ that's interesting difference there. ⁓ you manually update the kind of important parts of the context, make sure that that's up to date. Are there any other parts in the loop? So you've done that, you've implemented a story, ⁓ you've done all the current reviews, it's been merged. you update the Uber repo. Is there anything else in the process or has that kind of taken us full cycle to then presumably once enough of you have implemented enough epics, you get back together and you talk about like the next big set of features that are required. Dominic (1:28:41) Honestly, think this kind of contains everything in the process, but there's so many more things that go into it. In the individual parts, I think we could talk here for like four or five hours easily. ⁓ Deejay (1:28:50) Hahaha Yes, I am mindful of the fact that it's nearly quarter to eight in Sweden and dear old Dominic is staying late after work to talk to me. Dominic (1:29:01) yeah. No, that's fine. This is great fun. ⁓ So I'm just looking because I have a little bit of a couple of notes here, maybe that there's something that goes into it that I glossed over. I probably did. ⁓ Deejay (1:29:10) Yeah. Dominic (1:29:24) see, agentic way of working. Maybe one thing that maybe didn't get enough love ⁓ is kind of the discovery part. before, even before we get to refining stuff. because we, at first we had an existing product that we could. copy paste, basically, almost. Of course, with a lot of adjustments that were needed, but it was still kind of a, we know we need these features. So that was not really a big part of the discovery, but we're going so fast now. And I'm not saying we're running out of epics to do, but we are going to get to a place soon where we need discovery. So ⁓ we have a product manager who is also very much into the terminal nowadays. and has a lot of skills as well with Claude. And she created her own discovery kind of pipeline. And that is a bunch of like a set of six skills that she designed to that can be used in the process of doing discovery work. taking user interviews, ⁓ getting transcripts, translating that into kind of a scope document before we even get to our scoping, if that makes sense. So it is more like all the discovery work for the product manager. So there's like a research analyst, which takes raw inputs, like user interviews. There's a scope writer who takes that analysis, transforms it again into a new structured document. Deejay (1:30:54) Yeah. Dominic (1:31:11) A scope refiner which takes the original scope, goes over it, tries to find ambiguities or ⁓ contradictions in the original user's kind of explanation of stuff and then asks directed questions in, okay, we have the user said this, but also this, how can we kind of merge this into a correct answer? ⁓ From there, there's an Epic writer, a UX skill. We're kind of now also going into the phase where we want to focus on ⁓ more UX work that can be a part of the discovery process. And there's a bunch of other skills as well that are specialized in certain steps, which might not be part of a pipeline, but can be used in... ⁓ of distinct ways in their own sense. Maybe that's a little bit fake, but I hope it comes across as something useful. Deejay (1:32:13) Yeah, I mean, I think one of the things that's evident there is that it is more about more than just the coding process. And, you know, when you think about all of the other disciplines like the UX designers and the product managers and the kind of amount of distillation that those folks have to do and the, you know, framing the right problem before you even start trying to tackle it. There's a lot of other stuff that can be helped massively. Dominic (1:32:22) Absolutely. Deejay (1:32:43) by all of this and it then kind of becomes less surprising when you think that, okay, just sticking a coding agent inside the middle of a very long, complicated existing process. Surprise, surprise. doesn't give you the massive gains like you folks have had to repeat the point. Eight years worth of work with a third of the people taking 11 months. It's a shame that years last 12 months and that the mathematics on that is a little bit difficult, but after we're done recording, I'll try figuring out the percentage speed increase there. ⁓ But yeah, it sounds like a full end to end thing. And I do wonder whether people have to get become more used to this kind of wholesale change rather than trying to necessarily incrementally evolve processes. And Dominic (1:33:12) Yeah. Deejay (1:33:36) I think a really interesting experiment would be how do you evolve a code base to what you're describing? Maybe people can't do a full rewrite. Maybe they can, maybe they need to check their assumptions about that in light of new evidence of how fast things can go. But if folks can't do a complete rewrite, can you use agents to slowly get your, to reduce the amount of entry? in your entropy in your code base, like exactly like you said, of reduce the number of needless variations, reduce the amount of complication and just get it down to that irreducible complexity so that then you've got a nice stable foundation to do the kind of work that you folks in Endgame are doing in this process. Dominic (1:34:22) Absolutely, and I think that is especially for kind of brownfield projects. Again, that is going to be the most important part to reduce the, how can I say this correctly? ⁓ Tame the agent or the context of the agent so that it can only have a handful of outcomes. that will, ⁓ constraint, thank you, constrain it. So. Deejay (1:34:45) Yes, constrain it. Dominic (1:34:50) And that is going to be, there's, it will cost, like I know from experience with legacy code bases is that you will need to rewrite certain parts of the code base. You will need to kind of try to standardize patterns and getting it in a kind of standardized way before I think you can really start, get the benefits from this way of working. You need this. You need rule sets, you need standardization for your agent to be efficient at this. So yeah, I think with a moderate to high amount of pre-work. it will be possible in legacy code bases as well. it ⁓ would need a very directed way forwards. It would need refactoring work, which I know from experience as well, like you don't always get time to refactor stuff just with the hope of making it work for your agent. So I can imagine for a lot of companies, it might not sound like an option, but... If you can get that done, get your project in a better state so that your agent, even if your agent might hallucinate sometimes, you can still get massive benefits if you can get this end-to-end kind of agentic mindset over your whole process. Because it's not like anything I worked before. Yeah, of course, the individual steps, it's still like you need to write a story. You still need to have an epic. You still need to develop quote unquote. the code, but it is not the same thing anymore at all. I haven't personally at work, I have not written a single line of code manually since January, right before the training. I have not written a single piece of syntax. Deejay (1:36:44) You And how do you feel about that? I mean, it sounds like you're doing some stuff in your spare time to keep your skills sharp, do you like, do you feel sad about that? Are you like happy because you get to think about higher level things? Do you still enjoy the things that you're doing on a day-to-day job, on day-to-day basis in your job? Are you enjoying those as much as you were writing syntax previously? Dominic (1:36:55) Hmm. So there, I think we already had this conversation once in Slack. No, was in Slack. And I can remember in the beginning, this was a, I went through the five stages of grief because my identity as a programmer, like I am a coder, a programmer. Like I ⁓ love writing syntax. Deejay (1:37:18) Was this over beer? You Dominic (1:37:39) that kind of disappeared from one day to the next. Like my value that I could add by being able to write good code, certainly had no value anymore. So that's personally like, I had this whole rant once in the team where it got to me and I kind of got emotional, but I told my team I experienced a certain ego death. in that like this whole ego, this whole person that I was because of this thing that I did, it's just gone from one day to the next. So yeah, it wasn't easy in the beginning. And I certainly have had my doubts over the process. Like, do I even like this? Is this really fun? but yeah, like humans, and especially me, can be quite rigid in the patterns in the brain, so to say, in like, Deejay (1:38:26) You Dominic (1:38:35) getting used to stuff. did this work for 12, 13 years, I think. I was a coder and now I write prompts to an agent. Just to be completely blunt, no, I did not always like it. But at this point, like after a couple of months, it is really fun. I did get to the point again where I'm like, I delivered this whole epic and I did the refinement. I did make sure that it was done correctly. Deejay (1:38:59) you Dominic (1:39:04) I am still in the loop for now. We'll see how it is in like a year from now or six months. You know how fast it goes. ⁓ But it is really fun, especially when you have the system that works this well, where you can trust your agents for 99%. I'm not pretending like there's zero hallucinations ever. There was this example just as a tiny side story where... ⁓ we had to implement this table and you needed to be able to delete a document. And I think there were three delete buttons in the table to delete the same document. was just like that stuff still happens, but being able to trust the process and the system in to such a high degree makes the work fun because you can focus on diving into this refinement, going into the depth with the designer to go, hey, does this really look correct and just produce three different prototypes within 10 minutes like, hey, let's try this layout, let's try this layout, let's try that layout, compare them and just. put it into your delivery and just deliver that thing. It's a different kind of, ⁓ what's the word, a different kind of satisfaction that you get because it's not the same dopamine pathways that I got from typing the code and doing the heavy technical thinking myself, but there's a lot of other rewarding stuff that come from this. Deejay (1:40:40) Yeah, that makes perfect sense. And I think there'll be some people that really like the technical solutions. as you were speaking, the thing that was going through my head is that it's kind of like, instead of delivering code, you're delivering features now you're delivering productivity like big chunks of stuff. maybe, you know, when you when you're programming and you get something that's really nice, it's really neat, and it's elegant, and it's, you know, totally reusable. That's a nice feeling. And I think we need to kind of ⁓ substitute that with the satisfaction of like, I've got a whole bunch of work done and it's implemented the right way. I know certainly ⁓ I had a moment when I had a vibe coded project and it worked, passed all the acceptance tests, did what I needed to do. The code was just like, that's not how I would write it. And so, vibed a bit more until it looked like code I would have written, bearing in mind. I don't think anyone that's ever worked with me would say code that I've written would be any good. But ⁓ it looked like, you know, the architecture that I was happy with. And at that point, all of a sudden, I kind of became not like proud of it, but I felt like it was mine. I felt like, you know, kind of a connection with it, like I was involved. So ⁓ I can imagine that once you've got all those guide rails in place, ⁓ then I'm trying to avoid going into convergent evolution taking us even longer into the evening, but like the guide rails that you were talking about and having all that context, like things keep evolving to be crab shaped because the laws of physics and the biology and the ecosystem around them, like a crab is a good solution to this. And I think the guide rails and the context kind of, you want your agents to be able to convergently evolve the code base to what it should be doing without kind of coming up with crazy solutions. Dominic (1:42:19) Mm. Deejay (1:42:32) This is probably a good sign that we should probably wrap up because I'm rambling and talking absolute nonsense now. ⁓ Was there anything else that was particularly important to you that you wanted to say or wanted to share with people? Dominic (1:42:36) For sure. That's a very good question. ⁓ Not really at the moment, like I'm full consumed by this way of working, but yeah, maybe one takeaway is that I think I can be considered one of the most skeptical people that have worked at Odevo regarding AI. ⁓ I have been vocal against it and... Deejay (1:42:47) Have you got any side projects that you want to promote? Dominic (1:43:15) if you have any doubts on the effectiveness, correctness and feasibility of it with the correct kind of system to work in all those kind of, ⁓ where was I going? Like these preconceived notions that I had, they went all out the window. So it is important to... give it a shot and you have to change your mindset. Like that is really important because it is a different way of working. And it is, yeah, it's a, how do people say it? A new era and you have to adjust yourself. if you manual coding, there would, like you said before, there's going to be some pieces of software or complex ⁓ systems that will need. that manual attention. Okay, well this just went completely off the rails, sorry. Maybe you... Deejay (1:44:21) No, no, it's fine. think, you know, I would probably, I'll be so bold as to finish your sentence. But, you know, there are things that are like missile guidance systems that want to be mathematically proven and all of those kinds of things. But a lot of what we do is moving strings in and out of databases. And I think that that's more about the kind of business logic and those sorts of things, which, you know, we probably won't, we need to be sure about the business logic rather than the syntax necessarily. But now I think the It's important, you know, to, and I'm glad that you, were willing to share the kind of journey that you've been on because it's, I think there are not so many developers that are willing to kind of, ⁓ you know, talk about the change of mindset and, ⁓ and not so many that have actually gone on this journey in an enterprise where they are trying to deliver features. There's loads of people like me on podcasts and YouTube who are like, doing bits of consultancies. Oh, well, I made a little toy project in the sidelines and it worked really well. So it must work in a massive, you know, several thousand person enterprise, but it's different when you're delivering features. So I think that's a really valuable thing to share. And I appreciate you staying late. It is now one minute past eight in Stockholm at the time of recording. And I look forward to hearing your conference talks. I'm going to... Dominic (1:45:36) Sure. Deejay (1:45:44) In the outro, I'm absolutely going to encourage people. don't need to do it because I'm doing it now. Listeners, you need to find Dominik and his complicatedly spelled Polish surname and then harass him into submitting this as a conference talk. Dominic (1:45:55) Thank Cool, thank you so much for having me DJ. Deejay (1:46:00) It has been great fun and you win the award for a longest podcast yet. So that's another thing for you. Dominic (1:46:05) ⁓ Deejay (1:46:10) Hopefully you are still with us. Didn't fall asleep and didn't get too frustrated about us going off topic and starting to ramble towards the end. If you did enjoy this episode, then I think it'd be really lovely if you could reach out to Dom and give him some feedback. This is the first kind of any kind of public speaking that he's done. And I would really like it if he gave this as a conference talk, you know, with full diagrams and code snippets. So it's slightly easier to follow than just listening to two people talk. So any encouragement you can send his way. I think would be a lovely thing. If you have any feedback regarding the podcast generally, then you can email wavesofinnovation@re-cinq.com That domain name is R E dash C I N Q dot com. And otherwise be good to each other and you'll hear me in the next one.

Episode Highlights

Dominic transitioned from AI skeptic to leading a team delivering code 72 times faster.

Traditional Kanban processes bottlenecked quickly when combined with the sheer speed of agentic code generation.

Engineering processes must be redesigned around AI velocity rather than forcing AI into old workflows.

Odevo is rebuilding five years of legacy software in just eleven months using agentic workflows.

Heavy upfront discovery and strict acceptance criteria are now critical superpowers for reducing AI entropy.

Hub repositories and submodules provide precise context boundaries for agents navigating complex microservice architectures.

Standardized inputs allow engineers to use diverse agentic tools like SpecKit or raw terminal commands.

Share This Episode

https://re-cinq.com/podcast/integrating-ai-into-development-processes

Free Resource

Master the AI Native Transformation

174 patterns, 422 pages — #1 Bestseller From Cloud Native to AI Native is FREE for a limited time

Get it For Free!Get it For Free!free-resource

The Community

Stay Connected to the Voices Shaping the Next Wave

Join a community of engineers, founders, and innovators exploring the future of AI-Native systems. Get monthly insights, expert conversations, and frameworks to stay ahead.

72x Faster Software Delivery with a For... | re:cinq Podcast