<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[Cody Django Redmond]]></title><description><![CDATA[Public expressions from an software engineering manager, sci-fi enthusiast and synth noodler.]]></description><link>https://codydjango.com/</link><image><url>https://codydjango.com/favicon.png</url><title>Cody Django Redmond</title><link>https://codydjango.com/</link></image><generator>Ghost 5.53</generator><lastBuildDate>Mon, 13 Apr 2026 14:56:38 GMT</lastBuildDate><atom:link href="https://codydjango.com/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[AI Documentation Dividend]]></title><description><![CDATA[<p>A friend recently asked me what qualities make a great engineering manager, and one of the things I mentioned was risk management &#x2013; specifically, the value of investing in design artifacts like system diagrams, reference architectures, and clear interface boundaries before diving into execution. In my experience, the hard part</p>]]></description><link>https://codydjango.com/ai-documentation-dividend/</link><guid isPermaLink="false">699dc5d627058428dbad0246</guid><category><![CDATA[architecture]]></category><category><![CDATA[agentic]]></category><dc:creator><![CDATA[Cody Redmond]]></dc:creator><pubDate>Tue, 24 Feb 2026 15:39:58 GMT</pubDate><content:encoded><![CDATA[<p>A friend recently asked me what qualities make a great engineering manager, and one of the things I mentioned was risk management &#x2013; specifically, the value of investing in design artifacts like system diagrams, reference architectures, and clear interface boundaries before diving into execution. In my experience, the hard part was never knowing these things were valuable. The hard part was convincing anyone to pay for them.</p><p>That dynamic is shifting in an interesting way. As organizations start investing in agentic AI workflows for software development, there&apos;s a new top-down pressure to produce exactly the kind of context-rich documentation that engineering managers have been quietly advocating for years. Agentic systems need well-defined boundaries, clear interface contracts, and accurate architectural references to make good decisions autonomously. Leadership is now asking for these artifacts because AI needs them &#x2014; and that&apos;s creating buy-in that was historically very difficult to generate.</p><p>The irony isn&apos;t lost on me. Human engineers have always needed this context too. The difference is that &quot;invest in documentation&quot; never made it past a quarterly planning meeting, while &quot;enable our AI agents to work effectively&quot; apparently does. I&apos;m not complaining... I&apos;ll take the win! But it&apos;s worth naming: if your organization is finally building out these maintainability artifacts for AI, make sure your engineers benefit from them just as much.</p>]]></content:encoded></item><item><title><![CDATA[Happy 2026]]></title><description><![CDATA[<p>I&apos;m due for an update: Joni is now three years old, and doing great. The last few years have blown past us. I&apos;m happy and healthy. My job is engaging, and rewarding, and challenging. Leading a platform team is satisfying, and I&apos;m blessed to</p>]]></description><link>https://codydjango.com/happy-2026/</link><guid isPermaLink="false">69570ea327058428dbad0204</guid><category><![CDATA[life]]></category><dc:creator><![CDATA[Cody Redmond]]></dc:creator><pubDate>Fri, 02 Jan 2026 00:29:43 GMT</pubDate><content:encoded><![CDATA[<p>I&apos;m due for an update: Joni is now three years old, and doing great. The last few years have blown past us. I&apos;m happy and healthy. My job is engaging, and rewarding, and challenging. Leading a platform team is satisfying, and I&apos;m blessed to be working with some truly exceptional individuals. Audrey is wrapping up her first year as an human resources generalist, and she likes it. We love where we&apos;re living, and our house is still slowly undergoing renovations. My mom and one of my sisters moved nearby, which is nice. My other sister in Australia now has a second baby, and we&apos;ve been in contact more frequently.<br><br>I&apos;ve been meditating more frequently over the last year, and I plan to keep this up going into 2026. It sure seems appropriate, considering the precarious situation we all seem to be living in. I also hope to spend a little more time this year with music, and with physical activity. And less time with renovations :) </p>]]></content:encoded></item><item><title><![CDATA[What does it mean to "operationalize" an  engineering team?]]></title><description><![CDATA[<p><br>In startup mode, it&apos;s the <strong>WHY</strong> and the <strong>WHAT</strong> that see the most consideration. The customer problem, the purpose, the service, the deliverable, etc. The <strong>HOW</strong> is often overlooked, so long as it&apos;s &quot;good enough&quot; and delivered as quickly as possible. A common strategy</p>]]></description><link>https://codydjango.com/what-does-it-mean-to-operationalize-an-engineering-team/</link><guid isPermaLink="false">6812c3a427058428dbad00a9</guid><category><![CDATA[software development]]></category><category><![CDATA[scaling]]></category><category><![CDATA[teams]]></category><dc:creator><![CDATA[Cody Redmond]]></dc:creator><pubDate>Thu, 01 May 2025 15:45:12 GMT</pubDate><content:encoded><![CDATA[<p><br>In startup mode, it&apos;s the <strong>WHY</strong> and the <strong>WHAT</strong> that see the most consideration. The customer problem, the purpose, the service, the deliverable, etc. The <strong>HOW</strong> is often overlooked, so long as it&apos;s &quot;good enough&quot; and delivered as quickly as possible. A common strategy during this phase is &quot;we&apos;ll figure it out as we go&quot; &#x2013; an ad-hoc approach.</p><p>As product-market fit is found, companies often choose to focus on growth, expanding their customer base and acquiring as many as possible within their market. This is the &quot;scale-up&quot; mode, and this is where <strong>HOW</strong> becomes significantly important: how to consistently deliver quality service as customers and usage volume grows exponentially.</p><p>It&apos;s the intentional tackling of the <strong>HOW</strong> problem that is the focus of &quot;Operations&quot;. This is the work of going from ad-hoc processes and reactive firefighting to a structured, process-driven operation that handles scale efficiently. Here&apos;s what operationalizing a reliability engineering team typically involves:</p><ul><li><strong>Playbooks</strong> - Creating standard operating procedures for common incidents, outages, and maintenance tasks</li><li><strong>Controls, Automations and self-service</strong> - Building tools that enable better control of the systems, or to automate repetitive tasks and allow other teams to help themselves</li><li><strong>Observability and monitoring</strong> - Implementation of metrics, logging, and alerting systems that provide visibility into system health and simplify troubleshooting</li><li><strong>Service Level Objectives (SLOs)</strong> - Identifying the critical user journeys, establishing clear reliability targets and tracking performance against them</li><li><strong>Refactoring legacy systems</strong> - Systematically identifying, updating or replacing early &quot;quick and dirty&quot; solutions with more scalable architectures that can handle increased load and complexity</li><li><strong>Incident management framework</strong> - Structured approach to handling incidents with clear roles, communication channels, and post-mortems</li><li><strong>Capacity planning</strong> - Regular forecasting of resource needs based on growth projections</li><li><strong>Knowledge management</strong> - Documentation systems that capture institutional knowledge and reduce dependency on specific team members</li><li><strong>Cross-training and hiring</strong> - Ensuring the team has redundancy in skills and is staffed appropriately for scale</li><li><strong>On-call rotation</strong> - Establishing a fair, sustainable schedule for engineers to handle off-hours incidents with clear escalation paths and support systems to prevent burnout</li></ul><p>The term &quot;operations&quot; has roots in both military and business process management, where it means to convert strategic goals into day-to-day operational processes. In software development, the language really gained traction with the rise of cloud operations in the 2010s, influenced by Google&apos;s approach to running large-scale SRE teams. It boils down to putting something into operation in a systematic, repeatable way.</p><p>Operationalizing is building systems, processes, and team structures that can handle a 10x scale without requiring a 10x headcount. While vertical scaling helps with raw capacity, it doesn&apos;t address processes that won&apos;t scale, or the exponential spikes or system patterns that can lead to cascading failures and erratic behavior.</p><p>In addition to efficiencies, a focus on operations is also highly rewarding for both engineers and end-users. In both cases we&apos;re investing in predictability, meaning a better and more consistent experience with fewer surprises. For engineers in particular, this can drastically improve work-life balance and increase flow time, reducing burnout.</p><p>Of course, it&apos;s not as easy as it seems. Especially when it comes to reworking legacy tech while scaling, or identifying and tuning SLOs. These are areas often underestimated, where experienced senior and staff engineers can make a huge difference. </p><p>For my next post, I&apos;ll tackle the &quot;HOW&quot; of operations &#x2013; very meta! By that, I mean the heuristics I use to identify priorities and sequencing. I&apos;ve found mapping activities to be super-useful in surfacing certain insights, such as value-stream mapping, and cognitive load/complexity mapping, and I think it&apos;s worthwhile of a post. I might also dip into how qualitative/anecdotal data is another useful driver, like the good work being done by the folks over at getdx.com. <br></p>]]></content:encoded></item><item><title><![CDATA[What makes a good manager?]]></title><description><![CDATA[<p>I was recently asked this question by a friend considering a move into engineering management. It&apos;s been a few years since I&apos;ve asked myself this question, and I find it interesting how my perspective on this has changed over the years.</p><p>Before I begin, I&apos;</p>]]></description><link>https://codydjango.com/what-makes-a-good-manager/</link><guid isPermaLink="false">664b6cb127058428dbacfbbe</guid><category><![CDATA[management]]></category><dc:creator><![CDATA[Cody Redmond]]></dc:creator><pubDate>Mon, 20 May 2024 19:31:45 GMT</pubDate><content:encoded><![CDATA[<p>I was recently asked this question by a friend considering a move into engineering management. It&apos;s been a few years since I&apos;ve asked myself this question, and I find it interesting how my perspective on this has changed over the years.</p><p>Before I begin, I&apos;d like to note that I won&apos;t address the manager&apos;s responsibilities, which are frequently framed as delivering value to the organization, often by achieving operational objectives while retaining and growing reports. Instead, I&apos;ll focus on the qualities that make for a great manager within the context of a team.</p><h3 id="consistent-practices-and-behaviours">Consistent Practices and Behaviours</h3><p>A manager creates a stable and productive environment by establishing a repeatable set of practices and behaviours. When team members know what to expect, they can focus their energy on solving problems rather than worrying about unpredictability. Practices can (and should) evolve, but there ought to be a core stability that reinforces a psychologically safe environment for work to get done.</p><p>For example, I have a weekly team meeting with an agenda that acts as a checklist of our most important responsibilities and allows the team to discuss significant issues that may have come up. I also have 1:1 meetings every two weeks and a monthly retrospective to uncover better ways of working.</p><p>I also have a standard method for feedback (both praise and constructive) that I incorporate into my onboarding of any new direct report.</p><ul><li><a href="https://www.linkedin.com/in/mark-horstman-373a3/?ref=codydjango.com">Mark Horstman</a> covers this topic in his book <a href="https://www.manager-tools.com/products/effective-manager-book-second-edition?ref=codydjango.com">The Effective Manager</a> and the <a href="https://www.manager-tools.com/?ref=codydjango.com">Manager Tools Podcast</a>. </li><li><a href="https://lethain.com/?ref=codydjango.com">Will Larsen</a> has a blog post that covers how to systematize just enough for teams to be effective: <a href="https://lethain.com/work-policy-not-exceptions/?ref=codydjango.com">https://lethain.com/work-policy-not-exceptions/</a></li><li><a href="https://www.linkedin.com/in/claire-hughes-johnson-7058/?ref=codydjango.com">Claire Hughes Johnson</a> also recently published a <a href="https://press.stripe.com/scaling-people?ref=codydjango.com">book on this topic</a>, which describes the repeatable set of practices as the manager&apos;s &quot;operating system.&quot;</li></ul><h3 id="regulated-and-open">Regulated and Open</h3><p>Software Development can evoke contrasting opinions, perspectives, and dynamics. Managers who successfully manage their own temperaments are far more effective when collaborating with others during times of increased tension. </p><p>Managers must effectively work with people not just on their team but throughout the organization. No matter the context, an effective manager leads from a place of responsibility, curiosity, and openness. The <a href="https://conscious.is/?ref=codydjango.com">Conscious Leadership Group</a> refers to this place as &quot;above the line.&quot; &#xA0; Conversely, being &quot;below the line&quot; means being closed off, defensive, and reactive.</p><ul><li>The book <a href="https://www.amazon.com/15-Commitments-Conscious-Leadership-Sustainable/dp/0990976904/?ref=codydjango.com">The Fifteen Commitments of Conscious Leadership</a> describes the qualities of conscious leadership in useful detail and provides methods for moving from &quot;below the line&quot; to &quot;above the line.&quot; </li></ul><h3 id="reliable-and-effective">Reliable and Effective</h3><p>A manager is responsible for the standard of quality across their teams. This means modelling professionalism and follow-through. It doesn&apos;t mean that you need to be stuffy or uptight; it just means that you know which qualities are most important to build trust and an effective team and work to push that standard higher. Sometimes, I refer to this as &quot;setting the tone.&quot; </p><p>For example, if I say I will do something, I do it and report back to the team. I use proper spelling, grammar, and full sentences, and I expect that folks come to meetings prepared. I use agendas for meetings, keep them on topic, and end them early instead of running down the clock. </p><ul><li>In his book <a href="https://www.amazon.ca/Art-Leadership-Small-Things-Done/dp/1492045691?ref=codydjango.com">The Art of Leadership: Small Things, Done Well</a>, Michael Lopp highlights that managers must set the highest standard for follow-through, which is crucial in building a high-trust team.</li><li>The central lesson of the book <a href="https://www.amazon.ca/Score-Takes-Care-Itself-Philosophy/dp/1591843472?ref=codydjango.com">The Score Takes Care of Itself</a> is that leaders who set, model, and maintain a high standard will create an environment where success is a natural outcome. </li><li>Another fun read that covers this ground is <a href="https://www.amazon.ca/Turn-Ship-Around-Turning-Followers/dp/1591846404?ref=codydjango.com">Turn the Ship Around!</a> by David Marquet. I liked this one, too. It goes a little deeper into how shifting language can be a powerful tool in shifting behaviours. </li></ul><h3 id="ambitious">Ambitious</h3><p>A line manager often links the CEO or VP and the staff, ensuring effective alignment of organizational goals. With clear understanding and articulation, aligning the organizational vision with the personal drivers of your direct reports is possible, which can result in a powerful source of motivation. </p><p>In addition to responding to top-down objectives, a great manager will also work from the bottom up, looking for opportunities or innovations that could generate new revenue streams or unlock untapped organizational value.</p><p>Lastly, a great manager understands that engineers love hard problems and that setting high targets is a precursor to achieving exceptional outcomes. But just as important is understanding that making unachievable commitments on someone else&apos;s behalf is a source of demotivation. In this regard, it&apos;s important to maintain awareness of the technical environment in which your team is operating and to collaborate with your team to identify ambitious targets that are also achievable. &#xA0; </p><ul><li>The book <a href="https://www.amazon.ca/Drive-Surprising-Truth-About-Motivates/dp/1594484805?ref=codydjango.com">Drive: The Surprising Truth About What Motivates Us</a> does a great job of unpacking how different people are motivated by different factors. It&apos;s worthwhile to check in regularly with your direct reports on the most important factors and then find ways to align those drivers with operational objectives or growth goals. &#xA0;</li></ul><h3 id="risk-adverse">Risk-Adverse</h3><p>Lastly, I don&apos;t often see this aspect come up in discussions as frequently&#x2014;perhaps because it&apos;s often relegated to project management, a separate area. But from my own experience, I&apos;ve noticed that a process that identifies and manages risk early enables far greater delivery and fewer surprises.</p><p>I like starting with a risk mitigation activity with my team that uncovers the assumptions, dependencies, and bets. Then, to raise our confidence, we validate assumptions and spike in the areas where we are least confident. We then move to system design diagramming and technical designs for significant areas. </p><p>If you can do this quickly and repeatably, you will often avoid working on the wrong thing for too long or seeing a project go drastically off-track. A direct report might occasionally feel unfamiliar with the &quot;design up front&quot; approach. They come around when they see that the entire thing can be accomplished within hours, that it generally yields better results, and that it&apos;s fun, too.</p><ul><li>In the book &quot;<a href="https://rightingsoftware.org/?ref=codydjango.com">Righting Software</a>,&quot; Juval L&#xF6;wy emphasizes the importance of addressing areas of volatility in a codebase and the critical role of <em>project design</em> in software architecture. In particular, he states that complex software benefits from considered design, scoping, and sequencing. The <em>design </em>involves defining the architecture, components and interactions; the <em>scoping </em>ensures the functional and nonfunctional requirements are met and communicated, and the <em>sequencing </em>ensures that the development workflow is optimal for the team, which I often interpret as &quot;allowing for the maximum number of iterations on running code&quot;.</li><li>The Pragmatic Engineer has <a href="https://blog.pragmaticengineer.com/efficient-software-project-management-at-its-roots/?ref=codydjango.com">lots</a> of <a href="https://blog.pragmaticengineer.com/what-agile-really-means/?ref=codydjango.com">content</a> about how engineering managers should be familiar with <a href="https://blog.pragmaticengineer.com/how-to-lead-a-project-in-software-development/?ref=codydjango.com">project management and risk management</a>. </li></ul>]]></content:encoded></item><item><title><![CDATA[Missing from my bookshelf: the 15 Commitments of Conscious Leadership]]></title><description><![CDATA[<p>This book was recently recommended to me from Bryan Dunn, the VP of Product at <a href="https://crowdbotics.com/?ref=codydjango.com">Crowdbotics</a>. It arrived at my door on Saturday, and I started reading it immediately. My general approach to a new book is to first read the intro and conclusion &#x2013; this generally gives me a</p>]]></description><link>https://codydjango.com/missing-from-my-bookshelf-the-15-commitments-of-conscious-leadership/</link><guid isPermaLink="false">65e9f78527058428dbacfb5b</guid><category><![CDATA[leadership]]></category><category><![CDATA[books]]></category><dc:creator><![CDATA[Cody Redmond]]></dc:creator><pubDate>Thu, 07 Mar 2024 18:33:59 GMT</pubDate><content:encoded><![CDATA[<p>This book was recently recommended to me from Bryan Dunn, the VP of Product at <a href="https://crowdbotics.com/?ref=codydjango.com">Crowdbotics</a>. It arrived at my door on Saturday, and I started reading it immediately. My general approach to a new book is to first read the intro and conclusion &#x2013; this generally gives me a good idea of how well the book reads, and how much of an investment I want to make. I was immediately hooked. The tone and language is friendly and interesting; a little hippy-dippy but not unfocused.</p><p>At this point, I&apos;m convinced this is a book that I&apos;ve been missing. Of course every leader likes to think that they&apos;re &quot;conscious&quot; and with a high degree of self-awareness, but this is one of those things that can be perfectly described as a blind-spot. Emotions are tricky to navigate, and I&apos;m eager for more management tricks and tools. I&apos;m excited to dedicate my morning time over the next couple of weeks in the hopes of learning new such tricks and tools from this book. I&apos;ll come back to this post with updates as I go. </p>]]></content:encoded></item><item><title><![CDATA[Scaling People: Tactics for Management and Company Building]]></title><description><![CDATA[<p>In addition to my <a href="https://codydjango.com/new-role-in-new-industry/">new role</a> at <a href="https://crowdbotics.com/?ref=codydjango.com">Crowdbotics</a>, I&apos;ve also started reading a <a href="https://press.stripe.com/scaling-people?ref=codydjango.com">recently published book</a> on scaling people management. It was written by Claire Hughes Johnson - former COO of Google and Stripe. I&apos;m excited because it purports to be a pragmatic take on management,</p>]]></description><link>https://codydjango.com/scaling-people-tactics-for-management-and-company-building/</link><guid isPermaLink="false">6507532727058428dbacfacb</guid><category><![CDATA[books]]></category><dc:creator><![CDATA[Cody Redmond]]></dc:creator><pubDate>Sun, 17 Sep 2023 20:21:35 GMT</pubDate><content:encoded><![CDATA[<p>In addition to my <a href="https://codydjango.com/new-role-in-new-industry/">new role</a> at <a href="https://crowdbotics.com/?ref=codydjango.com">Crowdbotics</a>, I&apos;ve also started reading a <a href="https://press.stripe.com/scaling-people?ref=codydjango.com">recently published book</a> on scaling people management. It was written by Claire Hughes Johnson - former COO of Google and Stripe. I&apos;m excited because it purports to be a pragmatic take on management, complete with templates and workbook material.</p><p><strong>Update</strong>: I&apos;ve read this book, and it would have been a lot more eye-opening to me when I started my leadership journey. Although I appreciated the articulation and writing style, I had already come to many of the insights on my own. I will recommend this book to new managers or folks leaning in that direction.</p><p><strong>Notes from the introduction</strong>:</p><ul><li>You need process, and you need it sooner than you realize</li><li>A company will not get far without &quot;core processes&quot;: strong management and sound operating systems.</li><li>Companies and teams must establish a playing field where everyone participates and marks progress.</li></ul><blockquote>You know why playing a game is fun? Because it has rules, and you have a way to win. Picture a bunch of people showing up at an athletic field with random equipmnt and no rules. Someone is going to get hurt. You don&apos;t know how to play, you don&apos;t know how to score, and you don&apos;t know how to win.</blockquote><ul><li>Research has found that people who outperform in their fields employ strategies that move them past the autonomous stage of learning, like athletes who use speed workouts to improve their performance.</li><li>A combination of core frameworks, such as hiring and planning practices and underlying leadership principles, can help scale an organization. </li></ul><p>Workbooks: <a href="http://press.stripe.com/scaling-people/workbooks?ref=codydjango.com">http://press.stripe.com/scaling-people/workbooks</a></p><p>Consider reading: <a href="https://www.amazon.ca/Working-Backwards-Insights-Stories-Secrets?ref=codydjango.com">https://www.amazon.ca/Working-Backwards-Insights-Stories-Secrets</a></p><p>Two more books on Stripe Press that I can thoroughly recommend:</p><ul><li><a href="https://press.stripe.com/high-growth-handbook?ref=codydjango.com"><em>High Growth Handbook </em>by Elad Gil</a></li><li><a href="https://press.stripe.com/an-elegant-puzzle?ref=codydjango.com"><em>An Elegant Puzzle </em>by Will Larson</a></li></ul>]]></content:encoded></item><item><title><![CDATA[Democratizing Sofware Development in the Low-Code Industry.]]></title><description><![CDATA[<p>I&apos;m happy to announce that I&apos;m starting a new leadership position as a Platform Software Engineering Manager at <a href="https://crowdbotics.com/?ref=codydjango.com">Crowdbotics</a>. I&apos;m working with a fantastic team to simplify and accelerate software development using AI and low-code/no-code approaches. &#xA0; </p><p>Crowdbotics was founded in 2017 by</p>]]></description><link>https://codydjango.com/new-role-in-new-industry/</link><guid isPermaLink="false">65074a6527058428dbacfa2c</guid><category><![CDATA[career]]></category><dc:creator><![CDATA[Cody Redmond]]></dc:creator><pubDate>Sun, 17 Sep 2023 19:12:59 GMT</pubDate><content:encoded><![CDATA[<p>I&apos;m happy to announce that I&apos;m starting a new leadership position as a Platform Software Engineering Manager at <a href="https://crowdbotics.com/?ref=codydjango.com">Crowdbotics</a>. I&apos;m working with a fantastic team to simplify and accelerate software development using AI and low-code/no-code approaches. &#xA0; </p><p>Crowdbotics was founded in 2017 by Y-combinator alum and Forbes Magazine &quot;30 under 30&quot; Anand Kulkarni. Since then, Crowdbotics has paved the way for low-code software development, enabling anyone to generate production-grade applications in minutes using AI-assisted product requirement analysis, visual building tools and code generation. </p><p>Over 20,000 apps have been launched through Crowdbotics, including mission-critical healthcare applications, venture-backed software products earning millions in revenue, learning management platforms, and government tools. </p><p>I&apos;ll work closely with Product Management to develop and scale the architecture and engineering teams necessary to enable a growth vector in an exciting new domain. </p><p>Although Crowdbotics has a small headquarters in Berkeley, CA, it&apos;s a globally distributed remote-first company with members across all time zones. My team is primarily situated in PDT, although there are a few in EST, as well as members in Nairobi, Dubai and Nepal!</p>]]></content:encoded></item><item><title><![CDATA[Observability at Scale]]></title><description><![CDATA[<p><em><em>This is Part </em>IV<em> in Observability Engineering: Achieving Production Excellence</em></em></p><h3 id="build-versus-buy-and-return-on-investment"><br>Build Versus Buy and Return on Investment</h3><p>This chapter provides solid advice for those who are unfamiliar with the &quot;<a href="https://en.wikipedia.org/wiki/Not_invented_here?ref=codydjango.com">not invented here</a>&quot; syndrome. </p><p>When considering building with open source tools, weigh the full impact of hidden costs like</p>]]></description><link>https://codydjango.com/observability-at-scale/</link><guid isPermaLink="false">64ee39ed27058428dbacf952</guid><category><![CDATA[software testing]]></category><dc:creator><![CDATA[Cody Redmond]]></dc:creator><pubDate>Fri, 01 Sep 2023 20:16:53 GMT</pubDate><content:encoded><![CDATA[<p><em><em>This is Part </em>IV<em> in Observability Engineering: Achieving Production Excellence</em></em></p><h3 id="build-versus-buy-and-return-on-investment"><br>Build Versus Buy and Return on Investment</h3><p>This chapter provides solid advice for those who are unfamiliar with the &quot;<a href="https://en.wikipedia.org/wiki/Not_invented_here?ref=codydjango.com">not invented here</a>&quot; syndrome. </p><p>When considering building with open source tools, weigh the full impact of hidden costs like recruiting, hiring, and training to develop and maintain custom solutions and the opportunity costs of not delivering core business value. </p><h3 id="efficient-data-storage">Efficient Data Storage</h3><p>There are many challenges when it comes to storing but especially querying observability data, which has real-time requirements on billions of rows of ultrawide events of high-dimensionality and high-cardinality data. This chapter uses Honeycomb&apos;s Retriever implementation to elucidate the various tradeoffs. Other publicly available data stores up to the challenge include <a href="https://cloud.google.com/bigquery/?ref=codydjango.com">Google Cloud Big Query</a>, <a href="https://clickhouse.com/?ref=codydjango.com">ClickHouse</a>, and <a href="https://druid.apache.org/?ref=codydjango.com">Apache Druid</a>. </p><h3 id="cheap-and-accurate-enough-sampling">Cheap and Accurate Enough: Sampling</h3><p>Your team is probably more concerned about traces that contain errors or poor performance. Sampling is an excellent technique for improving the signal-to-noise ratio on events you care about, drastically reducing complexity and costs when considering storage and query requirements. </p><p>Because sampling is so valuable when handling observability data at scale, it&apos;s becoming increasingly common for open-source instrumentation libraries such as OTel to provide sampling logic capabilities. </p><p>There are two different sampling strategies to use tactically:</p><ul><li>Head-based: The decision is made immediately and is propagated downstream via headers. Pro: reduces the overhead of collecting and storing unnecessary traces right at the source. Con: Potentially significant or anomalous traces may be missed or incomplete if only some services in a distributed system decide to sample a request.</li><li>Tail-based: The sampling decision occurs at the end of a transaction or request. The system collects all spans related to a trace and then decides whether or not to keep it based on various criteria. Pro: All meaningful traces are retained, leading to better insights. Con: More resource-intensive; implementation is more complex. </li></ul><h3 id="telemetry-management-with-pipelines">Telemetry Management with Pipelines</h3><p>More to come.</p><p></p>]]></content:encoded></item><item><title><![CDATA[Simplifying complexity]]></title><description><![CDATA[<p>As organizations evolve and software grows, so does complexity. Learning how to identify and curb complexity is a core skill to develop as an engineering manager.</p><p>Unconstrainted complexity inevitably results in suboptimal outcomes, such as:</p><ul><li>A slowdown in a team&apos;s ability to accurately assess a problem or deliver</li></ul>]]></description><link>https://codydjango.com/how-i-simply-complexity/</link><guid isPermaLink="false">64c7ea8327058428dbacf648</guid><category><![CDATA[management]]></category><dc:creator><![CDATA[Cody Redmond]]></dc:creator><pubDate>Thu, 03 Aug 2023 20:27:49 GMT</pubDate><content:encoded><![CDATA[<p>As organizations evolve and software grows, so does complexity. Learning how to identify and curb complexity is a core skill to develop as an engineering manager.</p><p>Unconstrainted complexity inevitably results in suboptimal outcomes, such as:</p><ul><li>A slowdown in a team&apos;s ability to accurately assess a problem or deliver a solution</li><li>A slowdown in the time it takes to onboard new engineers</li><li>An increase in incidents and rework</li><li>A drop in morale or an increase in attrition</li></ul><p>It&apos;s impossible to eliminate complexity: some problem spaces are just naturally complex, and there&apos;s no way around it. The trick for an engineering manager is to <em>manage </em>complexity, which means identifying, measuring, and simplifying where possible.</p><h2 id="talking-about-complexity">Talking about complexity</h2><p>Complexity can be found in many forms. It&apos;s helpful to be able to identify which aspect of complexity we intend to simplify. </p><h3 id="aspects-and-measurements">Aspects and measurements</h3><ul><li><strong>Cognitive load</strong>: characterized by an overwhelming diversity of detailed tasks that also increases context switching and few opportunities for deep work or system optimization, resulting in generally slow delivery and low morale. Measure with developer surveys, 1:1s, and onboarding metrics for new hires.</li><li><strong>Process complexity</strong>: characterized by too many cooks in the kitchen, too many meetings, too many required signoffs, slow decision-making, unmet requirements, and frequent change orders. </li><li><strong>Codebase complexity</strong>: characterized by slow builds, slow tests, reduced cycle time on code changes, reduced cycle time on code reviews, and increased rework or deployment rollback. Measure using maintainability index scoring using an index that works for the stage and engineering goals of the organization. </li><li><strong>System complexity</strong>: characterized by no single person knowing how the system works, no known success metrics, or low-quality metrics; quality or performance feedback often comes from end-users, and debugging or root cause analysis requires multiple people and a significant investment. Measure with build times, time it takes from code written to code in production, time for functional tests to run, rework ratios, and other DORA-inspired metrics.</li></ul><h3 id="assessment-and-approaches">Assessment and Approaches</h3><p>The <a href="https://hbr.org/2007/11/a-leaders-framework-for-decision-making?ref=codydjango.com">Cynafin framework</a> is helpful for leaders to assess the operating context to take appropriate actions quickly. The four quadrants are <em>Simple</em>, <em>Complicated</em>, <em>Complex</em>, and <em>Chaotic</em>, where <em>simple </em>is characterized as apparent cause-and-effect relationships where correct answers are based on facts and easily verified, and things become much less linear from there. I won&apos;t go into the details of each quadrant&apos;s characteristics (feel free to read the link or check out the wiki page &#x2013; it&apos;s excellent). But mapping complexities with this framework has often led me to the following actionable behaviours: </p><ul><li>Reduce <em>Chaotic </em>domains to <em>Complex </em>by drawing out a signal from noise by introducing Observability and collaborative Event Storming.</li><li>Reduce <em>Complex </em>domains to <em>Complicated </em>by introducing abstractions, boundaries, patterns, and workflows. </li><li>Make <em>Complicated </em>domains <em>Simple</em> by introducing or improving tooling, access to specialists, or load-shedding via a specialized team to handle inherently complex systems that are core to the organization&apos;s revenue streams.<em> </em></li><li>Eliminate <em>Simple </em>tasks via automation or outsourcing.</li></ul><h2 id="conclusion">Conclusion</h2><ul><li><strong>Divide and Conquer</strong>: Complexity in software is typically managed by a &quot;divide and conquer&quot; approach, which can be applied at any level of granularity when considering software systems.</li><li><strong>Abstractions, Interfaces and Boundaries</strong>: introduce smaller cognitive loads with specialization and bounded contexts modelled on supporting the business&apos;s current and future revenue streams. Move teams to support the bounded contexts, and have those teams own the architecture that they depend on and document with C4 patterns.</li><li><strong>Refactoring to Design Patterns</strong>: Refactor complicated codebase towards named design patterns to increase understandability, maintainability, and flexibility. </li><li><strong>Introductions of Frameworks</strong>: If you&apos;re noticing the same sorts of use cases come up frequently, introducing a framework can make a huge impact, although it might require an up-front investment, so be prepared with a considered plan when proposing to your team. &#xA0; </li><li><strong>Enabling Teams</strong>: A temporary enabling team to aid in strategic refactoring or short-term organizational projects such as GDPR, security improvements, etc. Enabling teams are also great for tackling technical debt or building technical surplus to unlock additional product velocity. They can also be transitioned easily to permanent developer experience teams if desired.</li><li><strong>Complex Subsystem Teams</strong>: Load-shed core complexity from stream-aligned product teams with the introduction of a team designed to handle a particularly complicated area.</li></ul><p><em>Bonus points:</em></p><ul><li><strong>Establish baseline performance measurements with SLAs, SLOs, and SLIs for services in production:</strong> Identifying baseline performance metrics enables a clear focus, especially when things are overwhelming. Knowing the performance and expectations of production services results in understanding the capacity for taking on new objectives and making wise decisions. For example, reducing complexity might be more critical when SLAs are unmet. </li><li><strong>Establish high-level goals with Objectives and Quantitative Key Results</strong>: Identify big goals and ensure they can be achieved incrementally (i.e. a mix of leading and lagging metrics enables progress can be measured throughout development and not just at the very end). Working with Product Stakeholders to identify the right success metrics early results in a much simpler development process and managed expectations. </li></ul>]]></content:encoded></item><item><title><![CDATA[Coaching frameworks]]></title><description><![CDATA[<p>I received a question a few weeks back on if I used a coaching framework and which one. My response was that I use a framework of my own design: a combination of <a href="https://en.wikipedia.org/wiki/SWOT_analysis?ref=codydjango.com">SWOT analysis</a> with <a href="https://en.wikipedia.org/wiki/Objectives_and_key_results?ref=codydjango.com">OKR</a>-style reporting. At the time, it didn&apos;t seem satisfying as an</p>]]></description><link>https://codydjango.com/coaching/</link><guid isPermaLink="false">64c4497727058428dbacf2ff</guid><category><![CDATA[management]]></category><dc:creator><![CDATA[Cody Redmond]]></dc:creator><pubDate>Sat, 29 Jul 2023 04:30:52 GMT</pubDate><content:encoded><![CDATA[<p>I received a question a few weeks back on if I used a coaching framework and which one. My response was that I use a framework of my own design: a combination of <a href="https://en.wikipedia.org/wiki/SWOT_analysis?ref=codydjango.com">SWOT analysis</a> with <a href="https://en.wikipedia.org/wiki/Objectives_and_key_results?ref=codydjango.com">OKR</a>-style reporting. At the time, it didn&apos;t seem satisfying as an answer. The reality is always a little more nuanced; I&apos;ve read multiple books on coaching, practiced various systems with my directs and received executive coaching myself. Over time and experience, I&apos;ve dialled in a system that has resulted in many of my reports&apos; professional growth and promotion. </p><p>But in retrospect, it&apos;s clearer to say that I use the <a href="https://hbr.org/2019/11/the-leader-as-coach?ref=codydjango.com">GROW model</a>. It&apos;s become widespread in tech, and it&apos;s generally the answer that folks want to hear.</p><h2 id="grow-model-of-coaching">GROW Model of Coaching </h2><p>So what is the GROW model? It&apos;s a way of speaking that is intended to result in better nondirective coaching. Why non-directive coaching? Because most leaders naturally employ a directed style of leadership which takes place primarily through &quot;telling&quot; and leans heavily on authority. This style is prevalent in training a report for a specific job that is likely low-variability. </p><p>The challenge of modern teams is a fast-moving world: objectives change, technologies evolve, and it&apos;s no longer feasible to expect a manager or leader always to know the best way forward. Directed leadership also has the unfortunate property of stifling ownership and does not build organizational capacity well.</p><p>A GROW approach to coaching requires patience to learn and apply but is rewarding and energizing when used successfully. Avoiding leading questions means that the direct report always owns the problem to be solved, and the coaching method is specifically tailored to open perspective and draw out insight. </p><ul><li><strong>GOAL</strong>: Asking what the person wants to achieve from the session and in the immediate future, such as &quot;<em>What do you want when you walk out the door that you don&apos;t have now?</em>&quot;</li><li><strong>Reality</strong>: This means asking probing questions about the current situation in detail. The trick here is to inquire about facts describing reality, such as what, <em>who</em>, <em>when</em>, and <em>where &#x2013; </em>but explicitly not <em>why</em>, because it is tied to motivations and justifications, which border on judgment and can raise defences, which is counter-productive. A good question here is, &quot;<em>What are the key things we need to know?</em>&quot;</li><li><strong>Options: </strong>This step is to help broaden the perspective. Often people who are seeking help feel like they have limited options. By opening the floor to creative thinking, more possibilities and perspectives emerge. I learned a trick from Product Management: <em>Diverge for Options, Converge for Decisions</em>. A good question here is, &quot;<em>If you had a magic wand, what would you do?</em>&quot;</li><li><strong>Will: </strong>Asking &quot;What will you do?&quot; encourages detailing a specific plan. Sometimes the plan might just be to learn more about the situation, to show appreciative inquiry in learning more about a problem or someone&apos;s perspective or concerns. Sometimes, it&apos;s about preparing for a conversation or drafting a proposal. Another trick here is to ask how confident the person is in their plan and how likely they will be to act on it. Investing in helping someone through a problem, only for them not to feel confident enough to action on it, can be demotivating for all parties. My job as a people manager is to validate that my coaching serves the intended function.</li></ul><h2 id="the-effective-manager">The Effective Manager</h2><p>This is the book I come back to more than anything. According to Mark Horstman, successful coaching is simply about asking for more. Supporting, mentoring, and coaching is not an end in itself. People managers are intended to drive value for the company. Effective managing is giving frequent feedback on performance, and effective coaching is continuously asking for higher levels of performance. </p><p>This may sound aggressive or counter-intuitive, but in a high-trust environment, it&apos;s a superpower. Too many managers feel like coaching is something to apply to low-performers, but coaching high-performers is satisfying and can yield much more. </p><p>Either way, every direct report should know their performance expectations in their role, and a regular review on the topic should occur. Not every direct will be rushing to exceed, and that&apos;s okay. They know what it is and are directly responsible for their career development. They also know that their manager will work with them to achieve their performance goals.</p><p>These are the steps for effective coaching that MT recommends:</p><ol><li><strong>Collaborate to Set a Goal</strong>: Describe a behaviour or result to achieve by a date. For example: <em>by (four months from now), you will deliver admin features within one week on average without introducing any regressions</em>. (This is an example of the MT goal structure DBQ &#x2013; <em>Date Behaviour Quality</em>). MT specifies that coaching goals are long-term behavioural goals &#x2013; if they can be achieved in less than four months, a frequent feedback model will suffice. &#xA0;</li><li><strong>Collaborate to Brainstorm Resources</strong>: There are no silver bullets. Go for volume, not accuracy. </li><li><strong>Collaborate to Create a Plan: </strong>Each step in the plan contains a deadline and behaviour and is completed when the reporting happens to the manager &#x2013; effectively, the plan is an accountability system that you are directly investing in for the sake of your report. The plan is only for the first few weeks &#x2013; it&apos;s not worth planning four months if they can&apos;t make it past the first week.</li><li><strong>The Direct Acts and Reports on the Plan: </strong>If the system has been set up correctly, we should receive regular updates in the form of task completion emails, and we then are briefly discussing the progress each week in 1:1s.</li></ol><p>Steps 3 and 4 become iterative toward achieving the goal from step 1. If the direct report fails to accomplish a coaching task, we give the direct report negative feedback. MT gives an example: &quot;When you miss your coaching deadlines, that&apos;s more work for later. Can you change that?&quot;. Similarly, positive feedback can be given when the direct report completes a coaching deliverable.</p><p>I love that Manager Tools willingly leverages what we know about organizational behaviour to set short deadlines on doable tasks to increase the chance of completion. Whenever possible, look for opportunities to observe the direct engaging in the behaviour we want, to provide the direct with feedback on what we observe, and make that a regular, very short-scope task. If coaching is behavioural, then the best way to achieve it is to enable and incentivize frequent practice.</p><h2 id="the-coaching-habit">The Coaching Habit</h2><p>This book claims that seven questions and a coaching habit will foster team autonomy and empowerment. The central argument of this book is that successful coaching is through asking questions rather than providing answers. I found this book to be too long for its little substance, but if you are looking for a way to refresh your 1:1 questions, you can look to the seven questions for inspiration.</p><p> </p>]]></content:encoded></item><item><title><![CDATA[Observability for Teams]]></title><description><![CDATA[<p><em>This is Part III in Observability Engineering: Achieving Production Excellence</em></p><h3 id="applying-observability-practices-in-your-team">Applying Observability Practices in Your Team</h3><p>Start with the most significant pain points, and then flesh out your instrumentation iteratively.</p><h3 id="observability-driven-development">Observability-Driven Development</h3><ul><li>A key finding of Accelerate: Building and Scaling High Performing Technology Organizations was that the inverse relationship between</li></ul>]]></description><link>https://codydjango.com/observability-for-teams/</link><guid isPermaLink="false">64b96e6127058428dbacf009</guid><category><![CDATA[software testing]]></category><dc:creator><![CDATA[Cody Redmond]]></dc:creator><pubDate>Fri, 28 Jul 2023 22:59:53 GMT</pubDate><content:encoded><![CDATA[<p><em>This is Part III in Observability Engineering: Achieving Production Excellence</em></p><h3 id="applying-observability-practices-in-your-team">Applying Observability Practices in Your Team</h3><p>Start with the most significant pain points, and then flesh out your instrumentation iteratively.</p><h3 id="observability-driven-development">Observability-Driven Development</h3><ul><li>A key finding of Accelerate: Building and Scaling High Performing Technology Organizations was that the inverse relationship between speed and quality is a myth: high-performing teams can release high-quality code quickly, and these two qualities are correlated and reinforce each other. Conversely, failures tend to happen more often for teams that move slowly and take substantially longer to recover.</li><li>The key metric for the health and effectiveness of an engineering team is <u>the time elapsed from when code is written to when it is in production</u>. Every team should be tracking this metric and working to improve it.</li><li>Isolated test-driven development does not reveal whether customers are having a good experience with your service.</li><li>Observability should be used early in the software development life cycle, during the development process, to help catch defects earlier and reduce the cost of fixing them later. This is what is meant by &quot;Shifting Observability Left.&quot;</li></ul><h3 id="using-service-level-objectives-for-reliability">Using Service-Level Objectives for Reliability</h3><ul><li>Threshold alerting is for known unknowns only. This isn&apos;t sustainable; distributed systems&apos; failures are inevitable and unpredictable. </li><li>A good alert must reflect immediate user impact, be actionable, be novel, and require investigation rather than rote action.</li><li>SLOs decouple the &quot;what&quot; and &quot;why&quot; behind incident alerting.</li><li>SLOs are excellent at communicating how to prioritize reliability vis-a-vis with feature development. If we aren&apos;t hitting SLOs, the focus ought to be reliability.</li><li>Two types of SLOs: time-based measures (99th percentile latency less than 300ms over each 5-minute window) and event-based measures (proportion of events that took less than 300 ms during a given rolling time window).</li><li>For time-based: 99p as the target; for every 100 minutes, I&apos;m allowed 1 bad minute. For event-based: for 100 events, I&apos;m allowed one bad event.</li><li>Use event-based because they provide a more reliable and granular way to quantify the state of a service. They are more precise. They measure brownouts better, like when more events fail but not all of them. You can more reasonably measure an SLO with event-based availability targets.</li><li>If SLOs are not being met, but customers are also not complaining, then perhaps it&apos;s okay to reduce the SLOs, if that could enable product development elsewhere. </li><li>If customers complain, it might be a poor leadership decision to reduce the SLOs further.</li></ul><h3 id="debugging-slo-based-alerts">Debugging SLO-Based Alerts</h3><ul><li>Stop relying on experience to guess what is happening in a system. It&apos;s unreliable and unsustainable.</li><li>Observability is not specific to debugging. Debugging&apos;s concern is to remove a bug, but it says little about the overall state of a given system. Observability will tell you which systems are good candidates to improve, which may involve debugging but could also involve performance improvements, refactoring or redesigning to achieve a target SLO.</li></ul><h3 id="observability-and-the-software-supply-chain">Observability and the Software Supply Chain</h3><ul><li>Slack implemented Observability in the software supply chain, instrumenting the CI pipeline to solve complex problems throughout the CI workflow that were previously invisible or undetected.</li></ul>]]></content:encoded></item><item><title><![CDATA[Fundamentals of Observability]]></title><description><![CDATA[<p><em>This is Part II in Observability Engineering: Achieving Production Excellence</em></p><p><strong>We can&#x2019;t understand a complex system if it&#x2019;s a black box.</strong></p><p>Observability aims to understand and explain your system&apos;s internal state from its outputs, ideally without adding new metrics. </p><p><em>Structured events are the building</em></p>]]></description><link>https://codydjango.com/observability-engineering-part-ii/</link><guid isPermaLink="false">648bba669e85930544c4a61c</guid><category><![CDATA[software testing]]></category><dc:creator><![CDATA[Cody Redmond]]></dc:creator><pubDate>Fri, 28 Jul 2023 22:57:26 GMT</pubDate><content:encoded><![CDATA[<p><em>This is Part II in Observability Engineering: Achieving Production Excellence</em></p><p><strong>We can&#x2019;t understand a complex system if it&#x2019;s a black box.</strong></p><p>Observability aims to understand and explain your system&apos;s internal state from its outputs, ideally without adding new metrics. </p><p><em>Structured events are the building blocks of observability</em>. High cardinality, high dimensionality, and context-rich events facilitate discoverability, enabling a movement away from reactive, iterative debugging to an approach where curiosity is immediately rewarded.</p><p>To answer all possible questions with metrics, all metrics would have to be captured at all levels of granularity, which is unrealistic and prohibitively expensive. You&apos;d spend more time with metrics than with the actual software. Furthermore, domain expertise is still necessary to contextualize the metrics and make sense of them concerning the question at hand.</p><p>When emitting high-context events, we still get all metrics related to the event, but only those metrics, which is much more reasonable. We only require metrics relevant to the event&apos;s context, and the events can be compared for outliers in an existing data set. This is useful for performance analysis.</p><p>Chapter Five concludes that metrics are too low-level and isolated to serve as a building block for true software observability. They should instead be relegated to where they are efficient in monitoring low-variability infrastructure and system-level concerns.</p><h2 id="glossary">Glossary</h2><ul><li><strong>Metric:</strong> a pre-aggregated measurement, as a scalar value, collected to represent system state, with optional tags used for grouping and searching.</li><li><strong>Structured Event: </strong>a record of everything that occurred while one particular request interacted with your service, organized and formatted as key-value pairs so it&apos;s easily searchable. </li><li><strong>Distributed Trace</strong>: the tracking of interrelated events that occur throughout a distributed backend, usually in the service of a single request.</li><li><strong>Trace Span</strong>: the segments that comprise each part of a distributed trace. These might correlate to network jumps between services, or particular areas of measurement, &#xA0;are typically differentiated as <em>root span</em> and <em>parent-child </em>spans, and contain specific data used to enable the stitching of the trace: the trace ID, span ID, parent ID, Timestamp, and Duration. In addition, additional data can be added to a span as a series of tags to be leveraged in custom queries and sampling rules.</li></ul><h2 id="opentelemetry">OpenTelemetry</h2><p>OpenTelemetry (OTel) is a Cloud Native Computing Foundation incubating project formed by merging the OpenTracing and OpenCensus projects. This happened in 2019, so as a whole OTel is still relatively new. Despite its age, it has seen rapid adoption in the industry, with technical committees composed of representatives from Google, LighStep, Microsoft and Uber. Some benefits to using OpenTelemetry:</p><ul><li>Vendor-agnostic and community-supported means you only have to instrument once to send telemetry data to different products. </li><li>Consistency in language and established semantic conventions help with alignment and ensure that everyone is on the same page.</li><li>Ample availability of libraries with broad language support</li></ul><p>OTel provides libraries, agents, tooling, and other things designed for capturing and managing telemetry data across your services. &#xA0;</p><h3 id="opentelementry-concepts">OpenTelementry Concepts</h3><ul><li><strong>API</strong>: OTel libraries have a specific interface that developers use to interact with the OTel system</li><li><strong>SDK</strong>: The concrete implementation component of OTel that tracks state and batches data for transmission</li><li><strong>Tracer</strong>: A component in the SDK that tracks which span is currently active in a system process. It also enables adding attributes or events to the span or modifying its state. </li><li><strong>Meter</strong>: A component responsible for creating instruments used for reporting measurements in your process and the ability to access and modify measurements, such as by adding or retrieving values at periodic intervals. </li><li><strong>Context propagation</strong>: The current inbound request contains headers that the SDK deserializes to specify the present context for the process and also serializes it to pass downstream.</li><li><strong>Exporter</strong>: A plug-in for the SDK that translates OTel in-memory objects into the appropriate format required by a specific destination, such as stdout, a lot file, Zipkin, Jaeger, Lightstep or Honeycomb. &#xA0; </li><li><strong>Collector</strong>: a standalone binary process that receives telemetry data in OTLP format, processes it, and sends it to one or more configured destinations. </li></ul><p>The next meetup is July 20th, where we cover <em>Part III: Observability For Teams.</em></p>]]></content:encoded></item><item><title><![CDATA[Engineering Management, in my own words]]></title><description><![CDATA[<h3 id="what-is-the-role-of-an-engineering-manager">What is the role of an engineering manager?</h3><p>At a high level, it&apos;s to deliver value, keep the engineering team engaged, and retain and grow direct reports. But let&apos;s unpack this a little further.</p><p>This means ruthlessly prioritizing while simplifying complexity and promoting deep work. Tactically,</p>]]></description><link>https://codydjango.com/engineering-management-in-my-own-words/</link><guid isPermaLink="false">64b6adf327058428dbacee06</guid><category><![CDATA[management]]></category><dc:creator><![CDATA[Cody Redmond]]></dc:creator><pubDate>Wed, 19 Jul 2023 23:40:06 GMT</pubDate><content:encoded><![CDATA[<h3 id="what-is-the-role-of-an-engineering-manager">What is the role of an engineering manager?</h3><p>At a high level, it&apos;s to deliver value, keep the engineering team engaged, and retain and grow direct reports. But let&apos;s unpack this a little further.</p><p>This means ruthlessly prioritizing while simplifying complexity and promoting deep work. Tactically, an effective EM knows which architectural and project considerations can be made to increase the throughput of delivered value as the organization pivots and scales.</p><p>As a leader, the EM sets the pace for others to follow and models the organization&apos;s values. The EM also must win the team&apos;s and their peers&apos; trust and be trusted to make wise and quick decisions when needed.</p><p>An empathic EM cultivates a high-trust environment and knows how to get the most from their team. They are practiced in delegation and coaching, and they give quick and effective feedback. They promote deep work, and they appreciate the role of developer experience in a highly effective team. &#xA0; </p><p>In addition, the engineering manager is also the primary point of contact with many stakeholders across the organization, working collaboratively with product managers, other engineering managers, and domain specialists. The engineering manager presents frequent and disciplined communication with stakeholders, directors, and executives, keeping confidence high and surprises low. </p><h3 id="how-do-you-measure-your-engineering-team">How do you measure your engineering team?</h3><p>Maybe more than anything else, the meat of being an effective EM is a healthy obsession with metrics &#x2013; leveraging data for measurements in all areas. For example, here are just a few areas that jump to mind:</p><ol><li>Establishing quantitative criteria for business objectives and the continued measurement of progress towards said goals, such as via KPIs or OKRs.</li><li>Service-level indicators to form service-level objectives, such as expected response times or cycle times across a range of percentiles, to know a service is performing as expected and without degradation. </li><li>Internal measurements correlated to team velocity, perhaps measure improvements around previously-identified bottlenecks such as code review, the time it takes to run a test suite or the number of specific incidents per week. </li><li>Weighted decision matrixes for capturing backlog items, technical debt, code quality and design complexity. For example, quantifying how decoupled a module is from the rest of the system or how many dependencies a given module has. </li><li>Individual performance measurements to facilitate promotions and growth goals for direct reports. For example, being able to draw attention to an individual&apos;s contributions in crucial areas, the quality of the contributions, and the correlated impact on the business. </li><li>Capacity estimates that influence architectural and infrastructure decisions. For example, being able to accurately estimate the load and utilization of a system so that it&apos;s used optimally under regular conditions while also scaling to peak conditions. </li></ol><p>At a business objective level, I use the OKR framework to ensure that my team delivers the right value at the right time and is moving in the same direction as the organization. At this level, I tend to use a mix of leading and lagging indicators. If a team is having difficulting hitting leading indicators, the lagging indicators will be even more challenging. More importantly, leading indicators enable faster iteration. &#xA0;</p><p>I work more tactically at the team level, employing a &quot;theory of constraints&quot; approach. Instead of specific generic engineering metrics (number of commits, DORA metrics), I look for bottlenecks, then figure out a metric that correlates with unblocking, and work toward that. For example, in my previous role, a team was bottlenecked on incidents causing rework and context switching. I identified &quot;incidents per month&quot; as a metric and strategized with the team for solutions, considering process change, architecture change, and additional testing. Later, the bottleneck moved to code reviews, and the focus became reducing the cycle time on a code review.</p><p>Throughout my experience, I&apos;ve found using a mix of high-level and low-level objectives, and a mix of leading and lagging indicators, enables a predictability in software management that I really enjoy.</p>]]></content:encoded></item><item><title><![CDATA[The Path to Observability]]></title><description><![CDATA[These are notes that I've found interesting from Part 1 of Observability Engineering published by O'Reilly. I'm reading it as part of the Honeycomb-hosted book club.]]></description><link>https://codydjango.com/observability-engineering/</link><guid isPermaLink="false">648b49329e85930544c4a38b</guid><category><![CDATA[software testing]]></category><dc:creator><![CDATA[Cody Redmond]]></dc:creator><pubDate>Fri, 16 Jun 2023 01:26:36 GMT</pubDate><content:encoded><![CDATA[<p><em>This covers Part 1 of Observability Engineering: Achieving Production Excellence.</em></p><p>Notes that I&apos;ve found interesting from the first section of the book published by O&apos;Reilly. I&apos;m reading it as part of the <a href="https://www.honeycomb.io/?ref=codydjango.com">Honeycomb</a>-hosted <a href="https://info.honeycomb.io/observability-engineering-book-club-2023?ref=codydjango.com">book club</a>.</p><h3 id="what-s-a-metric">What&apos;s a metric?</h3><p>A <em>metric</em> was introduced in 1988 as the foundational substrate of monitoring. It&apos;s a single number with optional tags for grouping and searching. They are cheap and straightforward, enabling tooling and optimizations around collecting, storing, shipping, and analyzing. They are easy to aggregate in time-series buckets.</p><h3 id="what-s-observability">What&apos;s Observability?</h3><p>Originally coined in 1960, it was a characterization to describe mathematical control systems, defined as a measure of how well the internal states of a system can be inferred from knowledge of its external outputs. Extrapolate that to today, and the idea is that if structured events are emitted from throughout the system, they can be dynamically explored with granular control to give a much faster and more accurate diagnosis of the system state.</p><p>The book presents observability in software systems as &quot;<em>understand any system state your application may have gotten itself into, even new states you couldn&apos;t have predicted, without shipping custom code to handle it.&quot;</em></p><p>The pillars of observability are described as:</p><ul><li>Structured events</li><li>Hypothesis-driven debugging</li><li>Tooling that supports high cardinality, high dimensionality, and explorability</li></ul><h3 id="how-does-it-differ-from-monitoring">How does it differ from Monitoring?</h3><p>Monitoring has been conventionally expressed as using logs, metrics, and traces to approximate overall system health. The telemetry is set up ahead of time based on assumptions of how a well-understood system will operate. As such, monitoring is well-suited for well-understood and less volatile systems, such as infrastructure-level analysis. </p><p>But with shifts towards continuous delivery and cloud native practices, software grows in complexity. New software systems can have a varied and emergent state space, contingent on invariable factors. Traditional monitoring and it&apos;s assumptions might no longer be the best tool for the job.</p><p>Debugging with monitoring requires certain assumptions, and to get value from metrics requires a level of knowledge of the system and how it is supposed to function. The upper limit on effective troubleshooting is constrained by your ability to pre-declare conditions that describe what you might be looking for. The role of intuition requires extensive experience.</p><p>Monitoring is great for infrastructure, Kafka and messaging queues. These are &quot;known territory&quot;, where you know what to look for. Observability is ideal for &quot;unknown territory&quot;, where you might not know what to be looking for. It&apos;s hard to find what you don&apos;t know what to look for.</p><h3 id="what-does-observability-enable">What does Observability enable?</h3><p>Observability through high-cardinality events enables an exploration of the &quot;state space of system behavior&quot; as a manner of investigation. This changes the classic troubleshooting workflow from a reactive model that is heavily reliant on institutional data, experience, and intuition, to a proactive model that rewards curiosity. </p><p>It&apos;s basically a &quot;back-foot vs front-foot&quot; distinction. Troubleshooting with monitoring is reactive, and requires work up front to set the conditions. Observability enables investigation in the moment, to explore conditions with existing data.</p><h3 id="how-does-it-work">How does it work?</h3><p>Structured events are key-value pairs, at an arbitrary length. The data should be high-cardinality, because this will be most useful in identifying data during debugging a system. For example, you will benefit from being able to tie data back to users, timeframes, nodes, processes, batches, etc. </p><p>Ideally the events are &quot;wide&quot; enough to carry all significant context variable that could influence the state space of system behavior. This can mean hundred or even thousands of key-value pairs, enabling drilling down on any combination of those keys.</p><hr><p>Next Book Club meeting is scheduled for June 16th and will cover Part II: Fundamentals of Observability</p>]]></content:encoded></item><item><title><![CDATA[Business books that are also enjoyable]]></title><description><![CDATA[<p>A friend is transitioning from service industry to project management. She&apos;s looking to get into tech, and even through she&apos;s very tech-minded, she feels insecure about her lack of experience in the business. I offered to put together a list of books that would be valuable</p>]]></description><link>https://codydjango.com/my-favorite-business-books/</link><guid isPermaLink="false">648617499e85930544c4a294</guid><category><![CDATA[books]]></category><dc:creator><![CDATA[Cody Redmond]]></dc:creator><pubDate>Sun, 11 Jun 2023 20:50:45 GMT</pubDate><content:encoded><![CDATA[<p>A friend is transitioning from service industry to project management. She&apos;s looking to get into tech, and even through she&apos;s very tech-minded, she feels insecure about her lack of experience in the business. I offered to put together a list of books that would be valuable in introducing common business situations and concepts. The only criteria is that I find the books readable, or even enjoyable. </p><ol><li><strong><a href="https://www.amazon.ca/Goal-Process-Ongoing-Improvement?ref=codydjango.com">The Goal</a>: A Process of Ongoing Improvement</strong><br>Eliyahu M. Goldratt<br><em>The classic pulpy novel about a manufacturing plant in crisis that introduces the &quot;Theory Of Constraints&quot; is also a great introduction to business in general, with a plethora of principles, stereotypes, and insights still relevant today.</em></li><li><strong><a href="https://www.amazon.ca/Phoenix-Project-DevOps-Helping-Business/dp/0988262592?ref=codydjango.com">The Phoenix Project</a>: A Novel about IT, DevOps, and Helping Your Business Win</strong><br>Gene Kim<br><em>The same winning formula as &quot;The Goal&quot;, but completely modernized for an organization headed towards a digital transformation; Introduction to DevOps and agile principles in action.</em></li><li><strong><a href="https://www.amazon.ca/Unicorn-Project-Developers-Disruption-Thriving-ebook/dp/B07QT9QR41?ref=codydjango.com">The Unicorn Project</a>: A Novel about Developers, Digital Disruption, and Thriving in the Age of Data</strong><br>Gene Kim<br><em>A follow-up to the Phoenix Project with emphasis on Developers and the role of Developer Experience in organizations achieving product success.</em></li><li><strong><a href="https://www.amazon.ca/Measure-What-Matters/dp/B07BMGX7W2?ref=codydjango.com">Measure What Matters</a>: How Google, Bono, and the Gates Foundation Rock the World with OKRs</strong><br>John Doerr<br><em>How to choose measurable objectives that drive behaviour and result in outcomes; How this system scales across an organization to ensure all teams are are working towards the same goal.</em></li><li><strong><a href="https://www.amazon.ca/Drive-Surprising-Truth-About-Motivates/dp/1594484805?ref=codydjango.com">Drive</a>: The Surprising Truth About What Motivates Us</strong><br>Daniel H. Pink<br><em>How to attract and motivate knowledge workers though Purpose, Mastery and Autonomy.</em></li><li><strong><a href="https://www.amazon.ca/Five-Dysfunctions-Team-Leadership-Fable?ref=codydjango.com">The Five Dysfunctions of a Team</a>: A Leadership Fable</strong><br>Patrick Lencioni<br><em>People issues are messy; what to look for and what can be done.</em></li><li><strong><a href="https://www.jimcollins.com/books.html?ref=codydjango.com">Good To Great</a>: Why Some Companies Make the Leap... and Others Don&apos;t.</strong><br>Jim Collins<br><em>Where to focus, where to invest, and what to avoid; The importance of disciplined thought and behaviour.</em></li><li><strong><a href="https://teamtopologies.com/?ref=codydjango.com">Team Topologies</a>: Organizing Business and Technology Teams for Fast Flow</strong><br>Matthew Skelton, Manuel Pais<br><em>What sociology and science tells us are the most important factors for high-functioning teams in areas of high complexity; How to organize teams to achieve goals while adapting to changing conditions; If architecture of the system and architecture of the organization are at odds, the architecture of the organization wins; </em></li><li><strong><a href="https://www.amazon.ca/Sooner-Safer-Happier-Patterns-Antipatterns/dp/1942788916?ref=codydjango.com">Sooner Safer Happier</a>: Antipatterns and Patterns for Business Agility</strong><br>Jonathan Smart<br><em>Flow efficiency, Lead time and Throughput; Identifying types of work with Cynefin framework; Leveraging Lean and Agile when it makes sense.</em></li><li><strong><a href="https://senseandrespond.co/?ref=codydjango.com">Sense &amp; Respond</a>: How Successful Organizations Listen to Customers and Create New Products Continuously.<br></strong>Jeff Gothelf &amp; Josh Seiden<br><em>Introduction to concepts of Lean, Agile, and a Test-And-Learn approach; How a Sense &amp; Respond model can be used throughout an organization.</em></li></ol><h2 id="notable-mentions">Notable Mentions</h2><ol><li><strong><a href="https://www.amazon.ca/Peopleware-Productive-Projects-and-Teams/dp/B09WDTVQ59?ref=codydjango.com">Peopleware</a>: Productive Projects and Teams</strong><br>DeMarco Tom<br>Another classic book that by today&apos;s standards might appear a little less relevant than it did when it was published. Many scenarios may not appear directly applicable to modern situations. But I still love this book. The principles still stand today, and I enjoy it from the historical perspective, too.</li></ol>]]></content:encoded></item></channel></rss>