Conway’s Law was first articulated in 1967 by computer programmer Melvin Conway. Examining reveals a radical leap forward in the development of complex programmatic systems:
“organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations.”
In 2011, Harvard business school published a paper: Exploring the Duality between Product and Organizational Architectures: A Test of the “Mirroring” Hypothesis. The collaborative work, conducted by MaCormack, Rusnak and Baldwin, studied what they called “The Mirroring Hypothesis”, or – as we know it – “Conway’s Law”. To find data, the team analyzed software patterns to better understand the relationship between people and the systems they create.
During their analysis they separated organizations into two groups: tightly coupled and loosely coupled.
A tightly coupled organization is a typical corporate software development environment where employees dwell under the same roof, report to a manager, who reports to a manager, within an intentional structure in organized concert. It has structured communication that bends to the physical confines of the workplace.
A loosely coupled organization is best exemplified by an open source project. Communication occurs between random contributors. It is unstructured, unmanaged and the team is often distributed. There is minimal – if any – social fabric.
Table from “A Test of the “Mirroring” Hypothesis”
|Tightly Coupled||Loosely Coupled|
|Goals||Shared, Explicit||Diverse, Implicit|
|Membership||Closed, Contracted||Open, Voluntary|
|Authority||Formal, Hierarchical||Informal, Meritocratic|
|Location||Centralized, Colocated||Decentralized, Distributed|
|Behaviour||Planned, Coordinated||Emergent, Independent|
What did they learn? To frame their findings, they coined the term Propagation Cost.
They defined it as:
“the percentage of system elements that can be effected, on average, when a change is made to a randomly chosen element.”
A function or file has a propagation cost which indicates the overall percentage of the system that is impacted when it is altered. In other words, if you change a file, how many other files are impacted by your change? Did your change break a bunch of stuff? If so, then that is a high propopagation cost.
By comparing open source projects to projects that were first private, and then made public, the paper established a clear link: tightly coupled organizations build systems with significantly higher propagation costs while loosely coupled organizations built systems with a much lower propagation cost.
Why? The reason seems to be that loosely coupled systems do not share any existing social context and therefore their authors write more approachable code, so that anyone can make alterations to the system with minimal context. Open source authors are aware that their code will be seen by the broad public and, similar to a writer cleaning up their language for a broad or international audience, the programmer does with their code. It is a function of applied empathy for the reader or the next contributor.
And why do tightly coupled teams produce more tightly coupled programmatic systems? Imagine you and your peers gathered around one computer solving a problem and “co-authoring” a solution. Deep context is presumed inside of the solution as you all think through it together, in the office or over lunch.
In a close-knit office environment, there is little motivation to factor accessibility into the code because the physical interaction makes the context apparent to more contributors. The primary author can also be relied upon to “be there” if clarity is ever needed. Now repeat this for the development of most of the system and we can imagine the impact.
The study implies partial proof of Conway’s Law: systems reflect the communication structures within which they are developed.
A second study, The Influence of Organizational Structure on Software Quality: An Empirical Case Study by Nagappan, Murphy and Basili of Microsoft pushed the idea further. It revealed that Conway’s Law is bidirectional, and that poor code level failure metrics also indicate poor organizational metrics:
“…organizational metrics when applied to data … were statistically significant predictors of failure-proneness.”
But what were these “organizational metrics” that they collected? And how did these metrics speak towards software failure proneness? Does this mean that code can be analyzed to see where organizations need to change?
The following table has been adapted from the article, with the descriptions expanded for clarity and context.
Low failure rate is is good, high failure rate is bad.
|Organizational Metric||Quality Indication|
|Number of engineers||The greater the number of engineers that touched the code, the higher the failure rate.|
|Number of ex-engineers||Teams that lost team members had a decline in knowledge retention and therefore an increase of failure rate.|
|Edit frequency||The more edits to a component, the higher the instability and thus failure rate.|
|Depth of Master Ownership||Components that were “owned” by a “master” – a single person with 75% or more of edits – had lower failure rates.|
|Percentage of Org contributing to development||The lower the overall percentage of the organization that contributed to development, the lower the failure rate.|
|Level of organizational code ownership||The more code contributors outside of the core working group, the higher the failure rate.|
|Overall Organization Ownership||The ratio of “masters” making edits to code compared to the total number of engineers. The better the ratio in favour of “master” contribution, the lower the failure rate.|
|Organization Intersection Factor||For each group that contributed 10% or more to a codebase, the higher the failure rate.|
From the quality indications, an “ideal development team” is one that:
- is small
- has had no churn
- makes few edits
- has an owner of 75% contribution history or more
- does not need to work across an organization
- contains only one working group
After developing a means to determine what success and failure look like within a system, the teams’ conclusion was thus:
“… organizational measures predict failure-proneness … with significant precision, recall and sensitivity.”
This is fascinating! And it pushes the proof of Conway’s Law even further: organizations can be analyzed to predict the quality of programmed work and vice versa. But it is easy to “miss the forest from the trees” when drawing conclusions from other software systems, given the near infinite nature of their potential complexity. And social dynamics are so rooted in chaos that one wonders about the repeatability and transferability of these studies to most environments. But this is science and so we emerge with two key take aways and consider them to be true:
1) Systems reflect the communication structures within which they are developed.
2) Organizations can be analyzed to predict the quality of programmed work and vice versa.
Or, in short, we agree that Conway’s Law is true.
The Reverse Conway
Awareness of Conway’s Law can help organizations establish healthy social structures as projects grow. Microservice archicture offers social scaling as a key benefit: architect your code into modular microservices first, and then the optimal, modular organizational structure will follow. It is marketed as Conway’s Law in reverse. Simpler, tidier systems with healthy communication create simpler, tidier teams with healthier communication.
It is a fair proposal, as when we think back to the “ideal team”, a microservice team will be small, have a “master” and have only one working group. But it can still be impacted by churn, make many updates and work across other organizational teams, perhaps even more often.
Yet the approach is still wise from a pure technical perspective, even with the social caveats. Compared to tightly coupled monolithic designs, a loosely-coupled microservice team will proliferate systems which are more approachable and less failure prone. But it cuts the other way in that there will then be a mesh of independent pieces, each vulnerable to the drawbacks of poor inter-dependent communication.
Ten microservice teams staffed by poor communicators will render a poor system in aggregate, no matter how competent the programmers and no matter how “API-like” they behave. It may be better than the chaotic, monolithic alternative, but it only scratches the surface of where you can generate the most improvement.
Communication is often called a “soft problem". It is rooted in emotion, spirit, and chaos, and that is why weak teams try to engineer around it. On the surface, avoidance is natural, as there is evidence that open source projects operate well without a long-term plan, organizational structure, or any form of social interaction. It also appears as though organizations can “reverse Conway” their systems, using architectural changes to “fix” and “analyze” their people. But this is destructive.
Communication is not an engineering problem and it is not a “soft problem". It is a human problem, the hardest problem, and the one that most teams lack the courage to face. A team optimizes for Conway’s Law not when it rearranges people around code like furniture, but when it creates an environment where each person is inspired and expected to improve their interpersonal communication. It does it when instead of trying to tool over it or engineer around it, they go through it. The technical brilliance of tomorrow will be born from those that learn to minimize ego and enhance their verbal, non-verbal, emotional and spiritual communication.
So how do you do it? Ask your team mates.>> Home