-
Notifications
You must be signed in to change notification settings - Fork 22
/
Copy pathmanipleservices.html
579 lines (514 loc) · 27.2 KB
/
manipleservices.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
<html>
<meta content="width=device-width, initial-scale=1.0" name="viewport">
<head>
<title>
leontrolski - Manipleservices
</title>
<style>
body {margin: 5% auto; background: #fff7f7; color: #444444; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Helvetica, Arial, sans-serif; font-size: 16px; line-height: 1.8; max-width: 50rem;margin-left:30%;}
@media screen and (max-width: 800px) {body {font-size: 14px; line-height: 1.4; max-width: 90%;margin-left:5%;}}
#left{min-width:10%;position:fixed;top:0;left:0;padding:2rem;background:whitesmoke;height:100vh}
@media screen and (max-width: 1000px) {#left{display:none;}}
pre {width: 100%;background-color:white;line-height:1.3rem;padding:1rem;}
a {border-bottom: 1px solid #444444; color: #444444; text-decoration: none; text-shadow: 0 1px 0 #ffffff; }
a:hover {border-bottom: 1px solid #44444400;}
.inline {background: #b3b2b226; padding-left: 0.3em; padding-right: 0.3em; white-space: nowrap;}
.georgian {font-family: Georgia, serif;}
blockquote {font-style: italic;color:black;background-color:#f2f2f2;padding:2em;}
@media screen and (max-width: 800px) {blockquote {padding:1em;}}
details {border-bottom:solid 5px gray;}
th {text-align: left; border-right: solid 1px gray;border-bottom: solid 1px gray;padding: 0.5rem;}
td {border-right: solid 1px gray; padding: 0.5rem;}
.table-container{overflow-x:scroll;background:white;}
table {border-collapse: collapse;}
li {line-height: 1.4rem;}
.link-1 {display:table;font-size:1rem;margin-bottom:0.5rem;margin-top:0.5rem;}
.link-2 {display:table;font-size:0.8rem;margin-bottom:0.2rem;margin-left:2rem;}
</style>
<link href="https://unpkg.com/[email protected]/themes/prism-vs.css" rel="stylesheet">
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.20.0/components/prism-core.min.js">
</script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.20.0/plugins/autoloader/prism-autoloader.min.js">
</script>
</head>
<body>
<div id="left">
<a class="link-1" href="#intro">Intro</a>
<a class="link-1" href="#aims">Aims</a>
<a class="link-1" href="#methods">Methods</a>
<a class="link-2" href="#splitting-into-components">Splitting Into Components</a>
<a class="link-2" href="#dependency-specification">Dependency Specification</a>
<a class="link-2" href="#communication">Communication</a>
<a class="link-2" href="#deployment-independence">Deployment Independence</a>
<a class="link-2" href="#version-independence">Version Independence</a>
<a class="link-2" href="#separate-repos">Separate Repos</a>
<a class="link-2" href="#language-independence">Language Independence</a>
<a class="link-2" href="#database-independence">Database Independence</a>
<a class="link-1" href="#proposal">Proposal</a>
</div>
<a href="index.html">
<img src="pic.png" style="height:2em">
⇦
</a>
<p><i>2023-05-24</i></p>
<h1>Manipleservices</h1>
<p class="georgian">Maniple: <em>a subdivision of a Roman legion</em>. From manipulus:
<em>bundle, handful.</em>
</p>
<h1 id="intro">Intro</h1>
<p>Discussions of system design often focus on one of two
architectures:</p>
<ul>
<li>
<p><strong>Trad Monolith</strong> - one service, handling entire
requests vertically via function calls.</p>
</li>
<li>
<p><strong>Modern Classic Microservices</strong> - many services,
running on independent machines (or VMs or Pods), talking over the
network (usually REST or some message bus).</p>
</li>
</ul>
<p>
This neat lumping together of ideas (think left-wing/right-wing) isn't particularly
helpful when it comes to solving our problems effectively - we're free to pick
and choose ideas from each approach.
</p>
<p>With that in mind, this document aims to decompose these approaches into
(somewhat) orthogonal:</p>
<ul>
<li>
<p><strong>Aims</strong> - what are the actual development/production
aims we're trying to achieve?</p>
</li>
<li>
<p><strong>Methods</strong> - How to structure the code? How should
components of the system communicate? etc.</p>
</li>
</ul>
<p>It then <a href="#proposal">proposes</a> a third approach - <strong>Manipleservices</strong>
- with the target of maximising our aims,
in commonly encountered software environments, given our available methods.
There is an accompanying skeleton
<a href="https://github.com/leontrolski/manipleservices/tree/main#manipleservices">Python + CircleCI repo</a> -
variations on the ideas in this repo can be applied to a wide variety of codebases.
The main takeaway from this document is just a restatement of YAGNI:
</p>
<blockquote class="georgian">
Start your architecture as simply as possible. As your system grows, work out what <b>aims</b> you have,
and pick <b>methods</b> that best reach those aims. Don't choose methods based on ideologies,
that solve problems you may never have, often at significant cost.
</blockquote>
<p><em><small>To be more generic, service-like units of the system will be
referred to as "components" instead of "services".</small></em></p>
<h1 id="aims"><strong>Aims</strong></h1>
<p>Regardless of the methods we choose, we tend to hold the same aims.
Some of them end up being conflicting in reality, and their
costs/benefits need to be weighed up against one another.</p>
<ul>
<li>
<p><strong>Reliability</strong> - The system should keep working for the
end user.</p>
</li>
<li>
<p><strong>Database</strong></p>
<ul>
<li>
<p><strong>Isolation</strong> - Frequent, difficult-to-avoid incidents
are often caused by schema migrations, query planner changes etc. We
would like these to be isolated.</p>
</li>
<li>
<p><strong>Performance</strong> (assuming a conventional db) - After
we've reached the limits of vertical scaling, we would like to be able
to split the data to achieve some level of horizontal scaling. For most
companies, application-code performance is irrelevant as apps scale
easily and they're cheap to run.</p>
</li>
<li>
<p><strong>Consistency</strong> - ACID-ier the better.</p>
</li>
</ul>
</li>
<li>
<p><strong>Performance</strong> - The system should run quickly for the end user,
without being too pricey for the company running it.</p>
</li>
<li>
<p><strong>CI</strong> - The CI should run quickly, both running tests
and deploying components.</p>
</li>
<li>
<p><strong>Developer Ergonomics</strong> - Developers would like to be
able to <a href="https://leontrolski.github.io/cmd-click-manifesto.html">CMD-click things</a>, they'd like their tests to run fast, they'd
like to have confidence their one-line change doesn't break something
miles away, they'd like well-typed data to work with.</p>
</li>
<li>
<p><strong>Testing</strong> - Testing any slice of the system - from
unit tests through to cross component tests - should be easy.</p>
</li>
<li>
<p><strong>Organisational</strong> - Management want to have specific teams responsible for
specific bits of the code, both during development and running in prod.</p>
</li>
<li>
<p><strong>Debugging</strong> - It should be easy to isolate the cause
of production problems and recreate them locally. If something starts
breaking, it should be easy to say - "change X caused this to start
breaking". It would be nice to be able to run two versions of a component in production to
see which one doesn't work.</p>
</li>
</ul>
<br>
<h1 id="methods"><strong>Methods</strong></h1>
<h2 id="splitting-into-components">Splitting Into Components</h2>
<p>Splitting into more small components (as opposed to fewer large
components) simply magnifies the effects of the other methods below.</p>
<h2 id="dependency-specification">Dependency Specification</h2>
<p>Monoliths tend to very <em>loosely</em> specify "this code depends on
this other code", often via language-level imports. In the wild,
microservices also tend to loosely specify dependencies by enumerating
service URLs in some config file.</p>
<p>An explicit dependency graph has many advantages, it can be achieved
with:</p>
<ul>
<li>
<p>Language-level packages (eg. Installable Python packages + lists of
requirements).</p>
</li>
<li>
<p>File-level build system (eg. <a href="https://bazel.build/">Bazel</a>).</p>
</li>
</ul>
<p>An explicit dependency graph enables:</p>
<ul>
<li>
<p>Quicker <strong>CI</strong> - <a href="https://circleci.com/docs/api/v2/index.html#operation/continuePipeline">only build/test/deploy</a> things that
changed/that's dependencies changed.</p>
</li>
<li>
<p>More comprehensive <strong>testing</strong> - we know if we change
component A, we should run the integration tests covering {A, B, C}.</p>
</li>
<li>
<p>Better <strong>developer ergonomics</strong> - tests run quicker, the
section of system to bear in mind while working on a task is
reduced.</p>
</li>
</ul>
<p><strong>Note</strong>: the <strong>aims</strong> achieved by using this method are
<em>mostly</em> <em>independent</em> from those achieved using the other
methods.
</p>
<h2 id="communication">Communication</h2>
<p>The most common ways components communicate are:</p>
<ul>
<li>
Language-level function calls.
</li>
<li>
REST.
</li>
<li>
Message buses.
</li>
<li>
RPC frameworks.
</li>
</ul>
<p>Using function calls means it's difficult to achieve
<strong>deployment independence</strong> (see below), however there are
numerous downsides to picking any of the other options in regards to
<strong>developer ergonomics</strong>:
</p>
<ul>
<li>
<p>Typing tends to be rubbish without significant investment in wacky
(often codegen-based) tooling - see gRPC and friends.</p>
</li>
<li>
<p>CMD-clickability goes out the window, back to grep.</p>
</li>
</ul>
<p>There are also various performance costs to constantly serialising
and flinging data over the wire.</p>
<p>Using message buses introduces further <strong>deployment
independence</strong> - if one service is down, the caller can still
give it work to do later. However there are often large
<strong>debugging</strong> costs - where did that message
originally come from?
</p>
<h2 id="deployment-independence">Deployment Independence</h2>
<p>The ability to deploy different components to
different types of machines, and also to scale parts of the system
independently we refer to as deployment independence. Deployment independence
often comes together with version independence (see below), but it's possible
to have one without the other.
</p>
<p>
Having things deployed separately can help <strong>organisationally</strong>,
for example, different teams can be billed according to their resource usage.
Occasionally, different types of workers might be more suitable for different types of
workloads, for example, there might be some <strong>performance</strong> reason to serve user
requests by some magic cloud worker near to the user.
</p>
<p>
The case that deployment independence in and of itself enables scaling is somewhat overstated.
Consider an inbound request that hits component <code>A</code>, this in turn talks to component <code>B</code>.
If <code>B</code> takes 98% of the CPU, there's a temptation to make <code>B</code> a separately deployed component,
this means you can scale up the number of machines accordingly right? This only acheives anything meaningful where the work done by <code>B</code>
is parallelisable <em>for each original call to <code>A</code></em>, if this is not the case, you may as well deploy
<code>A+B</code> on every machine and scale them together.
</p>
<h2 id="version-independence">Version Independence</h2>
<p>
This is the ability to have different versions of components running in prod together.
</p>
</p>
An important use case is the ability to eg: deploy v1.3.7 of your payments service
alongside v1.3.6, directing 10% of traffic to the former and 90% of traffic to the latter.
This should lead to increased <strong>reliability</strong> as you can test new software
for a subset of users and automatically roll back if some threshold of errors is crossed.
</p>
<p>
If you wish to test pre-prod, there are significant <strong>testing</strong> tooling
problems to overcome - are you going to test against the cross-product of
possible interacting services? Some small <strong>developer ergonomics</strong>
friction is introduced - if I CMD-click, do I hit the version of the function
that is deployed and causing bugs? Similar pain is introduced whilst <strong>debugging</strong>.
Also, there is often significant tooling overhead introduced implementing versioning and deploying versions
of packages (see the example below).
</p>
<h2 id="separate-repos">Separate Repos</h2>
<p>Choosing many-repos vs monorepo mostly impacts <strong>developer
ergonomics</strong> and <strong>CI</strong>.</p>
<p>One downside of many-repos is duplication of component level
boilerplate (think CircleCI config etc). The flipside with monorepos is
that tooling like Github/CircleCI can get a bit overwhelming, it should be
possible to overcome this with tooling though.</p>
<p>
The bigger downside with many-repos is that it <strong id="rot">induces code rot</strong>. In a monorepo,
if there is a small cleanup to do with a bit of library code, you:
<ol>
<li>Make the fix.</li>
<li>Run linters/tests.</li>
<li>Fix everything downstream.</li>
<li>Repeat steps 2-3.</li>
</ol>
With many-repos, you:
<ol>
<li>Make the fix.</li>
<li>Upversion the library's package.</li>
<li>Deploy the package to some repository.</li>
<li>Work out which downstream things depend on the library.</li>
<li>For each downstream thing:
<ul>
<li>Upversion the dependency.</li>
<li>Work out how to run the linters/tests.</li>
<li>Run the linters/tests.</li>
</ul>
</li>
<li>Realise you made a small error in the original fix.</li>
<li>Repeat steps 1-6.</li>
</ol>
What tends to happen is people either make the fix and ignore steps 2 onwards, thereby
making people downstream wary of updating. <strong>Or they just don't make the fix</strong>.
At Google, people making wide-reaching changes are responsible for cleaning up at least
80% of the mess they make - this is the way it should be.
</p>
<p>In theory, the decision to have one/many-repos should be independent
of <strong>version independence</strong>, in practice, one tends to
follow the other.</p>
<blockquote>Aside:<em> It should at least be at least feasible to not have one
follow the other, consider these scenarios:</em>
<ul>
<li>
<p><em>Many-repos where one repo depends <strong>always</strong> on the HEAD of another.</em></p>
</li>
<li>
<p><em>A single repo, where one Python package depends on another
package <strong>at a different commit within the same repo</strong>.</em></p>
</li>
</ul>
It's also possible to imagine some funky setup with one master repo (that's maybe in charge
of deployment and integration testing) that contains one <a href="https://git-scm.com/book/en/v2/Git-Tools-Submodules">Git submodule</a>
for each component.
</blockquote>
<h2 id="language-independence">Language Independence</h2>
<p>Choosing to have a mix of languages impacts the feasibility of
various other methods:</p>
<ul>
<li>
<p><strong>Communication</strong> - no more language-level function
calls. <p><em>(Aside: How difficult to implement is eg. a Python call to a Node
process to server-side render some React be?)</em></p></p>
</li>
<li>
<p><strong>Dependency Specification</strong> - no more utilising
language-level package tooling.</p>
</li>
</ul>
<p>In practice, there will often be eg. Python and TypeScript living
alongside each other. It might be advantageous to mix and match
different kinds of components and methods in this case.</p>
<h2 id="database-independence">Database Independence</h2>
<p>Sometimes a component has many databases (maybe a separate time
series db for a particular domain). The consensus is that the opposite - having many
components sharing one database - is a no-go, in this case the coupling is such that you
only <em>truly</em> have one component.
</p>
<p>More databases means:</p>
<ul>
<li>
<p>More <strong>isolation</strong>.</p>
</li>
<li>
<p>The potential for better <strong>performance</strong>.</p>
</li>
<li>
<p>The potential for worse <strong>performance</strong> by being forced
into doing cross-database JOINs at the application level.</p>
</li>
<li>
<p>Worse <strong>consistency</strong> - cross database transactions are
hard to do.</p>
</li>
<li>
<p>Worse <strong>debugging</strong>, the number of things you can JOIN
across to debug is reduced. This is somewhat surmountable with
data-lake-y things.</p>
</li>
</ul>
<h1 id="proposal">Proposal</h1>
<p><em><a href="https://github.com/leontrolski/manipleservices/tree/main#manipleservices">Example Python + CircleCI repo</a></em>.</p>
<p>
Manipleservices is a set of methods picked to maximise our aims within commonly seen, medium
sized codebases (that is to say, almost all codebases). The following table summarises the choices made,
comparing them to the two dominating approaches:
</p>
<div class="table-container">
<table>
<tr>
<th>Method</th>
<th>Trad Monolith</th>
<th>Modern Classic Microservices</th>
<th>Manipleservices</th>
</tr>
<tr>
<td><strong>Splitting Into Components</strong></td>
<td>One component</td>
<td>Many components</td>
<td>Handful of components</td>
</tr>
<tr>
<td><strong>Dependency Specification</strong></td>
<td>Loose via imports</td>
<td>Not really</td>
<td>Explicit at the language's package level (eg. in <code>pyproject.toml</code>s)</td>
</tr>
<tr>
<td><strong>Communication</strong></td>
<td>Function calls</td>
<td>REST</td>
<td>Mostly function calls</td>
</tr>
<tr>
<td><strong>Deployment Independence</strong></td>
<td>None</td>
<td>Loads</td>
<td>Nope</td>
</tr>
<tr>
<td><strong>Version Independence</strong></td>
<td>Nope</td>
<td>Yup</td>
<td>Mostly not</td>
</tr>
<tr>
<td><strong>Separate Repos</strong></td>
<td>Nope</td>
<td>Yup</td>
<td>Nope</td>
</tr>
<tr>
<td><strong>Language Independence</strong></td>
<td>Not really</td>
<td>Yup</td>
<td>Mostly not</td>
</tr>
<tr>
<td><strong>Database Independence</strong></td>
<td>One DB</td>
<td>One DB per component</td>
<td>One or many DBs per component</td>
</tr>
</table>
</div>
<p>Manipleservices adopt further deployment
independence/communication methods <strong>only when there
are concrete reasons to do so</strong>.</p>
<p>
Why did we pick these methods?
<ul>
<li><p><strong>Splitting Into Components</strong> - We pick a handful - it's all in the name, a maniple.</p></li>
<li><p><strong>Dependency Specification</strong> - We're explicit so we can always test/deploy everything that we need to, but nothing more.</p></li>
<li><p><strong>Communication</strong> - Until there's some RPC library with excellent typing and good CMD-clickability, function calls are a vastly superior dev experience. The benefits from sending things over wire aren't big enough for most medium codebases.</p></li>
<li><p><strong>Deployment Independence</strong> - The benefits of doing this are only really present with very large codebases/organisations. It's worth restating, that this is largely orthogonal to version independence.</p></li>
<li><p><strong>Version Independence</strong> - This might be the first thing you'd change when looking from a reliability perspective, being able to A-B test core features is good. However, until you actually see incidents where a root fix would have been to do this, the complexity introduced in testing and developer tooling probably isn't worth it.</p></li>
<li><p><strong>Separate Repos</strong> - The Github (or equivalent) tooling needs to improve for monorepos, getting the CI right is tricky, but code <a href="#rot">enters long term decline</a> in many-repos.</p></li>
<li><p><strong>Language Independence</strong> - There don't seem to be many projects that try get cross-language typing/in process communication working well. For this reason, combined with the choice of function-calls-by-default, we try keep the number of languages in use low.</p></li>
<li><p><strong>Database Independence</strong> - A handful of DBs makes it largely possible to avoid cross-component N+1 patterns, while still letting us tune specific databases for specific performance characteristics.</p></li>
</ul>
</p>
<h2><a href="https://github.com/leontrolski/manipleservices/tree/main#manipleservices">Manipleservices in Python</a></h2>
<ul>
<li><p>Components are installable packages. (See also <a href="https://blogs.newardassociates.com/blog/2023/you-want-modules-not-microservices.html">You Want Modules, Not Microservices</a>).</p></li>
<li><p>A dependency tree of the entire system is encoded in the <code>pyproject.toml</code> files via <code>-e</code> installs.</p></li>
<li><p>CI builds continuation jobs based on which component(s) change and their dependencies.</p></li>
<li><p>We only call functions from <code><other-package>.api</code>, this is by convention, but should probably be linted (and linted such that only "dumb data" can cross the interface).</p></li>
<li><p>Note that we don't import any types from <code><other-package></code>, we make our own types.
<ul>
<li><p>When we call <code><other-package>.api.f(a)</code>, <code>a</code> only has to conform to a <code>Protocol</code>, not a nominal type.</p></li>
<li><p>In the return type from <code><other-package>.api(...) -></code>, we can just choose which bits we need, we don't have to conform to the whole return type.</p></li>
<li><p><code>mypy</code> will check that everything lines up.</p></li>
</ul>
</p></li>
<li><p>The aim of the above, is that if we for some reason wanted to switch inter-package communication from function calls to eg. REST, it should be trivial.</p></li>
<li><p>In <code>.circleci/continue.py</code>, <code>changed_packages(...)</code> could be any function that returns a list of packages that you think have changed. Ditto with <code>add_dependant_packages(...)</code> - this would normally add any packages that depend on the ones that have changed.</p></li>
<li><p>As well as packages that are "service-like", you'd have eg: <code>packages/shop_integration_tests</code> that might depend on <code>account</code> and <code>shop</code> and run tests that depend on both of them.</p></li>
<li><p>To complement the above, we might want that by convention, eg: <code>shop</code> would monkeypatch any methods of <code>account.api</code> that it uses during testing.</p></li>
</ul>
<h3>File layout</h3>
<pre>├── README.md
└── packages
├── account
│ ├── pyproject.toml
│ ├── src
│ │ └── account
│ │ ├── __init__.py
│ │ ├── api
│ │ │ ├── get.py
│ │ │ └── post.py
│ │ ├── db.py
│ │ ├── py.typed
│ │ └── users.py
│ └── tests
│ └── test_user.py
└── shop
├── pyproject.toml
├── src
│ └── shop
│ ├── __init__.py
│ ├── api
│ │ ├── get.py
│ │ └── post.py
│ ├── basket.py
│ ├── client
│ │ └── account.py
│ └── py.typed
└── tests
└── test_basket.py
</pre>
</body>
</html>