<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Distributed Systems on Jahvon Dockery</title>
    <link>https://jahvon.dev/tags/distributed-systems/</link>
    <description>Recent content in Distributed Systems on Jahvon Dockery</description>
    <generator>Hugo -- 0.148.1</generator>
    <language>en</language>
    <lastBuildDate>Tue, 27 May 2025 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://jahvon.dev/tags/distributed-systems/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Distributing Work with Go Concurrency</title>
      <link>https://jahvon.dev/notes/distributing-work/</link>
      <pubDate>Tue, 27 May 2025 00:00:00 +0000</pubDate>
      <guid>https://jahvon.dev/notes/distributing-work/</guid>
      <description>&lt;p&gt;A few months back, I worked through VictoriaMetrics&amp;rsquo; &lt;a href=&#34;https://victoriametrics.com/blog/go-sync-mutex/index.html&#34;&gt;Go concurrency series&lt;/a&gt; and wanted to get some practice. So I implemented a few distributed systems, work distribution patterns to see how the concurrency patterns translate.&lt;/p&gt;
&lt;p&gt;Work distribution is fundamental to building scalable systems - you need ways to spread processing across multiple components while coordinating the results. Go&amp;rsquo;s goroutines and channels map well to distributed system concepts - channels as service communication, goroutines as system components, WaitGroups for coordination. Here&amp;rsquo;s what I learned.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>A few months back, I worked through VictoriaMetrics&rsquo; <a href="https://victoriametrics.com/blog/go-sync-mutex/index.html">Go concurrency series</a> and wanted to get some practice. So I implemented a few distributed systems, work distribution patterns to see how the concurrency patterns translate.</p>
<p>Work distribution is fundamental to building scalable systems - you need ways to spread processing across multiple components while coordinating the results. Go&rsquo;s goroutines and channels map well to distributed system concepts - channels as service communication, goroutines as system components, WaitGroups for coordination. Here&rsquo;s what I learned.</p>
<h2 id="producer-consumer-async-work-distribution">Producer-Consumer: Async Work Distribution</h2>
<p>Producers generate work and send it through channels while consumers process it asynchronously.</p>
<div class="highlight"><pre tabindex="0" style="color:#d6cbb4;background-color:#252b2e;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#e67e80">type</span> ConsumerResult <span style="color:#e67e80">struct</span> {
</span></span><span style="display:flex;"><span>    ConsumerID <span style="color:#dbbc7f">int</span>
</span></span><span style="display:flex;"><span>    Data       <span style="color:#dbbc7f">string</span>
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#859289;font-style:italic">// start multiple consumers</span>
</span></span><span style="display:flex;"><span><span style="color:#e67e80">for</span> id <span style="color:#7a8478">:=</span> <span style="color:#d699b6">0</span>; id &lt; numConsumers; id<span style="color:#7a8478">++</span> {
</span></span><span style="display:flex;"><span>    wg.<span style="color:#b2c98f">Add</span>(<span style="color:#d699b6">1</span>)
</span></span><span style="display:flex;"><span>    <span style="color:#e67e80">go</span> <span style="color:#e67e80">func</span>(consumerID <span style="color:#dbbc7f">int</span>) {
</span></span><span style="display:flex;"><span>        <span style="color:#e67e80">defer</span> wg.<span style="color:#b2c98f">Done</span>()
</span></span><span style="display:flex;"><span>        <span style="color:#e67e80">for</span> msg <span style="color:#7a8478">:=</span> <span style="color:#e67e80">range</span> msgChan {
</span></span><span style="display:flex;"><span>            result <span style="color:#7a8478">:=</span> ConsumerResult{
</span></span><span style="display:flex;"><span>                ConsumerID: consumerID,
</span></span><span style="display:flex;"><span>                Data: fmt.<span style="color:#b2c98f">Sprintf</span>(<span style="color:#b2c98f">&#34;processed-%d&#34;</span>, msg),
</span></span><span style="display:flex;"><span>            }
</span></span><span style="display:flex;"><span>            resultChan <span style="color:#7a8478">&lt;-</span> result
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>    }(id)
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#859289;font-style:italic">// start a single producer that sends work into a channel</span>
</span></span><span style="display:flex;"><span><span style="color:#e67e80">go</span> <span style="color:#e67e80">func</span>() {
</span></span><span style="display:flex;"><span>    <span style="color:#e67e80">defer</span> <span style="color:#d699b6">close</span>(msgChan)
</span></span><span style="display:flex;"><span>    <span style="color:#e67e80">for</span> i <span style="color:#7a8478">:=</span> <span style="color:#d699b6">1</span>; i <span style="color:#7a8478">&lt;=</span> <span style="color:#d699b6">25</span>; i<span style="color:#7a8478">++</span> {
</span></span><span style="display:flex;"><span>        msgChan <span style="color:#7a8478">&lt;-</span> i
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}()
</span></span></code></pre></div><p>Buffered channels give you throttling - if consumers can&rsquo;t keep up, the producer blocks instead of consuming memory.</p>
<p>Use this pattern for event streaming, async processing, or decoupling generation speed from processing speed. It maps directly to Kafka or microservice event handling.</p>
<h2 id="worker-pools-controlled-work-distribution">Worker Pools: Controlled Work Distribution</h2>
<p>Worker pools give you structure - fixed number of workers pulling from the same job queue. It&rsquo;s like running N service instances behind a load balancer.</p>
<div class="highlight"><pre tabindex="0" style="color:#d6cbb4;background-color:#252b2e;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#e67e80">type</span> PoolJob <span style="color:#e67e80">struct</span> {
</span></span><span style="display:flex;"><span>    ID   <span style="color:#dbbc7f">int</span>
</span></span><span style="display:flex;"><span>    Data <span style="color:#dbbc7f">string</span>
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#859289;font-style:italic">// start a fixed number of workers</span>
</span></span><span style="display:flex;"><span><span style="color:#e67e80">for</span> i <span style="color:#7a8478">:=</span> <span style="color:#d699b6">0</span>; i &lt; numWorkers; i<span style="color:#7a8478">++</span> {
</span></span><span style="display:flex;"><span>    wg.<span style="color:#b2c98f">Add</span>(<span style="color:#d699b6">1</span>)
</span></span><span style="display:flex;"><span>    <span style="color:#e67e80">go</span> <span style="color:#e67e80">func</span>(workerID <span style="color:#dbbc7f">int</span>) {
</span></span><span style="display:flex;"><span>        <span style="color:#e67e80">defer</span> wg.<span style="color:#b2c98f">Done</span>()
</span></span><span style="display:flex;"><span>        <span style="color:#e67e80">for</span> job <span style="color:#7a8478">:=</span> <span style="color:#e67e80">range</span> jobChan {
</span></span><span style="display:flex;"><span>            <span style="color:#859289;font-style:italic">// do some work</span>
</span></span><span style="display:flex;"><span>            time.<span style="color:#b2c98f">Sleep</span>(<span style="color:#d699b6">100</span> <span style="color:#7a8478">*</span> time.Millisecond)
</span></span><span style="display:flex;"><span>            results <span style="color:#7a8478">&lt;-</span> PoolResult{
</span></span><span style="display:flex;"><span>                WorkerID: workerID,
</span></span><span style="display:flex;"><span>                JobID:    job.ID,
</span></span><span style="display:flex;"><span>                Value:    fmt.<span style="color:#b2c98f">Sprintf</span>(<span style="color:#b2c98f">&#34;processed-%s&#34;</span>, job.Data),
</span></span><span style="display:flex;"><span>            }
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>    }(i)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>The job channel acts like a load balancer - work goes to whichever worker is available.</p>
<p>This pattern is good for CPU-heavy tasks or when you need predictable resource usage. It&rsquo;s similar to scaling microservice instances for ingress traffic.</p>
<h2 id="batch-processing-efficient-work-distribution">Batch Processing: Efficient Work Distribution</h2>
<p>Sometimes you need to group items into batches for efficiency or to respect downstream rate limits. This example handles batching by size and by time.</p>
<div class="highlight"><pre tabindex="0" style="color:#d6cbb4;background-color:#252b2e;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#e67e80">func</span> (p <span style="color:#7a8478">*</span>BatchProcessor) <span style="color:#b2c98f">startBatchAggregator</span>() {
</span></span><span style="display:flex;"><span>    <span style="color:#e67e80">go</span> <span style="color:#e67e80">func</span>() {
</span></span><span style="display:flex;"><span>        batch <span style="color:#7a8478">:=</span> <span style="color:#d699b6">make</span>([]<span style="color:#dbbc7f">int</span>, <span style="color:#d699b6">0</span>, p.batchSize)
</span></span><span style="display:flex;"><span>        flushTimer <span style="color:#7a8478">:=</span> time.<span style="color:#b2c98f">NewTimer</span>(<span style="color:#d699b6">2</span> <span style="color:#7a8478">*</span> time.Second)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        sendBatch <span style="color:#7a8478">:=</span> <span style="color:#e67e80">func</span>() {
</span></span><span style="display:flex;"><span>            <span style="color:#7a8478">&lt;-</span>p.rateLimiter.C <span style="color:#859289;font-style:italic">// wait for rate limiter</span>
</span></span><span style="display:flex;"><span>            batchCopy <span style="color:#7a8478">:=</span> <span style="color:#d699b6">make</span>([]<span style="color:#dbbc7f">int</span>, <span style="color:#d699b6">len</span>(batch))
</span></span><span style="display:flex;"><span>            <span style="color:#d699b6">copy</span>(batchCopy, batch)
</span></span><span style="display:flex;"><span>            p.batchChan <span style="color:#7a8478">&lt;-</span> batchCopy
</span></span><span style="display:flex;"><span>            batch = batch[:<span style="color:#d699b6">0</span>]
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        <span style="color:#e67e80">for</span> {
</span></span><span style="display:flex;"><span>            <span style="color:#e67e80">select</span> {
</span></span><span style="display:flex;"><span>            <span style="color:#e67e80">case</span> item, ok <span style="color:#7a8478">:=</span> <span style="color:#7a8478">&lt;-</span>p.itemChan:
</span></span><span style="display:flex;"><span>                <span style="color:#e67e80">if</span> !ok {
</span></span><span style="display:flex;"><span>                    <span style="color:#e67e80">if</span> <span style="color:#d699b6">len</span>(batch) &gt; <span style="color:#d699b6">0</span> {
</span></span><span style="display:flex;"><span>                        <span style="color:#b2c98f">sendBatch</span>()
</span></span><span style="display:flex;"><span>                    }
</span></span><span style="display:flex;"><span>                    <span style="color:#d699b6">close</span>(p.batchChan)
</span></span><span style="display:flex;"><span>                    <span style="color:#e67e80">return</span>
</span></span><span style="display:flex;"><span>                }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>                batch = <span style="color:#d699b6">append</span>(batch, item)
</span></span><span style="display:flex;"><span>                <span style="color:#e67e80">if</span> <span style="color:#d699b6">len</span>(batch) <span style="color:#7a8478">&gt;=</span> p.batchSize {
</span></span><span style="display:flex;"><span>                    <span style="color:#b2c98f">sendBatch</span>()
</span></span><span style="display:flex;"><span>                }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>            <span style="color:#e67e80">case</span> <span style="color:#7a8478">&lt;-</span>flushTimer.C:
</span></span><span style="display:flex;"><span>                <span style="color:#e67e80">if</span> <span style="color:#d699b6">len</span>(batch) &gt; <span style="color:#d699b6">0</span> {
</span></span><span style="display:flex;"><span>                    <span style="color:#b2c98f">sendBatch</span>()
</span></span><span style="display:flex;"><span>                }
</span></span><span style="display:flex;"><span>            }
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>    }()
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>The <code>select</code> with the flush timer gives you batches when they&rsquo;re full OR when time runs out. The rate limiter prevents overwhelming the batch processor and its external dependencies. I used a simple timer here, but you can replace it with much more sophisticated limiting logic as needed.</p>
<p>The batch processing pattern works well for database bulk operations, API integrations with rate limits, or protecting downstream services.</p>
<h2 id="a-few-notes">A Few Notes</h2>
<p>Working through these patterns reinforced a few things:</p>
<ul>
<li>Channels behave like message queues with capacity limits and natural flow control.</li>
<li>Multiple goroutines running the same function is basically horizontal scaling - same patterns you&rsquo;d use for scaling system components.</li>
<li>These patterns compose well. Producer-consumer provides the foundation, worker pools add structure, batching adds efficiency.</li>
</ul>
<table>
  <thead>
      <tr>
          <th>Pattern</th>
          <th>Analogy</th>
          <th>Use Case</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Producer-Consumer</td>
          <td>Message queues, event streams</td>
          <td>Event-driven architectures, async processing</td>
      </tr>
      <tr>
          <td>Worker Pools</td>
          <td>Load-balanced system components</td>
          <td>Controlled concurrency, predictable resources</td>
      </tr>
      <tr>
          <td>Batch Processing</td>
          <td>ETL pipelines, bulk APIs</td>
          <td>Rate limiting, bulk operations</td>
      </tr>
  </tbody>
</table>
<h3 id="error-handling">Error Handling</h3>
<p>Error handling in concurrent code needs to be explicit and planned upfront, similar to how distributed systems need circuit breakers and retry logic. I used result structs that carry either data or errors:</p>
<div class="highlight"><pre tabindex="0" style="color:#d6cbb4;background-color:#252b2e;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#e67e80">type</span> WorkResult <span style="color:#e67e80">struct</span> {
</span></span><span style="display:flex;"><span>    Data <span style="color:#dbbc7f">string</span>
</span></span><span style="display:flex;"><span>    Err  <span style="color:#dbbc7f">error</span>
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#e67e80">func</span> <span style="color:#b2c98f">worker</span>(jobs <span style="color:#7a8478">&lt;-</span><span style="color:#e67e80">chan</span> <span style="color:#dbbc7f">int</span>, results <span style="color:#e67e80">chan</span><span style="color:#7a8478">&lt;-</span> WorkResult) {
</span></span><span style="display:flex;"><span>    <span style="color:#e67e80">for</span> job <span style="color:#7a8478">:=</span> <span style="color:#e67e80">range</span> jobs {
</span></span><span style="display:flex;"><span>        <span style="color:#e67e80">if</span> job<span style="color:#7a8478">%</span><span style="color:#d699b6">7</span> <span style="color:#7a8478">==</span> <span style="color:#d699b6">0</span> { <span style="color:#859289;font-style:italic">// simulate some failures</span>
</span></span><span style="display:flex;"><span>            results <span style="color:#7a8478">&lt;-</span> WorkResult{Err: fmt.<span style="color:#b2c98f">Errorf</span>(<span style="color:#b2c98f">&#34;job %d failed&#34;</span>, job)}
</span></span><span style="display:flex;"><span>            <span style="color:#e67e80">continue</span>
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>        results <span style="color:#7a8478">&lt;-</span> WorkResult{Data: fmt.<span style="color:#b2c98f">Sprintf</span>(<span style="color:#b2c98f">&#34;processed-%d&#34;</span>, job)}
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>For timeouts and cancellation, <code>context.Context</code> works well:</p>
<div class="highlight"><pre tabindex="0" style="color:#d6cbb4;background-color:#252b2e;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-go" data-lang="go"><span style="display:flex;"><span><span style="color:#e67e80">func</span> <span style="color:#b2c98f">workerWithTimeout</span>(ctx context.Context, jobs <span style="color:#7a8478">&lt;-</span><span style="color:#e67e80">chan</span> <span style="color:#dbbc7f">int</span>, results <span style="color:#e67e80">chan</span><span style="color:#7a8478">&lt;-</span> WorkResult) {
</span></span><span style="display:flex;"><span>    <span style="color:#e67e80">for</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#e67e80">select</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#e67e80">case</span> job <span style="color:#7a8478">:=</span> <span style="color:#7a8478">&lt;-</span>jobs:
</span></span><span style="display:flex;"><span>            <span style="color:#859289;font-style:italic">// process the job</span>
</span></span><span style="display:flex;"><span>        <span style="color:#e67e80">case</span> <span style="color:#7a8478">&lt;-</span>ctx.<span style="color:#b2c98f">Done</span>():
</span></span><span style="display:flex;"><span>            results <span style="color:#7a8478">&lt;-</span> WorkResult{Err: ctx.<span style="color:#b2c98f">Err</span>()}
</span></span><span style="display:flex;"><span>            <span style="color:#e67e80">return</span>
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><h3 id="beyond-the-basics">Beyond the basics</h3>
<p>These patterns scratch the surface of Go&rsquo;s concurrency toolkit. The VictoriaMetrics series I mentioned dives deep into more advanced primitives like <code>sync.Mutex</code> for protecting shared state, <code>sync.Pool</code> for object reuse, <code>sync.Once</code> for one-time initialization, and <code>sync.Map</code> for concurrent map access. I recommend checking it out if you haven&rsquo;t already!</p>
<p>I intentionally stuck to channels and WaitGroups in my examples here - they mirror message passing between services naturally and keep the code readable. Once those patterns are solid, adding mutexes and other synchronization primitives becomes intuitive because you already understand the coordination challenges.</p>
<p>As you build more complex systems, you&rsquo;ll need these other tools. Mutexes become your distributed locks, sync.Pool mirrors connection pooling in microservices, sync.Once handles singleton initialization across service instances (similar to leader election), and sync.Map acts like shared caches that multiple services access concurrently.</p>
<p><strong><a href="https://github.com/jahvon/go-concurrency-examples">Full code examples on GitHub</a></strong></p>
<p><em>Next: circuit breaker and fan-in/fan-out implementations.</em></p>
]]></content:encoded>
    </item>
  </channel>
</rss>
