The Future of Community Patch

Community Patch was my first real serverless application. It started as a Slack conversation between myself and one of our developers where we identified the need to have a “stupid simple patch server” to allow people to get a cloud hosted external source running without having to do a lot of the upfront work.

I wrote the StupidSimplePatchServer that day. An S3 bucket for the JSON definitions with an API Gateway to serve and manage them. It was more proof of concept than production ready service, but everything was there to transform it into a full blown resource for the Jamf community.

I launched Community Patch in April of 2018 where it has been serving Jamf admins for three years. The vision was to have a place where anyone can create and share patch definitions completely publicly. It was API driven with the hope to enable better automation, and that did eventually happen as Jamf admins began integrating patch feed updates into their AutoPkg workflows. There are ~1,500 active feed subscriptions as of today. I’m really quite proud of how this has grown, and of the value it gave to others.

For the first year of its existence, Community Patch operated entirely within AWS’s free tier. Eventually usage went above those limits and the service cost a few dollars per month. Not an issue. Over time, with the kind of organic growth it has seen, Community Patch started costing more. The beautiful thing about serverless apps is that consistent workflows scale linearly in price so I was never subject to sticker shock. I could see the trend and knew what to expect.

As fortune would have it, the next two years of Community Patch would be funded by AWS through a combination of general use and open source credits I received. I focused during this time on where I wanted to take the project, and what to do once those credits ran their course.

This has been discussed a lot in the #communitypatch channel in the MacAdmins Slack, but I had been working on a major overhaul meant to address everything that was an issue, and everything that was learned with Beta 2 (the current iteration). This new architecture would be more cost effective (basing it on metrics and billing data I gathered), more performant, and fill in the gaps on many use cases my users brought up: API tokens, delegating permissions to other contributors, creating a unified patch feed instead of subscribing to many, etc. Much of the code for this next version is out in the repo today.

Then Jamf bought Kinobi just before JNUC 2020. 

I took a week to figure out where I go from here. Kinobi has a lot of tech for Jamf to take advantage of including custom definition support – the key feature of Community Patch. With the Kinobi team now bringing their know-how and their stack to Jamf I was left wondering if it was a valuable use of my time to keep working on my version. At first I decided not to change course. I committed to updates to my Patch Server project, which enables self-hosted patch sources, and said I would revisit the funding question of Community Patch another day.

Today’s that day. The new architecture for Community Patch no longer makes sense to invest the time and effort into a migration of the current user base. In fact, I don’t even know who all is using it! I am able to reach out to the contributors as they signed up to be able to host definitions, but the service is public and open (back to the ~1500 feed number). That makes migrating and retiring the old stack even more difficult.

It has been suggested I setup something like a Patreon, but the truth of the matter is I won’t be iterating fixes or features for Community Patch going forward. People would be giving me money to keep the lights on, and that doesn’t sit well. Plus, it doesn’t change that the service’s use continues to increase at a steady rate which means the cost will continue to creep up over time if it’s still running.

This leads me to my decision: Community Patch will be shut down. My intent is to have this message shared as widely as possible, the web page updated to inform browsers of this, and ultimately shut down the APIs after JNUC 2021. When that happens your Jamf Pro server will error when attempting to retrieve updates, but it won’t lose the latest version of those definitions it has stored.

Between now and then, there are options out there for migrating your custom patch titles. The Patch Server project can be run on your own hosts if you’re feeling up to managing your own. Jamf just announced more than 500 software titles are now available in the official feed (which is just the beginning), and we should expect to see more out of the integration of Kinobi in the future.

It is not lost on me that this is going to be a disruption to a lot of Jamf customers, . I do very much apologize for that. I’m looking forward to seeing what the future of Jamf’s patch service becomes with the people who I know are behind it, but my chapter in this is coming to a close.

A Quick Look at S3 Read Speeds and Python Lambda Functions

The other day I was having a conversation with a colleague around an asynchronous file hashing operation that triggers off new objects uploaded to a S3 bucket. At one point we were talking throughput. The design has a notification configuration that sends the S3 events into a SQS queue for processing. This means for the first minute we have five Lambda functions each processing a file one at a time (batch size of 1: this is an implementation decision, and for the sake of this article we won’t get into larger batch sizes), and then at the second minute 65, the thirds 125, and so on.

The napkin math of this discussion assumed 1 GB file size average and an ideal 100 MBps throughput. At 10 seconds per file, 6 files per minute per Lambda function, we could expect during a scale up to process 30 files in the first minute, 380 in the second minute, and 730 in the third minute. Our current implementation allows us to hash through 1,170 files at the (admittedly on the higher end) 1 GB files within 3 minutes.

That is, if we actually get 100 MBps.

What can we actually expect?

The rest of my afternoon became focused on finding out what the realistic throughput for this code would be. I stripped down the service’s Lambda function to the key items and preloaded a S3 bucket with four test files at sizes we commonly expect coming into the system: 100 MB, 500 MB, 1 GB, and 5 GB.

Here’s the code:

import hashlib
import time
import boto3
s3 = boto3.resource("s3")
BUCKET = "my-bucket"
KEYS = [
"test100mb.file",
"test500mb.file",
"test1gb.file",
"test5gb.file",
]
CHUNK_SIZE = 20000000
def lambda_handler(event, context):
print("Starting…")
for key in KEYS:
stream_file(key)
print("Complete")
def stream_file(key):
start_time = time.time()
hash_digest = hashlib.sha1()
s3_object = s3.Object(bucket_name=BUCKET, key=key).get()
for chunk in read_in_chunks(s3_object):
hash_digest.update(chunk)
ellapsed_time = time.time() start_time
print(
f"File {key} – SHA1 {hash_digest.hexdigest()} – Total Seconds: {round(ellapsed_time, 2)}"
)
def read_in_chunks(s3_object: dict):
"""A generator that iterates over an S3 object in 10 MB chunks."""
stream = s3_object["Body"]._raw_stream
while True:
data = stream.read(CHUNK_SIZE)
if not data:
break
yield data
view raw s3tester.py hosted with ❤ by GitHub

Now came the testing portion. I need to see how this code performs not only at different memory settings (remember: the memory setting also allocates CPU and network IO to our functions), but also the chunk size that will be streamed from S3 for each object. The code above uses the chunk size to stream X bytes of the S3 object into memory, then updates the hash digest with it, and discards before moving onto the next chunk. This makes our actual memory utilization very low. In fact, the above hashing operation works at 128 MB of memory for the Lambda function even if the execution time isn’t great.

At this point I must inform you all that I did my testing like a barbarian of old by clicking “Invoke” in the console after changing my memory and chunk settings. If you’re looking to do performance testing, I recommend you go check out the AWS Lambda Power Tuning project for this. It’s pretty great.

The table below is the data I recorded as a part of this effort. A few things to note that limit this data and makes it incomplete:

  • I only performed two runs at each configuration. This is a very limited data set and there’s clearly environmental variance between executions that affected the times. These could have been leveled out, and outliers dropped, if I obtained a larger data set.
  • With only one run being performed at a time I have no clear indication if a mass number of parallel read operations of different files will impact read speeds. My assumption is no, but that is an assumption.
  • While it would be possible to multi-thread this workflow, and potentially multi-process it at much higher memory settings, I don’t see the benefit in doing so for the added code complexity. Plus, splitting up the downloads across threads likely won’t increased read speeds from S3 as now there are multiple streams competing for bandwidth.

We’ll pick up on my thought process on the other side of this table.

File Size (MB)Memory UsedFirst Run (Seconds)Second RunAvg Speed (MBps)
128 MB Memory / 1 MB Chunk Size
100824.917.4916.13
5008227.2630.7417.24
10008255.774.8215.32
500082329.68329.6815.17
128 MB Memory / 10 MB Chunk Size
1001085.187.6415.60
50010821.2624.9621.64
100010843.3249.8221.47
5000108217.84240.1621.83
256 MB Memory / 10 MB Chunk Size
1001082.652.5138.76
5001089.810.848.54
100010820.3221.4847.85
5000108118.799.0845.92
256 MB Memory / 20 MB Chunk Size
1001382.642.439.68
5001389.749.9250.86
100013819.5819.9250.63
500013899.4297.4450.80
256 MB Memory / 50 MB Chunk Size
1002453.552.7131.95
50024512.6812.5239.68
100024525.5425.439.26
5000245128.06127.539.13
512 MB Memory / 20 MB Chunk Size
1001371.51.1675.19
5001375.385.3493.28
100013713.7613.872.57
500013769.7469.7471.69
512 MB Memory / 50 MB Chunk Size
1002452.071.7352.63
5002456.746.5275.41
100024513.6614.371.53
500024568.7870.2171.95
1024 MB Memory / 20 MB Chunk Size
1001371.21.186.96
5001376.65.2984.10
100013714.5713.9370.18
500013772.7669.5770.26
1024 MB Memory / 50 MB Chunk Size
1002461.331.2178.74
5002466.526.5376.63
100024614.4614.6168.80
500024672.6572.6968.80
2048 MB Memory / 20 MB Chunk Size
1001381.091.0693.02
5001385.335.3593.63
100013813.8913.9171.94
500013869.6969.5671.81

I started off at the default 128 MB and a typo of 1 MB chunks (I thought I had written 10000000 😅 ). The smaller chunk size means we’re making many, many more requests to S3, so increasing it to 10 MB is a simple way to improve performance. At the higher chunk size we’re now getting close to utilizing all the available memory and we can’t increase it again.

I think it should be said that anyone deploying Lambda functions should default their memory setting to 256 MB to start no matter what. The leap in performance is clear no matter what you’re doing, and at per-millisecond billing there’s no reason not to go for it.

With the additional memory overhead I decided to see what would happen if I 5x the chunk size. While within the limit, my performance actually decreased. Dropping the chunk size down to 20 MB revealed a sweet spot (someone help here, but I know I’ve heard of the 20 MB number being used in a few other places within AWS for chunking/in-memory caching) where we can now consistency get ~50 MBps reads from S3.

At 512 MB of memory and the 20 MB chunk size we’ve hit the optimal settings across object sizes. 70+ MBps baseline with variance up to 90+ MBps.

If I were to do more intensive performance testing I would be focused here. Increasing memory to 1024 MB and 2048 MB improved the read speed for < 1 GB objects, but not the ≥ 1 GB ones. I still tested 50 MB chunks at 512 MB and 1024 MB but it again resulted in performance hits.

It might be tempting to look at the speed increases for < 1 GB files and say the function should run at that to burn through those faster, but the timing difference is insignificant in our context with 1.06 seconds at 2048 MB vs 1.5 seconds at our “optimal” 512 MB for 100 MB objects.

I say it that way because this system isn’t expected to have to deal with constant, high volume ingress of objects to our bucket. Ingress will be inconsistent and spiky at certain times of a monthly cycle. Now, if I were expecting high volume ingress and at a more constant rate I might find the increase warranted. ~3,400 100 MB objects per hour vs ~2,400 is a very different kind of measurement.

I hope you all enjoyed coming along for this little journey. Perhaps some day I’ll come back to it and put it through some proper performance tuning analysis.

Event Driven Applications with DynamoDB Streams and EventBridge

Build flexible serverless applications that are driven by changes in your DynamoDB table.

Whenever I start a new serverless application there are always three core technologies that form the basis: API Gateway APIs, Lambda Functions, and DynamoDB Tables. They’re the bread and butter of most AWS developers in this day and age.

From there, as my application grows in complexity, I inevitably end up adding in EventBridge Event Bus for triggering my downstream business logic and automation in reaction to the API.

Many of my APIs are designed to be RESTful where I am writing, reading, and updating data in the backend. My Lambda Functions are mainly focused on validation of the requests – which can simple schema validation or include complex logic around related records that may or may not exist – and the DynamoDB operation the resource and method map to.

Ensuring that my API functions have one, and only one, job to perform keeps them simple and easy to understand. The data layer of my application- the DynamoDB table primarily- is the source of truth for the service. Rather than have sequential operations at the API layer to trigger automation or business logic I instead want that trigger to be the action of the change in my data.

This is where marrying DynamoDB Streams to Event Bus comes in.

DynamoDB Streams and Lambda

When enabled, DynamoDB streams contain records of modifications to your table. These records are in the order they occurred and appear only once (no duplicates). This all occurs in near-real time which makes it an extremely attractive tool to turn on.

But, there are some gotchas. When you add a DynamoDB Stream as an event source to your Lambda Function it will receive batches of records serially (either from the start or the end of the stream). Lambda will not scale out horizontally as it must ensure records are processed in order. This means you need to be very careful about the business logic you put into those stream processor functions. Your stream will only be as fast as what is executing it, and if you introduce any bugs your entire pipeline will come grinding to a halt until you push a fix (I learned this the hard way).

There are a features for DynamoDB event sources that allow you to work/design around these issues. See the BisectBatchOnFunctionError and DestinationConfig options for more info.

My DynamoDB tables are almost always of a single table design. I have various types of records I store in a single table with generic keys for my indexes to enable the queries I require.

MyTable:
Type: AWS::DynamoDB::Table
Properties:
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
AttributeName: pk
AttributeType: S
AttributeName: sk
AttributeType: S
AttributeName: gsi1_pk
AttributeType: S
AttributeName: gsi1_sk
AttributeType: S
KeySchema:
AttributeName: pk
KeyType: HASH
AttributeName: sk
KeyType: RANGE
GlobalSecondaryIndexes:
IndexName: GSI1
KeySchema:
AttributeName: gsi1_pk
KeyType: HASH
AttributeName: gsi1_sk
KeyType: RANGE
Projection:
ProjectionType: ALL
StreamSpecification:
StreamViewType: NEW_AND_OLD_IMAGES
view raw example-table.yaml hosted with ❤ by GitHub

If I could be storing any kind of record in my table that makes writing logic on where to dispatch stream records in my Lambda Function more precarious. To me, the best solution is to dispatch ALL of my DynamoDB stream records to a centralized location that I can then build my business logic from.

EventBridge Event Bus

EventBridge is something I jumped onto during re:Invent 2019. Up until the introduction of the EventBridge Event Bus all serverless applications relied on AWS’s event sources for automated triggers. With Event Bus we now have our own custom eventing framework that plugs right into serverless applications.

There are a lot of features with EventBridge that do cool things around direct SaaS partner integrations (like Datadog), cross account eventing, and event discovery by pointing whatever you want at it, but don’t let those big use cases deter you from implementing one into small services.

We pay $1.00 for every 1,000,000 events we publish into a Bus, and we don’t pay for the rules that we attach to it.

We don’t pay for the rules we attach to our Event Bus.

Those rules enable patterns in your applications that before required all kinds of additional work and scaffolding to make happen. At a minimum a rule must define the source that triggers it. Past that, rules can become as fine grained as we desire. Emitting events with JSON payloads opens up the ability to drill into the details (effectively the body) and match against the content.

Rules are then able to invoke a wide range of AWS services. Beyond other Lambda Functions, you can pass events on to SQS Queues, SNS Topics, directly invoke Step Functions, call downstream HTTP endpoints… And then consider that you can have multiple rules triggering off the same events allowing parallel processing and workflows.

This flexibility and power dwarfs most other AWS offerings.

DynamoDB Events

As shown in the diagram at the start of this post; the goal is to emit changes to the DynamoDB table into an Event Bus where we can take full advantage of its Swiss army knife nature to plug in all of the business logic we want.

To this end, our DynamoDB stream processor has only one job to do:

from datetime import datetime
import json
import os
import boto3
EVENT_BUS = os.getenv("EVENT_BUS")
events_client = boto3.client("events")
def lambda_handler(event, context):
"""This Lambda function takes DynamoDB stream events and publishes them to an
EventBridge EventBus in batches (DynamoDB streams can be submitted in batches of a
maximum of 10).
"""
events_to_put = []
for record in event["Records"]:
print(f"Event: {record['eventName']}/{record['eventID']}")
table_arn, _ = record["eventSourceARN"].split("/stream")
events_to_put.append(
{
"Time": datetime.utcfromtimestamp(
record["dynamodb"]["ApproximateCreationDateTime"]
),
"Source": "my-service.database",
"Resources": [table_arn],
"DetailType": record["eventName"],
"Detail": json.dumps(record), # Gotcha here: Decimal() objects require handling
"EventBusName": EVENT_BUS,
}
)
events_client.put_events(Entries=events_to_put)
return "ok"

This function takes a batch of DynamoDB stream records from the event source and translates them into the Event Bus event structure.

In my code example I treat the Source as something descriptive. For internal application events I tend to follow the pattern of service-name.component-name for labeling my sources. In this case it is simply my-service.database with the implication being if I end up with multiple tables they’re all the same source but different Resources – the table ARN here – that I can use as a part of my rule to filter out what I’m executing on. I map the DynamoDB action (INSERT, MODIFY, REMOVE) to DetailType and I pump the entire record into the Detail as JSON.

Now when I go to take action on changes in my table I can add complex rules looking for those specific attributes and details.

MyFunction:
Type: AWS::Serverless::Function
Properties:
Runtime: python3.8
CodeUri: ./src/my_function
Handler: index.lambda_handler
Events:
TableChanges:
Type: EventBridgeRule
Properties:
EventBusName: !Ref EventBus
InputPath: $.detail
Pattern:
source:
my-service.database
resources:
!GetAtt MyTable.Arn
detail-type:
INSERT
MODIFY
detail:
dynamodb:
Keys:
pk:
S: [{ "prefix": "OID#" }]
sk:
S: [{ "prefix": "UID#" }]

By preserving the entire DynamoDB stream record I am able to match on key patterns to enable rules for specific record types.

The example above is taken from a Lambda Function that listened for the creation and modification of records that described customer integrations and then wrote back a historical record stating what keys were changed and by whom.

I could take this same rule, change it to listen for INSERT and REMOVE on those same key prefixes, and pipe matching events into a SQS FIFO Queue that manages aggregate records for customers tracking overall counts for things like the number of integrations or device counts (which would be a separate event rule going into the same queue).

This framework allows the service now to scale out, adding in automation and workflows on DynamoDB events without having to do anything to the stream, the stream processor, or anything that is already hooked up to a rule as they’re all completely independent components.

Drawbacks?

This design pattern isn’t without its inefficiencies which tend to pop out at large/high scale.

The amount of data being emitted by your table into the Stream isn’t necessarily a major issue. Past the free tier, GetRecords request will only run you $0.20 per million assuming your records aren’t very large. If they are, you can switch to KEYS_ONLY instead of sending the entire item into the stream which should still allow focused event rules.

That free tier covers 2,500,000 stream read request units every month. You may not notice it for quite some time.

You also run the risk of having a large amount of wasted events. At $1.00 per million on our Event Bus maybe we don’t care too much as small scale. Once throughput ratchets up and millions of table events are going through every day that becomes a different story. Ensuring our systems are designed around internal eventing should cut down on e-and-billing-waste.

Lastly, I’m going to make mention of service quotas – which might be a bit of bike-shedding but I’m gonna do it anyway.

EventBridge’s PutEvents API ranges from 600-2,400 requests per second depending on which region you’re operating in. These are limits you can increase, but you could quickly spike into them before you realized it. Batching events (like shown in our stream processor function above) is your best friend to stave this off.

This is not a limit you would (likely) be able to hit off a single DynamoDB stream (you’re more likely to back up on the stream while having plenty of overhead for events). Add in multiple sources for your Event Bus and it’s something you could spike into quickly.