Sometimes its okay to be a monolith. Or at least one monolith amongst many.

Photo by Gabriel Gusmao / Unsplash
Scale out via the process model

Ideally we want to be able to handle an arbitrarily large number of parallel requests, limited only by our budget (and, regrettably, the third-party Backing Services we're relying on). Within a reasonable space its often efficient to handle multiple requests in parallel, but beyond a certain point it gets cumbersome.  That's where process scaling comes in.  

The idea is simple: once your process starts to run out of capacity, add another process behind a load balancer.  After all we've already made sure that our processes are stateless, so adding more is easy, right?  Right?

Actually yeah, it really is.  If you've been building according to the 12 Factor approach, there really isn't anything else to it.  No caveats, no if-buts, just scale as far as the eye can see (and wallet can stretch).

Some services take this a bit far (to my mind), like Google Cloud Functions not supporting more than one parallel request per runtime.  I think that with modern frameworks writing reasonably efficient code to run under mutli-threaded servers is neither hard nor particularly error prone, so knock yourself out.  Just once you reach a certain capacity and begin to see less ROI in scaling up with additional memory or processing resources, stop trying and go parallel.

As a benefit, this leads in very naturally to disposability.