Handling Complex Data Flows
At its core, programming is nothing more than stacking little blocks of logic on each other, building up a structure comprised of detailed organization, intentional design, and functional beauty.
Unless those goals are held front and center it is only a matter of time before the system devolves into a convoluted mess of overlapping, overstepping, messy code.
One place this happens frequently is in the area of a complex flow. Data starts in one form and is modified by the time it reaches the end. An example of this would be something like a sign-up form:
<?php
class FictitousRegisterController {
public function register(Request $request): JsonResponse
{
if ($request->invitationCode = 'knock-knock') {
$user = User::create([
'email' => $request->email,
'password' => $request->password,
]);
$user->roles()->attach('member');
if ($request->subscribe) {
MailingService::subscribe($user->email);
$user->update([
'subscriber' => true,
]);
}
if ($request->has('group_uuids')) {
collect($request->group_uuids)->each(function($uuid) use ($user) {
if($group = Group::whereUuid($uuid)->first()) {
$user->groups()->attach($group);
} else {
throw new InvalidGroupException($uuid . ' is an invalid group uuid!');
}
});
} else {
$user->groups()->attach(Group::whereDefault()->first());
}
Mailer::sendWelcomeEmail($user->email);
return response()->json([
'message' => 'Welcome!',
]);
}
return response()->json([
'message' => 'Error!'
]);
}
}
This method has a high cyclomatic complexity, or it has a high number of potential paths through, and would benefit from some refactoring. A common approach would be to break it into several private methods to isolate logic into more manageable portions:
<?php
class FictitiousRegisterController
{
public function register(Request $request): JsonResponse
{
if ($this->hasCorrectCode($request->invitationCode)) {
$user = $this->createUser($request);
$this->handleMailingList($request->subscribe, $user);
$this->handleGroupAssignment($request, $user);
Mailer::sendWelcomeEmail($user->email);
return response()->json([
'message' => 'Welcome!',
]);
}
return response()->json([
'message' => 'Error!'
]);
}
private function createUser($request) {
$user = User::create([
'email' => $request->email,
'password' => $request->password,
]);
$user->roles()->attach('member');
return $user;
}
private function hasCorrectCode($code) {
return $code = 'knock-knock';
}
private function handleMailingList($subscribe, $user) {
if ($subscribe) {
MailingService::subscribe($user->email);
$user->update([
'subscriber' => true,
]);
}
}
private function handleGroupAssignment($request, $user) {
if ($request->has('group_uuids')) {
collect($request->group_uuids)->each(function ($uuid) use ($user) {
if ($group = Group::whereUuid($uuid)->first()) {
$user->groups()->attach($group);
} else {
throw new InvalidGroupException($uuid . ' is an invalid group uuid!');
}
});
} else {
$user->groups()->attach(Group::whereDefault()->first());
}
}
}
Before we move on let me say that the above may be a perfectly valid solution for many use cases. Not every scenario will require extensive refactoring. That said, let’s see why we might want to consider something else.
There are a few things that bother me about this approach. For example, testing private methods is less than ideal. We would have to write an overarching feature test that has little visibility into the code aside from the input and output. Adding any more logic in the future means we have to modify the class and hope that our feature test factors in the new functionality.
Let’s look at another way
If we step up a level from the implementation and think about the elements at play we can see there are basically three different elements:
- The request/response (the controller)
- The data itself (username/password/errors)
- The logic that acts upon the data
With those three players identified we can start breaking down the large method and relocate code into one of these three locations.
Items related to the gathering of input and the return of data in the response can all stay within the controller as that’s what it is there for.
<?php
class FictitiousRegisterController {
public function register(Request $request): JsonResponse
{
$traveler = (new RegisterTraveler())->setRequest($request);
$pipes = [
ValidateInvitationCode::class,
CreateUser::class,
AssignPermissions::class,
HandleMailingList::class,
AssignToGroups::class,
SendWelcomeEmail::class,
];
return app(Pipeline::class)
->send($traveler)
->through($pipes)
->then(function ($traveler) {
return response()->json([
'message' => 'Success',
]);
});
The actual data itself (username, password, groups) can be centralized into a simple data transfer object, or DTO.
<?php
class RegisterTraveler {
private $request;
private $user;
public function setRequest($request)
{
$this->request = $request;
return $this;
}
public function getRequest()
{
return $this->request;
}
public function setUser($user)
{
$this->user = $user;
return $this;
}
public function getUser()
{
return $this->user;
}
}
Lastly, the individual steps that do the acting can be moved into a group of “steps”, each located within their own class.
<?php
class CreateUser implements PipeInterface {
public function handle($traveler, $next)
{
$traveler->setUser(
User::create([
'email' => $traveler->getRequest()->email,
'password' => $traveler()->getRequest()->password,
])
);
return $next($traveler);
}
}
<?php
class HandleMailingList implements PipeInterface
{
public function handle($traveler, $next)
{
if ($traveler->getRequest()->subscribe) {
MailingService::subscribe($traveler->getUser()->email);
$traveler->getUser()->update([
'subscriber' => true,
]);
}
return $next($traveler);
}
}
Notice how the high-level theory has driven out a solution that closely reflects the primary players?
The controller receives the input, assembles it into a DTO, and a Laravel Pipeline churns through the individual steps, returning the modified output. It’s a beautiful thing.
We’ve used this pipeline approach to break down complex flows at Zaengle a number of times with good success but there were a few issues to resolve as we established our pattern:
- Exiting the pipeline
- What to do with partial completion
- Breaking down pipes into sub-pipelines
Exiting the pipeline
How can we abort if something goes wrong in one of the steps? The simplest way we’ve found is to throw an exception from a pipe. Wrapping the pipeline in a try/catch has allowed us to handle potential errors from the pipes.
<?php
class FictitiousRegisterController {
public function register(Request $request): JsonResponse
{
$traveler = (new RegisterTraveler())->setRequest($request);
$pipes = [
ValidateInvitationCode::class,
CreateUser::class,
AssignPermissions::class,
HandleMailingList::class,
AssignToGroups::class,
SendWelcomeEmail::class,
];
try {
return app(Pipeline::class)
->send($traveler)
->through($pipes)
->then(function ($traveler) {
return response()->json([
'message' => 'Success',
]);
});
} catch (Exception $e) {
return response()->json([
'message' => $e->getMessage(),
]);
}
<?php
class AssignToGroups implements PipeInterface
{
public function handle($traveler, $next)
{
if ($traveler->getRequest()->has('group_uuids')) {
collect($traveler->getRequest()->group_uuids)->each(function($uuid) use ($traveler) {
if($group = Group::whereUuid($uuid)->first()) {
$traveler->getUser()->groups()->attach($group);
} else {
throw new InvalidGroupException($uuid . ' is an invalid group uuid!');
}
});
} else {
$traveler->getUser()->groups()->attach(Group::whereDefault()->first());
}
return $next($traveler);
}
}
Partial completion
Since we’ve set up the try/catch already, adding in a database transaction helps clean up database state if an anomoly occurs. Start the transaction before the pipeline kicks off, commit it if the process completes successfully, and rollback in the exception catcher if there’s a problem.
<?php
class FictitiousRegisterController {
public function register(Request $request): JsonResponse
{
$traveler = (new RegisterTraveler())->setRequest($request);
$pipes = [
ValidateInvitationCode::class,
CreateUser::class,
AssignPermissions::class,
HandleMailingList::class,
AssignToGroups::class,
SendWelcomeEmail::class,
];
try {
DB::beginTransaction();
return app(Pipeline::class)
->send($traveler)
->through($pipes)
->then(function ($traveler) {
DB::commit();
return response()->json([
'message' => 'Success',
]);
});
} catch (Exception $e) {
DB::rollback();
return response()->json([
'message' => $e->getMessage(),
]);
}
Sub pipelines
Infrequently we will have some branching logic that makes sense to be within a single pipe. If that logic becomes too complex we will break it apart into a sub-pipeline. All the same principles apply to a sub-pipeline as a normal pipeline.
Benefits of pipelines
Earlier I noted that it can be difficult to test a single method that calls a number of private methods. With a pipeline approach we have the freedom of testing individual pipes in isolation, as well as having higher-level feature tests that ensure the given input produces the expected output.
<?php
class CreateUserTest extends TestCase {
/** @test */
public function it_creates_a_user()
{
$traveler = (new RegisterTraveler)->setRequest(new Request(['email' => 'test', 'password' => 'password']));
(new CreateUser)->handle($traveler, function () {});
$this->assertInstanceOf(User::class, $traveler->getUser());
}
}
Conclusion
A pipeline is one solution we’ve used to handle complex data flows. There are other options that may work better for you. The biggest takeaway from this experience was to think one step above the actual implementation and identify what the main players were, define their responsibilities, and implement a solution that maintained their integrity.
(Update - we've consolidated the above technique into an open source package. Please feel free to use it, share with your network, and let us know what you think. You can find us on Twitter @zaengle.)
Want to read more tips and insights on working with a website development team that wants to help your organization grow for good? Sign up for our bimonthly newsletter.
By Jesse Schutt
Director of Engineering
Jesse is our resident woodworker. His signature is to find the deeper meaning in a project and the right tool for the job.