Building a CSV Importer - Part 3

Written By Jesse Schutt
Posted on

If you've made it this far, you already know that we've imported data from a CSV file, mapped the columns into our application's domain, split the data into individual rows, and set up a Laravel Pipeline to ingest the information.

The last item I'd like to cover is how to keep track of what happens to the data during the pipeline flow. This is where the CSVRowLog model comes in to play!

Note: This article includes sections of code that I thought help illustrate the concept. Please reference the repository if you want to see unabridged code.

<?php
    
namespace App;
    
use Illuminate\Database\Eloquent\Model;
    
class CSVRowLog extends Model
{
    
    protected $table = 'csv_row_logs';
    
    protected $fillable = [
        'csv_row_id',
        'code',
        'pipe',
        'message',
        'level'
    ];
}

Migration:

<?php
    
use Illuminate\Support\Facades\Schema;
use Illuminate\Database\Schema\Blueprint;
use Illuminate\Database\Migrations\Migration;
    
class CreateCsvRowLogsTable extends Migration
{
    
    public function up()
    {
        Schema::create('csv_row_logs', function (Blueprint $table) {
            $table->increments('id');
            $table->unsignedInteger('csv_row_id')->index();;
            $table->foreign('csv_row_id')
                ->references('id')
                ->on('csv_rows')
                ->onDelete('cascade');
            $table->string('code')->nullable();
            $table->text('message')->nullable();
            $table->string('level')->nullable();
            $table->timestamps();
        });
    }
    
    public function down()
    {
        Schema::dropIfExists('csv_row_logs');
    }
}

The purpose of the CSVRowLog model is to track events that happen to individual CSVRow records as it passes through the pipeline.

Logging Errors

If something goes wrong we can terminate the import by throwing an exception from any of the pipes.

<?php
    
namespace App\Services\CSVImporter\Pipes;
    
use App\Participant;
use App\Services\CSVImporter\CSVImportTraveler;
use App\Services\CSVImporter\Exceptions\MissingParticipantEmailException;
    
class ImportParticipant implements CSVImporterPipe
{
    
    public function handle(CSVImportTraveler $traveler, \Closure $next)
    {
        if(!isset($traveler->getRow()->contents['email'])) {
            throw new MissingParticipantEmailException('No email was set for participant.');
        }
        
        // ...
 
    }
}

As an example, say the email column is missing on this ImportParticipant pipe. Throwing a new custom MissingParticipantEmailException will abort the pipeline progress.

By updating the CSVImporter we can catch the exceptions and convert them into CSVRowLog records to display back to the user.

<?php

namespace App\Services\CSVImporter;

use App\CSVRow;
use App\CSVRowLog;
use App\Services\CSVImporter\Exceptions\MissingParticipantEmailException;
use App\Services\CSVImporter\Pipes\AssignEmergencyContact;
use App\Services\CSVImporter\Pipes\ImportParticipant;
use App\Services\CSVImporter\Pipes\MapAllergies;
use Illuminate\Pipeline\Pipeline;

class CSVImporter
{

    private $traveler;

    public function __construct(CSVImportTraveler $traveler)
    {
        $this->traveler = $traveler;
    }

    public function importRow(CSVRow $row)
    {
        try {
            return app(Pipeline::class)
                ->send($this->traveler->setRow($row))
                ->through([
                    ImportParticipant::class,
                    MapAllergies::class,
                    AssignEmergencyContact::class
                ])->then(function ($progress) {
                    $this->traveler->getRow()->markImported();

                    return $progress;
                });
        } catch (\Exception $e) {
            $this->logException($e);

            return false;
        }
    }

    private function logException(\Exception $e)
    {
        switch (get_class($e)) {

            case MissingParticipantEmailException::class;
                $pipe = MissingParticipantEmailException::class;
                $code = MissingParticipantEmailException::CODE;
                break;
                
            default:
                $code = 'general_error';
                break;
        }

        $this->traveler->getRow()
            ->logs()
            ->create([
                'pipe'    => $pipe ?? null,
                'code'    => $code ?? null,
                'message' => $e->getMessage(),
                'level'   => CSVRowLog::LEVEL_ERROR
            ]);
    }
}

While there are ways to abstract the creation of logs, this example highlights the overarching flow:

  1. Send the data through the pipeline
  2. Raise an exception if something happens that is considered an "error"
  3. Catch the error and turn it into a log to keep track of what went wrong

Rolling Back Partial Progress

It's likely that if an exception is thrown in a pipe we don't want to keep the data from previous pipes as it would result in a partial import of the row. Simply wrapping the pipeline in a database transaction allows us to roll back any of the previous changes!

public function importRow(CSVRow $row)
{
    try {
        DB::beginTransaction();
        return app(Pipeline::class)
            ->send($this->traveler->setRow($row))
            ->through([
                ImportParticipant::class,
                MapAllergies::class,
                AssignEmergencyContact::class
            ])->then(function ($progress) {
                $this->traveler->getRow()->markImported();
                DB::commit();

                return $progress;
            });
    } catch (\Exception $e) {
        DB::rollBack();
        $this->logException($e);

        return false;
    }
}

Logging Warnings

In the case where something happens with the data that doesn't warrant a full-stop you can also write CSVRowLog records from within pipes.

<?php
    
namespace App\Services\CSVImporter\Pipes;
    
use App\Allergy;
use App\Services\CSVImporter\CSVImportTraveler;
    
class MapAllergies implements CSVImporterPipe
{
    
    const PARTICIPANT_HAS_DIETARY_NEEDS = 'participant_has_dietary_needs';
   
    public function handle(CSVImportTraveler $traveler, \Closure $next)
    {
       	// ...
    
        if(!empty($traveler->getRow()->contents['allergies'])) {
            $traveler->getRow()
                ->logs()
                ->create([
                    'code' => self::PARTICIPANT_HAS_DIETARY_NEEDS,
                    'pipe' => self::class,
                    'message' => 'Participant has dietary needs!'
                    'level' => 'info'
                ]);
        }
            
        // ....
    
    }
}

Displaying Logs

Now that we have CSVRowLog records related to CSVRow records we can simply output the results like this:

Csv Index
Csv Logs

In Conclusion

Hopefully this walkthrough has given you some ideas of how you can implement a CSV importer in your own application. Your implementation will look different than mine, but it's my hope that the high-level principles translate well.

Ping me on Twitter @jesseschutt if you have any questions or suggestions for improvement and check out the repository if you would like to see the all the code!

  1. Hero Image