Table of Contents

Three weeks ago with Jonathan from Fluent C++, we announced a coding challenge: link here.

Let’s meet the winner and discuss some of the best solutions

(Our choice is quite surprising! See why :))

First of all, I’d like to thank you all for the submissions to the challenge. The task was ambitious! The final solution wasn’t just a few lines of code, but more than 100… on the average around 200 LOC… sometimes even more. To write such app for sure, you had to spend a few good hours. We appreciate your time and effort!

We got 11 entries.

If you’re one of the participants, you should be proud of yourself! You’ve learnt some C++17 and wrote a working app!
Congratulations!

The rules  

Just to remind:

The task proposed in the challenge is to write a command line tool that takes in a CSV file, overwrites all the data of a given column by a given value, and outputs the results into a new CSV file.

In other words, you had to code a command line tool that transforms an input CSV file with some rules and then saves it as a new file.

Desired effect:

Replace fields under the “City” label, with “London”. We want all the people from the input file to be now located in London.

Not super simple, as it requires several elements like:

  • Reading and writing to a text file
  • Parsing CSV header
  • Parsing CSV lines
  • Searching for a selected column
  • Replacing the text
  • Error handling
  • Reading from command line arguments

Originally it was motivated by a simple PowerShell script:

Import-Csv .\input.csv | ForEach-Object {
    $_."City" = 'London'
    $_
} | Export-Csv .\output.csv -NoTypeInformation

Unfortunately it’s not that simple in C++ :D A bit more LOC needed :)

The Winner  

We selected:

Fernando B. Giannasi

Here’s his solution: link to code at Coliru

And here’s a surprising fact about Fernando:

He’s not a professional programmer :)

I work as an Intensive Care and Emergency Physician (a.k.a intensivist) in my cities ICU’s.

And his story:

I’m a Linux enthusiast since the 90’s, which in an almost natural way led me to be interested in programming.
I have a strong background on shell script and Python, which I have also used for data analysis.
The first contact I had with (mostly) C and C++ was before college, about 15 years ago, and it did not suit my needs since I often found myself struggling with awkward syntax and details/constraints from the language rather than the real problem I was trying to solve. So with Python, I went some years after…

But a few years ago I was working with Raspberry-Pi projects, and I felt the lack of performance of my approach using Python and Bash scripts, and I decided to give C++ another try.
Man, what a different language!!
All the algorithms I liked were there on the STL… And the containers, the performance, RAII, everything feels so natural that I never turned back.

Wow! So there’s hope in C++, with the modern features and coding style :) I wish more and more people will perceive C++ that way.

The winner’s solution  

Let’s dive into the code:

If we go from main() into details we get the following picture:

The main() core part:

try 
{
    if (argc != 5) { throw runtime_error("Bad arguments"); }

    auto [in_file, out_file] = get_file_handlers(argv[1], argv[4]);

    string_view new_value = argv[3];
    auto target_index = get_target_column(in_file, argv[2], ',');
    if (target_index) {
        do_work(in_file, out_file, *target_index, new_value, ',');
    }
    else {
        throw runtime_error("Column name doesn’t exist in the input file");
    }
}
  • The code reads the input data from argv.
  • Opens the files, input and output
  • Finds the target column (the return value is optional<int>)
  • If the column index was found we get into the transform code that does all of the replacement.
  • If anything wrong happens we’ll get an exception
  • There’s a structured binding used to store streams of the input and output.

get_target_column:

the header:

[[nodiscard]] optional<int> get_target_column(ifstream& input,
                                              const string_view& label,
                                              const char delimiter)

and the core part:

auto tokens = split_string(first_line, delimiter);

if (auto it = find(begin(tokens), end(tokens), label); 
    it == tokens.end()) {
        return {}; 
}
else {
    return distance(begin(tokens), it);
}
  • it reads the first line of the input file and then splits the string into tokens (using a delimiter)
  • returns an index if found something
  • [[nodiscard]] will remind you actually to use the return value somewhere. See my post about C++17 attribs.
  • The code is super clean and so easy to read.

And below the code that splits the string (line):

[[nodiscard]] auto split_string(const string_view& input, 
                                const char delimiter) 
{
    stringstream ss {input.data()};
    vector<string> result;

    for (string buffer; 
         getline(ss, buffer, delimiter);) 
            {result.push_back(move(buffer));}

    return result;
}
  • I don’t have to add any comments, very easy to read and clean.

And here’s the core part of the transformation:

string buffer;

getline(input, buffer); // for the header line
output << buffer << endl;

while (getline(input, buffer)) {
    auto tokens = split_string(buffer, delimiter);
    tokens[target_index] = new_value.data();

    for (auto& i: tokens) {
        output << i;
        output << (i == tokens.back() ? '\n':delimiter);
    }
}

Again: clean and expressive.

Here’s what motivated Fernando:

… I think that a simple problem screamingly demands a simple solution, since correct of course.

So I tried to write a solution that my wife or my son could easily understand…

I mean, C++ has so many nice features, but as with any other tool, using all of them to solve a simple problem didn’t feel right…

The code is a perfect example of modern C++. And this is why Jonathan and me chose him as the winner.

Worth mentioning  

With so many good entries, it was hard for us to pick the winner. Moreover, there are many possible solutions and approaches. You might also want to look at the following examples:

  • In this solution the author usedline_iterator and tag_iterator. With those core tools, he was able to traverse the file efficiently. Also, such approach looks very scalable and can be easily adapted for other requirements.
    • This is an advanced code, so we were really impressed with the quality and effort to write such beauty.
  • In my C++17 articles, I forgot to mention, that std::iterator is now deprecated. I am glad that all of the solutions where an iterator was proposed remembered about this spec change.
  • Surprisingly, a lot of people used std::experimental::ostream_joiner from Library Fundamentals V2. This is not yet in the standard, as I know, but looks really good.
    • Used in solution like: William Killian
    • see the cppreference link.
    • Basically it’s an ‘optimized’ version of ostream_iterator. It usually makes only one write to the output for a range of values.

Summary  

Once again thank you for the code, it was a great experience to review them. I see how much I need to learn to write such code!

To end this post, I’d like to mention another quote by the winner:

If C++ has the reputation for being too much “expert friendly”, it do not need to be that way, specially if you keep simple things simple.

Isn’t that true? :)