Table of Contents

Let’s see pimpl and its alternatives in a real application! I’ve implemented a small utility app - for file compression - where we can experiment with various designs.

Is it better to use pimpl or maybe abstract interfaces? Read on to discover.

Intro  

In my previous post I covered the pimpl pattern. I discussed the basic structure, extensions, pros and cons and alternatives. Still, the post might sound a bit “theoretical”. Today I’d like to describe a practical usage of the pattern. Rather than inventing artificial names like MyClass and MyClassImpl you’ll see something more realistic: like FileCompressor or ICompressionMethod.

Moreover, this will be my first time when I’ve used Conan to streamline the work with third-party libraries (as we need a few of them).

Ok, so what’s the example?

The app - command line file compressor  

As an example, I’ve chosen a utility app that helps with packing files.

Basic use case:

Users run this utility app in a console environment. A list of files (or directories) can be passed, as well with the name of the output file. The output file will also specify the given compression method: .zip for zip, .bz2 for BZ compression, etc. Users can also run the app in help mode that will list some basic options and available compression methods. When the compression is finished a simple summary: bytes processed and the final size of the output file is shown.

Requirements:

  • a console application
  • command line with a few options
    • output file - also specifies the compression method
    • list of files (also with directory support)
  • basic summary at the end of the compression process

The same can be achieved with command line mode of your favourite archive managers (like 7z). Still, I wanted to see how hard is it to compress a file from C++.

The full source code can be found at my GitHub page: GitHub/fenbf/CompressFileUtil.

Simple implementation  

Let’s start simple.

When I was learning how to use Conan - through their tutorial - I met a helpful library called Poco:

Modern, powerful open source C++ class libraries for building network- and internet-based applications that run on desktop, server, mobile and embedded systems.

One thing I’ve noticed was that it supports Zip compression. So all I have to do for the application is to use the library, and the compression is done.

I came up with the following solution:

Starting from main() and going into details of the implementation:

int main(int argc, char* argv[])
{
    auto inputParams = ParseCommandLine(argc, argv);

    if (inputParams.has_value())
    {
        auto params = inputParams.value();

        RunCompressor(params);
    }
    else
        ShowHelp();
}

I won’t discuss the underlying implementation of parsing the command line, let’s skip to RunCompressor() instead:

void RunCompressor(const InputParams& params) noexcept
{
    try
    {
        FileCompressor compressor;
        compressor.Compress(params.m_files, params.m_output);
    }
    catch (const std::exception& ex)
        std::cerr << "Error: " << ex.what() << '\n';
    catch (...)
        std::cerr << "Unexpected error\n";
}

Ok, so what’s the deal with pimpl or abstract interfaces?

The first iteration has none of them :)

FileCompressor is declared in FileCompressor.h and is directly included by the file with main() (CompressFileUtil.cpp):

#include <Poco/Zip/Compress.h>

class FileCompressor
{
public:
    void Compress(const StringVector& vecFileNames, 
                  const string& outputFileName);

private:
    void CompressZip(const StringVector& vecFileNames, 
                     const string& outputFileName);
    void CompressOneElement(Poco::Zip::Compress& compressor, 
                            const string& fileName);
};

The class is straightforward: just one method Compress where you pass vector of strings (filenames) and the file name of the output archive to create. It will check the output file extension and forward the work to CompressZip (only zip for now):

void FileCompressor::CompressZip(const StringVector& vecFileNames, 
                                 const string& outputFileName)
{
    std::ofstream out(outputFileName, std::ios::binary);
    Poco::Zip::Compress compressor(out, /*seekable output*/true);

    for (const auto& fileName : vecFileNames)
        CompressOneElement(compressor, fileName);

    compressor.close();
}

CompressOneElement() uses Poco’s compressor to do all the magic:

Poco::File f(fileName);
if (f.exists())
{
    Poco::Path p(f.path());
    if (f.isDirectory())
    {
        compressor.addRecursive(p, Poco::Zip::ZipCommon::CL_MAXIMUM, 
                                /*excludeRoot*/true, p.getFileName());
    }
    else if (f.isFile())
    {
        compressor.addFile(p, p.getFileName(), 
                            Poco::Zip::ZipCommon::CM_DEFLATE,
                            Poco::Zip::ZipCommon::CL_MAXIMUM);
    }
}

Please notice two things:

  • Firstly: all of the private implementation is shown here (no fields, but private methods).
  • Secondly: types from a third party library are included (might be avoided by using forward declaration).

In other words: every time you decide to change the private implementation (add a method or field) every compilation unit that includes the file will have to be recompiled.

Now we’ve reached the main point of this article:

We aim for pimpl or an abstract interface to limit compilation dependencies.

Of course, the public interface might also change, but it’s probably less often than changing the internals.

In theory, we could avoid Poco types in the header - we could limit the number of private methods, maybe implement static free functions in FileCompressor.cpp. Still, sooner or later we’ll end up having private implementation revealed in the class declaration in one way or another.

I’ve shown the basic code structure and classes. But let’s now have a look at the project structure and how those third-party libraries will be plugged in.

Using Conan to streamline the work  

The first iteration only implements the part of requirements, but at least the project setup is scalable and a solid background for later steps.

As I mentioned before, with this project I’ve used Conan (Conan 1.0 was released on 10th January, so only a few days ago!) for the first time (apart from some little tutorials). Firstly, I needed to understand where can I plug it in and how can it help.

In short: in the case of our application, Conan does all the work to provide other libraries for the project. We are using some third party libraries, but a Conan package can be much more (and you can create your custom ones).

To fetch a package you have to specify its name in a special file: conanfile.txt (that is placed in your project directory).

It might look as follows:

[requires]
Poco/1.8.0.1@pocoproject/stable

[generators]
visual_studio

Full reference here docs: conanfile.txt

Conan has several generators that do all job for you. They collect information from dependencies, like include paths, library paths, library names or compile definitions, and they translate/generate a file that the respective build system can understand. I was happy to see “Visual Studio Generator” as one of them (your favourite build tools is probably also on the list of Conan’s Generators).

With this little setup the magic can start:

Now, all you have to do is to run (in that folder) the Conan tool and install the packages.

conan install . -s build_type=Debug -if build_debug -s arch=x86

This command will fetch the required packages (or use cache), also get package’s dependencies, install them in a directory (in the system), build the binaries (if needed) and finally generate correct build options (include/lib directories) for your compiler.

In the case of Visual Studio in my project folder\build_debug I’ll get conanbuildinfo.props with all the settings. So I have to include that property file in my project and build it…. and it should work :)

But why does Conan help here?

Imagine what you would have to do to add another library? Each step:

  • download a proper version of the library
  • download dependencies,
  • build all,
  • install,
  • setup Visual Studio (or another system) and provide the corrects paths…

I hate doing such work. But with Conan replacing libs, playing with various alternatives is very easy.

Moreover, Conan managed to install OpenSSL library - a dependency for Poco - and on Windows building OpenSSL is a pain as far as I know.

Ok… but where can you find all of the libraries?

Have a look here:

Let’s go back to the project implementation.

Improvements, more libs:  

The first version of the application uses only Poco to handle zip files, but we need at least two more:

  • Boost program options - to provide an easy way to parse the command line arguments.
  • BZ compression library - I’ve searched for various libs that would be easy to plug into the project, and BZ seems to be the easiest one.

In order to use the libraries, I have to add a proper links/names into conanfile.txt.

[requires]
Poco/1.8.0.1@pocoproject/stable
Boost.Program_Options/1.65.1@bincrafters/stable 
bzip2/1.0.6@conan/stable

Thanks to Bincrafters boost libraries are now divided into separate packages!

Still, boost in general has a dense dependency graph (between the libraries), so the program options library that I needed brought a lot of other boost libs. Still, it works nicely in the project.

We have all the libraries, so we move forward with the project. Let’s prepare some background work for the support of more compression methods.

Compression methods  

Since we want to have two methods (and maybe more in the future), it’s better to separate the classes. That will work better when we’d like to add another implementation.

The interface:

class ICompressionMethod
{
public:
    ICompressionMethod() = default;
    virtual ~ICompressionMethod() = default;

    virtual DataStats Compress(const StringVector& vecFileNames, 
                               const string& outputFileName) = 0;
};

Then we have two derived classes:

  • ZipCompression - converted from the first implementation.
  • BZCompression - BZ2 compression doesn’t provide archiving option, so we can store just one file using that method. Still, it’s common to pack the files first (like using TAR) and then compress that single file. In this implementation, for simplicity, I’ve used Zip (fastest mode) as the first step, and then BZ compresses the final package.

There’s also a factory class that simplifies the process of creating required classes… but I’ll save the details here for now.

We have all the required code, so let’s try with pimpl approach:

pimpl version  

The basic idea of the pimpl patter is to have another class “inside” a class we want to divide. That ‘hidden’ class handles all the private section.

In our case, we need CompressorImpl that implements the private details of FileCompressor.

The main class looks like that now:

class FileCompressor
{
public:
    FileCompressor();
    ~FileCompressor();

    // movable:
    FileCompressor(FileCompressor && fc) noexcept;   
    FileCompressor& operator=(FileCompressor && fc) noexcept;

    // and copyable
    FileCompressor(const FileCompressor& fc);
    FileCompressor& operator=(const FileCompressor& fc);

    void Compress(const StringVector& vecFileNames, 
                  const string& outputFileName);

private:
    class CompressorImpl;

    const CompressorImpl* Pimpl() const { return m_pImpl.get(); }
    CompressorImpl* Pimpl() { return m_pImpl.get(); }

    std::unique_ptr<CompressorImpl> m_pImpl;
};

The code is longer than in the first approach. This is why we have to do all the preparation code:

  • in the constructor we’ll create and allocate the private pointer.
  • we’re using unique_ptr so destructor must be defined in cpp file in order not to have compilation problem (missing deleter type).
  • the class is move-able and copyable so additional move and copy constructors are required to be implemented.
  • CompressorImpl is forward declared in the private section
  • Pimpl accessors are required to implement const methods properly. See why it’s essential in my previous post.

And the CompressorImpl class:

class FileCompressor::CompressorImpl
{
public:
    CompressorImpl() { }

    void Compress(const StringVector& vecFileNames, 
                  const string& outputFileName);
};

Unique pointer for pimpl is created in the constructor of FileCompressor and optionally copied in the copy constructor.

Now, every method in the main class needs to forward the call to the private, like:

void FileCompressor::Compress(const StringVector& vecFileNames, 
                              const string& outputFileName)
{
    Pimpl()->Compress(vecFileNames, outputFileName);
}

The ‘real’ Compress() method decides which Compression method should be used (by the extension of the output file name) and then creates the method and forwards parameters.

Ok… but what’s the deal with having to implement all of that additional code, plus some boilerplate, plus that pointer management and proxy methods… ?

How pimpl broke dependencies?  

The reason: Breaking dependencies.

After the core structure is working we can change the private implementation as much as we like and the client code (that includes FileCompressor.h) doesn’t have to be recompiled.

In this project, I’ve used precompiled headers, and what’s more the project is small. But it might play a role when you have many dependencies.

Another essential property of pimpl is ABI compatibility; it’s not important in the case of this example, however. I’ll return to this topic in a future blog post.

Still, what if the whole compression code, with the interface, sit into a different binary, a separate DLL? In that case, even if you change the private implementation the ABI doesn’t change so you can safely distribute a new version of the library.

Implementing more requirements  

Ok… so something should work now, but we have two more elements to implement:

  • showing stats
  • showing all available compression methods

How to do it in the pimpl version?

In case of showing stats:

Stats are already supported by compression methods, so we just need to return them.

So we declare a new method in the public interface:

class FileCompressor 
{
    ...
    void ShowStatsAfterCompression(ostream& os) const;
};

This will only be a proxy method:

void FileCompressor::ShowStatsAfterCompression(ostream& os) const
{
    Pimpl()->ShowStatsAfterCompression(os);
}

(Here’s the place where this Pimpl accessors kicks in, it won’t allow us to skip const when the private method inside CompressorImpl is declared).

And… at last, the actual implementation:

void FileCompressor::CompressorImpl
::ShowStatsAfterCompression(ostream& os) const
{
    os << "Stats:\n";
    os << "Bytes Read: " << m_stats.m_bytesProcessed << "\n";
    os << "Bytes Saved: " << m_stats.m_BytesSaved << "\n";
}

So much code… just for writing a simple new method.

Ok… by that moment I hope you get the intuition how pimpl works in our example. I’ve prepared another version that uses abstract interface. Maybe it’s cleaner and easier to use than pimpl?

The Abstract Interface version  

If you read the section about compression methods - where ICompressionMethod is introduced, you might get an idea how to add such approach for FileCompressor.

Keep in mind that we want to break physical dependency between the client code. So that’s why we can declare abstract interface, then provide some way to create the actual implementation (a factory?). The implementation will be only in cpp file so that the client code won’t depend on it.

class IFileCompressor
{
public:
    virtual ~IFileCompressor() = default;

    virtual void Compress(const StringVector& vecFileNames, const     
                          string& outputFileName) = 0;

    static unique_ptr<IFileCompressor> CreateImpl();
};

And then inside cpp file we can create the final class:

class FileCompressor : public IFileCompressor
{
public:
    void Compress(const StringVector& vecFileNames, 
                  const string& outputFileName) override;
    void ShowStatsAfterCompression(ostream& os) const override;

private:
    DataStats m_stats;
};

And the factory method:

unique_ptr<IFileCompressor> IFileCompressor::CreateImpl()
{
    return unique_ptr<IFileCompressor>(new FileCompressor());
}

Can that work?

How abstract interface broke dependencies?  

With abstract interface approach, we got into a situation where the exact implementation is declared and defined in a separate cpp file. So if we change it, there’s no need to recompile clients code. The same as we get with pimpl.

Was it easier than pimpl?

Yes!

No need for special classes, pointer management, proxy methods. When I implemented this is was much cleaner.

Why might it be worse?

ABI compatibility.

If you want to add a new method to the public interface, it must be a virtual one. In pimpl, it can be a normal non-virtual method. The problem is that when you use a polymorphic type, you also get a hidden dependency on its vtable.

Now, if you add a new virtual method vtable might be completely different, so you cannot be sure if that will work in client’s code.

Also, ABI compatibility requires Size and Layout of the class to be unchanged. So if you add a private member, that will change the size.

Comparison  

Let’s roughly compare what’s we’ve achieved so far with pimpl and abstract interface.

Feature pimpl Abstract Interface
Compilation firewall Yes Yes
ABI compatibility Yes No
How to add a new method Add new method in the main class
Implement proxy method
Implement the actual implementation
Add new virtual method into the Interface
Implement the override method in the implementation class
How to add a new private member? Inside pimpl class
Doesn’t affect ABI
Inside the interface implementation
Changes size of the object, so is not binary compatible
Others Quite not clean
Harder to debug
It’s usually clean
cannot be used as a value on stack

Summary  

This was a fun project.

We went from a straightforward implementation to a version where we managed to limit compilation dependencies. Two methods were tested: pimpl and abstract interface.

Personally, I prefer the abstract interface version. It’s much easier to maintain (as it’s only one class + interface), rather than a class that serves as a proxy plus the real private implementation.

What’s your choice?

Moreover, I enjoyed working with Conan as a package manager. It significantly improved the developments speed! If I wanted to test a new library (a new compression method), I just had to find the proper link and update conanfile.txt. I hope to have more occasion to use this system. Maybe even as a producer of a package.

And here I’d like to thank JFrog-Conan for sponsoring and helping in writing this blog post.

But that’s not the end!

Some time in the future it would be cool to improve the code and return with an example of a separate DLL and see what’s that ABI compatibility… and how that works.