ChatGPT解决这个技术问题 Extra ChatGPT

How to implement the factory method pattern in C++ correctly

There's this one thing in C++ which has been making me feel uncomfortable for quite a long time, because I honestly don't know how to do it, even though it sounds simple:

How do I implement Factory Method in C++ correctly?

Goal: to make it possible to allow the client to instantiate some object using factory methods instead of the object's constructors, without unacceptable consequences and a performance hit.

By "Factory method pattern", I mean both static factory methods inside an object or methods defined in another class, or global functions. Just generally "the concept of redirecting the normal way of instantiation of class X to anywhere else than the constructor".

Let me skim through some possible answers which I have thought of.

0) Don't make factories, make constructors.

This sounds nice (and indeed often the best solution), but is not a general remedy. First of all, there are cases when object construction is a task complex enough to justify its extraction to another class. But even putting that fact aside, even for simple objects using just constructors often won't do.

The simplest example I know is a 2-D Vector class. So simple, yet tricky. I want to be able to construct it both from both Cartesian and polar coordinates. Obviously, I cannot do:

struct Vec2 {
    Vec2(float x, float y);
    Vec2(float angle, float magnitude); // not a valid overload!
    // ...
};

My natural way of thinking is then:

struct Vec2 {
    static Vec2 fromLinear(float x, float y);
    static Vec2 fromPolar(float angle, float magnitude);
    // ...
};

Which, instead of constructors, leads me to usage of static factory methods... which essentially means that I'm implementing the factory pattern, in some way ("the class becomes its own factory"). This looks nice (and would suit this particular case), but fails in some cases, which I'm going to describe in point 2. Do read on.

another case: trying to overload by two opaque typedefs of some API (such as GUIDs of unrelated domains, or a GUID and a bitfield), types semantically totally different (so - in theory - valid overloads) but which actually turn out to be the same thing - like unsigned ints or void pointers.

1) The Java Way

Java has it simple, as we only have dynamic-allocated objects. Making a factory is as trivial as:

class FooFactory {
    public Foo createFooInSomeWay() {
        // can be a static method as well,
        //  if we don't need the factory to provide its own object semantics
        //  and just serve as a group of methods
        return new Foo(some, args);
    }
}

In C++, this translates to:

class FooFactory {
public:
    Foo* createFooInSomeWay() {
        return new Foo(some, args);
    }
};

Cool? Often, indeed. But then- this forces the user to only use dynamic allocation. Static allocation is what makes C++ complex, but is also what often makes it powerful. Also, I believe that there exist some targets (keyword: embedded) which don't allow for dynamic allocation. And that doesn't imply that the users of those platforms like to write clean OOP.

Anyway, philosophy aside: In the general case, I don't want to force the users of the factory to be restrained to dynamic allocation.

2) Return-by-value

OK, so we know that 1) is cool when we want dynamic allocation. Why won't we add static allocation on top of that?

class FooFactory {
public:
    Foo* createFooInSomeWay() {
        return new Foo(some, args);
    }
    Foo createFooInSomeWay() {
        return Foo(some, args);
    }
};

What? We can't overload by the return type? Oh, of course we can't. So let's change the method names to reflect that. And yes, I've written the invalid code example above just to stress how much I dislike the need to change the method name, for example because we cannot implement a language-agnostic factory design properly now, since we have to change names - and every user of this code will need to remember that difference of the implementation from the specification.

class FooFactory {
public:
    Foo* createDynamicFooInSomeWay() {
        return new Foo(some, args);
    }
    Foo createFooObjectInSomeWay() {
        return Foo(some, args);
    }
};

OK... there we have it. It's ugly, as we need to change the method name. It's imperfect, since we need to write the same code twice. But once done, it works. Right?

Well, usually. But sometimes it does not. When creating Foo, we actually depend on the compiler to do the return value optimisation for us, because the C++ standard is benevolent enough for the compiler vendors not to specify when will the object created in-place and when will it be copied when returning a temporary object by value in C++. So if Foo is expensive to copy, this approach is risky.

And what if Foo is not copiable at all? Well, doh. (Note that in C++17 with guaranteed copy elision, not-being-copiable is no problem anymore for the code above)

Conclusion: Making a factory by returning an object is indeed a solution for some cases (such as the 2-D vector previously mentioned), but still not a general replacement for constructors.

3) Two-phase construction

Another thing that someone would probably come up with is separating the issue of object allocation and its initialisation. This usually results in code like this:

class Foo {
public:
    Foo() {
        // empty or almost empty
    }
    // ...
};

class FooFactory {
public:
    void createFooInSomeWay(Foo& foo, some, args);
};

void clientCode() {
    Foo staticFoo;
    auto_ptr<Foo> dynamicFoo = new Foo();
    FooFactory factory;
    factory.createFooInSomeWay(&staticFoo);
    factory.createFooInSomeWay(&dynamicFoo.get());
    // ...
}

One may think it works like a charm. The only price we pay for in our code...

Since I've written all of this and left this as the last, I must dislike it too. :) Why?

First of all... I sincerely dislike the concept of two-phase construction and I feel guilty when I use it. If I design my objects with the assertion that "if it exists, it is in valid state", I feel that my code is safer and less error-prone. I like it that way.

Having to drop that convention AND changing the design of my object just for the purpose of making factory of it is.. well, unwieldy.

I know that the above won't convince many people, so let's me give some more solid arguments. Using two-phase construction, you cannot:

initialise const or reference member variables,

pass arguments to base class constructors and member object constructors.

And probably there could be some more drawbacks which I can't think of right now, and I don't even feel particularly obliged to since the above bullet points convince me already.

So: not even close to a good general solution for implementing a factory.

Conclusions:

We want to have a way of object instantiation which would:

allow for uniform instantiation regardless of allocation,

give different, meaningful names to construction methods (thus not relying on by-argument overloading),

not introduce a significant performance hit and, preferably, a significant code bloat hit, especially at client side,

be general, as in: possible to be introduced for any class.

I believe I have proven that the ways I have mentioned don't fulfil those requirements.

Any hints? Please provide me with a solution, I don't want to think that this language won't allow me to properly implement such a trivial concept.

@Zac, although the title is very similar, the actual questions are IMHO different.
Good duplicate but the text of this question is valuable in and of itself.
Two years after asking this, I have some points to add: 1) This question is relevant to several design patterns ([abstract] factory, builder, you name it, I don't like delving in their taxonomy). 2) The actual issue being discussed here is "how to cleanly decouple object storage allocation from object construction?".
@Dennis: only if you don't delete it. These kind of methods are perfectly fine, as long as it is "documented" (source code is documentation ;-) ) that the caller takes ownership of the pointer (read: is responsible for deleting it when appropriate).
@Boris @Dennis you could also make it very explicit by returning a unique_ptr<T> instead of T*.

N
Nico

First of all, there are cases when object construction is a task complex enough to justify its extraction to another class.

I believe this point is incorrect. The complexity doesn't really matter. The relevance is what does. If an object can be constructed in one step (not like in the builder pattern), the constructor is the right place to do it. If you really need another class to perform the job, then it should be a helper class that is used from the constructor anyway.

Vec2(float x, float y);
Vec2(float angle, float magnitude); // not a valid overload!

There is an easy workaround for this:

struct Cartesian {
  inline Cartesian(float x, float y): x(x), y(y) {}
  float x, y;
};
struct Polar {
  inline Polar(float angle, float magnitude): angle(angle), magnitude(magnitude) {}
  float angle, magnitude;
};
Vec2(const Cartesian &cartesian);
Vec2(const Polar &polar);

The only disadvantage is that it looks a bit verbose:

Vec2 v2(Vec2::Cartesian(3.0f, 4.0f));

But the good thing is that you can immediately see what coordinate type you're using, and at the same time you don't have to worry about copying. If you want copying, and it's expensive (as proven by profiling, of course), you may wish to use something like Qt's shared classes to avoid copying overhead.

As for the allocation type, the main reason to use the factory pattern is usually polymorphism. Constructors can't be virtual, and even if they could, it wouldn't make much sense. When using static or stack allocation, you can't create objects in a polymorphic way because the compiler needs to know the exact size. So it works only with pointers and references. And returning a reference from a factory doesn't work too, because while an object technically can be deleted by reference, it could be rather confusing and bug-prone, see Is the practice of returning a C++ reference variable, evil? for example. So pointers are the only thing that's left, and that includes smart pointers too. In other words, factories are most useful when used with dynamic allocation, so you can do things like this:

class Abstract {
  public:
    virtual void do() = 0;
};

class Factory {
  public:
    Abstract *create();
};

Factory f;
Abstract *a = f.create();
a->do();

In other cases, factories just help to solve minor problems like those with overloads you have mentioned. It would be nice if it was possible to use them in a uniform way, but it doesn't hurt much that it is probably impossible.


+1 for Cartesian and Polar structs. It is generally best to create classes and structs which directly represent the data they are intended for (as opposed to a general Vec struct). Your Factory is a good example too, but your example does not illustrate who owns the pointer 'a'. If the Factory 'f' owns it, then it probably will be destroyed when 'f' leaves scope, but if 'f' does not own it, it is important for the developer to remember to free that memory or else a memory leak can occur.
Of course an object can be deleted by reference! See stackoverflow.com/a/752699/404734 That of course raises the question if it is wise to return dynamic memory by reference, because of the problem of the potentially assigning of the return value by copy (the caller could of course also do something like int a = *returnsAPoninterToInt() and would then face the same problem, if dynamically allcoated memory gets returned, like for references, but in the pointer version the user has to explicitly dereference instead of just forgetting to explicitly reference, to be wrong).
@Kaiserludi, nice point. I didn't think of that, but it's still an "evil" way to do things. Edited my answer to reflect that.
What about creating different non-polymorphic classes that are immutable? Is a factory pattern then appropriate to use in C++?
@daaxix, why would you need a factory to create instances of a non-polymorphic class? I don't see what does immutability have to do with anything of this.
C
Community

Simple Factory Example:

// Factory returns object and ownership
// Caller responsible for deletion.
#include <memory>
class FactoryReleaseOwnership{
  public:
    std::unique_ptr<Foo> createFooInSomeWay(){
      return std::unique_ptr<Foo>(new Foo(some, args));
    }
};

// Factory retains object ownership
// Thus returning a reference.
#include <boost/ptr_container/ptr_vector.hpp>
class FactoryRetainOwnership{
  boost::ptr_vector<Foo>  myFoo;
  public:
    Foo& createFooInSomeWay(){
      // Must take care that factory last longer than all references.
      // Could make myFoo static so it last as long as the application.
      myFoo.push_back(new Foo(some, args));
      return myFoo.back();
    }
};

@LokiAstari Because use of smart pointers is the simplest way to loose the control over memory. The control of which C/C++ langs are known to be supreme of in comparison to other languages, and from which they gain the biggest advantage. Not mentioning the fact that smart pointers produce memory overhead similar to other managed languages. If you want the convenience of automatic memory management start programming in Java or C# but don't put that mess into C/C++.
@lukasz1985 the unique_ptr in that example does not have performance overhead. Managing resources, including memory, is one of the supreme advantages of C++ over any other language because you can do it with no performance penalty and deterministically, without losing control, but you say exactly the opposite. Some people dislike things C++ does implicitly, like memory management through smart pointers, but if what you want is for everything to be obligatorily explicit, use C; the tradeoff is orders of magnitude less problems. I think it is unfair you vote down a good recommendation.
@EdMaster: I did not respond previously because he was obviously trolling. Please don't feed the troll.
@LokiAstari he might be a troll, but what he says might confuse people
@yau: Yes. But: boost::ptr_vector<> is a tiny bit more efficient as it understands it owns the pointer rather than delegating the work to a subclass. BUT the main advantage of boost::ptr_vector<> is that it exposes its members by reference (not pointer) thus it is really easy to use with algorithms in the standard library.
E
Evan Teran

Have you thought about not using a factory at all, and instead making nice use of the type system? I can think of two different approaches which do this sort of thing:

Option 1:

struct linear {
    linear(float x, float y) : x_(x), y_(y){}
    float x_;
    float y_;
};

struct polar {
    polar(float angle, float magnitude) : angle_(angle),  magnitude_(magnitude) {}
    float angle_;
    float magnitude_;
};


struct Vec2 {
    explicit Vec2(const linear &l) { /* ... */ }
    explicit Vec2(const polar &p) { /* ... */ }
};

Which lets you write things like:

Vec2 v(linear(1.0, 2.0));

Option 2:

you can use "tags" like the STL does with iterators and such. For example:

struct linear_coord_tag linear_coord {}; // declare type and a global
struct polar_coord_tag polar_coord {};

struct Vec2 {
    Vec2(float x, float y, const linear_coord_tag &) { /* ... */ }
    Vec2(float angle, float magnitude, const polar_coord_tag &) { /* ... */ }
};

This second approach lets you write code which looks like this:

Vec2 v(1.0, 2.0, linear_coord);

which is also nice and expressive while allowing you to have unique prototypes for each constructor.


A
Al.G.

You can read a very good solution in: http://www.codeproject.com/Articles/363338/Factory-Pattern-in-Cplusplus

The best solution is on the "comments and discussions", see the "No need for static Create methods".

From this idea, I've done a factory. Note that I'm using Qt, but you can change QMap and QString for std equivalents.

#ifndef FACTORY_H
#define FACTORY_H

#include <QMap>
#include <QString>

template <typename T>
class Factory
{
public:
    template <typename TDerived>
    void registerType(QString name)
    {
        static_assert(std::is_base_of<T, TDerived>::value, "Factory::registerType doesn't accept this type because doesn't derive from base class");
        _createFuncs[name] = &createFunc<TDerived>;
    }

    T* create(QString name) {
        typename QMap<QString,PCreateFunc>::const_iterator it = _createFuncs.find(name);
        if (it != _createFuncs.end()) {
            return it.value()();
        }
        return nullptr;
    }

private:
    template <typename TDerived>
    static T* createFunc()
    {
        return new TDerived();
    }

    typedef T* (*PCreateFunc)();
    QMap<QString,PCreateFunc> _createFuncs;
};

#endif // FACTORY_H

Sample usage:

Factory<BaseClass> f;
f.registerType<Descendant1>("Descendant1");
f.registerType<Descendant2>("Descendant2");
Descendant1* d1 = static_cast<Descendant1*>(f.create("Descendant1"));
Descendant2* d2 = static_cast<Descendant2*>(f.create("Descendant2"));
BaseClass *b1 = f.create("Descendant1");
BaseClass *b2 = f.create("Descendant2");

In my opinion, this concept defeats the purpose of the FactoryMethod pattern which is to keep the implementation hidden from the user of the resulting object. This requires the caller to have knowledge of both derived and base classes which eliminates the utility. If I wanted to make an object to implement an interface and I have to also know exactly the details of what to build then I don't need a template to do it for me. the Factory Method pattern describes concrete and abstract factory interfaces where each concreate factory can build a particular type of object.
m
mbrcknl

I mostly agree with the accepted answer, but there is a C++11 option that has not been covered in existing answers:

Return factory method results by value, and

Provide a cheap move constructor.

Example:

struct sandwich {
  // Factory methods.
  static sandwich ham();
  static sandwich spam();
  // Move constructor.
  sandwich(sandwich &&);
  // etc.
};

Then you can construct objects on the stack:

sandwich mine{sandwich::ham()};

As subobjects of other things:

auto lunch = std::make_pair(sandwich::spam(), apple{});

Or dynamically allocated:

auto ptr = std::make_shared<sandwich>(sandwich::ham());

When might I use this?

If, on a public constructor, it is not possible to give meaningful initialisers for all class members without some preliminary calculation, then I might convert that constructor to a static method. The static method performs the preliminary calculations, then returns a value result via a private constructor which just does a member-wise initialisation.

I say 'might' because it depends on which approach gives the clearest code without being unnecessarily inefficient.


I used this extensively when wrapping OpenGL resources. Deleted copy constructors and copy assignment forcing the use of move semantics. I then created a bunch of static factory methods for creating each type of resource. This was a lot more readable than OpenGL's enum based runtime dispatch which often has a bunch of redundant function parameters depending on the enum passed. It's a very useful pattern, surprised this answer isn't higher up.
J
Jerry Coffin

Loki has both a Factory Method and an Abstract Factory. Both are documented (extensively) in Modern C++ Design, by Andei Alexandrescu. The factory method is probably closer to what you seem to be after, though it's still a bit different (at least if memory serves, it requires you to register a type before the factory can create objects of that type).


Even if it is out-dated (which I dispute), it is still perfectly serviceable. I still use a Factory based on MC++D's in a new C++14 project to great effect! Moreover, the Factory and Singleton patterns are probably the least out-dated parts. While pieces of Loki like Function and the type manipulations can be replaced with std::function and <type_traits> and while lambdas, threading, rvalue refs have implications that may require some minor tweaking, there is no standard replacement for singletons of factories as he describes them.
P
Péter Török

I don't try to answer all of my questions, as I believe it is too broad. Just a couple of notes:

there are cases when object construction is a task complex enough to justify its extraction to another class.

That class is in fact a Builder, rather than a Factory.

In the general case, I don't want to force the users of the factory to be restrained to dynamic allocation.

Then you could have your factory encapsulate it in a smart pointer. I believe this way you can have your cake and eat it too.

This also eliminates the issues related to return-by-value.

Conclusion: Making a factory by returning an object is indeed a solution for some cases (such as the 2-D vector previously mentioned), but still not a general replacement for constructors.

Indeed. All design patterns have their (language specific) constraints and drawbacks. It is recommended to use them only when they help you solve your problem, not for their own sake.

If you are after the "perfect" factory implementation, well, good luck.


Thanks for the answer! But could you explain how using a smart pointer would release the restriction of dynamic allocation? I didn't quite get this part.
@Kos, with smart pointers you can hide the allocation/deallocation of the actual object from your users. They only see the encapsulating smart pointer, which to the outside world behaves like a statically allocated object.
@Kos, not in the strict sense, AFAIR. You pass in the object to be wrapped, which you have probably allocated dynamically at some point. Then the smart pointer takes ownership of it and ensures that it is properly destroyed when it is no more needed (the time of which is decided differently for different kinds of smart pointers).
D
DAG

This is my c++11 style solution. parameter 'base' is for base class of all sub-classes. creators, are std::function objects to create sub-class instances, might be a binding to your sub-class' static member function 'create(some args)'. This maybe not perfect but works for me. And it is kinda 'general' solution.

template <class base, class... params> class factory {
public:
  factory() {}
  factory(const factory &) = delete;
  factory &operator=(const factory &) = delete;

  auto create(const std::string name, params... args) {
    auto key = your_hash_func(name.c_str(), name.size());
    return std::move(create(key, args...));
  }

  auto create(key_t key, params... args) {
    std::unique_ptr<base> obj{creators_[key](args...)};
    return obj;
  }

  void register_creator(const std::string name,
                        std::function<base *(params...)> &&creator) {
    auto key = your_hash_func(name.c_str(), name.size());
    creators_[key] = std::move(creator);
  }

protected:
  std::unordered_map<key_t, std::function<base *(params...)>> creators_;
};

An example on usage.

class base {
public:
  base(int val) : val_(val) {}

  virtual ~base() { std::cout << "base destroyed\n"; }

protected:
  int val_ = 0;
};

class foo : public base {
public:
  foo(int val) : base(val) { std::cout << "foo " << val << " \n"; }

  static foo *create(int val) { return new foo(val); }

  virtual ~foo() { std::cout << "foo destroyed\n"; }
};

class bar : public base {
public:
  bar(int val) : base(val) { std::cout << "bar " << val << "\n"; }

  static bar *create(int val) { return new bar(val); }

  virtual ~bar() { std::cout << "bar destroyed\n"; }
};

int main() {
  common::factory<base, int> factory;

  auto foo_creator = std::bind(&foo::create, std::placeholders::_1);
  auto bar_creator = std::bind(&bar::create, std::placeholders::_1);

  factory.register_creator("foo", foo_creator);
  factory.register_creator("bar", bar_creator);

  {
    auto foo_obj = std::move(factory.create("foo", 80));
    foo_obj.reset();
  }

  {
    auto bar_obj = std::move(factory.create("bar", 90));
    bar_obj.reset();
  }
}

Looks nice to me. How would you implement (maybe some macro magic) static registration? Just imagine the base class being some servicing class for objects. The derived classes provide a special kind of servicing to those objects. And you want to progressively add different kinds of services by adding a class derived from base for each of those kinds of services.
M
Matthieu M.

Factory Pattern

class Point
{
public:
  static Point Cartesian(double x, double y);
private:
};

And if you compiler does not support Return Value Optimization, ditch it, it probably does not contain much optimization at all...


Can this really be considered an implementation of the factory pattern?
@Dennis: As a degenerate case, I would think so. The problem with Factory is that it is quite generic and covers a lot of ground; a factory can add arguments (depending on the environment/setup) or provide some caching (related to Flyweight/Pools) for example, but these cases only make sense in some situations.
If only changing the compiler would be as easy as you make it sound :)
@rozina: :) It works well in Linux (gcc/clang are remarkably compatible); I admit Windows is still relatively closed, though it should get better on 64-bits platform (less patents in the way, if I remember correctly).
And then you have the whole embedded world with some subpar compilers.. :) I am working with one like that which does not have return value optimization. I wish it had though. Unfortunately switching it is not an option at this moment. Hopefully in the future it will be updated or we will make a switch for sth else :)
u
user1095108
extern std::pair<std::string_view, Base*(*)()> const factories[2];

decltype(factories) factories{
  {"blah", []() -> Base*{return new Blah;}},
  {"foo", []() -> Base*{return new Foo;}}
};

F
Florian Richoux

I know this question has been answered 3 years ago, but this may be what your were looking for.

Google has released a couple of weeks ago a library allowing easy and flexible dynamic object allocations. Here it is: http://google-opensource.blogspot.fr/2014/01/introducing-infact-library.html