goto "List of improvements"

What is it?

The page about some C++ improvements, that can be add to language, and I think they must be added. You can send me your opinion about each proposal on the page or discuss them with newsgroup.

Layout of the page can be changed in future. If you want to save link to the page use the URL: grizlyk1.narod.ru/cpp_new .


Updated: 07.04.07. List of modifications.

What kind of improvements are considered here?

All possible C++ improvements on the page have designed under one or more following conditions:

A) the goal of each improvement:
  • to improve old code support or improve upgrading old code for new standard
  • to eliminate unneccessary expressions in new code and make coding process easyer
  • to add new useful behaviour into class and other language parts

B) each improvement must be back compatible:
  • old style code can be compiled by C++ compiler with the improvement support without any changes

List of improvements.


Explicit memory specifications for POD pointers.

Abstract.

POD pointers to all types of memory (static, auto, heap) have equal syntax, that leads to runtime error and disables compile time error detection.

Any kind of memory pointer can be implemented as user C++ class, but low-level memory wrappers can not ignore POD pointers, because they are working with "operator new" - "new" return POD pointers. If user will implement own memory pointer safe and free from POD pointers, user must invent own safe "operator new".

The simplest solution of the trouble is explicit memory declaration for POD pointers.

Semantics #1.

C-style POD pointer has no enough information about how to work with pointer, what is why programmer must support the POD pointer usage by own conventions, implemented, for example, as comments, look at the following example:

    //return "new[]", must be "delete[]" by external
    int*    get_data1();

    //return "new", must be "delete" by external
    int*    get_data2();

    //return static, do not delete by external
    int*    get_data3();

For language without garbage collecting (as C++ does) it is too raw behaviour, it is danger by possible memory leak or double delete, because compiler do not control the pointer usage.

C++ contain memory of following types:

We can use correct memory type in pointer declaration.

Syntax.

Fortunatelly C-style pointers are able to be extentended. The common declaration of pointer looks like this:
[extra_type] type * [extra_pointer] pointer = expression;

Note:
There are some exceptions from the regular rule, when [extra_pointer] declarations must be placed befor [extra_type], for example:

mutable const int *pointer; //who is mutable "const int" or "pointer"?
register const int *pointer; //who is register "const int" or "pointer"?

I agree, that pointer can not point to register and const int outside of class can not be mutable, but simple regular place for declaration is always better, for example

//not C++
const int *mutable pointer; //"mutable pointer to const int"
const int *register pointer; //"register pointer to const int"

Explicit memory type declaration can be added to [extra_type] section. So we can get for "const pointer to const std::list allocated by new"
const heap std::list * const pointer = expression;

In order to declare function parameter of concrete memory type we can write:
void foo( const char heap[] *const str);

Semantics #2.

In order to execute compile time memory usage tests, we must declare rules for cast POD pointers with different memory types.

  1. The memory type declaration concern to [extra_type], not to [extra_poiner] section, so we could declare memory type for not pointer, like this
    const int heap not_pointer = expression;
    Do we need the explicit memory type declaration for non-pointers? I do not know. I think, most people will want to declare non-pointers as default memory type (without any keywords), so address of the object can be assigned to pointer of any memory type without explicit cast.

  2. Implicit casts between different memory types is allowed only for pairs inluding traditional pointer with default memory type

    const int heap * heap_pointer = new int(0);
    const int * pointer = heap_pointer;
    heap_pointer=pointer;

    // error heap -> heap[] implicit cast is not allowed
    const int heap[] * heap[]_pointer = heap_pointer ;
    // error heap[] -> heap implicit cast is not allowed
    heap_pointer=heap[]_pointer;

  3. Explicit cast between pointers with different memory types is allowed with the help of "static_cast<>".

    const int heap * heap_pointer = new int(0);
    const int * pointer = heap_pointer;
    heap_pointer=pointer;

    // ok heap -> heap[] explicit cast is allowed
    const int heap[] * heap[]_pointer = static_cast<const int heap[]*>(heap_pointer);
    // ok heap[] -> heap explicit cast is allowed
    heap_pointer=static_cast<const int heap*>(heap[]_pointer);

    For the cast of memory type we can allow incomplete cast target expression

    const int heap * heap_pointer = new int(0);
    const int heap[] * heap[]_pointer = static_cast<heap[]*>(heap_pointer);
    heap_pointer=static_cast<heap*>(heap[]_pointer);

  4. Consider examples of explicit memory type usage

    • Explicit cast to "Tobj auto*" in memory wrapper.
      template<class Tobj>
      class Tauto_ptr
      {
          mutable Tobj heap *ptr;
      public:
          //return address to use
          Tobj auto*     get_ptr()const 
           { return static_cast<auto*>(ptr); }
      
          //return ownership
          Tobj heap*     move()const //it is correct "move()const"
           { Tobj heap *const tmp=ptr; ptr=0; return tmp; }
           
          Tauto_ptr(Tobj heap* p=0):ptr(p){}
          ~Tauto_ptr(){delete ptr;}
      };
      

    • Possible compile time error detection due to explicit memory type.
      extern int test;
      
      void foo()
      {
      int 	 tmp=0;
      
      int heap *h_ptr1=new int(2);  //ok
      int heap *h_ptr2=new int[10]; //error
      int heap *h_ptr3=&tmp;        //error
      int heap *h_ptr4=&test;       //error
      
      int auto *a_ptr1=&tmp;        //ok
      int auto *a_ptr2=&test;       //error
      int auto *a_ptr3=new int(2);  //error
      int auto *a_ptr4=new int[10]; //error
      
      int *ptr1=&test;              //no tests, ok
      int *ptr2=&tmp;               //no tests, ok
      int *ptr3=new int(2);         //no tests, ok
      int *ptr4=new int[10];        //no tests, ok
      
          delete[] h_ptr1;          //error
          delete   h_ptr1;          //ok
      	
          delete   ptr1;		 	  //no tests, ok
      }
      

User defined aliases (typedef) of new memory types.

Due to C++ denies "implicit int", we obtain ability to declare user defined aliases for modifiers, like "const", "volatile" and memory types "heap", "auto":

    typedef heap    memtype;

It can be useful refinement, because we can make traits with different modifiers and use the modidiers in other classes by one user defined name:

struct Obj
{
    typedef heap    memtype;
};

struct Arr
{
    typedef heap[]  memtype;
};

template<class T>
struct Z
{
    typedef typename T::memtype memtype;
    Z memtype* get();
};

Z<Obj>	z_obj;
Z<Arr>	z_arr;

More extentions of memory types.

The extentions of C++ have described in the previous sections can be added in fact immediately, but some other extentions not, because they are introduce new memory types.

  1. If C++ will support garbage collected memory management the type of memory can be added also:
    • garbage
              dynamic object under compiler control, allocated by "new garbage"
    • garbage[] or garbage_array
              dynamic array under compiler control, allocated by "new garbage[]"

  2. We have a little problem with keyword "static" used as memory type descriptor.

    • It is because of "static" in global scope is used to control "public" attribute of a declaration, because all memory in global scope is "global memory under compiler control" by default, so we can get confusion and ambiguous declarations with "static".

    • The other problem is that most modern systems divide static memory on some following subtypes:
      • readonly code segment
      • readonly data segment
      • writeable initialized data segment
      • writeable non-initialized data segment

    The manual control "what kind of static data segment is used to place POD data or object" can be useful and allow to solve the "static" ambiguous, so "static" memory type can be replaced by concrete global segment and combination of keywords like "code, data, non_init, readonly, writeonly, readable, writeable".

Problems.

Nothing.


Operator "new" returning RAII wrappers (non-pointers) for dynamic objects.

Abstract.

Due to expression "new T" return "T*" and after it "T*" often must be placed into appropriate memory wrapper, we get here "two stages" creation process. If the memory wrapper is parameter of any function, we can get possible memory leakage if the process of creating object has failed, because "new T" and constructor of the wrapper is not "atomic" operation.

struct T{ T(const int); };
struct Wrapper{ Wrapper(T*); };
void foo(Wrapper,Wrapper,Wrapper);

void	boo()
{
	foo( new T(0), new T(1), new T(2) );
	/*
	can be unrolled into following sequence:
	
	-create memory
	a=new T(0), 
	//if T(1) throw, 
	//then T(0) will remain to be allocated and will be lost
	b=new T(1), c=new T(2);
	
	-create wrappers with the memory
	Wrapper(a),Wrapper(b),Wrapper(c);
	
	-call foo()
	foo(Wrapper(a),Wrapper(b),Wrapper(c));
	
	*/
}

The simplest way to solve the problem - to make special form of "expression new" to return user-defined memory wrapper instead of POD pointer. The way can use dynamic memory as well safe and easy, as auto or static memory.

Semantics.

RAII idiom require, that most dynamic objects (objects created by "new or new[]") should be placed into any kind of memory wrapper (auto_ptr<> for example) immediately after creation have completed.

The memory wrappers carry out ownership for obtained dynamic memory area and provide correct return the memory into free state.

It is better, if dinamic objects will be declared and created at the point of its usage, like ordinary auto temporary objects, without any explicit temporary storage.

//created at the point of its usage - better
foo(new int(0));
boo(int(0));

//created with explicit temporary storage - worse
const int *const ptr=new int(0);
foo(ptr);

const int var=0;
boo(var);

But expression "new T" return "T*", and "T*" must be placed into any memory wrapper.

Syntax.

The best way could be "operator new" overloaded in concrete class and global overloaded "operator new", but C++ does not allow to overload any by return value. Even if with C++ one be able to overload by return value, for the concrete case of "new", overloading return type could be a source of confusions.

So we must find explicit sign to tell to compiler what kind of new we are using and declaring. I think the name "new&" can be used as "new returning user defined memory wrapper".

This is example of "new&" declaration:

//#define while C++ does not support "heap"
#define heap
#define heap[]

template<class T>
struct User_wrapper
{ 
	User_wrapper(T heap *const); 
};

struct T
{
	static User_wrapper<T> 
		operator new& (T heap* ptr)
 		 { return User_wrapper<T>(ptr); }
 		 
	T(const int);
};

typedef User_wrapper<T>	ptr;
void	foo(ptr,ptr,ptr);

void	boo()
{
	//safe now
	foo(new& T(0), new& T(1), new& T(2));
}

If C++ will support some extentions for template declarations, at least only for "operator new&", then user can quickly redefine memory wrapper as needed. Consider the following example:

//global "::operator new&"
template<class T>
//C++ must support reference to "::operator new&", 
//instead of ::operator new& <T>
User_wrapper<T> ::operator new& (T heap *const ptr)
 {return User_wrapper<T>(ptr);}

//class T "T::operator new&"
template<class T>
//C++ must support reference to "T::operator new&", 
//instead of T::operator new& <T>
User_wrapper<T> T::operator new& (T heap* ptr)
 {return User_wrapper<T>(ptr);}

It is easy to see, that usage of "new&" is similar to "new", and as well an expression new T(params); unrolled by compiler into sequence of pseudocode "tmp=operator new(sizeof(T)), tmp->T::T(params), return tmp;, as the expression new& T(params); can be unrolled into sequence of pseudocode "tmp=operator new(sizeof(T)), tmp->T::T(params), return operator new&(tmp);.

Problems.

Nothing.


How to send list of parameters into template.

Abstract.

This is complementary for "new&" property of language and does nearly the same. It does the same (makes "atomic" RAII wrapper for dynamic memory area) in the case of concrete memory wrapper (instead of its object) passed as parameter.

Syntax of ordinary C++ class declaration does not allow to declare one class (class "A") with template<class T> in order to be able to obtain (unknown at point of the class "A" declaration) list of parameters of "class T" (known only at the point of the class "A" instance).

Consider the following example:

//Memory wrapper
template<class T>class Tptr;
//User class
struct T{ T(params_list); };

//We can do like this
Tptr<T>	tmp(new T(params_list));

/*
But we can not pass any "T::params_list" to constructor of 
class Tptr<T>, in other words, we can not do, for 
example, like this in order to hide memory allocation

Tptr<T>	tmp(T::params_list);
*/

Due to the limitation, C++ can not automatically("on fly") generate correct constructor for memory wrapper "Tptr<T>" that is obstacle of easy usage of the memory wrapper.

The simplest way to solve the trouble - to introduce extention into templated class declaration. We can extend syntax of template<>, to allow programmer to create and to use in theirs programs own memory management and to declare other classes for jobs are similar to that.

Semantics.

C++ has no internal garbage collector memory manager, to allocate dynamic memory without excplicit memory deallocation. In many cases RAII idiom allows users to implement the concrete kind of memory management with the help of user defined classes.

But, unfortunately, C++ syntax does not allow programmers to use the user defined classes in practical cases. We get here the same troubles as in the example of previous question:

void	boo()
{
	foo( new T(0), new T(1), new T(2) );
	/*
	can be unrolled into following sequence:
	
	-create memory
	a=new T(0), 
	//if T(1) throw, 
	//then T(0) will remain to be allocated and will be lost
	b=new T(1), c=new T(2);

I think, people do not want garbage collector as itself, but they really want usable RAII wrapper for easy dynamic memory allocation. In first example (in "abstract" section) the class Tptr<T> can be used as substitute of garbage collector if C++ will support extended template<> syntax.

Syntax.

In order to improve behaviour of the expression Tptr<T> tmp(new T(params_list));, we can improve template<> syntax. Consider:

  1. We can introduce define template keyword. The template "define" must act nearly as preprocessor "#define" directive. Consider the following example:
    • Expression (_empty_,_empty_) just shows concrete format of default declaration of "params_list" and expanded into nothing together with its braces.
    • If params_list is undefined, that compiler must ignore any incomplete declarations, as at the point "1" as if:
      ifdef params_list
      Tptr params_list { new T params_list ; }
      endif
    • Unlike to template parameters declared with "class" or "typename", if template parameter declared with "define" is not used during definition, it assigned to undefined, in other words, all template parameters declared with "define" are undefined by default:
      //the declaration
      Tptr<int> tmp;
      //is the same as
      Tptr<int,(_empty_)> tmp;

    Example:
    template< class T, define params_list=(_empty_,_empty_) >
    class Tptr
    {
        T heap* ptr;
    public:
        Tptr():ptr( new T params_list ){}
        
        //If "params_list" is not defined 
        //compiler will ignore the following declaration
        //1
        Tptr params_list :ptr( new T params_list ){}
    };
    

  2. We can introduce list template keyword. Unlike to define template keyword, list will expect expression like this (expr,expr,expr) and then will remove the its braces.

    template<class T, list params_list=(_empty_,_empty_) >
    class Tptr
    {
        T heap* ptr;
    public:
        Tptr():ptr( new T(params_list) ){}
        //1
        Tptr(params_list):ptr( new T(params_list) ){}
    };
    

  3. In the case of multiple list and define template keywords, we can help to compiler to assign the list of parameters with correct type during declaring and instantiating with the help of prefix like this class_name::[member_name::]. Consider the following example like this:

    template< class T, define params_list=(T::(_empty_,_empty_)) >
    class Tptr
    {
        T heap* ptr;
    public:
        Tptr():ptr( new T params_list ){}
        Tptr params_list :ptr( new T params_list ){}
    };
    
    template< class T, list T::params_list=T::(_empty_,_empty_) >
    class Tptr
    {
        T heap* ptr;
    public:
        Tptr():ptr( new T(T::params_list) ){}
        Tptr(T::params_list):ptr( new T(T::params_list) ){}
    };
    
    Tptr< my_class, my_class::(1,45,6,sin(2.45),"test") >   tmp;
    

  4. Both list and define template keywords are not macros and can not be defined by "#define". In the case of point "1" compiler must do the next:
    • Find in the expression new T(params_list) concrete declaration of constructor of class "T" (in general case member of class T), by T::params_list
    • Automatically generate correct constructor declaration for the expression Tptr(params_list) by the parameters of the concrete declaration of constructor of class "T" have been found at previous step
    • continue ordinary compiling

  5. In fact a template parameter declared with list or define template keyword is not the same as that declared with class or typename template keyword, and can be placed in separated section of template parametes, like this template<class T><define params_list=(T::(_empty_,_empty_))> class Tptr. For memory wrapper we no need to have a lot of templated classes with the same name and the equal first section of parameters and with differences in second section only.

With new syntax we can do like this:

//T::params_list? == T::(1,45,6,sin(2.45),"test")

//A
//not beauty, but can be used as single declaration
Tptr<T,(T::params_list)> tmp;
    foo( 
      Tptr<T,(T::params_list1)>(),
      Tptr<T,(T::params_list2)>(),
      Tptr<T,(T::params_list3)>()  );

//B
//better and suitable to define objects
Tptr<T>   tmp1<(T::params_list1)>,
          tmp2<(T::params_list2)>;
    foo( 
      Tptr<T>()<(T::params_list1)>,
      Tptr<T>()<(T::params_list2)>,
      Tptr<T>()<(T::params_list3)> );

//C
//best, but Tptr<T>::params_list can hide T::params_list
//if no T:: prefix
Tptr<T>   tmp(T::params_list);
    foo( 
      Tptr<T>(T::params_list1),
      Tptr<T>(T::params_list2),
      Tptr<T>(T::params_list3) );

Problems.

Nothing.


Explicit constructor in C-ctyle array declaration.

Abstract.

There are many cases, when we need concrete number of objects allocated and the number of objects will not be changed during object life time. The way to do it is C-ctyle array. Then we can write RAII wrapper for the array, so the array will be copyable and suitable as member for other classes.

Unfortunatelly, C++ does not allow to create C-style array by any constructor than default constructor. I do not see no one reason for the limitation. C-style array is useful because that is only way to make intrusive array allocated, for example, in auto memory, without slow dynamic memory allocation/deallocation.

The simplest way to avoid the problem - to declare concrete constructor for C-style arrays and use it during creation.

Semantics.

C-ctyle array allocates a piece of memory area of total size of the array. In the area objects will be placed one by one aligned correctly. After memory has been allocated, default constructor will be called for each object in the array.

It is easy to see, that instead of default constructor, any (but only obe for all) concrete constructor can be called for all objects in the array. Compiler easy can generate internal functions to create array with any concrete constructor.

There are many objects, that normally must not be created with the help of default constructor. Due to the C-style array limitation programmer must use low-perfomans and not-suitable here explicit "two stages" creation sequence: during first stage, memory has been allocated and default constructor called, then, during second stage, for each object of array user defined member "construct" will be called.

Syntax.

We must tell to compiler what constructor must be used in each C-style array declaraion united with the array creation. The simples way type_name variable_name(params)[size]; can not be used, because looks like function declaration. If to move "(params)" close to type_name, like this type_name(params) variable_name[size]; then again confusion with function and "new[]" has no the barrier as "variable_name" new type_name?(params)?[size];.

So we need special mark to declare constructor using for creation. There are two ways:

// -1- 
//use "=()"
type_name   variable_name[size]=type_name(params);
    =new    type_name[size]=type_name(params);

here "new" return r-value of POD pointer, so normally this 
statement can not be reserved for assignment, also
"type_name[size]" and "type_name(params)" has incompatible
types, but we can deny the case of syntax for "new expression" 
and remain only for variable declaration.

// -2-
//use "([" and "])"
type_name   variable_name([params])[size];
    =new    type_name([params])[size];    

Example:

void	foo()
{
int    tmp[40];

//the following tmp? are equal to "int tmp[40];"
int    tmp0[40]=int(0);
int    tmp1[40]=int(1);
int    tmp2([2])[40];
int    tmp3([3])[40];

int    (*ptr1)[40]=new int[1][40]=int(4);
int    (*ptr2)[40]=new int([5])[1][40];
}

template<class T>
class X
{
    static T	ext1[23];
    static T    ext2[23]=T(3,5,6);
    
    T           int([3,5,6])[23];
};

template<class T>
T    X<T>::ext1[23]=T(3,5,6);

template<class T>
T    X<T>::ext2[23];

Problems.

Nothing.


Auto_ptr idiom.

Abstract.

The auto_ptr is RAII memory wrapper and can be implemented as free-standing class with very light methods and its internal state has only POD pointer to allocated memory. We do not want to have auto_ptr with any overhead, so auto_ptr has no any extra runtime data, in addition to the POD pointer.

I think, more than 90% all dynamic memory operations can be covered by the high-effective memory wrapper, so completely support of the wrapper (of auto_ptr idiom) is very important part of language.

But in order to implement auto_ptr, language must support some things:

Problems.

There are some problems for implementation of auto_ptr in C++.

But in order to implement auto_ptr, language must support some following things:

  1. One of the problem is that C++ has no strict definition of "moveable", so in order to support auto_ptr C++ must integrate "moveable" concept into language.

  2. Other problem is that (unlike "copyable" objects) the "moveable" objects can be turned into incorrect state during lifetime (befor object will be deleted) and it is ordinary event (correct event) for "moveable".

    That is why we always can try to get access to "moveable" object during its incorrect state. Logicaly the attempts is always error, but logically destroyed object, technically is still accessable. Fortunatelly compiler can control correct access to "moveable" at compile time with the help of extra compile time attributes.

    So in order to support auto_ptr C++ must also define extra attributes to controll access to "moveable".

Only then auto_ptr can be implemented in C++ and can be used safely and effectively.


"Move", "Copy" and "Swap" operations.

Abstract.

There are at least two examples of operations and rvalue of them has always mutable internal state. This is operation "move" and "swap". It is very popular operations, required for RAII wrappers and effective data exchangers, but unfortunatelly they have no its C++ operators, so i think C++ must declare the operators.

Also the appearance of new operations "move" and "swap" forces us to introduce explicit operation "copy", to help user to make interaction of copyable and moveable data types (see "moveable concept" below).

Semantics.

  1. Operator "move". The operator takes internal state of source object and places the state into destination object, also source object obtains new internal state (often random or incorrect). The operation is also known as "destructive read". Any class, that supports the operation, is called "moveable".

    The following example shows "assignment" overloaded like "move":

    struct A
    {
    	mutable int i;
    	
    	A& operator= (const A& src)
    	 {
    	  if( this != &src )
    	   { 
    	    i=src.i; 
    	    src.i=0; 
    	   }
    	  return *this;
    	 }
    };
    

    One can see, that user can require both behaviours together: "move" and "ordinary assign", so we need special syntax to express our desire to do "exactly move".

    Also "operator=" for most people associated with "copy assignment" rather than with "move assignment". Consider:

    A    a,b;
    
        // many people will expect here 
        // that "b" will be unchanged 
        a = b;
    
        // but the expression is unrolled to
    {
        a::i=b::i, //ok
        b::i=0;    //error - unexpected assignment to "b" from invisible sourse
    }
    

    The new syntax of "operator move" can be the same as "a=b", but more clear and we will not expect there, that "b" will be unchanged.

    Properties of "operator move":

    • The "operator move" is binary operator with non-const destination and possible const source parameters.
    • The "operator move" can return reference of destination parameter, as "operator assign" does.
    • Precedence of "operator move" is equal to assignment and associated from right to left, as "operator assign" does.

  2. Operator "copy". The operator takes internal state of source object and place _copy_ of the internal state into destination object, also source object always remains with old unchanged internal state. Any class, that supports the operation, is called "copyable" (more detailed about copyable and moveable data types see "moveable concept" below).

    The question is: if we will introduce "operator move", maybe we need "operator copy" to explicit cover all possible operations ("move" and "copy")?

    From one side, we no need this, because "ordinary assign" does exactly copying.

    But from other side, later working with moveable data types, one can see, that we need operator "assignment by class declaration" also in addition to "operator move" and "operator copy", especially for templates.

    In other words, we have here three operators, not two operators:

    • operator "move" ("move assignment")
    • operator "copy" ("copy assignment")
    • operator "assign" ("assignment by class declaration")

    And it is useful if we will reserve "operator=" as "assignment by class declaration" instead of "operator copy", because (see "moveable concept" below) declaration of a variable of destination is used as selector of move or copy type of assignment, that will be applied. Consider:

    template<class T>
    T foo(T& a, cosnt T& b)
    {
        // here "a=b;" means
        // assignment by declaration of "a"
        return a=b;
    }
    
    template<class A>
    void foo()
    {
    A           ca, cb;
    moveable A  ma, mb;
    
        // here "a=b;" instantiated
        // as operator "copy assignment" (a,b)
        foo<A>(ca,cb);
    
        // here "a=b;" instantiated
        // as operator "move assignment" (a,b)
        foo<moveable A>(ma,mb);
    }
    

    So if we want to express "to do always copy" in templated foo<>, then we need syntax of separated "operator copy", similar to syntax of "operator move".

    Also later, working with interaction of moveable and copyable data types (more detailed about interaction of moveable and copyable data types see "moveable concept" below), one can see, that, unlike to constructor, assignment can be applyed far away from declaration of its variable of destination . In implementation of something we can find code like this:

        foo(a);
        foo(a);
        a=b;
        foo(b);
    

    Well, can you say, looking to the assingment either copy assignment will be applied or move assignment? You need find declaration of "a" variable. It is not so good, because code looks vague, it is hard to read the code by small parts.

    So, in spite of the fact, that "move constructor" will not be expressed differently from "copy constructor", due to the two reasons (listed above) we need to have syntax to express "move assignment" and "copy assignment" explicitly.

    Properties of "operator copy" is the same as for "operator move".

  3. Operator "swap". The operator exchanges internal state of source and destination objects. For POD types the operator can be implemented by internal builtin function of C++ compiler and can use single CPU opcode to provide most effective hardware exchange, as for "*p++" expression.

    The following example shows "assignment" overloaded like "swap":

    struct A
    {
    	mutable int i;
    	
    	void operator =(A& src)
    	 {
    	  if( this != &src )
    	   { 
    	    const A tmp(i);
    	    i=src.i; 
    	    src.i=tmp; 
    	   }
    	 }
    };
    

    Properties of "operator swap":

    • Unlike to "operator move" the "operator swap" is binary operator with both non-const parameters, source parameter can not be const (see below "non-const vs moveable").
    • The "operator swap" has undefined return (maybe reference of destination variable can be returned, as "operator assign" does).
    • Precedence and associate of "operator swap" is undefined (maybe precedence is equal to assignment and associated from right to left, as "operator assign" does).

Pointers, indirect access to data and moveable data types.

  1. Consider the following example of user-defined class designed to be replacement of POD pointer:
    template<class Tobj>
    class Tobject
    {
        Tobj heap*      ptr;
        
    public:
        typedef heap    memtype;
    
        void            detach(){ ptr=0; }
        void            destroy(){ delete ptr; ptr=0; }
    
        void            assign(Tobj heap* p){ ptr=p; }
        void            replace(Tobj heap* p)
         {
          if( p != ptr )
           {
            delete ptr;
            ptr=p;
           }
         }
    
        Tobj heap*      get()const { return ptr; }
        Tobj heap*      move()
         {
          Tobj heap* const tmp=ptr;
          ptr=0;
          return tmp;
         }
    
    public:
        Tobject(Tobj heap* p=0):ptr(p){}
    };
    

    In the example:

    • the Tobject<> encapsulates concrete form of "delete" ( "delete" instead of "delete[]") with the help of method "destroy" and typedef of "memtype"
    • similarly to POD pointer, the Tobject<> has methods "assign" and "get", for the methods C++ has special operators "operator=" and "operator*" plus "operator->"
    • similarly to POD pointer, destructor of the Tobject<> does nothing and constructor uses method "assign"

    It is easy to see, that independently from moveable or copyable type of "Tobj", where "ptr" point to, for each logical group of methods the Tobject<> declares at least pair of methods with different name and implementation:

    • create object or place new value
      • assign
      • replace
    • delete object
      • detach
      • destroy
    • indirect access
      • get
      • move

    In the example the Tobject<> is copyable class, but its data can be only moveable. Concrete "copy behaviour" of the Tobject<> can be defined by external class, and auto_ptr is kind of the external class.

    It is easy to see, that "detach" and "replace" of class of Tobject<> are kind of "syntax sugar", and can be implemented as "move" and "destroy" plus "assign". And "move" is syntax sugar also, because can be implemented as "get" plus "assign".

    The dividing on base and derived methods is not only possible, but the deviding looks like if current set of operators "operator=", "operator*" and "operator->" is enough to implement all we need.

    But the opinion is wrong, because concrete implementation of "copy behaviour" of the Tobject<> can request both methods of any group together and simultaneously the implementation can be ordinary data type, but the ordinary data type can request suitable operators for each method as copyable data type does.

  2. The concrete important case of "copy behaviour" of the Tobject<> is auto_ptr, that request both "get" and "move" methods simultaneously. This is ordinary operation for wrapper to move POD pointer out to wrapper of other type.

    For some moveable classes, that has clear separated internal and external state, and for copyable pointers or iterators, pointed to moveable data the operation of extraction of internal state ("move") is also useful.

    Note, that the considered "move operation" of the Tobject<> is unrelated to pointer itself, but related only to data, pointer point to (as "operator*" related to data, pointer point to), so the "move operation" is not like to "binary move operation", considered in previous clause. Maybe it is better to call the operation of the Tobject<> as "extract", but the name "move" has become familiar one for me.

    Consider the following example of auto_ptr implemented while C++ has no moveable support (it seems to me, that auto_ptr collects most possible difficulties of moveable data type implementation).

    template<class Tobj, class Tmemory=Tobject<Tobj> >
    class Tauto_ptr
    {
        mutable Tmemory     memory;
    
    //info
    public:
        typedef typename Tmemory::memtype   memtype;
        char             is_owner()const { return memory.get()? 1:0; }
        
    //access
    public:
    
        Tobj auto*      get_ptr()const 
         { 
          return static_cast<Tobj auto*>(memory.get()); 
         }
    
        Tobj&           get_obj()const 
         { 
          Tobj auto* const tmp=get_ptr();
          if(!tmp)safe_throw<err_zptr>();
          return *tmp; 
         }
    
        Tobj memtype*   move()const { return memory.move(); }
        Tobj            copy()const { return *memory.get(); }
          
    public:
    
        Tobj&           operator*  ()const { return get_obj(); }
        Tobj auto*      operator-> ()const { return &get_obj(); }
    
        Tobj&           operator() ()const { return get_obj(); }
        Tobj auto*      operator() (const int)const { return get_ptr(); }
        
    //assign
    public:
    
        Tauto_ptr&      assign(Tobj heap* const new_adr) 
         { 
          memory.assign(new_adr);
          return *this;
         }
    
        Tauto_ptr&      replace(Tobj heap* const new_adr) 
         { 
          memory.replace(new_adr);
          return *this;
         }
         
    public:
        Tauto_ptr&      operator= (Tobj heap* const new_adr)
         { 
          return replace(new_adr); 
         }
    
    //creation
    public:
        Tauto_ptr(Tobj memtype* p=0):memory(p){}
        
    //copying
    public:
        Tauto_ptr(const Tauto_ptr& obj):memory(obj.move()){}
        Tauto_ptr& operator= (const Tauto_ptr& obj)
         {
          if( &obj != this ){ replace(obj.move()); }
          return *this;
         }
         
    //destroying
    public:
        void            detach()const { memory.detach(); }
        void            destroy()const { memory.destroy(); }
        ~Tauto_ptr(){ memory.destroy(); }
    };
    

    To learn why "move","detach" and "destroy" have been declared as const for Tauto_ptr see "Non-const" is not the same as "moveable" below. Note, that the operations are non-const for Tmemory, because Tmemory is low-level class to work with concrete kind of memory and is not going to be auto memory substitution.

  3. The big reason to introduce the "operator extract" is existence of moveable data type. Looking on the example of auto_ptr inlementation, one can see that the operators "operator*" and "operator->" returns reference (or pointer).

    It is hard to implement "compile type attributes for moveable" (see "compile type attributes for moveable" below) if one "operator*" will be used for copy and move. In other words, we need new "operator extract" as well as "operator*" to allow to support of compile time attributes of moveable data type.

    In the example of auto_ptr implementation one can see method "copy". The method also can require own "operator copy extract". The other big reason to introduce the "operator move extract" and the "operator copy extract" is interaction of copyable and moveable data type.

    Similarly to previously considered arguments for apperence binary "move assignment" and "copy assignment", we need unary "operator move extract" and "operator copy extract". The existence of the three operators allow user to declare (and overload) three return values

    return reference "unary operator *"     - to access data pointer point to
    return value     "unary operator move"  - to move data pointer point to
    return value     "unary operator copy"  - to make copy of data pointer point to
    

    Properties of "unary operator move":

    • Operator "move extract" is unary prefix operator and looks like "unary operator *" (I am sure all will be happy to find new prefix operator in C++).
    • The operator "move extract" returns value of internal state of its objects.
    • In spite of moving of internal state of its object the operator "move extract" can be const (see below "non-const vs moveable").
    • Precedence and associate of operator "move extract" is similar to "unary operator *".

    Properties of "unary operator copy":

    • Operator "copy extract" is unary prefix operator and looks like "unary operator *" (I am sure all will be happy to find new prefix operator in C++).
    • The operator "copy extract" returns value of internal state of its objects.
    • Precedence and associate of operator "copy extract" is similar to "unary operator *".

    See "moveable concept" below to learn more about the operators.

Syntax.

We can use the following form for the operators (there is no ambiguous). It is interesting, that the forms of operators reflect the fact, that "move", "copy" and "swap" are different sides of one thing.

  1. Operator "move".
    T& T::operator<-<(const T& right);
    T& operator<-<(T& left, const T& right);
    
    a<-<b;
    

  2. Operator "copy".
    T& T::operator<+<(const T& right);
    T& operator<+<(T& left, const T& right);
    
    a<+<b;
    

  3. Operator "swap".
    void T::operator<->(T& right);
    void operator<->(T& left, T& right);
    
    a<->b;
    

Pointers, indirect access to moveable data.

It is hard to find short form of "unary operator copy", but for "unary operator move" there is the beautiful form. The long forms of the "unary operators" have the same sintax as its "binary operators". The short forms of the "unary operator move" is just alias, so user class can declare only one of them.

  1. Operator "unary move"
    T T::operator<();    //shortest, can be ambiguous with binary <
    T operator<(const T&);
    
    T T::operator<-<();  //here is no ambiguous
    T operator<-<(const T&);
    
    a=<b;
    a=<-<b;
    

  2. Operator "unary copy"
    T T::operator<+<();  //here is no ambiguous
    T operator<+<(const T&);
    
    a=<+<b;
    

Summary of operators:

== Binary operators ==

operator assignment by destination =
operator move assignment           <-<
operator copy assignment           <+<
operator swap                      <->

== Unary operators ==

operator unary move (short form)   <
operator unary move                <-<
operator unary copy                <+<

Problems.

Nothing.


"Move", "Copy" and "Swap" operations- Appendix

NOTE: "Non-const" is not the same as "moveable".

It is very important thing to see the differences between "non-const" and "moveable". And comparing operations "move" and "swap" we can easy to find the differences.

I think some people are trying to declare moveable (auto_ptr for exmple) as non-const in hope to warn user, that the moveable has unexpected, hidden assignment to its internal state, after operation "move" (the "move" implemented as overloading operation "copy") has executed.

Non-const can be used for any purpose, but due to non-const can not protect moveable from "double move", const must be used for standard purpose - protect from explicit changes of external state of moveable (from assignment for example). Consider the differences:

  1. template<class T>
    void  swap_1(T& lhs, T& rhs)
    {
    const T dummy ( <lhs );
    
    //ok
    //here "const T" protect us from wrong assignment to temporary
        dummy= <rhs; //compile time error
        
        lhs= <rhs;
        rhs= <dummy;
        
    //error
    //here compiler must generate error
    //independent from constness of dummy
    //because "double move" is "once" violation
        <dummy;
    }
    
  2. template<class T>
    void  swap_2(T& lhs, T& rhs)
    {
          T dummy ( <lhs );
          
    //error
    //here compiler must generate error 
    //but "non-const T" can not protect us from wrong assignment to temporary
        dummy= <rhs; //ok
        
        lhs= <rhs;
        rhs= <dummy;
        
    //error
    //here compiler must generate error 
    //independent from constness of dummy
    //because "double move" is "once" violation
        <dummy;
    }
    

And the fact that "const dummy" does not strictly required for function "swap", is not speaking that all other functions does not strictly require const movable.

In opposite to "operator move", "operator swap" really require non-const to its argumets, because (unlike to "move") we are going to use the swap arguments after "operator swap" has been executed.


Compile time attributes to control moveable state.

Abstract.

In previous chapters one can find that "moveable" needs extra compile time attributes to control moveable state. it is because "moveable" object can be turned into incorrect state (for example after "move" operation), but (ulinke to "copyable") the object will not be deleted immediately after that.

Fortunatelly, without runtime overhead, C++ compiler can control correct moveable usage at compile time in 90% cases with the help of some compile time class member attributes as: "once, ctor, dtor, after, befor, probe, level".

Semantics & Syntax.

  1. "Once" attribute. C++ must support keyword "once", to control that the method of any class (operator "move" for example) will be called only once during lifetime of object.

    The keyword "once" can not garantee compile time control in all cases, but compiler can do its best and at least it can produce warning, that "once" control is lost here. Consider:

    //let's work here only with objects created by  "new T"
    template<class T>
    struct Tptr
    {
        mutable T heap* ptr;
    
        T auto* get_ptr()const
         {
          return static_cast<auto*>(ptr);
         }
        T auto* operator *()const { return get_ptr(); }
        
        T heap* move()const once 
         { 
          T heap *const tmp(ptr); 
    
            ptr=0; 
            return tmp; 
         }
        T heap* operator < ()const once { return move(); }
        
    };
    
    template<class T>
    void    boo(const Tptr<T>&);
    
    template<class T>
    char    woo(const Tptr<T>& ptr){ return (*ptr)? 1: 0; }
    
    using   ::is_ok;
    using   lib::all_ok;
    
    template<class T>
    void    foo(const Tptr<T> ptr)
    {
        //ok
        if( ::is_ok() ){ ptr.move(); }
    
        //warning - possible double move();
        //if "::is_ok()" is not synced with "lib::all_ok()"
        if( !lib::all_ok() ){ ptr.move(); }
    
        //warning - compiler lost control for "move()const once"
        //due to hidden body of "boo<T>"
        boo<T>(ptr);
    
        //ok, due to accessable body of "woo<T>"
        woo<T>(ptr);
        
        //warning - more than possible double move();
        //if "::is_ok()" or "lib::all_ok()" was true
        ptr.move();
        
        //error - double move();
        ptr.move();
    }
    
    /*
    to avoid the first warning 
        //warning - possible double move();
    we need write like this
    
        if( ::is_ok() ){ ptr.move(); }
        else
         {
          if( !lib::all_ok() ){ ptr.move(); }
         }
    */
    

    Can the keyword "once" make differences for overloading Tobj heap *const once move()const once;? I think no.

        //ok
        Tobj heap *const      move()const once;
        //i think error - redeclaration of "move()const"
        Tobj heap *const      move()const;
    
        //ok together with "move()const once"
        Tobj * /*once*/       move()once;
        //i think error - redeclaration of "move()"
        Tobj *                move();
    

    It is evidently, with "once" we can make some compile time tests insted of runtime tests.

  2. One can say, that "once" is not enough to do complete compile time tesing. Consider:
        const Tptr<T>    a(new T);
        a.move();
        a.move();       //ok - the error detected
        a.get_ptr();    //wrong - the error not detected
    

    Let's extend Tptr class from previous example:

    template<class T>
    class Tptr
    {
        mutable T heap* ptr;
    
    public:
        T auto* get_ptr()const
         {
          return static_cast<auto*>(ptr);
         }
        T auto* operator *()const { return get_ptr(); }
        
        T heap* move()const once 
         { 
          T heap *const tmp(ptr); 
    
            ptr=0; 
            return tmp; 
         }
        T heap* operator < ()const once { return move(); }
        
        Tptr&   replace(T heap *const data)
         {
          //(ptr!=0) is rare condition
          if(data != ptr)if(ptr)delete ptr;
          ptr=data;
          return *this;
         }
        Tptr& operator= (T heap *const data){ return replace(data); } 
    
    public:
        Tptr(T heap *const p=0):ptr(p){}
    
        //(ptr!=0) is rare condition
        ~Tptr(){ if(ptr)delete ptr; }
    
    	//moveable semantic - see below
        Tptr(const T@ &obj):ptr(<obj){}
        Tptr& operator=(const Tptr@ &obj)
         {
          if( &obj != this)replace(<obj);
          return *this;
         }
    
    private:
    	//copyable semantic - disabled
        Tptr(const T& obj);
        Tptr& operator=(const Tptr& obj);
        
    };
    

    Now, I think, we need to tell to compiler that

    • if "move()" has executed, "get_ptr()" must not be executed,
    • if "move()" has not executed, "get_ptr()" can be executed by conditions of get_ptr "oneceness".

    Let's think, that if "move" has been executed, it looks like its Tptr<T> was logically deleted, no one "returning internal state" operation can be aplied, only destructor or new assignment.

    In fact "move" logically is real desructor with own name and the destructor can return value. So we need instead of "once" to use "dtor", "ctor"

    • to mark "move" as "dtor",
    • to mark "get_ptr" as "returning state"
    • to mark "replace" as "ctor"

    Assuming the "returning state" is default attribute. Standard constructors and destructor has ctor/dtor attributes by default, assignment has no. And we will get for Tptr<T>:

        //can "const dtor" pair be applied ot not?
        //i think yes, because dtor can destroy const objects
        T heap* move()const dtor;
        T heap* operator < ()const dtor;
        
        //can "const ctor" pair in theory be applied ot not?
        //i think no, because ctor can not re-create objects
        Tptr&    replace(T heap *const data)ctor;
        Tptr& operator= (T heap *const data)ctor;
        
        Tptr& operator=(const Tptr@ &obj)ctor;
    

    Now all is nearly exelent:

    {
    const Tptr<T>    a(new T);
        a.get_ptr();         //ok - get
        a.get_ptr();         //ok - get
        delete a.get_ptr();  //ok - compile time error detected
        delete a.move();     //ok - destroy
        a.move();            //ok - compile time error detected
        a.get_ptr();         //ok - compile time error detected
        a.replace(new Tobj); //ok - compile time error detected
    
    Tptr<T>          b(new T);
        b.replace(new Tobj); //ok - new value
        b.replace(new Tobj); //ok - new value
        b.move();            //ok - destroy
        b.move();            //ok - compile time error detected
        b.get_ptr();         //ok - compile time error detected
        b.replace(new Tobj); //ok - new value
    
    T    tmp;
        b.replace(&tmp);     //ok - compile time error detected
    }
    

  3. Now exist last problem - we can try to get access to unassigned value.
    Tptr<T>    a;
        a.get_ptr();    //wrong - error not detected
    

    We must tell to compiler, that variable "a" must be assigned into correct state befor usage, but we can not leave "unassigned values" after default ctors. To solve the problem we can use markers of sequence of function calls with attributes: "after" and "befor". To have ability to test state at run time (avoiding compile time protection) we can mark member with attribute "probe".

    In the example we can create function that must be called befor all other functions returning internal state can be called:

    protected:
        void     has_assigned(){}
        
    public:
    	//member with "probe" ignore compile time state
        char     is_ptr_assigned()const probe { return ptr? 1:0; }
        
        T auto* get_ptr()const after(has_assigned);
    
        //for operator * "after(has_assigned)" can be skipped 
        //due to get_ptr() call
        T auto* operator *()const { return get_ptr(); }
    
        //The expression "after(has_assigned||has_assigned)" 
        //is example of multiple conditions
        T heap*  move()const dtor after(has_assigned||has_assigned);
        
        //for operator < "after(has_assigned)" can be skipped 
        //due to move() call
        T heap*  operator < ()const dtor { return move(); }
        
        Tptr&    replace(T heap *const data)ctor
         {
          //(ptr!=0) is rare condition
          if(data != ptr)if(ptr)delete ptr;
          ptr=data;
          has_assigned();
          return *this;
         }
        Tptr& operator= (T heap *const data)ctor { return replace(data); }
        
    public:
        //here we can avoid problem with "has_assigned"
        Tptr():ptr(0){}
        Tptr(T heap *const p):ptr(p){ has_assigned(); }
        
    public:
        //but the important case can not be resolved at compile time
        //so we are forced to assume that we will not pass obj 
        //in incorrect state or do manual runtime tests
        //see "moveable concept" below to solve the problem
        Tptr(const T@ &obj):ptr(<obj){has_assigned();}
    
        //the same as "Tptr(const T@ & obj)"
        Tptr& operator=(const Tptr@ &obj)ctor
         {
          if( &obj != this)replace(<obj);
          has_assigned();
          return *this;
         }
    

    Now all is a little beter:

    Tptr<T>    a;
        a.get_ptr();         //ok - compile time error detected
        a.replace(new Tobj); //ok - new value
        a.get_ptr();         //ok - get
    

  4. Also instead of state named "after(has_assigned)" we can add "ready" standard attribute to mark the default state "after("ready")". Take state name for "after" in quotas ("ready") to do not mix state name "ready" with "ready" runtime identificator.
    protected:
        void     has_assigned()ready {}
        
    public:
        T auto*  get_ptr()const after("ready");
    

    And to avoid dummy functions as "has_assigned" due to the function call can be runtime conditional, we can do like this:

    public:
        T auto*  get_ptr()const after("ready");
        
        //ctor:  can be called for destroyed object
        //ready: also turn object into ready state
        Tptr&    replace(T heap *const data)ctor ready
        
    public:
        Tptr():ptr(0){}
        //Tptr(T heap *const p):ptr(p){ has_assigned(); }
        Tptr(T heap *const p):ptr(p) ready{}
    

  5. Instead of or together with "ctor,dtor,after,befor" we can use more general way to control compile time sequence with the help of attribute "level". Consider
    //"test" can be called from levels: "from1", "from2" or "from3"  
    //and turn object into level: "to"
       T heap* test() level(from1,from2,from3;to); 
    
    //"move" can be called from any level
    //and turn object into level: "0"
    	T heap* move() level(...;0); 
    	
    /*
    
    Predefined equivalence between "level" and "ctor,dtor,after,befor" 
    in last example:
    
    deleted: level 0 == dtor 
    created: level 1 == ctor 
    ready:   level 2 == ready, after (has_assigned) 
    
    */
    

    It is easy to see that "level" is more machine than human stuff.

  6. Note, that all keywords to control compile time attributes (as "after" for example) do not interact with run-time identificators, due to syntax limitations. Consider:
    int  after=0;
    double  ready=0.0;
    
    struct A
    {
        void test() ready;  
        void after() dtor after(test);
    };
    

Problems.

Nothing.


Moveable concept.

Abstract.

We need "copyable" and "moveable" concepts integrated to C++ completely.

For example, we have four different kinds of parameters:

and for them we must have:

  1. what syntax must be used to declare each type of parameter of function
  2. what kind of ctor or operator must be used to create the concrete type of parameter during function call

Semantics & Syntax.

Problems.

To select concrete moveable syntax and relashionships between "copyable" and "moveable" in accordance with all requirements.


Moveable concept: appendix.

Steps of "moveable concept" appearance.

The clause is describing why has "moveable concept" been appeared, what problems exist in code without "moveable concept" support. It can help to understand the purpose of "moveable concept" semantics and syntax.

  1. Local holder.

    I have some "holders" (practically using in my programs) to make correct destruction of dynamic memory on errors or returns. Its memory must be used only locally in the block:

    class  int_derived;
    extern uint field_size;
    void   get(char *const);
    uint   read(const char *const);
    
    {
    int                a(1);
    obj_holder<int>    b(new int_derived);
    array_holder<char> c(new char[field_size]);
    
        get(*c);
        if(!**c)safe_throw<err_type>("no input string");
    
        *b=read(*c);
        if(!*b)break;
    
        for(uint i=*b; i; --i)++a;
        return a;
    }
    

    There is a problem here: unlike "a", i can not return address of "b" or "c" outside of the local block. At the "}" destructors of the "holders" will free its dynamic memory.

  2. Moveable concept.

    To solve the last problem, we can add "ownership" term into "holder" and define concrete "ownership transfer" behaviour.

    The example of the "concrete ownership transfer behaviour" is auto_ptr behaviour. The purpose of the auto_ptr to pass ownership to the place of destanation in spite of auto blocks bounds.

    The often application of auto_ptr's ownership transfer is one from unnamed temporary source to named destination variable, that is why we no need any runtime tests to share ownership between some local blocks, because unnamed temporary unaccessible and will be automatically deleted after transfer has been completed.

    So the kind of "pass" is "move" for "auto_ptr". For "auto_ptr" we can not do "copy", that is why auto_ptr is not copyable, but moveable. In order to allow to user to express the fact of moveable data type we need "moveable" keyword, that can be applied similarly to "const" or "volatile" keywords. We can not use only "moveable value" or "moveable reference" instead of ordinary moveable data type.

    Of course, we can use "auto_ptr" with named source variable also, like this:

    {
    int                sa(1);
    auto_obj<int>      sb(new int_derived(2));
    auto_array<char>   sc(new char[field_size]);
    
    {
    int                a(sa);
    auto_obj<int>      b(sb);
    auto_array<char>   c(sc);
    
        get(*c);
        if(!**c)safe_throw<err_type>("no input string");
    
        *b=read(*c);
        if(!*b)break;
    
        for(uint i=*b; i; --i)++a;
        return a;
    }}
    

    There is no problem of access of named source variable when internal block has been executed here due to "}}" - named source variable is never used after transfer has been completed.

    But it is easy to see, that named source variable is accessible inside internal block and between each "}" in the pair "}}". The access to named source variable after move has been done is _always error_, never UB.

  3. Compile time attributes for moveable data type.

    The cause of last error is the fact, that boundaries of blocks for "auto" and "dynamic" memory are not coincide. Take in account, that "non-const" applied to copyable can not help us to solve the problem, can not protect us from access to named source variable after move has been done.

    To work with "moveable" we must take in account presence of "ownership", that does not exist for "copyable", so algorithms _can not begin to support moveable data type_ only by adding "moveable data type" declaration - algorithms must be _re-designed_ as they must be _re-designed_ to support "const".

    The following example shows runtime errors in algorithm, that can apprear if we will add moveable data type support as "re-declaration" only, without proper re-design:

    1. Original code desined for "copyable data type"
      template<class T>
      T 	foo(const T& data)
      {
      T	tmp(data);
      	tmp.foo();
      	return data;
      };
      

    2. New code only re-declared for "moveable data type" (incorrect):
      template<class T>
      moveable T 	foo(const moveable T & data )
      {
      	//possible run-time error due to "data" is reference
      moveable T	tmp(data);
      	tmp.foo();
      	//always run-time error
      	return data;
      };
      

    3. New code re-declared and re-desined for "moveable data type" (correct):
      template<class T>
      moveable T 	foo(const moveable T &:(ctor,ready;dtor) data )
      {
      	//ok due to ":(ctor,ready;dtor)" compile time attributes
      moveable T	tmp(data);
      	tmp.foo();
      	//ok
      	return tmp;
      };
      

    Whithout strict compiler support of detection of improper usage of moveable data type, it will be _hard to re-design_ old existing code to support new moveable data type, as it will be hard to re-design old existing code to support new "const" data type if compiler will not detect "assignment to const" error.

    The purpose of "compile time attributes for moveable data type" is attenuation boundaries of blocks for "auto" and "dynamic" memory. By nature of dynamic memory the life-time of variables belonging to the memory is not conjunction with ordinary blocks boundaries, controlled by compiler.

    We are using the dynamic memory to make a kind of blocks for variables across ordinary blocks boundaries, because we no need to do "copy" to pass the variable across any ordinary block boundary, so we increase perfomans here. But simultaneously we do not want to lose reliability and "compile time attributes for moveable data type" makes this.

    With "compile time attributes for moveable data type" the last example logically looks like this:

    {
    sa:{
    int                sa(1);
    sb:{
    auto_obj<int>      sb(new int(2));
    sc:{
    auto_array<char>   sc(new char[field_size]);
    
    {
    int                a(sa);
    sa:}
    auto_obj<int>      b(sb);
    sb:}
    auto_array<char>   c(sc);
    sc:}
    
        get(*c);
        if(!**c)safe_throw<err_type>("no input string");
    
        *b=read(*c);
        if(!*b)break;
    
        for(uint i=*b; i; --i)++a;
        return a;
    }
    
    sc:{
    auto_array<char>   sc(new char[field_size]);
    sc:}
    
    }
    

    Now we have no errors. Technically, life-time of moveable will be in accordance with ordinary boundaries "{}", but logically moveable will be accessible from C++ only inside its logical boundaries "variable_name:{ variable_name:}".

    The sintax "variable_name:{ variable_name:}" can be suitable also, but this can not replace compile time attibutes - declaring the logical boundaries implicitly by class or function declaration.


Differences between "rvalue", "movability" and "moveable data type".

The clause is describing the fact that there are differences between "rvalue", "movability" and "moveable data type" terms. Maybe the terms do not look like fundamental one, but there is fundamental difference between "new ordinary data type" and "refinement of existing data type".

People use words "moveable semantics" to both case, so we need to see sharp and clear difference between them, because they have different behaviour, it is really different things. It can help to understand the purpose of "moveable concept" semantics and syntax.

Probably we have no predefined undisputable definitions for "movability" and "moveable data type", but the terms must take in account all practically detected and theoretically visible properties of data types in programs.

  1. Why "new ordinary data type" is not the same as "refinement of existing data type".

    We can think, that conceptually "movability" or just "moveable property" is ability of _any_ concrete object to allow to process "move" to be applyed to the object, the property can be unrelated to the type of the object and can appear only in one concrete _place_ of code.

    As opposite, conceptually "moveable data type" is ordinary data type, that in addition to "movability", defines some other important rules of behaviour for all objects of own types _always_.

    This is important difference, because "moveable data type" related to _data type_, but "moveable property" related to _place of code_. The conceptual difference defines (results in) differences in behaviour for "objects of moveable data type" and "objects with moveable property only".

  2. The example of the difference in behaviour is RVO (return value optimization):

    1. Moving from local values ( the example from r-value reference FAQ ).

      A further language refinement can be made at this point. When returning a non-cv-qualified object with automatic storage from a function, there should be an implicit cast to rvalue:

      string
      operator+ (const string& x, const string& y)
      {
          string result;
          result.reserve(x.size() + y.size());
          result = x;
          result += y;    // as if return static_cast(result);
          return result;
      }
      

      The logic resulting from this implicit cast results in an automatic hierarchy of "move semantics" from best to worst:

      • If you can elide the move/copy, do so (by present language rules)
      • Else if there is a move constructor, use it
      • Else if there is a copy constructor, use it
      • Else the program is ill formed

    2. The example "Moving from local values" can be coded in accordance with moveable concept syntax, as defined on the page that you are reading:

      When returning any object with automatic storage from a function, there should be an explicit cast to "moveable data type", function must be declared properly:

      • string
        operator+ (const string& x, const string& y)
        {
            string result;
            result.reserve(x.size() + y.size());
            result = x;
            result += y;
        
            //here (according to return type of "operator+"
            //declaration) "copy constructor" _always_ 
            //- must be public in the context
            //- never can be implicitly replaced by "move constructor"
            return result;
        }
        

        The logic resulting from this explicit cast results in an explicit hierarchy of "copy semantics" from best to worst:

        • If you can elide the move/copy, do so (by present language rules)
        • Else if there is a copy constructor, use it
          • copy constructor declared by user and it is public in the context
          • copy constructor can be auto-generated by compiler
        • Else the program is ill formed
        note, the question of "eliding" is unrelated to question of "data type"

      • moveable string
        operator+ (const string& x, const string& y)
        {
            string result;
            result.reserve(x.size() + y.size());
            result = x;
            result += y;
        
            //here (according to return type of "operator+"
            //declaration) either "move constructor"
            //or "copy constructor"
            //- must be public in the context
            return result;
        }
        

        The logic resulting from this explicit cast results in an explicit hierarchy of "move semantics" from best to worst:

        • If you can elide the move/copy, do so (by present language rules)
        • Else if there is a move constructor, use it
          • move constructor declared by user and it is public in the context
          • move constructor can be auto-generated by compiler
        • Else if there is a copy constructor, use it
          • copy constructor declared by user and it is public in the context
          • copy constructor can be auto-generated by compiler
        • Else the program is ill formed
        note, the question of "eliding" is unrelated to question of "data type"

    The example shows, that the main difference between "moveable data type" and "moveable property(r-value reference)" is the fact, that:

    • for "moveable data type", behaviour of objects defined only by class _declaration_ and variable _declaration_ (as any ordinary C++ type _must_ do), "moveable data type" looks like "volatile data type" or "const data type".

    • but for "moveable property(r-value reference)", behaviour of objects defined only by concrete _place_ of usage of variable of undefined type, _never_ by declarations (as any ordinary C++ type _never_ must do). "(r-value reference)" makes type of variable is secondary, but place is primary condition of behaviour of variable.

  3. "Rvalue" and "moveable property".

    There are _no_ relationship between "rvalue" and "moveable property". We can easy imagine "conceptual moveable" as well rvalue as lvalue:

    moveable T  add(const moveable& T, const moveable& T);
    
    extern moveable T&  src;
    moveable T  dst= add(src,src);
    

    In the example above, return value of function "add" is "rvalue moveable data type", but "dst" and "src" are "lvalue moveable data type", in spite of "src" is "reference to moveable".

    This is because "conceptual moveable" is data type, that can create new instance from itself only by moving its internal state, insted of copying the state and "conceptual copyable" is data type, that can create new instance from itself as well by moving its internal state, as by copying the state.

    Both "conceptual moveable" and "conceptual copyable" can be as well "lvalue" as "rvalue" or "temporary".

  4. Co-existance of "moveable property" and "moveable data type".

    In theory, "moveable property" is just limited subset of "moveable data type", so we need only complete "moveable data type" instead of "moveable property".

    But "moveable property" can have own behaviour (as "r-value reference" do) orthogonal to behaviour of ordynary C++ data type. In the case the both of the behaviours can co-exist simultaneoulsy.

    Consider the following declaration of class string, supporting all of them:

    class string
    {
    /* *************************************
        copy and move 
    **************************************** */
    //copyable data type
    public:
        string(const string&);
        any_type operator= (const string&);
    
    //r-value refinement of copyable data type
    public:
        string(const? string&&);
        any_type operator= (const? string&&);
    
    //moveable data type
    public:
        string(moveable const string&);
        any_type operator= (moveable const string&)ctor;
    
    /* *************************************
        copy and move from non-const source
    **************************************** */
    //does not define copyable data type if unhided
    //can override implicit copy ctor/assignment
    public:
        //compile time error
        //string(string&) is hidden by
        //string(const string&)
        string(string&);
    
        //compile time error
        //operator= (string&) is hidden by
        //operator= (const string&)
        operator= (const string&)
    
    //does not define moveable data type if unhided
    //can override implicit move ctor/assignment
    public:
        //compile time error
        //string(moveable string&) is hidden by
        //string(moveable const string&)
        string(moveable string&);
    
        //compile time error
        //operator= (moveable string&) is hidden by
        //operator= (moveable const string&)
        any_type operator= (const string&)ctor;
    };
    

    I am sure - "moveable data type" is more regular and important stuff, than "r-value reference", so if C++ can not contain both, than "moveable data type" must be selected.


The "moveable concept" proposal in comparison with the "r-value reference" proposal.

Let's consider the "r-value reference" proposal ( r-value reference FAQ ) and the "moveable conept" proposal and find some of differences between them.

The "r-value reference", being refinement of copyable data type, can not claim to be representation of "moveable data type" as ordinary data type in C++. As opposite, "moveable concept" desined to introduce pure ordinary separated "moveable data type". The "r-value reference" has orthogonal behaviour in comparison with "moveable data type", so they can co-exist simultaneoulsy in one class.

The "r-value reference" syntax "int&&" can be used to make explicit cast of "temporary rvalue of type int" into "reference to temporary lvalue of type int", but the question in general is unrelated to moveable data type.

Also the fact of safe moving from temporary during temporary life-time is unrelated to properties of moveable data type as well as RVO unrelated to moveable data type or copyable data type.

And yet, if the proposal of "r-value reference" allows to process "move" to be applied to copyable, the proposal is a kind of "moveable property" rather than "moveable data type". The "r-value reference" does not introduce a separated data type, it is rather special set of rules to work with copyable data type in special places of code, unrelated to data type declaration.

The "r-value reference" never can replace ordinary moveable data type ( "r-value reference" is just limited subset of correct moveable data type and "r-value reference" has special rules to interacts with copyable, the rules does not dedined by class decalration ) because with the help of "r-value reference" user never will be able to express simplest kind of algorithms with moveable data type (auto_ptr for example).

  1. Concerning "r-value" term applied to "reference" term.

    If address of "rvalue" can be taken, then the "rvalue" has been implicitly casted into "lvalue", because by definition of "rvalue" - address of "rvalue" never can be taken.

    Of course, the fact of non-existence of address for any "rvalue" does not mean that "rvalue" can not be casted (implicit or explicit) into "lvalue", but as itself "r-value reference" sounds paradoxically: "address of thing, that has no address".

    For CPU's opcodes "rvalue" means data is immediate:

    • here "data" is "rvalue" :
      enum { data=3 };
      register int tmp=data;
      ;.code segment
      ;    //3 -> tmp
      ;    movl $_data,%eax
      

    • here "data" is "lvalue":
      int data=3;
      ;.data segment
      ;    _data	.int 3
      register int tmp=data;
      ;.code segment
      ;    //3 -> tmp
      ;    movl _data,%eax
      

    • we can find its address as following:
      //register int *const ptr=data;
      int auto* register const ptr=&data;
      ;.code segment
      ;    //&data -> ptr
      ;    leal _data,%ebx
      

    By the way, it can be better to call "r-value reference" as "reference to temporary", because semantics of "reference" assuming "lvalue", because "reference" always in best case will be implemented as compile time alias and in worst case as encapsulated "const POD pointer".

  2. There is no "moveable value" in "r-value reference" proposal.

    I think, we must note, that difference between "rvalue reference" and "ordinary reference" can be only behaviour. Because semantics of term "reference" means "address", in comparison with semantics of term "value". The difference is size of memory allocated for parameter, passed to function:

    ( 
      sizeof(parameter passed by any reference)
      == sizeof(parameter passed by pointer ) 
    )
     != sizeof(parameter passed by value )
    

    With the help of "r-value reference" proposal we can not explicitly declare value, that must be created by moveable ctor. Note, that moveable value and moveable reference has meaning for auto_ptr wrapper: in the first case we pass ownership, in the second case does not.

    This can be problem if data type has simultaneously moveable and copyable behaviour, because copyable will be hidden by moveable.

    Also it is hard to declare (or looking on declarations it is hard to guess) what kind of ctor will be used.

    I think, as if we can with the help of special technic hide the absence of moveable value, we must not do like this, we must give to user ability to clear express his desire about data type.

  3. Copyable in comparison with Moveable.

    The "r-value reference" proposal requires, that the state of the source after "move" remains in a self consistent state (all internal invariants are still intact).

    The requirement is theoretically correct (can exist), but practically useless one, because even auto_ptr wrapper has other behaviour.

    Note, it is very important for auto_ptr that its state after "move" became incorrect and user must not have access to auto_ptr during incorrect state, because else we can not implement desired behaviour of auto memory substitution.

    The "r-value reference" proposal requires, that from a client code point of view, choosing move instead of copy means that you don't care what happens to the state of the source.

    Here "copying from non-const source" and "move" are mixed. There are differences beetwen them. For "move" source state _never_ will be correct or has undefined behaviour, the state _always_ will be incorrect.

    Also "r-value reference" proposal just ignoring the fact, that "moveable data type" can be const, and the constness can help us to detect errors at compile time

    It is situation that as if we can imagine "moveable" data type sharp divided into internal and external state, and "move" change placement of internal state. Source object without internal state can not exist, and we may not assign to source internal state any random value, in order to make the state "as if correct or consistent".

    Of course, we can imagine a kind of data type for which after "move" any other (but correct) value exist for source, but we can not demand the specific behaviour from all moveable data, originally moveable data type must has ability to turn source after "move" into incorrect state.

    For PODs, move and copy are identical operations. The cause of the POD behaviour is just the fact, that POD data is copyable data type without specific implementation of "move" and instead of "move" "copy" is used.

    In comparison with "moveable data type", source external state of "copyable data type" after "copy" _always_ remains unchanged, _never_ source state after "copy" can be changed, because we can not get two equal copies, if source will be changed after copy.

  4. Binding temporaries to references.

    The "r-value reference" proposal has been designed, in order to save functionality without changing the meaning of any existing code, instead of introducing "moveable data type" as itself.

    The heroic attempt to make moveable from unchanged old code, working only with copyable, can be ironically compared with apearance of "implicit const reference T&&&":

    template<class T>
    void foo(T&&& t);
    
    here "t" is implicitly "const", because in the case:
        int	tmp=3;
        foo<int>( tmp );
    
    compiler can elide copy ctor to create "t" and makes perfectly forwarding.

    Note, introducing ordinary "moveable data type" will not require any changes to code working with copyable and does not change execution of the code.

    The "r-value reference" proposal has been designed, in order to save vague overloading by constness of parameter.

    void foo(const A& t);  // #1
    void foo(A&& t);       // #2
    
    A source();
    const A const_source();
    
        foo(source());        // binds to #2
        foo(const_source());  // binds to #1
    

    This is suspicional idea to make overloading by const/non-const parameter, it can be place of vague casts and human errors.


Operator "parameter passed by".

Abstract.

In some cases it is useful to protect objects of any class to be parameter passed by value or by reference. To do it C++ must introduce operators for each concrete way of parameter can be passed by. Declaring the operators as user-defined in public or protected section of class we can control the ability to the object to be parameter passed by reference or by value.

Syntax.

This is example of possible syntax for the operators:

struct A
{
	int i;
	
protected:
//copyable parameters

	operator (const A&)  ()const {return *this;}
	operator (A&)        (){return *this;}

	operator (const A*)  ()const {return this;}
	operator (A*)        (){return this;}

	operator (const A)   ()const {return *this;}
	operator (A)         (){return *this;}

protected:
//moveable parameters

	operator (const A@ &)()const {return *this;}
	operator (A@ &)      (){return *this;}

	operator const (A@ *)()const {return this;}
	operator (A@ *)      (){return this;}

	operator (const A@)  ()const {return *this;}
	operator (A@)        (){return *this;}
};

void	foo1(const A&);
void	foo2(const A*);
void	foo3(const A);

void	foo4(const A@ &);
void	foo5(const A@ *);
void	foo6(const A@);

void	boo()
{
A	a;

//the following code must not be compiled 
//due to operators are protected in this context

const A    &f1=a;     //operator (const A&)
const A    *f2=&a;    //operator (const A*)
const A    f3=a;      //operator (const A)

const A    @& f4=a;   //operator (const A@ &)
const A    @* f5=&a;  //operator (const A@ *)
const A    @f6=a;     //operator (const A@)

    foo1(a);          //operator (const A&)
    foo2(&a);         //operator (const A*)
    foo3(a);          //operator (const A)

    foo4(a);          //operator (const A@ &)
    foo5(&a);         //operator (const A@ *)
    foo6(a);          //operator (const A@)
}

Note, that A::operator (const A)()const is not the same as A::operator const A ()const or A::A(const A). The kind of operators operator (type) intended to be only for access control of external class users. Probably the operators can have no body, only declaration and body is built-in. Probably there are other ways to do it.

Problems.

Nothing.


List of modifications:

04.04.07  |  
04.03.07  |  01.03.07  |  
28.02.07  |  23.02.07  |  22.02.07  |  21.02.07  |  20.02.07  |  18.02.07  |  



Сайт создан в системе uCoz