What is it?
The page about some C++ improvements, that can be add to language, and I think they must be added. You can send me your opinion about each proposal on the page or discuss them with newsgroup.
Layout of the page can be changed in future. If you want to save link to the page use the URL: grizlyk1.narod.ru/cpp_new .
Updated: 07.04.07. List of modifications.
What kind of improvements are considered here?
All possible C++ improvements on the page have designed under one or more following conditions:
List of improvements.
Explicit memory specifications for POD pointers.
Abstract.
POD pointers to all types of memory (static, auto, heap) have equal syntax, that leads to runtime error and disables compile time error detection.
Any kind of memory pointer can be implemented as user C++ class, but low-level memory wrappers can not ignore POD pointers, because they are working with "operator new" - "new" return POD pointers. If user will implement own memory pointer safe and free from POD pointers, user must invent own safe "operator new".
The simplest solution of the trouble is explicit memory declaration for POD pointers.
Semantics #1.
C-style POD pointer has no enough information about how to work with pointer, what is why programmer must support the POD pointer usage by own conventions, implemented, for example, as comments, look at the following example:
//return "new[]", must be "delete[]" by external
int* get_data1();
//return "new", must be "delete" by external
int* get_data2();
//return static, do not delete by external
int* get_data3();
For language without garbage collecting (as C++ does) it is too raw behaviour, it is danger by possible memory leak or double delete, because compiler do not control the pointer usage.
C++ contain memory of following types:
We can use correct memory type in pointer declaration.
Syntax.
Fortunatelly C-style pointers are able to be extentended. The common declaration of pointer looks like this:
[extra_type] type * [extra_pointer] pointer = expression;
Note:
I agree, that pointer can not point to register and const int outside of class can not be mutable, but simple regular place for declaration is always better, for example
There are some exceptions from the regular rule, when [extra_pointer] declarations must be placed befor [extra_type], for example:
mutable const int *pointer; //who is mutable "const int" or "pointer"?
register const int *pointer; //who is register "const int" or "pointer"?
//not C++
const int *mutable pointer; //"mutable pointer to const int"
const int *register pointer; //"register pointer to const int"
Explicit memory type declaration can be added to [extra_type] section. So we can get for "const pointer to const std::list allocated by new"
const heap std::list * const pointer = expression;
In order to declare function parameter of concrete memory type we can write:
void foo( const char heap[] *const str);
Semantics #2.
In order to execute compile time memory usage tests, we must declare rules for cast POD pointers with different memory types.
const int heap not_pointer = expression;
const int heap * heap_pointer = new int(0);
const int * pointer = heap_pointer;
heap_pointer=pointer;
// error heap -> heap[] implicit cast is not allowed
const int heap[] * heap[]_pointer = heap_pointer ;
// error heap[] -> heap implicit cast is not allowed
heap_pointer=heap[]_pointer;
const int heap * heap_pointer = new int(0);
const int * pointer = heap_pointer;
heap_pointer=pointer;
// ok heap -> heap[] explicit cast is allowed
const int heap[] * heap[]_pointer = static_cast<const int heap[]*>(heap_pointer);
// ok heap[] -> heap explicit cast is allowed
heap_pointer=static_cast<const int heap*>(heap[]_pointer);
For the cast of memory type we can allow incomplete cast target expression
const int heap * heap_pointer = new int(0);
const int heap[] * heap[]_pointer = static_cast<heap[]*>(heap_pointer);
heap_pointer=static_cast<heap*>(heap[]_pointer);
template<class Tobj>
class Tauto_ptr
{
mutable Tobj heap *ptr;
public:
//return address to use
Tobj auto* get_ptr()const
{ return static_cast<auto*>(ptr); }
//return ownership
Tobj heap* move()const //it is correct "move()const"
{ Tobj heap *const tmp=ptr; ptr=0; return tmp; }
Tauto_ptr(Tobj heap* p=0):ptr(p){}
~Tauto_ptr(){delete ptr;}
};
extern int test;
void foo()
{
int tmp=0;
int heap *h_ptr1=new int(2); //ok
int heap *h_ptr2=new int[10]; //error
int heap *h_ptr3=&tmp; //error
int heap *h_ptr4=&test; //error
int auto *a_ptr1=&tmp; //ok
int auto *a_ptr2=&test; //error
int auto *a_ptr3=new int(2); //error
int auto *a_ptr4=new int[10]; //error
int *ptr1=&test; //no tests, ok
int *ptr2=&tmp; //no tests, ok
int *ptr3=new int(2); //no tests, ok
int *ptr4=new int[10]; //no tests, ok
delete[] h_ptr1; //error
delete h_ptr1; //ok
delete ptr1; //no tests, ok
}
User defined aliases (typedef) of new memory types.
Due to C++ denies "implicit int", we obtain ability to declare user defined aliases for modifiers, like "const", "volatile" and memory types "heap", "auto":
typedef heap memtype;
It can be useful refinement, because we can make traits with different modifiers and use the modidiers in other classes by one user defined name:
struct Obj
{
typedef heap memtype;
};
struct Arr
{
typedef heap[] memtype;
};
template<class T>
struct Z
{
typedef typename T::memtype memtype;
Z memtype* get();
};
Z<Obj> z_obj;
Z<Arr> z_arr;
More extentions of memory types.
The extentions of C++ have described in the previous sections can be added in fact immediately, but some other extentions not, because they are introduce new memory types.
The manual control "what kind of static data segment is used to place POD data or object" can be useful and allow to solve the "static" ambiguous, so "static" memory type can be replaced by concrete global segment and combination of keywords like "code, data, non_init, readonly, writeonly, readable, writeable".
Problems.
Nothing.
Operator "new" returning RAII wrappers (non-pointers) for dynamic objects.
Abstract.
Due to expression "new T" return "T*" and after it "T*" often must be placed into appropriate memory wrapper, we get here "two stages" creation process. If the memory wrapper is parameter of any function, we can get possible memory leakage if the process of creating object has failed, because "new T" and constructor of the wrapper is not "atomic" operation.
struct T{ T(const int); };
struct Wrapper{ Wrapper(T*); };
void foo(Wrapper,Wrapper,Wrapper);
void boo()
{
foo( new T(0), new T(1), new T(2) );
/*
can be unrolled into following sequence:
-create memory
a=new T(0),
//if T(1) throw,
//then T(0) will remain to be allocated and will be lost
b=new T(1), c=new T(2);
-create wrappers with the memory
Wrapper(a),Wrapper(b),Wrapper(c);
-call foo()
foo(Wrapper(a),Wrapper(b),Wrapper(c));
*/
}
The simplest way to solve the problem - to make special form of "expression new" to return user-defined memory wrapper instead of POD pointer. The way can use dynamic memory as well safe and easy, as auto or static memory.
Semantics.
RAII idiom require, that most dynamic objects (objects created by "new or new[]") should be placed into any kind of memory wrapper (auto_ptr<> for example) immediately after creation have completed.
The memory wrappers carry out ownership for obtained dynamic memory area and provide correct return the memory into free state.
It is better, if dinamic objects will be declared and created at the point of its usage, like ordinary auto temporary objects, without any explicit temporary storage.
//created at the point of its usage - better
foo(new int(0));
boo(int(0));
//created with explicit temporary storage - worse
const int *const ptr=new int(0);
foo(ptr);
const int var=0;
boo(var);
But expression "new T" return "T*", and "T*" must be placed into any memory wrapper.
Syntax.
The best way could be "operator new" overloaded in concrete class and global overloaded "operator new", but C++ does not allow to overload any by return value. Even if with C++ one be able to overload by return value, for the concrete case of "new", overloading return type could be a source of confusions.
So we must find explicit sign to tell to compiler what kind of new we are using and declaring. I think the name "new&" can be used as "new returning user defined memory wrapper".
This is example of "new&" declaration:
//#define while C++ does not support "heap"
#define heap
#define heap[]
template<class T>
struct User_wrapper
{
User_wrapper(T heap *const);
};
struct T
{
static User_wrapper<T>
operator new& (T heap* ptr)
{ return User_wrapper<T>(ptr); }
T(const int);
};
typedef User_wrapper<T> ptr;
void foo(ptr,ptr,ptr);
void boo()
{
//safe now
foo(new& T(0), new& T(1), new& T(2));
}
If C++ will support some extentions for template declarations, at least only for "operator new&", then user can quickly redefine memory wrapper as needed. Consider the following example:
//global "::operator new&"
template<class T>
//C++ must support reference to "::operator new&",
//instead of ::operator new& <T>
User_wrapper<T> ::operator new& (T heap *const ptr)
{return User_wrapper<T>(ptr);}
//class T "T::operator new&"
template<class T>
//C++ must support reference to "T::operator new&",
//instead of T::operator new& <T>
User_wrapper<T> T::operator new& (T heap* ptr)
{return User_wrapper<T>(ptr);}
It is easy to see, that usage of "new&" is similar to "new", and as well an expression new T(params);
unrolled by compiler into sequence of pseudocode "tmp=operator new(sizeof(T)), tmp->T::T(params), return tmp;
, as the expression new& T(params);
can be unrolled into sequence of pseudocode "tmp=operator new(sizeof(T)), tmp->T::T(params), return operator new&(tmp);
.
Problems.
Nothing.
How to send list of parameters into template.
Abstract.
This is complementary for "new&" property of language and does nearly the same. It does the same (makes "atomic" RAII wrapper for dynamic memory area) in the case of concrete memory wrapper (instead of its object) passed as parameter.
Syntax of ordinary C++ class declaration does not allow to declare one class (class "A") with template<class T>
in order to be able to obtain (unknown at point of the class "A" declaration) list of parameters of "class T" (known only at the point of the class "A" instance).
Consider the following example:
//Memory wrapper
template<class T>class Tptr;
//User class
struct T{ T(params_list); };
//We can do like this
Tptr<T> tmp(new T(params_list));
/*
But we can not pass any "T::params_list" to constructor of
class Tptr<T>, in other words, we can not do, for
example, like this in order to hide memory allocation
Tptr<T> tmp(T::params_list);
*/
Due to the limitation, C++ can not automatically("on fly") generate correct constructor for memory wrapper "Tptr<T>" that is obstacle of easy usage of the memory wrapper.
The simplest way to solve the trouble - to introduce extention into templated class declaration. We can extend syntax of template<>, to allow programmer to create and to use in theirs programs own memory management and to declare other classes for jobs are similar to that.
Semantics.
C++ has no internal garbage collector memory manager, to allocate dynamic memory without excplicit memory deallocation. In many cases RAII idiom allows users to implement the concrete kind of memory management with the help of user defined classes.
But, unfortunately, C++ syntax does not allow programmers to use the user defined classes in practical cases. We get here the same troubles as in the example of previous question:
void boo()
{
foo( new T(0), new T(1), new T(2) );
/*
can be unrolled into following sequence:
-create memory
a=new T(0),
//if T(1) throw,
//then T(0) will remain to be allocated and will be lost
b=new T(1), c=new T(2);
I think, people do not want garbage collector as itself, but they really want usable RAII wrapper for easy dynamic memory allocation. In first example (in "abstract" section) the class Tptr<T> can be used as substitute of garbage collector if C++ will support extended template<> syntax.
Syntax.
In order to improve behaviour of the expression Tptr<T> tmp(new T(params_list));
, we can improve template<> syntax. Consider:
define
template keyword. The template "define" must act nearly as preprocessor "#define" directive. Consider the following example:
(_empty_,_empty_)
just shows concrete format of default declaration of "params_list" and expanded into nothing together with its braces.
params_list
is undefined, that compiler must ignore any incomplete declarations, as at the point "1" as if:
ifdef params_list
Tptr params_list { new T params_list ; }
endif
//the declaration
Tptr<int> tmp;
//is the same as
Tptr<int,(_empty_)> tmp;
template< class T, define params_list=(_empty_,_empty_) >
class Tptr
{
T heap* ptr;
public:
Tptr():ptr( new T params_list ){}
//If "params_list" is not defined
//compiler will ignore the following declaration
//1
Tptr params_list :ptr( new T params_list ){}
};
list
template keyword. Unlike to define
template keyword, list
will expect expression like this (expr,expr,expr)
and then will remove the its braces.
template<class T, list params_list=(_empty_,_empty_) >
class Tptr
{
T heap* ptr;
public:
Tptr():ptr( new T(params_list) ){}
//1
Tptr(params_list):ptr( new T(params_list) ){}
};
list
and define
template keywords, we can help to compiler to assign the list of parameters with correct type during declaring and instantiating with the help of prefix like this class_name::[member_name::]
. Consider the following example like this:
template< class T, define params_list=(T::(_empty_,_empty_)) >
class Tptr
{
T heap* ptr;
public:
Tptr():ptr( new T params_list ){}
Tptr params_list :ptr( new T params_list ){}
};
template< class T, list T::params_list=T::(_empty_,_empty_) >
class Tptr
{
T heap* ptr;
public:
Tptr():ptr( new T(T::params_list) ){}
Tptr(T::params_list):ptr( new T(T::params_list) ){}
};
Tptr< my_class, my_class::(1,45,6,sin(2.45),"test") > tmp;
list
and define
template keywords are not macros and can not be defined by "#define". In the case of point "1" compiler must do the next:
new T(params_list)
concrete declaration of constructor of class "T" (in general case member of class T), by T::params_list
Tptr(params_list)
by the parameters of the concrete declaration of constructor of class "T" have been found at previous step
list
or define
template keyword is not the same as that declared with class
or typename
template keyword, and can be placed in separated section of template parametes, like this template<class T><define params_list=(T::(_empty_,_empty_))> class Tptr
. For memory wrapper we no need to have a lot of templated classes with the same name and the equal first section of parameters and with differences in second section only.
With new syntax we can do like this:
//T::params_list? == T::(1,45,6,sin(2.45),"test")
//A
//not beauty, but can be used as single declaration
Tptr<T,(T::params_list)> tmp;
foo(
Tptr<T,(T::params_list1)>(),
Tptr<T,(T::params_list2)>(),
Tptr<T,(T::params_list3)>() );
//B
//better and suitable to define objects
Tptr<T> tmp1<(T::params_list1)>,
tmp2<(T::params_list2)>;
foo(
Tptr<T>()<(T::params_list1)>,
Tptr<T>()<(T::params_list2)>,
Tptr<T>()<(T::params_list3)> );
//C
//best, but Tptr<T>::params_list can hide T::params_list
//if no T:: prefix
Tptr<T> tmp(T::params_list);
foo(
Tptr<T>(T::params_list1),
Tptr<T>(T::params_list2),
Tptr<T>(T::params_list3) );
Problems.
Nothing.
Explicit constructor in C-ctyle array declaration.
Abstract.
There are many cases, when we need concrete number of objects allocated and the number of objects will not be changed during object life time. The way to do it is C-ctyle array. Then we can write RAII wrapper for the array, so the array will be copyable and suitable as member for other classes.
Unfortunatelly, C++ does not allow to create C-style array by any constructor than default constructor. I do not see no one reason for the limitation. C-style array is useful because that is only way to make intrusive array allocated, for example, in auto memory, without slow dynamic memory allocation/deallocation.
The simplest way to avoid the problem - to declare concrete constructor for C-style arrays and use it during creation.
Semantics.
C-ctyle array allocates a piece of memory area of total size of the array. In the area objects will be placed one by one aligned correctly. After memory has been allocated, default constructor will be called for each object in the array.
It is easy to see, that instead of default constructor, any (but only obe for all) concrete constructor can be called for all objects in the array. Compiler easy can generate internal functions to create array with any concrete constructor.
There are many objects, that normally must not be created with the help of default constructor. Due to the C-style array limitation programmer must use low-perfomans and not-suitable here explicit "two stages" creation sequence: during first stage, memory has been allocated and default constructor called, then, during second stage, for each object of array user defined member "construct" will be called.
Syntax.
We must tell to compiler what constructor must be used in each C-style array declaraion united with the array creation. The simples way type_name variable_name(params)[size];
can not be used, because looks like function declaration. If to move "(params)" close to type_name, like this type_name(params) variable_name[size];
then again confusion with function and "new[]" has no the barrier as "variable_name" new type_name?(params)?[size];
.
So we need special mark to declare constructor using for creation. There are two ways:
// -1-
//use "=()"
type_name variable_name[size]=type_name(params);
=new type_name[size]=type_name(params);
here "new" return r-value of POD pointer, so normally this
statement can not be reserved for assignment, also
"type_name[size]" and "type_name(params)" has incompatible
types, but we can deny the case of syntax for "new expression"
and remain only for variable declaration.
// -2-
//use "([" and "])"
type_name variable_name([params])[size];
=new type_name([params])[size];
Example:
void foo()
{
int tmp[40];
//the following tmp? are equal to "int tmp[40];"
int tmp0[40]=int(0);
int tmp1[40]=int(1);
int tmp2([2])[40];
int tmp3([3])[40];
int (*ptr1)[40]=new int[1][40]=int(4);
int (*ptr2)[40]=new int([5])[1][40];
}
template<class T>
class X
{
static T ext1[23];
static T ext2[23]=T(3,5,6);
T int([3,5,6])[23];
};
template<class T>
T X<T>::ext1[23]=T(3,5,6);
template<class T>
T X<T>::ext2[23];
Problems.
Nothing.
Auto_ptr idiom.
Abstract.
The auto_ptr is RAII memory wrapper and can be implemented as free-standing class with very light methods and its internal state has only POD pointer to allocated memory. We do not want to have auto_ptr with any overhead, so auto_ptr has no any extra runtime data, in addition to the POD pointer.
I think, more than 90% all dynamic memory operations can be covered by the high-effective memory wrapper, so completely support of the wrapper (of auto_ptr idiom) is very important part of language.
But in order to implement auto_ptr, language must support some things:
Problems.
There are some problems for implementation of auto_ptr in C++.
Unlike to auto_ptr, we can have many POD pointers with the same address (that have been placed in an auto_ptr) and we can use the pointers while the auto_ptr exist.
But in order to implement auto_ptr, language must support some following things:
That is why we always can try to get access to "moveable" object during its incorrect state. Logicaly the attempts is always error, but logically destroyed object, technically is still accessable. Fortunatelly compiler can control correct access to "moveable" at compile time with the help of extra compile time attributes.
So in order to support auto_ptr C++ must also define extra attributes to controll access to "moveable".
Only then auto_ptr can be implemented in C++ and can be used safely and effectively.
"Move", "Copy" and "Swap" operations.
Abstract.
There are at least two examples of operations and rvalue of them has always mutable internal state. This is operation "move" and "swap". It is very popular operations, required for RAII wrappers and effective data exchangers, but unfortunatelly they have no its C++ operators, so i think C++ must declare the operators.
Also the appearance of new operations "move" and "swap" forces us to introduce explicit operation "copy", to help user to make interaction of copyable and moveable data types (see "moveable concept" below).
Semantics.
The following example shows "assignment" overloaded like "move":
struct A
{
mutable int i;
A& operator= (const A& src)
{
if( this != &src )
{
i=src.i;
src.i=0;
}
return *this;
}
};
One can see, that user can require both behaviours together: "move" and "ordinary assign", so we need special syntax to express our desire to do "exactly move".
Also "operator=" for most people associated with "copy assignment" rather than with "move assignment". Consider:
A a,b;
// many people will expect here
// that "b" will be unchanged
a = b;
// but the expression is unrolled to
{
a::i=b::i, //ok
b::i=0; //error - unexpected assignment to "b" from invisible sourse
}
The new syntax of "operator move" can be the same as "a=b", but more clear and we will not expect there, that "b" will be unchanged.
Properties of "operator move":
The question is: if we will introduce "operator move", maybe we need "operator copy" to explicit cover all possible operations ("move" and "copy")?
From one side, we no need this, because "ordinary assign" does exactly copying.
But from other side, later working with moveable data types, one can see, that we need operator "assignment by class declaration" also in addition to "operator move" and "operator copy", especially for templates.
In other words, we have here three operators, not two operators:
And it is useful if we will reserve "operator=" as "assignment by class declaration" instead of "operator copy", because (see "moveable concept" below) declaration of a variable of destination is used as selector of move or copy type of assignment, that will be applied. Consider:
template<class T>
T foo(T& a, cosnt T& b)
{
// here "a=b;" means
// assignment by declaration of "a"
return a=b;
}
template<class A>
void foo()
{
A ca, cb;
moveable A ma, mb;
// here "a=b;" instantiated
// as operator "copy assignment" (a,b)
foo<A>(ca,cb);
// here "a=b;" instantiated
// as operator "move assignment" (a,b)
foo<moveable A>(ma,mb);
}
So if we want to express "to do always copy" in templated foo<>, then we need syntax of separated "operator copy", similar to syntax of "operator move".
Also later, working with interaction of moveable and copyable data types (more detailed about interaction of moveable and copyable data types see "moveable concept" below), one can see, that, unlike to constructor, assignment can be applyed far away from declaration of its variable of destination . In implementation of something we can find code like this:
foo(a);
foo(a);
a=b;
foo(b);
Well, can you say, looking to the assingment either copy assignment will be applied or move assignment? You need find declaration of "a" variable. It is not so good, because code looks vague, it is hard to read the code by small parts.
So, in spite of the fact, that "move constructor" will not be expressed differently from "copy constructor", due to the two reasons (listed above) we need to have syntax to express "move assignment" and "copy assignment" explicitly.
Properties of "operator copy" is the same as for "operator move".
The following example shows "assignment" overloaded like "swap":
struct A
{
mutable int i;
void operator =(A& src)
{
if( this != &src )
{
const A tmp(i);
i=src.i;
src.i=tmp;
}
}
};
Properties of "operator swap":
Pointers, indirect access to data and moveable data types.
template<class Tobj>
class Tobject
{
Tobj heap* ptr;
public:
typedef heap memtype;
void detach(){ ptr=0; }
void destroy(){ delete ptr; ptr=0; }
void assign(Tobj heap* p){ ptr=p; }
void replace(Tobj heap* p)
{
if( p != ptr )
{
delete ptr;
ptr=p;
}
}
Tobj heap* get()const { return ptr; }
Tobj heap* move()
{
Tobj heap* const tmp=ptr;
ptr=0;
return tmp;
}
public:
Tobject(Tobj heap* p=0):ptr(p){}
};
In the example:
It is easy to see, that independently from moveable or copyable type of "Tobj", where "ptr" point to, for each logical group of methods the Tobject<> declares at least pair of methods with different name and implementation:
In the example the Tobject<> is copyable class, but its data can be only moveable. Concrete "copy behaviour" of the Tobject<> can be defined by external class, and auto_ptr is kind of the external class.
It is easy to see, that "detach" and "replace" of class of Tobject<> are kind of "syntax sugar", and can be implemented as "move" and "destroy" plus "assign". And "move" is syntax sugar also, because can be implemented as "get" plus "assign".
The dividing on base and derived methods is not only possible, but the deviding looks like if current set of operators "operator=", "operator*" and "operator->" is enough to implement all we need.
But the opinion is wrong, because concrete implementation of "copy behaviour" of the Tobject<> can request both methods of any group together and simultaneously the implementation can be ordinary data type, but the ordinary data type can request suitable operators for each method as copyable data type does.
For some moveable classes, that has clear separated internal and external state, and for copyable pointers or iterators, pointed to moveable data the operation of extraction of internal state ("move") is also useful.
Note, that the considered "move operation" of the Tobject<> is unrelated to pointer itself, but related only to data, pointer point to (as "operator*" related to data, pointer point to), so the "move operation" is not like to "binary move operation", considered in previous clause. Maybe it is better to call the operation of the Tobject<> as "extract", but the name "move" has become familiar one for me.
Consider the following example of auto_ptr implemented while C++ has no moveable support (it seems to me, that auto_ptr collects most possible difficulties of moveable data type implementation).
template<class Tobj, class Tmemory=Tobject<Tobj> >
class Tauto_ptr
{
mutable Tmemory memory;
//info
public:
typedef typename Tmemory::memtype memtype;
char is_owner()const { return memory.get()? 1:0; }
//access
public:
Tobj auto* get_ptr()const
{
return static_cast<Tobj auto*>(memory.get());
}
Tobj& get_obj()const
{
Tobj auto* const tmp=get_ptr();
if(!tmp)safe_throw<err_zptr>();
return *tmp;
}
Tobj memtype* move()const { return memory.move(); }
Tobj copy()const { return *memory.get(); }
public:
Tobj& operator* ()const { return get_obj(); }
Tobj auto* operator-> ()const { return &get_obj(); }
Tobj& operator() ()const { return get_obj(); }
Tobj auto* operator() (const int)const { return get_ptr(); }
//assign
public:
Tauto_ptr& assign(Tobj heap* const new_adr)
{
memory.assign(new_adr);
return *this;
}
Tauto_ptr& replace(Tobj heap* const new_adr)
{
memory.replace(new_adr);
return *this;
}
public:
Tauto_ptr& operator= (Tobj heap* const new_adr)
{
return replace(new_adr);
}
//creation
public:
Tauto_ptr(Tobj memtype* p=0):memory(p){}
//copying
public:
Tauto_ptr(const Tauto_ptr& obj):memory(obj.move()){}
Tauto_ptr& operator= (const Tauto_ptr& obj)
{
if( &obj != this ){ replace(obj.move()); }
return *this;
}
//destroying
public:
void detach()const { memory.detach(); }
void destroy()const { memory.destroy(); }
~Tauto_ptr(){ memory.destroy(); }
};
To learn why "move","detach" and "destroy" have been declared as const for Tauto_ptr see "Non-const" is not the same as "moveable" below. Note, that the operations are non-const for Tmemory, because Tmemory is low-level class to work with concrete kind of memory and is not going to be auto memory substitution.
It is hard to implement "compile type attributes for moveable" (see "compile type attributes for moveable" below) if one "operator*" will be used for copy and move. In other words, we need new "operator extract" as well as "operator*" to allow to support of compile time attributes of moveable data type.
In the example of auto_ptr implementation one can see method "copy". The method also can require own "operator copy extract". The other big reason to introduce the "operator move extract" and the "operator copy extract" is interaction of copyable and moveable data type.
Similarly to previously considered arguments for apperence binary "move assignment" and "copy assignment", we need unary "operator move extract" and "operator copy extract". The existence of the three operators allow user to declare (and overload) three return values
return reference "unary operator *" - to access data pointer point to
return value "unary operator move" - to move data pointer point to
return value "unary operator copy" - to make copy of data pointer point to
Properties of "unary operator move":
Properties of "unary operator copy":
See "moveable concept" below to learn more about the operators.
Syntax.
We can use the following form for the operators (there is no ambiguous). It is interesting, that the forms of operators reflect the fact, that "move", "copy" and "swap" are different sides of one thing.
T& T::operator<-<(const T& right);
T& operator<-<(T& left, const T& right);
a<-<b;
T& T::operator<+<(const T& right);
T& operator<+<(T& left, const T& right);
a<+<b;
void T::operator<->(T& right);
void operator<->(T& left, T& right);
a<->b;
Pointers, indirect access to moveable data.
It is hard to find short form of "unary operator copy", but for "unary operator move" there is the beautiful form. The long forms of the "unary operators" have the same sintax as its "binary operators". The short forms of the "unary operator move" is just alias, so user class can declare only one of them.
T T::operator<(); //shortest, can be ambiguous with binary <
T operator<(const T&);
T T::operator<-<(); //here is no ambiguous
T operator<-<(const T&);
a=<b;
a=<-<b;
T T::operator<+<(); //here is no ambiguous
T operator<+<(const T&);
a=<+<b;
Summary of operators:
== Binary operators ==
operator assignment by destination =
operator move assignment <-<
operator copy assignment <+<
operator swap <->
== Unary operators ==
operator unary move (short form) <
operator unary move <-<
operator unary copy <+<
Problems.
Nothing.
"Move", "Copy" and "Swap" operations- Appendix
NOTE: "Non-const" is not the same as "moveable".
It is very important thing to see the differences between "non-const" and "moveable". And comparing operations "move" and "swap" we can easy to find the differences.
I think some people are trying to declare moveable (auto_ptr for exmple) as non-const in hope to warn user, that the moveable has unexpected, hidden assignment to its internal state, after operation "move" (the "move" implemented as overloading operation "copy") has executed.
Non-const can be used for any purpose, but due to non-const can not protect moveable from "double move", const must be used for standard purpose - protect from explicit changes of external state of moveable (from assignment for example). Consider the differences:
template<class T>
void swap_1(T& lhs, T& rhs)
{
const T dummy ( <lhs );
//ok
//here "const T" protect us from wrong assignment to temporary
dummy= <rhs; //compile time error
lhs= <rhs;
rhs= <dummy;
//error
//here compiler must generate error
//independent from constness of dummy
//because "double move" is "once" violation
<dummy;
}
template<class T>
void swap_2(T& lhs, T& rhs)
{
T dummy ( <lhs );
//error
//here compiler must generate error
//but "non-const T" can not protect us from wrong assignment to temporary
dummy= <rhs; //ok
lhs= <rhs;
rhs= <dummy;
//error
//here compiler must generate error
//independent from constness of dummy
//because "double move" is "once" violation
<dummy;
}
And the fact that "const dummy" does not strictly required for function "swap", is not speaking that all other functions does not strictly require const movable.
In opposite to "operator move", "operator swap" really require non-const to its argumets, because (unlike to "move") we are going to use the swap arguments after "operator swap" has been executed.
Compile time attributes to control moveable state.
Abstract.
In previous chapters one can find that "moveable" needs extra compile time attributes to control moveable state. it is because "moveable" object can be turned into incorrect state (for example after "move" operation), but (ulinke to "copyable") the object will not be deleted immediately after that.
Fortunatelly, without runtime overhead, C++ compiler can control correct moveable usage at compile time in 90% cases with the help of some compile time class member attributes as: "once, ctor, dtor, after, befor, probe, level".
Semantics & Syntax.
The keyword "once" can not garantee compile time control in all cases, but compiler can do its best and at least it can produce warning, that "once" control is lost here. Consider:
//let's work here only with objects created by "new T"
template<class T>
struct Tptr
{
mutable T heap* ptr;
T auto* get_ptr()const
{
return static_cast<auto*>(ptr);
}
T auto* operator *()const { return get_ptr(); }
T heap* move()const once
{
T heap *const tmp(ptr);
ptr=0;
return tmp;
}
T heap* operator < ()const once { return move(); }
};
template<class T>
void boo(const Tptr<T>&);
template<class T>
char woo(const Tptr<T>& ptr){ return (*ptr)? 1: 0; }
using ::is_ok;
using lib::all_ok;
template<class T>
void foo(const Tptr<T> ptr)
{
//ok
if( ::is_ok() ){ ptr.move(); }
//warning - possible double move();
//if "::is_ok()" is not synced with "lib::all_ok()"
if( !lib::all_ok() ){ ptr.move(); }
//warning - compiler lost control for "move()const once"
//due to hidden body of "boo<T>"
boo<T>(ptr);
//ok, due to accessable body of "woo<T>"
woo<T>(ptr);
//warning - more than possible double move();
//if "::is_ok()" or "lib::all_ok()" was true
ptr.move();
//error - double move();
ptr.move();
}
/*
to avoid the first warning
//warning - possible double move();
we need write like this
if( ::is_ok() ){ ptr.move(); }
else
{
if( !lib::all_ok() ){ ptr.move(); }
}
*/
Can the keyword "once" make differences for overloading Tobj heap *const once move()const once;
? I think no.
//ok
Tobj heap *const move()const once;
//i think error - redeclaration of "move()const"
Tobj heap *const move()const;
//ok together with "move()const once"
Tobj * /*once*/ move()once;
//i think error - redeclaration of "move()"
Tobj * move();
It is evidently, with "once" we can make some compile time tests insted of runtime tests.
const Tptr<T> a(new T);
a.move();
a.move(); //ok - the error detected
a.get_ptr(); //wrong - the error not detected
Let's extend Tptr class from previous example:
template<class T>
class Tptr
{
mutable T heap* ptr;
public:
T auto* get_ptr()const
{
return static_cast<auto*>(ptr);
}
T auto* operator *()const { return get_ptr(); }
T heap* move()const once
{
T heap *const tmp(ptr);
ptr=0;
return tmp;
}
T heap* operator < ()const once { return move(); }
Tptr& replace(T heap *const data)
{
//(ptr!=0) is rare condition
if(data != ptr)if(ptr)delete ptr;
ptr=data;
return *this;
}
Tptr& operator= (T heap *const data){ return replace(data); }
public:
Tptr(T heap *const p=0):ptr(p){}
//(ptr!=0) is rare condition
~Tptr(){ if(ptr)delete ptr; }
//moveable semantic - see below
Tptr(const T@ &obj):ptr(<obj){}
Tptr& operator=(const Tptr@ &obj)
{
if( &obj != this)replace(<obj);
return *this;
}
private:
//copyable semantic - disabled
Tptr(const T& obj);
Tptr& operator=(const Tptr& obj);
};
Now, I think, we need to tell to compiler that
Let's think, that if "move" has been executed, it looks like its Tptr<T> was logically deleted, no one "returning internal state" operation can be aplied, only destructor or new assignment.
In fact "move" logically is real desructor with own name and the destructor can return value. So we need instead of "once" to use "dtor", "ctor"
Assuming the "returning state" is default attribute. Standard constructors and destructor has ctor/dtor attributes by default, assignment has no. And we will get for Tptr<T>:
//can "const dtor" pair be applied ot not?
//i think yes, because dtor can destroy const objects
T heap* move()const dtor;
T heap* operator < ()const dtor;
//can "const ctor" pair in theory be applied ot not?
//i think no, because ctor can not re-create objects
Tptr& replace(T heap *const data)ctor;
Tptr& operator= (T heap *const data)ctor;
Tptr& operator=(const Tptr@ &obj)ctor;
Now all is nearly exelent:
{
const Tptr<T> a(new T);
a.get_ptr(); //ok - get
a.get_ptr(); //ok - get
delete a.get_ptr(); //ok - compile time error detected
delete a.move(); //ok - destroy
a.move(); //ok - compile time error detected
a.get_ptr(); //ok - compile time error detected
a.replace(new Tobj); //ok - compile time error detected
Tptr<T> b(new T);
b.replace(new Tobj); //ok - new value
b.replace(new Tobj); //ok - new value
b.move(); //ok - destroy
b.move(); //ok - compile time error detected
b.get_ptr(); //ok - compile time error detected
b.replace(new Tobj); //ok - new value
T tmp;
b.replace(&tmp); //ok - compile time error detected
}
Tptr<T> a;
a.get_ptr(); //wrong - error not detected
We must tell to compiler, that variable "a" must be assigned into correct state befor usage, but we can not leave "unassigned values" after default ctors. To solve the problem we can use markers of sequence of function calls with attributes: "after" and "befor". To have ability to test state at run time (avoiding compile time protection) we can mark member with attribute "probe".
In the example we can create function that must be called befor all other functions returning internal state can be called:
protected:
void has_assigned(){}
public:
//member with "probe" ignore compile time state
char is_ptr_assigned()const probe { return ptr? 1:0; }
T auto* get_ptr()const after(has_assigned);
//for operator * "after(has_assigned)" can be skipped
//due to get_ptr() call
T auto* operator *()const { return get_ptr(); }
//The expression "after(has_assigned||has_assigned)"
//is example of multiple conditions
T heap* move()const dtor after(has_assigned||has_assigned);
//for operator < "after(has_assigned)" can be skipped
//due to move() call
T heap* operator < ()const dtor { return move(); }
Tptr& replace(T heap *const data)ctor
{
//(ptr!=0) is rare condition
if(data != ptr)if(ptr)delete ptr;
ptr=data;
has_assigned();
return *this;
}
Tptr& operator= (T heap *const data)ctor { return replace(data); }
public:
//here we can avoid problem with "has_assigned"
Tptr():ptr(0){}
Tptr(T heap *const p):ptr(p){ has_assigned(); }
public:
//but the important case can not be resolved at compile time
//so we are forced to assume that we will not pass obj
//in incorrect state or do manual runtime tests
//see "moveable concept" below to solve the problem
Tptr(const T@ &obj):ptr(<obj){has_assigned();}
//the same as "Tptr(const T@ & obj)"
Tptr& operator=(const Tptr@ &obj)ctor
{
if( &obj != this)replace(<obj);
has_assigned();
return *this;
}
Now all is a little beter:
Tptr<T> a;
a.get_ptr(); //ok - compile time error detected
a.replace(new Tobj); //ok - new value
a.get_ptr(); //ok - get
protected:
void has_assigned()ready {}
public:
T auto* get_ptr()const after("ready");
And to avoid dummy functions as "has_assigned" due to the function call can be runtime conditional, we can do like this:
public:
T auto* get_ptr()const after("ready");
//ctor: can be called for destroyed object
//ready: also turn object into ready state
Tptr& replace(T heap *const data)ctor ready
public:
Tptr():ptr(0){}
//Tptr(T heap *const p):ptr(p){ has_assigned(); }
Tptr(T heap *const p):ptr(p) ready{}
//"test" can be called from levels: "from1", "from2" or "from3"
//and turn object into level: "to"
T heap* test() level(from1,from2,from3;to);
//"move" can be called from any level
//and turn object into level: "0"
T heap* move() level(...;0);
/*
Predefined equivalence between "level" and "ctor,dtor,after,befor"
in last example:
deleted: level 0 == dtor
created: level 1 == ctor
ready: level 2 == ready, after (has_assigned)
*/
It is easy to see that "level" is more machine than human stuff.
int after=0;
double ready=0.0;
struct A
{
void test() ready;
void after() dtor after(test);
};
Problems.
Nothing.
Moveable concept.
Abstract.
We need "copyable" and "moveable" concepts integrated to C++ completely.
For example, we have four different kinds of parameters:
and for them we must have:
Semantics & Syntax.
T(T&)
is suitable for copyable objects, other does not agree and does think, that moveable must be always declared as non-const, so thinks that non-const copy constructor T(T&)
probably is suitable for moveable objects.
The kind of "undefinined behaviour" and confusions must be eliminated. We must say, that (the sign "@" has been randomly taken only to define moveable type):
"type(const type& src)" is only copy constructor
"type(const type@ &src)" is only move constructor
and in the following expressions:
type a;
type tmp(a);
we must not request from "a" to be non-const, never, because during copying we must not assign to source or makes other changes to its external state (it contradicts to "copy" logical goal), all variations of its internal state must be hidden by mutable, if last is impossible, then it means the "type" is not copyable.
I think T(const T&)
must hide T(T&)
and T(T&)
must not be used and must not exist if T(const T&)
has been declared.
The same thing with "move constructor":
type a;
type @tmp(a);
and T(const T@ &)
must hide T(T@ &)
The following statements (the sign "@" has been randomly taken only to define moveable
type) describe all possible situations:
//copyable variable
type var;
//moveable variable
type @var;
//reference to copyable variable
type &var;
//reference to moveable variable
type @&var;
//parameter: copyable variable passed by value
void foo([const] type name);
//parameter: moveable variable passed by value
void foo([const] type @name);
//parameter: copyable variable passed by reference
void foo([const] type& name);
//parameter: moveable variable passed by reference
void foo([const] type@ &name);
We want to use "move constructor" as well as "copy constructor", that is why
we are forced to mark variable to define what kind of constructor must be
used to create new value. Consider:
template<class type>
//parameter: moveable variable passed by value
type@ foo(const type @param)
{
type a; //default ctor used
type @b; //the same as a
type c(param); //copy ctor used if exist else error
type @d(param); //move ctor used
type @e(param); //error - double move
return d; //move ctor used
}
type @&
and type @*
instead of type &@
and type *@
, so attribute of type as "const" or "moveable" must be placed from left side of "&" of "*" sign, not from right side.
Instead of sign "@" to declare "moveable type attribute" we can directly use keyword "moveable" similar to keyword "const" to declare constness and the previous example will look like this:
template<class type>
//parameter: moveable variable passed by value
type moveable foo(const type moveable param)
{
type a; //default ctor used
type moveable b; //the same as a
type c(param); //copy ctor used if exist else error
type moveable d(param); //move ctor used
type moveable e(param); //error - double move
return d; //move ctor used
}
struct A
{
class internal_state;
internal_state operator< ()const dtor;
/*
Let's imagine that "constructor from value" as "A(const A)" can exist.
In nature the kind of constructor can not exist due to recursive loop
without exit: new A (needs new A (needs new A))
Assuming that for "A(const A@ obj)" for "obj" compiler generated
constructor will be called befor user defined constructor will be
executed: new A ( compiler generated new A )
We are doing like this here to pay attention only to compile time attribute
tracing, because for parameter "passed by reference" it is much harder
to trace the compile time attributes, than for parameter "passed by value".
*/
A(const A@);
A(const internal_state&);
};
A a;
A @b(<a); // (1) // A(const internal_state& src);
A @c(b); // (2) // A(const A@ src);
In both cases "1" and "2" compiler must mark "src" as if "dtor" executed. That is why i think in the case "2" compiler must mark "src" as if "dtor" executed in spite of A(const A@ src)
implemetation (mark after the implemetation of A(const A@ src)
has been executed).
The same with assignment operator=(const A@)
.
The last can allow to create reliable templates, suitable for any "moveable".
That is why we must solve the problem with reliable "reference to moveable", it means that compiler must support compile time attributes "after", "ctor" etc over local scope boundaries.
I think one of the way to do like this it is declaring initial attributes and resulted attributes for each moveable parameter passed by reference. If the attributes are absent then compiler will assume that attributes of the parameter will not be changed after function return.
Consider the following example:
//both parameters will be logically destroyed
void foo(
//initial state: after(has_assigned||"ready")
//returned state: dtor
const Tptr<T>@ & ptr:(after(has_assigned||"ready");dtor),
//parameter without name
//initial state: ... //(any)
//returned state: dtor
const Tptr<T>@ & :(...;dtor)=Tptr<T>()
);
//"ptr" attributes will be unchanged
void boo( const Tptr<T>@ &ptr );
This is default assuming probably with excluding "move constructuctor" and "move assignment". For the operations by default initial attribute of its parameter must be "ready" and resulted attribute of the parameter will be "dtor". Also the attributes of the functions can be explicitly redefined by user.
So in one of the example above we have been become protected from passing uninitialized Tptr<T> into function. But again, only one condition can be used for "move constructor" - either we allow uninitialized to be passed or do not.
public:
//now the important case can be partially resolved at compile time
//obj with incorrect state can not be passed here
//due to initial attribute of obj must be "ready"
Tptr(const T@ &obj):ptr(<obj) ready{}
//the same as "Tptr(const T@ &obj)"
Tptr& operator=(const Tptr@ &obj)ctor ready;
The rules for "moveable references" does not allow to have run-time conditions of possible changes of attributes, but for run-time conditions we can use explicit run-time tests, and to mark runtime conditions we can add "runtime" keyword (runtime keyword disable compile time testings from the point of usage):
//"ptr" attributes can be changed under runtime conditions
void woo( const Tptr<T>@ &ptr:(runtime) );
//declare "ptr" with compile time control
Tptr<T>@ &ptr;
//from the point we have no compile time control for "ptr"
//probably it is error because "ptr" has no runtime attribute
woo(ptr);
//declare "ptr2" with runtime atrribute
Tptr<T>@ &ptr2:(runtime);
//ok
woo(ptr2);
If it is impossible to support total compile time control, we can use parts of code with ability of the control. Consider:
//"ptr" attributes can be changed under runtime conditions
void woo( const Tptr<T>@ &ptr:(runtime) );
Tptr<T>@ &ptr;
//set runtime area
//from the point we have no compile time control for "ptr"
ptr:(runtime);
//ok
woo(ptr);
ptr=new T;
//set compile time area
//return to compile time control, state ready
ptr:(ready);
And here we must define (enumerate) all strict rules also ("functions expecting moveable value can use copyable value" and so on).
foo(a);
foo(a);
a=b;
foo(b);
Well, can you say, looking to the assingment either copy assignment will be applied or move assignment? You need find declaration of "a" variable. It is not so good, because code looks vague, it is hard to read the code by small parts.
So, in spite of the fact, that "move constructor" can not be expressed differently from "copy constructor", we need to have ability to express "move assignment" explicitly, then we need "binary operator move <-<" in addition to "unary operator move <". We can give chance to user to express his desire for move assignment explicitly with the help of "operator <-<".
Also we can use "operator <-<" for local type override and for variable of destination declared as copyable to use move assignment.
Normally, the "operator <-<" just tell to compiler, that here move assignment, declared as:
must be used, but user can overload "operator <-<". It is important, that unlike to unary "operator move <", "binary operator move" does not use intermediate data, does not use two operators "move+assignment", does not make any possible overhead during move:
moveable class_name&:(ctor;...)
class_name::operator<-<(
const moveable class_name &:(ctor,dtor)
)ctor;
template<class T>
void foo()
{
T b;
moveable T a;
//The lines of the following code does the same:
//operator "move assignment"
a=b;
//operator move <-< calls "move assignment"
a<-<b;
//cast to force "move assignment"
static_cast< moveable typeof(a)& >(a)=b;
//with the help of two operators:
//unary move + assignment from intermediate internal state
a=<b;
}
Due to "operator=" is reserved for "assignment by destination type declaration" we need to have ability to explicit write "copy assignment", similar to "move assignment". By analogy with syntax of "binary operator move" we can make syntax of "binary operator copy":
T& T::operator<+< (const T&);
Normally, we could not require "unary copy" operator, because the operation could be implemented by creation temporary with the help of copy constructor:
but we must assuming, that compiler will be free to eleminate all copyes, that can be eliminated, and compiler will do different optimization for data and code.
void foo(const Z&);
moveable Z tmp;
//tell to compiler to make copy of tmp
foo(Z(tmp));
For the kind of optimization we could explicit tell to compiler always make copy and write the example like this:
But the syntax with "explicit" more suitable for declaration of data type, than for expressions, also new operator can give ability to user to overload the operator.
void foo(const Z&);
moveable Z tmp;
//tell to compiler to make copy of tmp
foo(explicit Z(tmp));
It is hard to find short form of "unary operator copy", as it can be done for "unary operator move", but we can use syntax of "binary operator copy" for "unary operator copy".
Also we can allow syntax of "binary operator move" to be applied to "unary operator move", so "unary operator move" will have two equal syntactic forms of itself. Class can declare any, but only one form, but the declaration will overload both of them. In fact, short form is only alias of long form.
class Z;
Z a;
moveable Z b;
//here "move assignment" will be used,
//because "b" declared as "moveable"
b = a;
//here "copy assignment" will be used,
//because "a" declared as "copyable".
a = b;
Also for the variables declared in the example we can use local moveable/copyable type override with the help of set of the following rules:
a) explicit cast to more limited type: static_cast<moveable T &:()> tell to compiler to use "move assignment" if "move assignment" declared, then must be public if "move assignment" undeclared, then will be auto-generated else if "copy assignment" declared, then must be public if "copy assignment" undeclared, then will be auto-generated else error b) explicit cast to less limited type: moveable_cast<T&> tell to compiler to use "copy assignment" if "copy assignment" declared, then must be public if "copy assignment" undeclared, then will be auto-generated else error c) binary explicit specialization of "assignment by default" (operator=): operator<-< tell to compiler to use "move assignment" by default similar to static_cast<moveable T&:()>, but can be overloaded by user operator<+< tell to compiler to use "copy assignment" by default similar to moveable_cast<T&>, but can be overloaded by user d) with "unary operators" and copying intermediate value of "internal_state" "unary operator move" operator<-< tell to compiler to use "move to internal_state" operator=(operator<-<) and then we can "assign from internal_state" "unary operator copy" operator<+< tell to compiler to use "copy to internal_state" operator=(operator<+<) and then we can "assign from internal_state"
Examples of the overriding:
a)
//"move assignment" for copyable
static_cast<moveable typeof(a)&:()>(a)=b;
//explicit "move assignment" for moveable
static_cast<moveable typeof(b)&:()>(b)=a;
b)
//"copy assignment" for moveable
moveable_cast<typeof(b)&>(b)=a;
//explicit "copy assignment" for copyable
moveable_cast<typeof(a)&>(a)=b;
c)
//by declaration of target variable
a = b;
b = a;
//binary operator move for copyable
a<-<b;
//binary operator move for moveable
b<-<a;
//binary operator copy for moveable
b<+<a;
//binary operator copy for copyable
a<+<b;
d)
//unary operator move for copyable
a=<b;
a=<-<b;
//unary operator move for moveable
b=<a;
b=<-<a;
//unary operator copy for moveable
b=<+<a;
//unary operator copy for copyable
a=<+<b;
Problems.
To select concrete moveable syntax and relashionships between "copyable" and "moveable" in accordance with all requirements.
Steps of "moveable concept" appearance.
The clause is describing why has "moveable concept" been appeared, what problems exist in code without "moveable concept" support. It can help to understand the purpose of "moveable concept" semantics and syntax.
I have some "holders" (practically using in my programs) to make correct destruction of dynamic memory on errors or returns. Its memory must be used only locally in the block:
class int_derived;
extern uint field_size;
void get(char *const);
uint read(const char *const);
{
int a(1);
obj_holder<int> b(new int_derived);
array_holder<char> c(new char[field_size]);
get(*c);
if(!**c)safe_throw<err_type>("no input string");
*b=read(*c);
if(!*b)break;
for(uint i=*b; i; --i)++a;
return a;
}
There is a problem here: unlike "a", i can not return address of "b" or "c" outside of the local block. At the "}" destructors of the "holders" will free its dynamic memory.
To solve the last problem, we can add "ownership" term into "holder" and define concrete "ownership transfer" behaviour.
The example of the "concrete ownership transfer behaviour" is auto_ptr behaviour. The purpose of the auto_ptr to pass ownership to the place of destanation in spite of auto blocks bounds.
The often application of auto_ptr's ownership transfer is one from unnamed temporary source to named destination variable, that is why we no need any runtime tests to share ownership between some local blocks, because unnamed temporary unaccessible and will be automatically deleted after transfer has been completed.
So the kind of "pass" is "move" for "auto_ptr". For "auto_ptr" we can not do "copy", that is why auto_ptr is not copyable, but moveable. In order to allow to user to express the fact of moveable data type we need "moveable" keyword, that can be applied similarly to "const" or "volatile" keywords. We can not use only "moveable value" or "moveable reference" instead of ordinary moveable data type.
Of course, we can use "auto_ptr" with named source variable also, like this:
{
int sa(1);
auto_obj<int> sb(new int_derived(2));
auto_array<char> sc(new char[field_size]);
{
int a(sa);
auto_obj<int> b(sb);
auto_array<char> c(sc);
get(*c);
if(!**c)safe_throw<err_type>("no input string");
*b=read(*c);
if(!*b)break;
for(uint i=*b; i; --i)++a;
return a;
}}
There is no problem of access of named source variable when internal block has been executed here due to "}}" - named source variable is never used after transfer has been completed.
But it is easy to see, that named source variable is accessible inside internal block and between each "}" in the pair "}}". The access to named source variable after move has been done is _always error_, never UB.
The cause of last error is the fact, that boundaries of blocks for "auto" and "dynamic" memory are not coincide. Take in account, that "non-const" applied to copyable can not help us to solve the problem, can not protect us from access to named source variable after move has been done.
To work with "moveable" we must take in account presence of "ownership", that does not exist for "copyable", so algorithms _can not begin to support moveable data type_ only by adding "moveable data type" declaration - algorithms must be _re-designed_ as they must be _re-designed_ to support "const".
The following example shows runtime errors in algorithm, that can apprear if we will add moveable data type support as "re-declaration" only, without proper re-design:
template<class T>
T foo(const T& data)
{
T tmp(data);
tmp.foo();
return data;
};
template<class T>
moveable T foo(const moveable T & data )
{
//possible run-time error due to "data" is reference
moveable T tmp(data);
tmp.foo();
//always run-time error
return data;
};
template<class T>
moveable T foo(const moveable T &:(ctor,ready;dtor) data )
{
//ok due to ":(ctor,ready;dtor)" compile time attributes
moveable T tmp(data);
tmp.foo();
//ok
return tmp;
};
Whithout strict compiler support of detection of improper usage of moveable data type, it will be _hard to re-design_ old existing code to support new moveable data type, as it will be hard to re-design old existing code to support new "const" data type if compiler will not detect "assignment to const" error.
The purpose of "compile time attributes for moveable data type" is attenuation boundaries of blocks for "auto" and "dynamic" memory. By nature of dynamic memory the life-time of variables belonging to the memory is not conjunction with ordinary blocks boundaries, controlled by compiler.
We are using the dynamic memory to make a kind of blocks for variables across ordinary blocks boundaries, because we no need to do "copy" to pass the variable across any ordinary block boundary, so we increase perfomans here. But simultaneously we do not want to lose reliability and "compile time attributes for moveable data type" makes this.
With "compile time attributes for moveable data type" the last example logically looks like this:
{
sa:{
int sa(1);
sb:{
auto_obj<int> sb(new int(2));
sc:{
auto_array<char> sc(new char[field_size]);
{
int a(sa);
sa:}
auto_obj<int> b(sb);
sb:}
auto_array<char> c(sc);
sc:}
get(*c);
if(!**c)safe_throw<err_type>("no input string");
*b=read(*c);
if(!*b)break;
for(uint i=*b; i; --i)++a;
return a;
}
sc:{
auto_array<char> sc(new char[field_size]);
sc:}
}
Now we have no errors. Technically, life-time of moveable will be in accordance with ordinary boundaries "{}", but logically moveable will be accessible from C++ only inside its logical boundaries "variable_name:{ variable_name:}".
The sintax "variable_name:{ variable_name:}" can be suitable also, but this can not replace compile time attibutes - declaring the logical boundaries implicitly by class or function declaration.
Differences between "rvalue", "movability" and "moveable data type".
The clause is describing the fact that there are differences between "rvalue", "movability" and "moveable data type" terms. Maybe the terms do not look like fundamental one, but there is fundamental difference between "new ordinary data type" and "refinement of existing data type".
People use words "moveable semantics" to both case, so we need to see sharp and clear difference between them, because they have different behaviour, it is really different things. It can help to understand the purpose of "moveable concept" semantics and syntax.
Probably we have no predefined undisputable definitions for "movability" and "moveable data type", but the terms must take in account all practically detected and theoretically visible properties of data types in programs.
We can think, that conceptually "movability" or just "moveable property" is ability of _any_ concrete object to allow to process "move" to be applyed to the object, the property can be unrelated to the type of the object and can appear only in one concrete _place_ of code.
As opposite, conceptually "moveable data type" is ordinary data type, that in addition to "movability", defines some other important rules of behaviour for all objects of own types _always_.
This is important difference, because "moveable data type" related to _data type_, but "moveable property" related to _place of code_. The conceptual difference defines (results in) differences in behaviour for "objects of moveable data type" and "objects with moveable property only".
A further language refinement can be made at this point. When returning a non-cv-qualified object with automatic storage from a function, there should be an implicit cast to rvalue:
string
operator+ (const string& x, const string& y)
{
string result;
result.reserve(x.size() + y.size());
result = x;
result += y; // as if return static_cast
The logic resulting from this implicit cast results in an automatic hierarchy of "move semantics" from best to worst:
When returning any object with automatic storage from a function, there should be an explicit cast to "moveable data type", function must be declared properly:
string
operator+ (const string& x, const string& y)
{
string result;
result.reserve(x.size() + y.size());
result = x;
result += y;
//here (according to return type of "operator+"
//declaration) "copy constructor" _always_
//- must be public in the context
//- never can be implicitly replaced by "move constructor"
return result;
}
The logic resulting from this explicit cast results in an explicit hierarchy of "copy semantics" from best to worst:
moveable string
operator+ (const string& x, const string& y)
{
string result;
result.reserve(x.size() + y.size());
result = x;
result += y;
//here (according to return type of "operator+"
//declaration) either "move constructor"
//or "copy constructor"
//- must be public in the context
return result;
}
The logic resulting from this explicit cast results in an explicit hierarchy of "move semantics" from best to worst:
The example shows, that the main difference between "moveable data type" and "moveable property(r-value reference)" is the fact, that:
There are _no_ relationship between "rvalue" and "moveable property". We can easy imagine "conceptual moveable" as well rvalue as lvalue:
moveable T add(const moveable& T, const moveable& T);
extern moveable T& src;
moveable T dst= add(src,src);
In the example above, return value of function "add" is "rvalue moveable data type", but "dst" and "src" are "lvalue moveable data type", in spite of "src" is "reference to moveable".
This is because "conceptual moveable" is data type, that can create new instance from itself only by moving its internal state, insted of copying the state and "conceptual copyable" is data type, that can create new instance from itself as well by moving its internal state, as by copying the state.
Both "conceptual moveable" and "conceptual copyable" can be as well "lvalue" as "rvalue" or "temporary".
In theory, "moveable property" is just limited subset of "moveable data type", so we need only complete "moveable data type" instead of "moveable property".
But "moveable property" can have own behaviour (as "r-value reference" do) orthogonal to behaviour of ordynary C++ data type. In the case the both of the behaviours can co-exist simultaneoulsy.
Consider the following declaration of class string, supporting all of them:
class string
{
/* *************************************
copy and move
**************************************** */
//copyable data type
public:
string(const string&);
any_type operator= (const string&);
//r-value refinement of copyable data type
public:
string(const? string&&);
any_type operator= (const? string&&);
//moveable data type
public:
string(moveable const string&);
any_type operator= (moveable const string&)ctor;
/* *************************************
copy and move from non-const source
**************************************** */
//does not define copyable data type if unhided
//can override implicit copy ctor/assignment
public:
//compile time error
//string(string&) is hidden by
//string(const string&)
string(string&);
//compile time error
//operator= (string&) is hidden by
//operator= (const string&)
operator= (const string&)
//does not define moveable data type if unhided
//can override implicit move ctor/assignment
public:
//compile time error
//string(moveable string&) is hidden by
//string(moveable const string&)
string(moveable string&);
//compile time error
//operator= (moveable string&) is hidden by
//operator= (moveable const string&)
any_type operator= (const string&)ctor;
};
I am sure - "moveable data type" is more regular and important stuff, than "r-value reference", so if C++ can not contain both, than "moveable data type" must be selected.
The "moveable concept" proposal in comparison with the "r-value reference" proposal.
Let's consider the "r-value reference" proposal ( r-value reference FAQ ) and the "moveable conept" proposal and find some of differences between them.
The "r-value reference", being refinement of copyable data type, can not claim to be representation of "moveable data type" as ordinary data type in C++. As opposite, "moveable concept" desined to introduce pure ordinary separated "moveable data type". The "r-value reference" has orthogonal behaviour in comparison with "moveable data type", so they can co-exist simultaneoulsy in one class.
The "r-value reference" syntax "int&&" can be used to make explicit cast of "temporary rvalue of type int" into "reference to temporary lvalue of type int", but the question in general is unrelated to moveable data type.
Also the fact of safe moving from temporary during temporary life-time is unrelated to properties of moveable data type as well as RVO unrelated to moveable data type or copyable data type.
And yet, if the proposal of "r-value reference" allows to process "move" to be applied to copyable, the proposal is a kind of "moveable property" rather than "moveable data type". The "r-value reference" does not introduce a separated data type, it is rather special set of rules to work with copyable data type in special places of code, unrelated to data type declaration.
The "r-value reference" never can replace ordinary moveable data type ( "r-value reference" is just limited subset of correct moveable data type and "r-value reference" has special rules to interacts with copyable, the rules does not dedined by class decalration ) because with the help of "r-value reference" user never will be able to express simplest kind of algorithms with moveable data type (auto_ptr for example).
If address of "rvalue" can be taken, then the "rvalue" has been implicitly casted into "lvalue", because by definition of "rvalue" - address of "rvalue" never can be taken.
Of course, the fact of non-existence of address for any "rvalue" does not mean that "rvalue" can not be casted (implicit or explicit) into "lvalue", but as itself "r-value reference" sounds paradoxically: "address of thing, that has no address".
For CPU's opcodes "rvalue" means data is immediate:
enum { data=3 };
register int tmp=data;
;.code segment
; //3 -> tmp
; movl $_data,%eax
int data=3;
;.data segment
; _data .int 3
register int tmp=data;
;.code segment
; //3 -> tmp
; movl _data,%eax
//register int *const ptr=data;
int auto* register const ptr=&data;
;.code segment
; //&data -> ptr
; leal _data,%ebx
By the way, it can be better to call "r-value reference" as "reference to temporary", because semantics of "reference" assuming "lvalue", because "reference" always in best case will be implemented as compile time alias and in worst case as encapsulated "const POD pointer".
I think, we must note, that difference between "rvalue reference" and "ordinary reference" can be only behaviour. Because semantics of term "reference" means "address", in comparison with semantics of term "value". The difference is size of memory allocated for parameter, passed to function:
( sizeof(parameter passed by any reference) == sizeof(parameter passed by pointer ) ) != sizeof(parameter passed by value )
With the help of "r-value reference" proposal we can not explicitly declare value, that must be created by moveable ctor. Note, that moveable value and moveable reference has meaning for auto_ptr wrapper: in the first case we pass ownership, in the second case does not.
This can be problem if data type has simultaneously moveable and copyable behaviour, because copyable will be hidden by moveable.
Also it is hard to declare (or looking on declarations it is hard to guess) what kind of ctor will be used.
I think, as if we can with the help of special technic hide the absence of moveable value, we must not do like this, we must give to user ability to clear express his desire about data type.
The "r-value reference" proposal requires, that the state of the source after "move" remains in a self consistent state (all internal invariants are still intact).
The requirement is theoretically correct (can exist), but practically useless one, because even auto_ptr wrapper has other behaviour.
Note, it is very important for auto_ptr that its state after "move" became incorrect and user must not have access to auto_ptr during incorrect state, because else we can not implement desired behaviour of auto memory substitution.
The "r-value reference" proposal requires, that from a client code point of view, choosing move instead of copy means that you don't care what happens to the state of the source.
Here "copying from non-const source" and "move" are mixed. There are differences beetwen them. For "move" source state _never_ will be correct or has undefined behaviour, the state _always_ will be incorrect.
Also "r-value reference" proposal just ignoring the fact, that "moveable data type" can be const, and the constness can help us to detect errors at compile time
It is situation that as if we can imagine "moveable" data type sharp divided into internal and external state, and "move" change placement of internal state. Source object without internal state can not exist, and we may not assign to source internal state any random value, in order to make the state "as if correct or consistent".
Of course, we can imagine a kind of data type for which after "move" any other (but correct) value exist for source, but we can not demand the specific behaviour from all moveable data, originally moveable data type must has ability to turn source after "move" into incorrect state.
For PODs, move and copy are identical operations. The cause of the POD behaviour is just the fact, that POD data is copyable data type without specific implementation of "move" and instead of "move" "copy" is used.
In comparison with "moveable data type", source external state of "copyable data type" after "copy" _always_ remains unchanged, _never_ source state after "copy" can be changed, because we can not get two equal copies, if source will be changed after copy.
The "r-value reference" proposal has been designed, in order to save functionality without changing the meaning of any existing code, instead of introducing "moveable data type" as itself.
The heroic attempt to make moveable from unchanged old code, working only with copyable, can be ironically compared with apearance of "implicit const reference T&&&":
here "t" is implicitly "const", because in the case:
template<class T>
void foo(T&&& t);
compiler can elide copy ctor to create "t" and makes perfectly forwarding.
int tmp=3;
foo<int>( tmp );
Note, introducing ordinary "moveable data type" will not require any changes to code working with copyable and does not change execution of the code.
The "r-value reference" proposal has been designed, in order to save vague overloading by constness of parameter.
void foo(const A& t); // #1
void foo(A&& t); // #2
A source();
const A const_source();
foo(source()); // binds to #2
foo(const_source()); // binds to #1
This is suspicional idea to make overloading by const/non-const parameter, it can be place of vague casts and human errors.
Operator "parameter passed by".
Abstract.
In some cases it is useful to protect objects of any class to be parameter passed by value or by reference. To do it C++ must introduce operators for each concrete way of parameter can be passed by. Declaring the operators as user-defined in public or protected section of class we can control the ability to the object to be parameter passed by reference or by value.
Syntax.
This is example of possible syntax for the operators:
struct A
{
int i;
protected:
//copyable parameters
operator (const A&) ()const {return *this;}
operator (A&) (){return *this;}
operator (const A*) ()const {return this;}
operator (A*) (){return this;}
operator (const A) ()const {return *this;}
operator (A) (){return *this;}
protected:
//moveable parameters
operator (const A@ &)()const {return *this;}
operator (A@ &) (){return *this;}
operator const (A@ *)()const {return this;}
operator (A@ *) (){return this;}
operator (const A@) ()const {return *this;}
operator (A@) (){return *this;}
};
void foo1(const A&);
void foo2(const A*);
void foo3(const A);
void foo4(const A@ &);
void foo5(const A@ *);
void foo6(const A@);
void boo()
{
A a;
//the following code must not be compiled
//due to operators are protected in this context
const A &f1=a; //operator (const A&)
const A *f2=&a; //operator (const A*)
const A f3=a; //operator (const A)
const A @& f4=a; //operator (const A@ &)
const A @* f5=&a; //operator (const A@ *)
const A @f6=a; //operator (const A@)
foo1(a); //operator (const A&)
foo2(&a); //operator (const A*)
foo3(a); //operator (const A)
foo4(a); //operator (const A@ &)
foo5(&a); //operator (const A@ *)
foo6(a); //operator (const A@)
}
Note, that A::operator (const A)()const
is not the same as A::operator const A ()const
or A::A(const A)
. The kind of operators operator (type)
intended to be only for access control of external class users. Probably the operators can have no body, only declaration and body is built-in. Probably there are other ways to do it.
Problems.
Nothing.
List of modifications:
04.03.07 | 01.03.07 |
28.02.07 | 23.02.07 | 22.02.07 | 21.02.07 | 20.02.07 | 18.02.07 |