tendra4minix bugs - copy_func

[Release b]
<t4m_bug_copy_func_args>
Thu May  6 20:33:28 WEST 2004

In the current release of tendra4minix (a), after fixing <t4m_bug_tok_recur>,
the compilation of the file tok_recur.C produces this error

Internal error: 'tag 5' used but not defined.

This error is generated by the frontend [src/producers/common/construct/],
inside the function write_capsule_body [../output/capsule.c]. The tag mentioned
is found in the data structure 'static VAR_INFO vars[VAR_total]', declared in
the same file. It is an array of 6 copies of the structure VAR_INFO, and the
"tags" are stored in the entry 0 of this array, 'vars[0]'. Tags uses are stored
in 'unsigned char *vars[0].uses' and their definitions are stored in
'string *vars[0].names'. What is happening is that 'vars[0].names[5]' is the
null pointer (that is, there is no definition), but 'vars[0].uses[5]' says that
the tag 5 has been used in the file, and thus the above error is generated.

Since the structure 'vars' is static, it is accessed only from the same module
capsule.c, and so it is not difficult to locate the places where it is written
and, in particular, where the entries 'vars[0].names[5]' and 'vars[0].uses[5]'
are written. To find them, we instrument the functions capsule_no, capsule_name,
capsule_id, record_usage, and clear_usage, in the following way

ulong capsule_no
    PROTO_N ( ( s, v ) )
    PROTO_T ( string s X int v )
{
    VAR_INFO *var = vars + v ;
    ulong n = ( var->no )++ ;
    if ( n >= var->sz ) extend_linkage ( v ) ;
    var->uses [n] = USAGE_USE ;
    var->names [n] = s ;
    var->present = 1 ;
    if (v == 0 && n == 5) {
      BUFFER bf = NULL_buff;
      clear_buffer(&bf, stderr);
      bfprintf(&bf, "capsule_no(%s, %d)\n", s, v);
      output_buffer(&bf, 1);
      free_buffer(&bf);
    }
    return ( n | LINK_EXTERN ) ;
}

and we learn that from all the functions in capsule.c that seem to write
'vars[0].names[5]' and 'vars[0].uses[5]', only capsule_no is actually invoked
(with s = NULL), so it must be the one we are looking for. I have been tracing
and tracing, and I have found that compile_function [../output/compile.c] is
the nexus between the part of the frontend that processes external declarations
and the part that encodes the output TDF capsule (in ../output/). What follows
is a summary of the calls starting from external_declaration [declare.c] and
going down to enc_func_defn [../output/compile.c], which, as its name implies,
is in charge of encoding the body of the function 'operator<< <double> (B &,
const double &)', and it is during its activation when the error is reported.

external_declaration [declare.c]
  clear_templates [instance.c], templ = 0
    copy_template_list [instance.c]
      copy_template_list [instance.c]
        copy_template [instance.c], A &operator<< <double>(B &, const double &)
          define_templ_member [instance.c]
            bind_template [instance.c], tid != NULL_id
              bind_template [instance.c], tid == NULL_id
                copy_object [copy.c], A &operator<< <double>(B&, const double&)
                  pop_namespace [namespace.c]
                  end_function [function.c]
                    ...
                    define_id [identifier.c]
                    compile_function [compile.c], A &operator<< <double>
                      make_tagdef [compile.c], A &operator<< <double>
                        capsule_id [capsule.c], A &operator<< <double>, ret 0
                        ...
                        enc_func_defn [compile.c]
                          ...
                          enc_compound_stmt [stmt.c]
                            [*]
                  end_templ_scope [rewrite.c]
                  end_declarator [namespace.c]
            clear_templates [instance.c], templ = 0

[*] Let us trace this call with more detail, since the problem arises during
    its activation (note the appearance of the message "capsule_no(, 0)"):

enc_compound_stmt, 65 exp_sequence_tag [stmt.c]
  enc_compound_stmt, 84 exp_location_tag
  enc_compound_stmt, 67 exp_decl_stmt_tag
    enc_stmt, 67 exp_decl_stmt_tag [stmt.c]
      enc_stmt, 65 exp_sequence_tag
        enc_compound_stmt, 65 exp_sequence_tag
          enc_compound_stmt, 84 exp_location_tag
            enc_compound_stmt, 73 exp_return_stmt_tag
              enc_stmt, 73 exp_return_stmt_tag, Simple return
                enc_exp, 19 exp_address_tag [exp.c]
                  enc_addr_exp, 17 exp_indir_tag [exp.c]
                    enc_exp, 22 exp_func_id_tag
                      enc_func_id_call
                        ...
                        capsule_id, A &A::operator<< <double>(const C<double>&)
                        ...
                        enc_exp_list
                          enc_exp, 19 exp_address_tag
                            enc_addr_exp, 14 exp_init_tag
                              enc_init_tag, 49 exp_constr_tag
                               [exp_dummy_no written with 6 at init.c line 711]
                                enc_exp, 22 exp_func_id_tag
                                  enc_func_id_call
                                    ...
                                    capsule_id, A::A(const B &)
                                    ...
                                    enc_exp_list
                                      enc_exp, 19 exp_address_tag
                                        enc_addr_exp, 87 exp_dummy_tag
                                          enc_exp, 87 exp_dummy_tag
                                            enc_dummy_exp, n = 6, cnt = 0
                                            enc_dummy_exp: returning
                                          enc_exp: returning
                                        enc_addr_exp: returning
                                      enc_exp: returning
                                      enc_exp, 19 exp_address_tag
                                        enc_addr_exp, 17 exp_indir_tag
                                          enc_exp, 0 exp_identifier_tag
                                            enc_addr_exp, 0 exp_identifier_tag
                                            enc_addr_exp: returning
                                          enc_exp: returning
                                        enc_addr_exp: returning
                                      enc_exp: returning
                                    enc_exp_list: returning
                                  enc_func_id_call: returning
                                enc_exp: returning
                              enc_init_tag: returning
                            enc_addr_exp: returning
                          enc_exp: returning
                          enc_exp, 49 exp_constr_tag
                            enc_exp, 22 exp_func_id_tag
                              enc_func_id_call
                                ...
                                capsule_id, C<double>::C(const double &)
                                ...
                                enc_exp_list
                                  enc_exp, 19 exp_address_tag
                                    enc_addr_exp, 87 exp_dummy_tag
                                      enc_exp, 87 exp_dummy_tag
                                        enc_dummy_exp, n = LINK_NONE, cnt = 0
                                          capsule_no(, 0)
                                        enc_dummy_exp: returning
                                      enc_exp: returning
                                    enc_addr_exp: returning
                                  enc_exp: returning
                                  enc_exp, 19 exp_address_tag
                                    enc_addr_exp, 17 exp_indir_tag
                                      enc_exp, 0 exp_identifier_tag
                                        enc_addr_exp, 0 exp_identifier_tag
                                        enc_addr_exp: returning
                                      enc_exp: returning
                                    enc_addr_exp: returning
                                  enc_exp: returning
                                enc_exp_list
                              enc_func_id_call: returning
                            enc_exp: returning
                          enc_exp: returning
                        enc_exp_list: returning
                      enc_func_id_call: returning
                    enc_exp: returning
                  enc_addr_exp: returning
                enc_exp: returning
              enc_stmt: returning
            enc_compound_stmt: returning
          enc_compound_stmt: returning
        enc_compound_stmt: returning
      enc_stmt: returning
    enc_stmt: returning
  enc_compound_stmt: returning
enc_compound_stmt: returning

The member function 'A::operator<< <double> (const C<double> &)' takes two
arguments and both of them are temporary objects built by user-defined
constructors. In the above call tree, we can recognize clearly each one of the
traces corresponding to the encoding of the calls to each constructor,
'A::A(const B &)' and 'C<double>::C(const double &)'. Each constructor takes
two arguments: a 'this' pointer and a reference to another object. The 'this'
pointer passed to the constructor 'C<double>::C(const double &)' is initialized
to the address of a temporary object of the class 'C<double>', and something is
going wrong during the encoding of this temporary object.

The wrong value (LINK_NONE) passed as second argument to the function
capsule_no comes from the field exp_dummy_no in the function enc_exp
[../output/exp.c] (line 2860). I have been tracing every assigment made to
(instances of) the field exp_dummy_no throughout the frontend, and what is
happening is that: for the constructor 'A::A(const B &)', whose call is built
correctly, the field exp_dummy_no is written with the value 6 (this event is
marked in the call tree above); for the constructor 'C<double>::C(const double
&)', the field exp_dummy_no is not written at all (LINK_NONE is its default
value).

The assignment of the field exp_dummy_no for the constructor 'A::A(const B &)'
is done in the function enc_init_tag [../output/init.c], in the case
exp_constr_tag. The function enc_init_tag is called by enc_addr_exp
[../output/exp.c] in the case exp_init_tag. If we take a look at the call tree
above, in the subtree corresponding to the call to the constructor
'C<double>::C(const double &)', we see that the exp_constr_tag is present,
but exp_address_tag and exp_init_tag (which are the tags that drive the control
flow from enc_exp to enc_init_tag) are missing. The absence of exp_init_tag is
significant, since the constructor MAKE_exp_init is used only in three places
throughout the frontend:

  - inside the function copy_exp [copy.c], with the obvious intention (this
    function is used to instantiate templates);

  - inside the function check_cond [statement.c], but in response to the
    expressions lex_while and lex_for, which are not relevant right now;

  - finally, inside the function make_temporary [initialise.c], which is used
    to build temporary objects.

It is obvious that we are interested in the invocations of make_temporary. I
have written a file almost identical to tok_recur.C, named tok_recur_e.C, which
is compiled without problems, and which differs from tok_recur.C only in that
the call to the member function A::operator<< is made explicitly inside the
function operator<<

template <class T>                                   // tok_recur_e.C, succeeds
inline A &operator << (B &b, const T &c) { return A(b).operator<<(C<T>(c)); }

instead of using an ambiguous expression, as in tok_recur.C,

template <class T>                                   // tok_recur.C, fails
inline A &operator << (B &b, const T &c) { return A(b) << C<T>(c); }

I have been comparing the treatment given to the constructor
'C<double>::C(const double &)' in both programs, and this bug looks similar
in origin to <t4m_bug_func_id>, in the sense that there are certain things
which are tried during the overload resolution, but which are delayed due to
the presence of free template parameters and, later on, during template
instantiation, these things are not properly redone/retried.

We know that the function copy_object [copy.c] is in charge of instantiating
templates. Let us trace the calls to copy_exp [copy.c] from copy_object when
it is generating the instance 'operator<< <double> (B &, const double &)'.
First we will do it for the file that is compiled correctly, tok_recur_e.C,

copy_exp, 65 exp_sequence_tag
  copy_exp, 84 exp_location_tag
  copy_exp, 84 exp_location_tag
    copy_exp, 73 exp_return_stmt_tag
      copy_exp, 81 exp_opn_tag
        apply_nary, 342 lex_func_Hop, cpy = 1
          copy_func_exp, 23 exp_call_tag
            copy_exp, 23 exp_call_tag
              copy_exp, 1 exp_member_tag
                rescan_member, template<class T> A &A::operator<<(const C<T> &)
              copy_exp, 49 exp_constr_tag
                copy_exp, 22 exp_func_id_tag
                  rescan_member, A::A(const B &)
                  copy_func_args, A::A(const B &)
                    copy_exp, 19 exp_address_tag /* Simple argument copy */
                      copy_exp, 87 exp_dummy_tag
                    copy_exp, 19 exp_address_tag /* Simple argument copy */
                      copy_exp, 17 exp_indir_tag
                        copy_exp, 0 exp_identifier_tag
                          rescan_member, b
          copy_exp_list
            copy_exp, 80 exp_op_tag
              expand_type, C<T>
                ...
                instance_type
                  copy_members
                    copy_member, template <class T> class C
                    copy_member, C<double>::c
                    copy_member, C<double>::C(const T &)
                    copy_member, C<double>::~C()
                    copy_member, C &C<double>::operator= (const C &)
              apply_unary, 185 lex_cast, cpy = 1
                copy_exp, 17 exp_indir_tag
                  copy_exp, 0 exp_identifier_tag
                    rescan_member, c
                make_cast_exp
                  convert_reference
                  cast_exp
                    resolve_cast
                    init_direct
                      init_constr, type_compound_tag
                        convert_constr
                          constr_candidates
                          resolve_overload, C<double>::C(const C<double> &)
                                          C<double>::C(const double &) [winner]
                          use_func_id, C<double>::C(const double &)
                            define_template, C<double>::C(const double &)
                          apply_constr
                            apply_func_id, C<double>::C(const double &)
                              cast_args
          make_func_exp, 23 exp_call_tag, rescan = 1
            rescan_func_id, template <class T> A &A::operator<< (const C<T> &)
              rescan_member
            resolve_call, template <class T> A &A::operator<< (const C<T> &)
              deduce_args
                inst_func_deduce
                   instance_func
            use_func_id, A &A::operator<< <double> (const C<double> &)
              define_template
            apply_func_id, A &A::operator<< <double> (const C<double> &)
              cast_args
                make_temporary, t = A, tag(t) = 11, tag(e) = 49
                init_assign, 6 type_ref_tag, t = const C<double> &
                  init_ref_lvalue /* Check base class conversions first */
                  convert_conv_aux
                  init_ref_lvalue
                  init_ref_rvalue
                    find_base_class
                    qualify_type
                    make_temporary, t = const C<double>, tag(t)=11, tag(e)=49
                    cast_class_class
                  make_ref_init
 
and with this (except for a call to init_assign that is probably called from
find_return_exp), all invocations return up to copy_object. [Note: in case we
want a wider portrait of this situation, the path from external_declaration
down to copy_object in tok_recur_e.C is identical to that we showed before for
tok_recur.C, and this subtree is also followed by a call to compile_function.]

In the above call trace, we must pay attention to two subtrees. The first
starts from

              apply_unary, 185 lex_cast, cpy = 1

[operator.c] and it is the trace of the call to constructor 'C<double>::C(const
double &)'. In fact, the subexpression 'C<T>(c)' that appears in the program is
an explicit cast from the type 'const T &' to the type 'C<T>', hence the name
of the lexical operator.

The second subtree starts from

            apply_func_id, A &A::operator<< <double> (const C<double> &)

[function.c] which, being called from make_func_exp [function.c], builds the
invocation to the member funcion 'A::operator<<'. To do this, it is necessary
to build two temporary objects, one of type 'A', another of type 'C<double>',
and pass them by reference. We see that this is done by cast_args [function.c]
calling make_temporary [initialise.c] on both types.

Now, we are going to trace the calls to copy_exp from copy_object in the case
of the program whose compilation fails, tok_recur.C,

copy_exp, 65 exp_sequence_tag
  copy_exp, 84 exp_location_tag
  copy_exp, 67 exp_decl_stmt_tag
    copy_exp, 65 exp_sequence_tag
      copy_exp, 84 exp_location_tag
        copy_exp, 73 exp_return_stmt_tag
          copy_exp, 19 exp_address_tag
            copy_exp, 17 exp_indir_tag
              copy_exp, 22 exp_func_id_tag
                rescan_member, A &A::operator<< <T> (const C<T> &)
                  rescan_member,template<class T> A &A::operator<<(const C<T>&)
                    apply_template
                      apply_func_templ
                        instance_func
                          expand_name [../parse/hash.c]
                            expand_type, template <class T> A & (const C<T> &)
                              ...
                              instance_type
                                copy_members
                                  copy_member,template <class T> class C
                                  copy_member,C<double>::c
                                  copy_member,C<double>::C(const T &)
                                  copy_member,C<double>::~C()
                                  copy_member,C &C<double>::operator=(const C&)
                    define_template,A &A::operator<< <double>(const C<double>&)
                copy_func_args, A &A::operator<< <double>(const C<double> &)
                  copy_exp, 19 exp_address_tag /* Simple argument copy */
                    copy_exp, 14 exp_init_tag
                      copy_exp, 49 exp_constr_tag
                        copy_exp, 22 exp_func_id_tag
                          rescan_member, A::A(const B &)
                          copy_func_args, A::A(const B &)
                            copy_exp, 19 exp_address_tag /* Simple arg. copy */
                              copy_exp, 87 exp_dummy_tag
                            copy_exp, 19 exp_address_tag /* Simple arg. copy */
                              copy_exp, 17 exp_indir_tag
                                copy_exp, 0 exp_identifier_tag
                                  rescan_member, b
                  copy_exp, 80 exp_op_tag /* Do implicit argument conversion */
                    expand_type, C<T>
                      ...
                      instance_type
                    apply_unary, 185 lex_cast, cpy = 1
                      copy_exp, 17 exp_indir_tag
                        copy_exp, 0 exp_identifier_tag
                          rescan_member, c
                      make_cast_exp
                        convert_reference
                        cast_exp
                          resolve_cast
                          init_direct
                            init_constr, type_compound_tag
                              convert_constr
                                constr_candidates
                                resolve_overload,C<double>::C(const C<double>&)
                                          C<double>::C(const double &) [winner]
                                use_func_id, C<double>::C(const double &)
                                  define_template, C<double>::C(const double &)
                                apply_constr
                                  apply_func_id, C<double>::C(const double &)
                                    cast_args
                  init_assign, 11 type_compound_tag, t = C<double>, e = c
                    init_direct, t = C<double>, e = c
                      init_constr, 11 type_compound_tag (the same arguments)
                        convert_constr t = C<double>, args = List(e)
                          constr_candidates
                          resolve_overload [C<double>::C(const C<double> &)]
                          use_func_id, C<double>::C(const C<double> &)
                          apply_constr
                            apply_trivial_func, n = 2 DEFAULT_COPY
                              convert_reference [convert.c]
                                ...
                              convert_class [assign.c]
                                ...

and with this all invocations return up to copy_object.

There are two important things in this tree. First, the subtree starting from

                    apply_unary, 185 lex_cast, cpy = 1

is identical to that of the file that is compiled correctly (tok_recur_e.C).
But, on the other hand, we do not see any call to make_temporary. There is a
misterious phenomenon which gives us a hint about what is happening: the
respective calls to the function init_assign [initialise.c] in each file.
This function is in charge of generating the initialization of the temporary
object. In the file whose compilation succeeds (tok_recur_e.C), we have a
subtree that starts from

                init_assign, 6 type_ref_tag, t = const C<double> &

called from cast_args [function.c], where type_ref_tag is the tag of the type
of the argument of init_assign. But in the file whose compilation fails
(tok_recur.C), we have instead

                  init_assign, 11 type_compound_tag, t = C<double>, e = c

called from copy_func_args [copy.c], and this makes the compiler generate a
call to the copy constructor 'C<double>::C(const C<double> &)'. To understand
the cause of the difference between the tags, let us trace the call to
apply_func_id [function.c] on the function 'A::operator<< <T> (const C<T> &)'
in the compilation of tok_recur.C; this call happens sooner in the compilation,
after the overload resolution that takes place inside 'template <class T>
operator<< (B &, const T &)',

use_func_id, A &A::operator<< <T> (const C<T> &)
  define_template /* Ignore template dependent instantiations */
apply_func_id, A &A::operator<< <T> (const C<T> &)
  cast_args
    make_temporary, t = A, tag(t) = 11, tag(e) = 49
    init_assign, t = const C<T> &, type_ref_tag
      init_ref_lvalue /* Check base class conversions first */
      convert_conv_aux, cast = CAST_IMPLICIT [construct.c]
        dependent_conv [template.c], returns 1
        cast_templ_type [cast.c]

The function cast_templ_type [cast.c] creates a lex_implicit operator that
contains, as a subexpression, the lex_cast operator that was present in the
source program; init_assign [initialise.c] returns immediately in the
conditional 'if ( eq_type ( r, t ) ) break ;'. The function cast_templ_type
puts the destination type of the cast ('const C<T> &', in the present case)
inside the lex_implicit operator. It is assumed that this implicit cast must
be reprocessed inside the function copy_func_args [copy.c], finishing what
cast_args [function.c] left undone. But there is a subtle bug in function
copy_func_args.

The problem is as follows. First, the function implicit_cast_exp [copy.c] is
called to check if the expression 'e' is an implicit cast (as it really is).
The function implicit_cast_exp returns null if its argument is not an
implicit cast; otherwise it returns the expression being cast, that is, the
original lex_cast expression that was put inside the lex_implicit operator:

	EXP e = DEREF_exp ( HEAD_list ( p ) ) ;
	EXP a = implicit_cast_exp ( e ) ;
	if ( !IS_NULL_exp ( a ) ) {
	    /* Do implicit argument conversion */
	    TYPE t ;
	    ERROR err = NULL_err ;
	    a = copy_exp ( a, NULL_type, NULL_type ) ;
	    t = DEREF_type ( exp_type ( a ) ) ;
	    e = init_assign ( t, cv_none, a, &err ) ;

It is ok that copy_exp is called with the argument 'a' without the lex_implicit
operator, since this was the original expression found in the program, and we
have already seen that the trace starting from "copy_exp, 80 exp_op_tag"
(just before "apply_unary, 185 lex_cast, cpy = 1") is the same for both files.
But the destination type of the cast, 'const C<T> &', which is the type of the
formal parameter of the function that is being applied, is lost when the
lex_implicit operator is discarded, and the type of the actual argument,
t = 'C<double>' is used instead. That is why init_assign [initialise.c] ends up
generating a call to the copy constructor: it thinks that the type of the
actual argument and the type of the formal parameters are the same type
'C<double>'.

Then, what we have to fix is the type 't' that is passed to the call

	    e = init_assign ( t, cv_none, a, &err ) ;

The type 't' should be taken from the expression 'e' (and not from 'a', as it
happens now). Recall that the implicit cast expression 'e' was built in this
way inside the function cast_templ_type [cast.c],

EXP cast_templ_type
    PROTO_N ( ( t, a, cast ) )
    PROTO_T ( TYPE t X EXP a X unsigned cast )
{
    /* etc.
      t = const C<T> &, op = lex_implicit
     */
    MAKE_exp_op ( t, op, a, NULL_exp, e ) ;
    return ( e ) ;
}

and so the type 't' can be recovered just changing the line

	    t = DEREF_type ( exp_type ( a ) ) ;

for

	    t = DEREF_type ( exp_type ( e ) ) ;
	    t = expand_type ( t, 1 ) ;

It is necessary to expand the type, since normally it will be dependent on the
parameters of the template we are currently instantiating. In the example
tok_recur.C, before expansion 't' is 'const C<T> &', and after expansion 't' is
'const C<double> &'.

The solution works, and it also works with two variants that caused the same
error message:

  - tok_recur_f.C, where the template function 'A::operator<<' is not a member
                   function,

  - tok_recur_t.C, where the classes A and B are also template classes.