> Julian Foad <julianfoad@btopenworld.com> wrote on 02/27/2006 09:22:15
PM:
>
> > Paul Burba wrote:
> > Again, I have to ask, are you sure this is a known limitation and not
a
> bug
> > that IBM could fix next week if told about it? This is a
> > considerable chunk of
> > work-around code.
>
> Julian,
>
> We've approached IBM about this before but didn't get a response. As I
> write this we're attempting to get some answers via a different channel.
> I'll update you as soon as I hear something.
>
> One other thing; while putting a new description of the problem together
> for IBM I found the problem isn't that apr_proc_wait() always sets
> exitcode to 0, but rather that it doesn't set it at all (the
uninitialized
> value was simply 0). The impact of the problem is still the same.
Hi All,
Ok, down to the *LAST* core patch for OS400 V5R4 on trunk! Sorry for the
delay in getting back to this, I've been busy with other work and waiting
on responses from IBM. And please bear with me on this, I'm a neophyte
with APR processes but I'll do my best to address the various concerns
you've each raised.
First, re IBM fixing this, I heard back from them and here is what they
had to say:
"In the case where a pid is passed to apr_proc_wait, we see if there is a
helper job. If so, we free it and return APR_CHILD_DONE. You are correct
that we do not set the exitcode in this case. However if this function is
called with a -1 for a pid meaning to wait for all processes to end, we do
in some cases set the exitcode. Our platform has a different
implementation because our child processes are handled through helper
jobs."
(BTW none of this is documented in IBM's header files - see
http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?topic=/rzaie/apr_core_api/apr__thread__proc_8h-source.html
if you are curious)
The "helper job" mentioned is a new IBM addition to apr_proc_t:
typedef struct apr_proc_t {
.
.
.
#ifdef AS400
apr_int32_t cmdtype; /* @A1A */
void *myHelperJob;
#endif
} apr_proc_t;
Sure enough, if apr_proc_t *cmd_proc->pid is set to -1, that exitcode_val
is set properly when calling apr_proc_wait(cmd_proc, &exitcode_val,
&exitwhy_val, APR_WAIT) in svn_io_wait_for_cmd(). The problem is, of
course, IBM's claim that "we do in *some* cases set the exitcode". That's
not very reassuring! So back to IBM, I asked them:
* If there is documentation of how these helper jobs work in the
context of APR, apr_thread_proc.h doesn't have much to say.
* In what cases the exitcode gets set and when it doesn't.
* My first post mentioned that "there are some problems" besides
the above when running hooks on OS400. I should mention the
other one now; the script's stderr is not written to the
errfile specified by apr_procattr_child_err_set(). So even if
apr_proc_wait() could be relied upon to return the scripts exit
code (which is doubtful) I can't get the script's stderr. So my
final question for IBM was if it's even possible to read a child
process' stdout/stderr using APR calls.
This is the only answer I've gotten to any of these questions:
"I have been digging around in our code and see that apr_proc_create()
uses
spawn() via apr_spawn() to start the helper job. In apr_proc_create, we
set the environment variable QIBM_USE_DESCRIPTOR_STDIO to Y so that file
descriptors 0, 1,and 2 are used for stdin, stdout, and stderr. In
apr_proc_create, we will over-ride attr->child_in (stdin) with what is
passed to apr_proc_create which in turn gets used on the spawn().
Another significant comment I found is that we use socketpair instead of
pipes for better performance on our platform."
So basically IBM has tossed the "P" in APR out the window. The point of
this lengthy semi-rant is that *some* manner of work-around is needed for
OS400. Running a script, getting its exit code, and reading the script's
stderr just doesn't seem possible using IBM's APR.
Peter Lundblad <peter@famlundblad.se> wrote on 02/28/2006 03:09:16 AM:
> Because that'd be a deadlock:-) We create pipes to read the hooks'
> stderr output. We must read that output before waiting for the
> process, else the pipe buffers might fil up. (I think Paul's patch has
> the same problem.)
Philip Martin <philip@codematters.co.uk> wrote on 02/28/2006 05:12:30 PM:
> The reason we have two functions, start and wait, on other platforms
> is that if the script produces lots of output it fills the pipe and
> then blocks waiting for the pipe to be read. If at the same time the
> parent is blocked waiting for the child we get deadlock. It looks
> like your implementation suffers from the same problem, perhaps you
> should move the read before the wait?
You two are of course correct regarding the deadlock potential. I wrote a
script that writes several 1000 lines to strderr and that's exactly what
happens. This is fairly easy to fix in the context of my current patch or
within whatever approach is eventually adopted...
> First impression is that it's ugly. You have replaced the two
> functions svn_io_start_cmd and svn_io_wait_for_cmd with the single
> function svn_io_run_os400_cmd. I assume you did it like that because
> the two functions communicate via an apr_proc_t and that doesn't allow
> you to add extra data.
I did it to avoid adding two new OS400 specific functions when I could
just add one. Also, I wasn't aware of the deadlock issue at the time.
> Since this is a workaround for an APR bug I'm not sure why you chose
> to put a public function in libsvn_subr when the only caller is in
> libsvn_repos.
I'm hesitant to call it a "bug" so much as IBM going their own merry
(undocumented) way. Since I never thought of it as a bug, it seemed
appropriate to put it in libsvn_subr, just like svn_io_start_cmd() and
svn_io_wait_for_cmd(), which are also only called from libsvn_repos.
> Does the OS400 port support an external diff or
> external editor? Your patch doesn't seem to have addressed those.
The OS400 ports don't currently support either.
> An alternative would be to modify the existing svn_io_start_cmd and
> svn_io_wait_for_cmd functions to pass svn_io_proc_t instead of
> apr_proc_t, this would allow you to add any extra data you need. I
> think svn_io_proc_t could be opaque and so all the OS400 stuff could
> be private to io.c. It would mean that the external diff/editor code
> get fixed as well, if that makes a difference.
I've looked at this and momentarily (for about 15 glorious minutes earlier
today) I thought it was doable. But now I'm not sure how to make it work.
svn_io_start_cmd() works with apr_file_t * but spawn() works with int
file descriptors. I'm not sure how I can make spawn() and apr_file_t *
play nicely together, specifically how would I connect the spawn()ed
process' stderr to the apr_file_t *errfile passed to svn_io_start_cmd()?
Worse, if svn_io_proc_t is opaque, each item of additional information I
store in it would need a public settor function in libsvn_subr no?
> > Index: subversion/libsvn_subr/io.c
> > ===================================================================
> > --- subversion/libsvn_subr/io.c (revision 18637)
> > +++ subversion/libsvn_subr/io.c (working copy)
> > @@ -42,6 +42,10 @@
> > #include <apr_portable.h>
> > #include <apr_md5.h>
> >
> > +#if AS400
> > +#include <spawn.h> /* For spawn() and struct inheritance. */
> > +#endif
> > +
> > #include "svn_types.h"
> > #include "svn_path.h"
> > #include "svn_string.h"
> > @@ -54,6 +58,7 @@
> >
> > #ifdef AS400
> > #define SVN_UTF_UTOE_XLATE_HANDLE "svn-utf-utoe-xlate-handle"
> > +#define SVN_UTF_ETOU_XLATE_HANDLE "svn-utf-etou-xlate-handle"
>
> We already have one of those in libsvn_subr/cmdline.c
Yes, but how does that help me in io.c? Unless you mean these should be
made public...should svn_utf.h have all of these?
#define SVN_UTF_NTOU_XLATE_HANDLE "svn-utf-ntou-xlate-handle"
#define SVN_UTF_UTON_XLATE_HANDLE "svn-utf-uton-xlate-handle"
#ifdef AS400
#define SVN_UTF_UTOE_XLATE_HANDLE "svn-utf-utoe-xlate-handle"
#define SVN_UTF_ETOU_XLATE_HANDLE "svn-utf-etou-xlate-handle"
#endif
> I feel a bit guilty about suggesting that since I seem to be asking
> for all your patches to be re-written :-/
Heh, I'm not worried about that, I just want it done correctly. I see 3
possible options to solve this:
1. Tweak the existing patch to avoid the deadlock issue and call it a day.
2. Figure out how to make apr_file_t and spawn() work together.
3. Move the patch entirely within hooks.c. Just add a second definition
for run_hook_cmd() along with the existing implementation within a #ifndef
AS400...#else...#endif block.
FWIW I like option 3, it is certainly the smallest patch and introduces no
new APIs.
Thoughts?
Paul B.
P.S. Why do I feel like Grasshopper on Kung Fu? You guys can fight over
who gets to be Master Po.
_____________________________________________________________________________
Scanned for SoftLanding Systems, Inc. and SoftLanding Europe Plc by IBM Email Security Management Services powered by MessageLabs.
_____________________________________________________________________________
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Mar 10 22:01:00 2006