                        Logtail Development Log
                        

2017 April 2

Began log for development of version 1.3.

Transmogrified into Perl 5 with strict and warnings set.  Got a
clean compile check.

Perl 5 does now allow our trick of using the name of a file as its
file handle (for example):
    my $f = "/usr/log/whatever";
    open($f, "<$f");
I added a new %handles hash, used with the file name as the key,
which holds the file handle corresponding to the file name.
Except for where a file handle is required, the file name continues
to be what we pass around.  Note that the -s (size) test operator
works on either a file name or a handle; this comes in handy.

Added:
    use Socket;
    use POSIX ":sys_wait_h";        # To define WNOHANG for waitpid()
which allows getting rid of our gnarly local definitions of
constants for socket-related and the waitpid functions.  I also
got rid of the warning about these values in the comments and
the little embedded C program for setting them.

Replaced title in logtail.html, which had originally been prepared
with a grey background in mind, with a similar-looking logo made
with the Google Font "Varela Round".  I may screenshot this and turn
it back into an image to avoid the dependency on Google Fonts, but
I'll leave it this way foro now to facilitate further experimentation.

Added a hanging indent to the SYNOPSIS line in logtail.html.

Update Perl version used for development in documentation.

2017 April 3

Rewrote the handling of the "-e" option to use getaddrinfo()
from Socket.  This should handle both IPv4 and IPv6 addresses, either
specified as a hostname or as an address.  From preliminary
testing it looks OK, but it needs substantially more qualification
before we're sure it works for all cases.

Rewrote IPv4 host name lookup code in resolution() to use
into_pton() and getnameinfo().  This will allow extending resolution
to handle IPv6 addresses.

Added code in resolution() and resfork() to identify IPv6 addresses
in the strings it is scanning.  In resolution(), IPv6 addresses
are packed with sockaddr_in6() before calling getnameinfo().

The code which creates the SOCKDNS socket, used to communicate
resolved hosts from the forked working process back to its
parent process, created the socket as long as the -n option was
not specified, but did not check for the presence of the -r option.
Thus, the socket was created even when we aren't resolving hosts.
I fixed it to only create the socket if we're resolving and
forking worker processes.

The checkDNS() function went ahead and performed the test for
input from SOCKDNS even if we weren't resolving hosts.  This wasted
time before, and now causes an error because we aren't unnecessarily
creating SOCKDNS when not resolving.  checkDNS() now only proceeds
with the test if both $resolve and $forking are set.

The handling of the select() function within checkDNS() was completely
messed up due my total misunderstanding of how it worked, and the
only reason previous versions of the program didn't hang there is
due to another error in the scope of the $dnsocks file descriptor
bit mask used within it.  When I declared this variable at the
proper global scope, the select() problem manifested itself and
caused hangs.  I was testing the result from select(), thinking it
was the number of ready file descriptors among those specified by
the masks, but in fact it's the total number of ready file descriptors
regardless of the mask.  You have to test the value stored into the
first argument against the mask passed in that argument to determine
if any of the descriptors you're interested in are ready to read.
Rewrote the code accordingly and now checkDNS seems to be behaving
properly.

2017 April 4

Parsing of IPv4 and IPv6 address in blocks of messages processed by
resfork() and resolution() is complicated by the fact that in Perl
there is no way to write a regular expression in which an alternation
(x|y) matches the first of the alternatives in the string.  The
match will, if the searched string contains x, always return the
match for x even if there is a match for y earlier in the string.
The only way to work around this is to do two separate matches,
saving the before and after match strings, and choose the one with
the shorter before-match string.  This is sufficiently messy and,
being used both in resolution() and resfork(), I encapsulated it in
a new function, nextIP(), which called with a string containing one
or more log items, returns:
    ($ip, $head, $tail, $v6) = nextIP(<string>)
It recognises both IPv4 and IPv6 addresses.  If no IP address is found
in the <string>, it returns an $ip of undef.

Integrated nextIP() into resfork() and resolution().

Replaced the kludge parsing of IPv4 addresses in nextIP() with the
IPv4 regular expression from the IPv6_re package.

Updated logtail.html and logtail.1 to reflect the changes to the
host name resolver, clarify the operation of the "c" option, and
note that the "d" option must be specified on the machine receiving
the logs, not those sending them.

2017 April 5

Replaced the Google Fonts title in logtail.html with a screen shot
of the rendered text.  This eliminates an external dependency in the
document.

Updated the Makefile to reflect the current files in the release.
This will need to be reviewed again before the final release is
made.

Rewrote the log echo send and receive code to use the facilities
of IO::Socket::IP.  This makes the log echo mechanism work with
either IPv4 or IPv6.  There is a change in the command line semantics
as a result of this, which I believe is for the better.  Previously,
all echo sockets created with the "-e" option would use the echo
talk post last specified with a "-t" option.  Now, the most recently
"-t" specification is used for each "-e" option.  This allows sending
echo information to different hosts, each with their own listen port.
We try to create the listen socket as IPv6, which will also accept
packets from IPv4 and, if that fails because the system does not
support IPv6, fall back to an IPv4-only socket.

For some screwball reason, with the listen socket port set to
the default of 5741, DNS messages from the child process to the parent
process, sent on 5742, never arrived.  It seems like almost any other
port works finr.  I changed the default port to 9875 and now it
seems to work OK with the default.  On reflection, it might make
sense to replace this UDP socket connection with a pipe pair, but
I'm not sure we won't run into horrors trying to use a pipe to
communicate from multiple child processes back to a single parent,
and I'm not in the mood for another half-week down another rabbit
hole.

Added another little twist in resolution().  It is common to have
several occurrences of the same IP address in one block of log
items.  We will usually satisfy all but the first from the cache,
but we were sending a timestamp update back to the parent process
each time we found a given IP address in the cache.  I added a second
cache so that we'll only ever send one timestamp update for a given
IP address, regardless of how many times it occurs in a block.

The nextIP() function could identify something which looked like an
IP address (particularly an IPv4 address) but was actually embedded
within something which wasn't, such as a browser version number.
This would lead to library warnings when trying to parse the supposed
IP address in sockaddr_in[6]().  I added code in nextIP() for both
IPv4 and IPv6 addresses which looks for adjacent characters which would
be valid as part of that kind of address and rejects the match if
one is present.  Due to the logic of nextIP(), this makes things
even more tangled, with a while loop wrapped around each of the
matches and the ability to restart the match after a bogus match is
discovered.  This is all before we check to see whether if we had
IPv4 and IPv6 matches, which came first in the log block.  This is
what we get for trying to parse unstructured text for things
which may or may not be an IP address.

2017 April 6

Tested relay from AWS to AWS via both IPv4 and IPv6 with resolution
performed both on the sending and receiving end.  It appears to be
OK.

Reimplemented the "-d" option, which was lost in the addition of
IPv6 support to echoing log entries.  If the "-d" option is specified,
the full name of sending host (host name or IP address) will be shown.
Otherwise, if the name has been resolved into a host name with
domain information, just the host name will be shown.  If the first
item before the leftmost period is decimal numeric, we assume the name
is an IPv4 address and show it in full.

If writing to the socket to transmit logs items to an echo host
failed, deliver() would die, possibly crashing the program.  Since
network links and hosts go up and down all the time, we don't want
to kill the program on a monitored host when one of the hosts to
which it's echoing goes down.  I removed the die (commenting it out
in case we want to re-enable for debugging) so that errors echoing
to other hosts will be ignored.

Commented out the code at the top which invokes the Carp package
for library error tracebacks.  When it's needed, just turn it back
on.

Put the current version into production on AWS (running with the
standard options: no resolution, no echo).  I'll let it run and
see if anything blows up.

Release version 1.3.
