Re: cvs commit: src/usr.sbin/jexec jexec.8 jexec.c

From: Robert Watson <rwatson_at_FreeBSD.org> Date: Fri, 30 May 2008 10:46:29 +0100 (BST) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:31 UTC

On Fri, 30 May 2008, Michael Reifenberger wrote:

>> Here's the specific concern I have: an administrator starts a jail with a 
>> name/IP number, and various processes run, creating TCP connections.  The 
>> administrator shuts down the jail to change global configuration for the 
>> jail, but some TCP connections remain in TIME_WAIT (etc) as they spin down. 
>> The administrator then restarts the jail.  For some number of minutes after 
>> starting the new jail, jexec will fail with the new jail, as the 
>> specification by IP or hostname will be ambiguous.
>
> Really? What would jls show during the 2 minutes period? If your statement 
> is true then jls is broken also because the displayed information cant be 
> trusted.

The bug is not in jail: it behaves as designed.  The bug is not in jls: it 
accurately reports the kernel jail state.  The bug is in your code, which 
relies on an underlying assumption that does not reflect the reality of how 
jail works.  The assumptions you've made sound useful, and if you read the 
replies you've been receiving, they are about how to make those assumptions 
correct, or at least provide a facility about which similar assumptions can be 
made.  This is not the first time these issues have been discussed--in fact, 
they came up at the Devsummit in the context of how to handle Audit and jails, 
where there is a desire to have a unique, persistent, administrator-defined 
identifier for jail.

However, to return to the point at hand: the assumptions do not currently 
hold, and as a result, the changes you've made to jexec(8) are fundamentally 
inconsistent, unreliable, and potentially quite confusing for administrators. 
You cannot assume that jails without processes in them immediately 
garbage-collect -- in fact, they may persist for seconds or minutes after the 
last process exits.  Hence my request: please don't MFC the changes to 
jexec(8) until they behave in a consistent, reliable, and 
administrator-friendly way.

As an example of the output you might reasonably say; I created a jail on 
zoo.FreeBSD.org, telnet'd to freefall.FreeBSD.org's SSH port, and then 
abruptly closed the telnet connection from the client side.  I then exited the 
jail, created a second jail, repeated the process, and exited the jail.  This 
leads to two jails referenced only by two TIME_WAIT state TCP connections:

[zoo]# !jls
    JID  IP Address      Hostname                      Path
      6  64.7.141.9      localhost                     /
      5  64.7.141.9      localhost                     /

[zoo]# !netstat
netstat -n | grep 22
tcp4       0      0  64.7.141.9.61843       69.147.83.40.22        TIME_WAIT
tcp4       0      0  64.7.141.9.59204       69.147.83.40.22        TIME_WAIT

If I then start a third jail and leave it running, and want to use jexec, an 
IP address or hostname is legitimately ambiguous.  Hence my comments about 
having a counter/flag to track the number of processes in a jail in order to 
differentiate "live" vs "dead" jails.

Robert N M Watson
Computer Laboratory
University of Cambridge