Provided by: charliecloud-builders_0.38-2_amd64 bug

NAME

       ch-image - Build and manage images; completely unprivileged

SYNOPSIS

          $ ch-image [...] build [-t TAG] [-f DOCKERFILE] [...] CONTEXT
          $ ch-image [...] build-cache [...]
          $ ch-image [...] delete IMAGE_GLOB [IMAGE_GLOB ...]
          $ ch-image [...] gestalt [SELECTOR]
          $ ch-image [...] import PATH IMAGE_REF
          $ ch-image [...] list [-l] [IMAGE_REF]
          $ ch-image [...] pull [...] IMAGE_REF [DEST_REF]
          $ ch-image [...] push [--image DIR] IMAGE_REF [DEST_REF]
          $ ch-image [...] reset
          $ ch-image [...] undelete IMAGE_REF
          $ ch-image { --help | --version | --dependencies }

DESCRIPTION

       ch-image  is  a  tool  for building and manipulating container images, but not running them (for that you
       want ch-run). It is completely unprivileged, with no setuid/setgid/setcap helpers.  Many  operations  can
       use caching for speed. The action to take is specified by a sub-command.

       Options that print brief information and then exit:

          -h, --help
                 Print  help  and exit successfully. If specified before the sub-command, print general help and
                 list of sub-commands; if after the sub-command, print help specific to that sub-command.

          --dependencies
                 Report dependency problems on standard output, if any, and exit. If all is well,  there  is  no
                 output and the exit is successful; in case of problems, the exit is unsuccessful.

          --version
                 Print version number and exit successfully.

       Common options placed before or after the sub-command:

          -a, --arch ARCH
                 Use  ARCH  for  architecture-aware  registry  operations. (See section “Architecture” below for
                 details.)

          --always-download
                 Download all files when pulling, even if  they  are  already  in  builder  storage.  Note  that
                 ch-image  pull  will  always  retrieve  the  most  up-to-date  image; this option is mostly for
                 debugging.

          --auth Authenticate with the remote repository, then (if successful) make all subsequent  requests  in
                 authenticated  mode. For most subcommands, the default is to never authenticate, i.e., make all
                 requests anonymously. The exception is push, which implies --auth.

          --break MODULE:LINE
                 Set a PDB breakpoint at line number LINE of module named MODULE (typically  the  filename  with
                 .py  removed,  or __main__ for ch-image itself). That is, a PDB debugger shell will open before
                 executing the specified line.

                 This is accomplished by re-parsing the module, injecting import pdb; pdb.set_trace()  into  the
                 parse  tree,  re-compiling  the tree, and replacing the module’s code with the result. This has
                 various gotchas, including (1) module-level code  in  the  target  module  is  executed  twice,
                 (2) the option is parsed with bespoke early code so command line argument parsing itself can be
                 debugged,  (3) breakpoints  on  function  definition  will  trigger  while  the module is being
                 re-executed, not when the function is called (break on the first  line  of  the  function  body
                 instead), and (4) other weirdness we haven’t yet characterized.

          --cache
                 Enable build cache. Default if a sufficiently new Git is available. See section Build cache for
                 details.

          --cache-large SIZE
                 Set  the  cache’s  large  file  threshold  to  SIZE  MiB, or 0 for no large files, which is the
                 default. Values greater than zero can speed up many  builds  but  can  also  cause  performance
                 degradation.  Experimental. See section Large file threshold for details.

          --debug
                 Add  a  stack  trace  to  fatal  error  hints. This can also be done by setting the environment
                 variable CH_IMAGE_DEBUG.

          --no-cache
                 Disable build cache. Default if a sufficiently new Git is not available.  This option turns off
                 the cache completely; if you want to re-execute a Dockerfile  and  store  the  new  results  in
                 cache, use --rebuild instead.

          --no-lock
                 Disable  storage  directory locking. This lets you run as many concurrent ch-image instances as
                 you want against the same storage directory, which risks corruption but  may  be  OK  for  some
                 workloads.

          --no-xattrs
                 Enforce default handling of xattrs, i.e. do not save them in the build cache or restore them on
                 rebuild. This is the default, but the option is provided to override the $CH_XATTRS environment
                 variable.

          --password-many
                 Re-prompt the user every time a registry password is needed.

          --profile
                 Dump  profile  to  files  /tmp/chofile.p  (cProfile  dump  format)  and  /tmp/chofile.txt (text
                 summary).  You  can  convert  the  former  to  a  PDF  call  graph  with  gprof2dot  -f  pstats
                 /tmp/chofile.p  |  dot  -Tpdf  -o  /tmp/chofile.pdf.  This excludes time spend in subprocesses.
                 Profile data should still be written on fatal errors, but not if the program crashes.

          -q, --quiet
                 Be quieter; can be repeated. Incompatible with -v and suppresses --debug regardless  of  option
                 order. See the FAQ entry on verbosity for details.

          --rebuild
                 Execute all instructions, even if they are build cache hits, except for FROM which is retrieved
                 from cache on hit.

          -s, --storage DIR
                 Set the storage directory (see below for important details).

          --tls-no-verify
                 Don’t  verify TLS certificates of the repository. (Do not use this option unless you understand
                 the risks.)

          -v, --verbose
                 Print extra chatter; can be repeated. See the FAQ entry on verbosity for details.

          --xattrs
                 Save xattrs and ACLs in the build cache, and restore them when rebuilding from the cache.

ARCHITECTURE

       Charliecloud provides the option --arch ARCH to specify the architecture for architecture-aware  registry
       operations.  The argument ARCH can be: (1) yolo, to bypass architecture-aware code and use the registry’s
       default architecture; (2) host, to use the host’s architecture, obtained with the equivalent of uname  -m
       (default  if  --arch  not  specified);  or (3) an architecture name. If the specified architecture is not
       available, the error message will list which ones are.

       Notes:

       1. ch-image is limited to one image per image reference in builder  storage  at  a  time,  regardless  of
          architecture.  For  example, if you say ch-image pull --arch=foo baz and then ch-image pull --arch=bar
          baz, builder storage will contain one image called “baz”, with architecture “bar”.

       2. Images’ default architecture is usually amd64, so this is  usually  what  you  get  with  --arch=yolo.
          Similarly,  if a registry image is architecture-unaware, it will still be pulled with --arch=amd64 and
          --arch=host  on  x86-64  hosts  (other  host  architectures   must   specify   --arch=yolo   to   pull
          architecture-unaware images).

       3. uname  -m  and image registries often use different names for the same architecture. For example, what
          uname -m reports as “x86_64” is known to  registries  as  “amd64”.  --arch=host  should  translate  if
          needed,  but  it’s useful to know this is happening.  Directly specified architecture names are passed
          to the registry without translation.

       4. Registries treat architecture as a pair of items, architecture and sometimes variant (e.g., “arm”  and
          “v7”).  Charliecloud  treats  architecture  as  a simple string and converts to/from the registry view
          transparently.

AUTHENTICATION

       Charliecloud does not have configuration files; thus, it  has  no  separate  login  subcommand  to  store
       secrets.  Instead,  Charliecloud  will  prompt for a username and password when authentication is needed.
       Note that some repositories refer to the secret as something other than a “password”; e.g., GitLab  calls
       it a “personal access token (PAT)”, Quay calls it an “application token”, and nVidia NGC calls it an “API
       token”.

       For   non-interactive   authentication,   you   can   use  environment  variables  CH_IMAGE_USERNAME  and
       CH_IMAGE_PASSWORD. Only do this if you fully understand the implications  for  your  specific  use  case,
       because it is difficult to securely store secrets in environment variables.

       By  default  for  most subcommands, all registry access is anonymous. To instead use authenticated access
       for everything, specify --auth or set the environment variable $CH_IMAGE_AUTH=yes. The exception is push,
       which always runs in authenticated mode. Even for pulling public images, it can be useful to authenticate
       for registries that have per-user rate limits, such  as  Docker  Hub.  (Older  versions  of  Charliecloud
       started  with  anonymous  access, then tried to upgrade to authenticated if it seemed necessary. However,
       this turned out to be brittle; see issue #1318.)

       The username and password are remembered for the life of the  process  and  silently  re-offered  to  the
       registry  if  needed.  One  case when this happens is on push to a private registry: many registries will
       first offer a read-only token when  ch-image  checks  if  something  exists,  then  re-authenticate  when
       upgrading  the token to read-write for upload. If your site uses one-time passwords such as provided by a
       security device, you can specify --password-many to provide a new secret each time.

       These values are not saved persistently, e.g. in a file. Note that we do use normal Python variables  for
       this information, without pinning them into physical RAM with mlock(2) or any other special treatment, so
       we cannot guarantee they will never reach non-volatile storage.

          Technical details

                 Most   registries   use  something  called  Bearer  authentication,  where  the  client  (e.g.,
                 Charliecloud) includes a token in the headers of every HTTP request.

                 The authorization dance is different from the typical UNIX approach, where there is a  separate
                 login  sequence  before  any content requests are made.  The client starts by simply making the
                 HTTP request it wants (e.g., to GET an image manifest), and if the registry  doesn’t  like  the
                 client’s  token  (or  if there is no token because the client doesn’t have one yet), it replies
                 with HTTP 401 Unauthorized, but crucially it also provides instructions in the response  header
                 on  how  to  get a token. The client then follows those instructions, obtains a token, re-tries
                 the request, and (hopefully) all is well. This approach also allows a client to upgrade a token
                 if needed, e.g. when transitioning from asking if a layer exists to uploading its content.

                 The distinction between Charliecloud’s anonymous mode and authenticated modes is that  it  will
                 only ask for anonymous tokens in anonymous mode and authenticated tokens in authenticated mode.
                 That  is,  anonymous  mode does involve an authentication procedure to obtain a token, but this
                 “authentication” is done anonymously. (Yes, it’s confusing.)

                 Registries also often reply HTTP 401 when an image does not exist, rather  than  the  seemingly
                 more  correct  HTTP  404 Not Found. This is to avoid information leakage about the existence of
                 images the client is not allowed to pull, and it’s why Charliecloud never says an image  simply
                 does not exist.

STORAGE DIRECTORY

       ch-image  maintains  state  using normal files and directories located in its storage directory; contents
       include various caches and temporary images used for building.

       In descending order of priority, this directory is located at:

          -s, --storage DIR
                 Command line option.

          $CH_IMAGE_STORAGE
                 Environment variable. The path must be absolute, because the variable is likely set in  a  very
                 different  context  than  when  it’s  used,  which seems error-prone on what a relative path is
                 relative to.

          /var/tmp/$USER.ch
                 Default. (Previously, the default was /var/tmp/$USER/ch-image. If a valid storage directory  is
                 found at the old default path, ch-image tries to move it to the new default path.)

       Unlike  many  container  implementations,  there is no notion of storage drivers, graph drivers, etc., to
       select and/or configure.

       The storage directory can reside on any single filesystem (i.e.,  it  cannot  be  split  across  multiple
       filesystems).  However, it contains lots of small files and metadata traffic can be intense. For example,
       the Charliecloud test suite uses approximately 400,000 files and directories in the storage directory  as
       of  this  writing.  Place  it  on a filesystem appropriate for this; tmpfs’es such as /var/tmp are a good
       choice if you have enough RAM (/tmp is not recommended because ch-run bind-mounts it into  containers  by
       default).

       While  you  can  currently  poke  around  in the storage directory and find unpacked images runnable with
       ch-run, this is not a supported use case. The supported workflow  uses  ch-convert  to  obtain  a  packed
       image; see the tutorial for details.

       The  storage  directory  format  changes on no particular schedule.  ch-image is normally able to upgrade
       directories produced by a given Charliecloud version  up  to  one  year  after  that  version’s  release.
       Upgrades  outside  this  window and downgrades are not supported. In these cases, ch-image will refuse to
       run until you delete and re-initialize the storage directory with ch-image reset.

       WARNING:
          Network filesystems, especially Lustre, are typically bad choices for the storage directory. This is a
          site-specific question and your local support will likely have strong opinions.

BUILD CACHE

   Overview
       Subcommands that create images, such as build  and  pull,  can  use  a  build  cache  to  speed  repeated
       operations.  That  is,  an  image is created by starting from the empty image and executing a sequence of
       instructions, largely Dockerfile instructions but also  some  others  like  “pull”  and  “import”.   Some
       instructions  are  expensive  to  execute (e.g., RUN wget http://slow.example.com/bigfile or transferring
       data billed by the byte), so it’s often cheaper to retrieve their results from cache instead.

       The build cache uses a relatively new Git under the hood; see the installation instructions  for  version
       requirements.  Charliecloud  implements workarounds for Git’s various storage limitations, so things like
       file metadata and Git repositories within the image should work.  Important  exception:  No  files  named
       .git* or other Git metadata are permitted in the image’s root directory.

       Extended  attributes  (xattrs)  are  ignored  by  the  build  cache  by default. Cache support for xattrs
       belonging to unprivileged xattr namespaces (e.g. user) can be enabled by specifying the  --xattrs  option
       or  by setting the CH_XATTRS environment variable. If CH_XATTRS is set, you override it with --no-xattrs.
       Note that extended attributes in privileged xattr namespaces (e.g. :code:‘trusted‘)  cannot  be  read  by
       :code:‘ch-image‘ and will always be lost without warning.

       The  cache  has three modes: enabled, disabled, and a hybrid mode called rebuild where the cache is fully
       enabled for FROM instructions, but all other  operations  re-execute  and  re-cache  their  results.  The
       purpose of rebuild is to do a clean rebuild of a Dockerfile atop a known-good base image.

       Enabled  mode  is  selected  with  --cache  or  setting  $CH_IMAGE_CACHE  to  enabled, disabled mode with
       --no-cache or disabled, and rebuild mode with --rebuild or rebuild. The default mode  is  enabled  if  an
       appropriate Git is installed, otherwise disabled.

   Compared to other implementations
       NOTE:
          This  section  is  a  lightly  edited  excerpt  from  our  paper “Charliecloud’s layer-free, Git-based
          container build cache”.

       Existing tools such as Docker and Podman implement their build cache with a  layered  (union)  filesystem
       such  as  OverlayFS  or  FUSE-OverlayFS  and  tar  archives  to represent the content of each layer; this
       approach is standardized by OCI. The layered cache works, but it has drawbacks in three critical areas:

       1. Diff format. The tar format is poorly standardized and not designed for diffs.   Notably,  tar  cannot
          represent  file  deletion. The workaround used for OCI layers is specially named whiteout files, which
          means  the  tar  archives  cannot  be  unpacked  by  standard   UNIX   tools   and   require   special
          container-specific processing.

       2. Cache  overhead.  Each  time  a Dockerfile instruction is started, a new overlay filesystem is mounted
          atop the existing layer stack. File metadata operations in the instruction then start at the top layer
          and descend the stack until the layer containing the desired  file  is  reached.  The  cost  of  these
          operations is therefore proportional to the number of layers, i.e., the number of instructions between
          the  empty  root  image  and the instruction being executed. This results in a best practice of large,
          complex instructions to minimize  their  number,  which  can  conflict  with  simpler,  more  numerous
          instructions the user might prefer.

       3. De-duplication.  Identical files on layers with an ancestry relationship (i.e., instruction A precedes
          B in a build) are stored only once.  However, identical files on layers without this relationship  are
          stored  multiple  times.  For  example, if instructions B and B’ both follow A — perhaps because B was
          modified and the image rebuilt — then any files created by both B and B’ will be stored twice.

          Also, similar files are never de-duplicated, regardless of ancestry. For  example,  if  instruction  A
          creates  a  file  and subsequently instruction B modifies a single bit in that file, both versions are
          stored in their entirety.

       Our Git-based cache addresses the three drawbacks: (1) Git is purpose-built to store  changing  directory
       trees,  (2) cache  overhead  is  imposed  only at instruction commit time, and (3) Git de-duplicates both
       identical and similar files. Also, it is based on an extremely widely used tool that  enjoys  development
       support  from  well-resourced  actors,  in  particular  on  scaling  (e.g.,  Microsoft’s large-repository
       accelerator Scalar was recently merged into Git).

       In addition to these structural advantages, performance experiments reported in our paper above show that
       the Git-based approach is as good as (and sometimes better than) overlay-based caches. On build time, the
       two approaches are broadly similar, with one or the other being faster depending  on  context.  Both  had
       performance  problems on NFS. Notably, however, the Git-based cache was much faster for a 129-instruction
       Dockerfile. On disk usage, the winner depended on the condition. For example, we saw  the  layered  cache
       storing  large  sibling  layers  redundantly;  on  the  other  hand, the Git-based cache has some obvious
       redundancies as  well,  and  one  must  compact  it  for  full  de-duplication  benefit.  However,  Git’s
       de-duplication  was  quite  effective  in  some  conditions and we suspect will prove even better in more
       realistic scenarios.

       That is, we believe our results show that the Git-based  build  cache  is  highly  competitive  with  the
       layered  approach,  with  no  obvious  inferiority  so far and hints that it may be superior on important
       dimensions. We have ongoing work to explore these questions in more detail.

   De-duplication and garbage collection
       Charliecloud’s build cache takes advantage of Git’s file de-duplication features.  This  operates  across
       the  entire build cache, i.e., files are de-duplicated no matter where in the cache they are found or the
       relationship between their container images. Files are de-duplicated  at  different  times  depending  on
       whether they are identical or merely similar.

       Identical  files  are  de-duplicated  at  git add time; in ch-image build terms, that’s upon committing a
       successful instruction.  That is, it’s impossible to store two files with the same content in  the  build
       cache.  If  you try — say with RUN yum install -y foo in one Dockerfile and RUN yum install -y foo bar in
       another, which are different instructions but both install RPM foo’s files — the content is  stored  once
       and each copy gets its own metadata and a pointer to the content, much like filesystem hard links.

       Similar  files,  however,  are only de-duplicated during Git’s garbage collection process. When files are
       initially added to a Git repository (with git add), they are stored inside the  repository  as  (possibly
       compressed)  individual  files, called objects in Git jargon. Upon garbage collection, which happens both
       automatically when certain parameters are met and explicitly with git gc, these files  are  archived  and
       (re-)compressed together into a single file called a packfile. Also, existing packfiles may be re-written
       into the new one.

       During  this  process,  similar files are identified, and each set of similar files is stored as one base
       file plus diffs to recover the others. (Similarity detection seems to be based primarily on  file  size.)
       This  delta  process is agnostic to alignment, which is an advantage over alignment-sensitive block-level
       de-duplicating filesystems. Exception: “Large” files are not compressed or de-duplicated. We use the  Git
       default threshold of 512 MiB (as of this writing).

       Charliecloud  runs  Git  garbage  collection at two different times. First, a lighter-weight garbage pass
       runs automatically when the number of loose files (objects) grows beyond a limit. This limit is  in  flux
       as  we  learn more about build cache performance, but it’s quite a bit higher than the Git default.  This
       garbage runs in the background and can continue after the build completes;  you  may  see  Git  processes
       using a lot of CPU.

       An  important  limitation  of  the automatic garbage is that large packfiles (again, this is in flux, but
       it’s several GiB) will not be re-packed, limiting the scope of similar file detection. To address this, a
       heavier garbage collection can be run manually with ch-image build-cache --gc.  This  will  re-pack  (and
       re-write)  the  entire  build  cache,  de-duplicating  all similar files. In both cases, garbage uses all
       available cores.

       git build-cache prints the specific garbage collection parameters in use, and -v can be  added  for  more
       detail.

   Large file threshold
       Because  Git  uses  content-addressed storage, upon commit, it must read in full all files modified by an
       instruction. This I/O cost can be a significant fraction of build time for some images. To mitigate this,
       regular files larger than the experimental large file threshold are stored outside  the  Git  repository,
       somewhat like Git Large File Storage.

       ch-image  copies  large files in and out of images at each instruction commit. It tries to do this with a
       fast metadata-only copy-on-write operation called “reflink”, but that is only supported  with  the  right
       Python  version,  Linux  kernel  version,  and  filesystem. If unsupported, Charliecloud falls back to an
       expensive standard copy, which is likely slower than letting Git deal  with  the  files.  See  File  copy
       performance for details.

       Every  version  of  a  large file is stored verbatim and uncompressed (e.g., a large file with a one-byte
       change will be stored in full twice), so Git’s de-duplication does not  apply.  However,  on  filesystems
       with  reflink  support,  files  can  share  extents (e.g., each of the two files will have its own extent
       containing the  changed  byte,  but  the  rest  of  the  extents  will  remain  shared).   This  provides
       de-duplication  between  large files images that share ancestry.  Also, unused large files are deleted by
       ch-image build-cache --gc.

       A final caveat: Large files in any image with the  same  path,  mode,  size,  and  mtime  (to  nanosecond
       precision  if  possible) are considered identical, even if their content is not actually identical (e.g.,
       touch(1) shenanigans can corrupt an image).

       Option --cache-large sets the threshold in MiB; if not set, environment variable CH_IMAGE_CACHE_LARGE  is
       used; if that is not set either, the default value 0 indicates that no files are considered large.

       (Note that Git has an unrelated setting called core.bigFileThreshold.)

   Example
       Suppose we have this Dockerfile:

          $ cat a.df
          FROM alpine:3.17
          RUN echo foo
          RUN echo bar

       On our first build, we get:

          $ ch-image build -t foo -f a.df .
            1. FROM alpine:3.17
          [ ... pull chatter omitted ... ]
            2. RUN echo foo
          copying image ...
          foo
            3. RUN echo bar
          bar
          grown in 3 instructions: foo

       Note  the dot after each instruction’s line number. This means that the instruction was executed. You can
       also see this by the output of the two echo commands.

       But on our second build, we get:

          $ ch-image build -t foo -f a.df .
            1* FROM alpine:3.17
            2* RUN echo foo
            3* RUN echo bar
          copying image ...
          grown in 3 instructions: foo

       Here, instead of being executed, each instruction’s results were retrieved from cache. (Charliecloud uses
       lazy retrieval; nothing is actually retrieved until the end, as seen by  the  “copying  image”  message.)
       Cache  hit  for  each instruction is indicated by an asterisk (*) after the line number.  Even for such a
       small and short Dockerfile, this build is noticeably faster than the first.

       We can also try a second, slightly different Dockerfile. Note that the first three instructions  are  the
       same, but the third is different:

          $ cat c.df
          FROM alpine:3.17
          RUN echo foo
          RUN echo qux
          $ ch-image build -t c -f c.df .
            1* FROM alpine:3.17
            2* RUN echo foo
            3. RUN echo qux
          copying image ...
          qux
          grown in 3 instructions: c

       Here,  the  first  two  instructions  are  hits  from  the  first Dockerfile, but the third is a miss, so
       Charliecloud retrieves that state and continues building.

       We can also inspect the cache:

          $ ch-image build-cache --tree
          *  (c) RUN echo qux
          | *  (a) RUN echo bar
          |/
          *  RUN echo foo
          *  (alpine+3.9) PULL alpine:3.17
          *  (root) ROOT

          named images:     4
          state IDs:        5
          commits:          5
          files:          317
          disk used:        3 MiB

       Here there are four named images: a and  c  that  we  built,  the  base  image  alpine:3.17  (written  as
       alpine+3.9 because colon is not allowed in Git branch names), and the empty base of everything root. Also
       note how a and c diverge after the last common instruction RUN echo foo.

BUILD

       Build an image from a Dockerfile and put it in the storage directory.

   Synopsis
          $ ch-image [...] build [-t TAG] [-f DOCKERFILE] [...] CONTEXT

   Description
       See  below  for  differences  with  other  Dockerfile  interpreters.  Charliecloud  supports  an extended
       instruction (RSYNC), a few other instructions behave slightly differently, and a few are ignored.

       Note that FROM implicitly pulls the base image if needed,  so  you  may  want  to  read  about  the  pull
       subcommand below as well.

       Required argument:

          CONTEXT
                 Path to context directory. This is the root of COPY instructions in the Dockerfile. If a single
                 hyphen  (-) is specified: (a) read the Dockerfile from standard input, (b) specifying --file is
                 an error, and (c) there is no context, so COPY will fail. (See --file for how  to  provide  the
                 Dockerfile on standard input while also having a context.)

       Options:

          -b, --bind SRC[:DST]
                 For  RUN  instructions  only,  bind-mount  SRC  at  guest  DST.  The default destination if not
                 specified is  to  use  the  same  path  as  the  host;  i.e.,  the  default  is  equivalent  to
                 --bind=SRC:SRC. If DST does not exist, try to create it as an empty directory, though images do
                 have ten directories /mnt/[0-9] already available as mount points. Can be repeated.

                 Note: See documentation for ch-run --bind for important caveats and gotchas.

                 Note:  Other  instructions  that  modify the image filesystem, e.g.  COPY, can only access host
                 files from the context directory, regardless of this option.

          --build-arg KEY[=VALUE]
                 Set build-time variable KEY defined by ARG instruction to VALUE. If VALUE  not  specified,  use
                 the value of environment variable KEY.

          -f, --file DOCKERFILE
                 Use  DOCKERFILE  instead  of  CONTEXT/Dockerfile. If a single hyphen (-) is specified, read the
                 Dockerfile from standard input; like docker build, the context directory is still available  in
                 this case.

          --force[=MODE]
                 Use  unprivileged  build  with  root  emulation  mode MODE, which can be fakeroot, seccomp (the
                 default), or none. See section “Privilege model” below for details on what this does  and  when
                 you might need it.

          --force-cmd=CMD,ARG1[,ARG2...]
                 If  command CMD is found in a RUN instruction, add the comma-separated ARGs to it. For example,
                 --force-cmd=foo,-a,--bar=baz would transform RUN foo -c into RUN foo -a --bar=baz -c.  This  is
                 intended  to  suppress validation that defeats --force=seccomp and implies that option.  Can be
                 repeated. If specified, replaces (does not extend) the  default  suppression  options.  Literal
                 commas  can be escaped with backslash; importantly however, backslash will need to be protected
                 from the shell also. Section “Privilege model” below explains why you might need this.

          -n, --dry-run
                 Don’t actually execute any Dockerfile instructions.

          --parse-only
                 Stop after parsing the Dockerfile.

          -t, --tag TAG
                 Name of image to create. If not specified, infer the name:

                 1. If Dockerfile named Dockerfile with an extension: use the extension with invalid  characters
                    stripped, e.g.  Dockerfile.@FOO.barfoo.bar.

                 2. If Dockerfile has extension df or dockerfile: use the basename with the same transformation,
                    e.g. baz.@QUX.dockerfile -> baz.qux.

                 3. If context directory is not /: use its name, i.e. the last component of the absolute path to
                    the context directory, with the same transformation,

                 4. Otherwise (context directory is /): use root.

                 If no colon present in the name, append :latest.

       Uses ch-run -w -u0 -g0 --no-passwd --unsafe to execute RUN instructions.

   Privilege model
   Overview
       ch-image is a fully unprivileged image builder. It does not use any setuid or setcap helper programs, and
       it  does not use configuration files /etc/subuid or /etc/subgid. This contrasts with the “rootless” or “‐
       fakeroot” modes of some competing builders, which do require privileged supporting code or utilities.

       Without root emulation, this approach does confuse programs that expect to  have  real  root  privileges,
       most notably distribution package installers. This subsection describes why that happens and what you can
       do about it.

       ch-image executes all instructions as the normal user who invokes it.  For RUN, this is accomplished with
       ch-run  arguments  including -w --uid=0 --gid=0. That is, your host EUID and EGID are both mapped to zero
       inside the container, and only one UID (zero) and GID (zero) are available inside  the  container.  Under
       this  arrangement, processes running in the container for each RUN appear to be running as root, but many
       privileged system calls will fail without the root emulation methods described below.  This  affects  any
       fully unprivileged container build, not just Charliecloud.

       The  most  common time to see this is installing packages. For example, here is RPM failing to chown(2) a
       file, which makes the package update fail:

            Updating   : 1:dbus-1.10.24-13.el7_6.x86_64                            2/4
          Error unpacking rpm package 1:dbus-1.10.24-13.el7_6.x86_64
          error: unpacking of archive failed on file /usr/libexec/dbus-1/dbus-daemon-launch-helper;5cffd726: cpio: chown
            Cleanup    : 1:dbus-libs-1.10.24-12.el7.x86_64                         3/4
          error: dbus-1:1.10.24-13.el7_6.x86_64: install failed

       This one is (ironically) apt-get failing to drop privileges:

          E: setgroups 65534 failed - setgroups (1: Operation not permitted)
          E: setegid 65534 failed - setegid (22: Invalid argument)
          E: seteuid 100 failed - seteuid (22: Invalid argument)
          E: setgroups 0 failed - setgroups (1: Operation not permitted)

       Charliecloud provides two different mechanisms to  avoid  these  problems.  Both  involve  lying  to  the
       containerized process about privileged system calls, but at very different levels of complexity.

   Root emulation mode fakeroot
       This  mode  uses  fakeroot(1)  to maintain an elaborate web of deceit that is internally consistent. This
       program intercepts both privileged system calls (e.g., setuid(2)) as well as  other  system  calls  whose
       return  values  depend  on  those  calls  (e.g.,  getuid(2)),  faking success for privileged system calls
       (perhaps making no system call at all) and altering return values to  be  consistent  with  earlier  fake
       success.  Charliecloud automatically installs the fakeroot(1) program inside the container and then wraps
       RUN instructions having known privilege needs with it. Thus, this mode  is  only  available  for  certain
       distributions.

       The  advantage  of  this  mode  is  its  consistency; e.g., careful programs that check the new UID after
       attempting to change it will  not  notice  anything  amiss.  Its  disadvantage  is  complexity:  detailed
       knowledge and procedures for multiple Linux distributions.

       This mode has three basic steps:

          1. After  FROM,  analyze the image to see what distribution it contains, which determines the specific
             workarounds.

          2. Before the user command in the first RUN instruction where  the  injection  seems  needed,  install
             fakeroot(1)  in  the  image,  if  one  is  not  already  installed,  as well as any other necessary
             initialization commands. For example, we turn off the apt sandbox (for Debian Buster) and configure
             EPEL but leave it disabled (for CentOS/RHEL).

          3. Prepend fakeroot to RUN instructions that seem to need it, e.g. ones  that  contain  apt,  apt-get,
             dpkg for Debian derivatives and dnf, rpm, or yum for RPM-based distributions.

       RUN instructions that do not seem to need modification are unaffected by this mode.

       The  details  are  specific  to  each  distribution.  ch-image  analyzes  image  content  (e.g., grepping
       /etc/debian_version) to select a configuration; see lib/force.py for  details.  ch-image  prints  exactly
       what it is doing.

       WARNING:
          Because  of  fakeroot mode’s complexity, we plan to remove it if seccomp mode performs well enough. If
          you have a situation where fakeroot mode works and seccomp does not, please let us know.

   Root emulation mode seccomp (default)
       This mode uses the kernel’s seccomp(2) system call  filtering  to  intercept  certain  privileged  system
       calls, do absolutely nothing, and return success to the program.

       Some  system  calls  are  quashed  regardless  of  their  arguments:  capset(2);  chown(2)  and  friends;
       kexec_load(2) (used to validate the filter itself); ; and setuid(2), setgid(2),  and  setgroups(2)  along
       with the other system calls that change user or group. mknod(2) and mknodat(2) are quashed if they try to
       create a device file (e.g., creating FIFOs works normally).

       The advantages of this approach is that it’s much simpler, it’s faster, it’s completely agnostic to libc,
       and  it’s  mostly agnostic to distribution. The disadvantage is that it’s a very lazy liar; even the most
       cursory consistency checks will fail, e.g., getuid(2) after setuid(2).

       While this mode does not provide consistency, it does offer a hook to help prevent  programs  asking  for
       consistency. For example, apt-get -o APT::Sandbox::User=root will prevent apt-get from attempting to drop
       privileges, which it verifies, exiting with failure if the correct IDs are not found (which they won’t be
       under  this  approach). This can be expressed with --force-cmd=apt-get,-o,APT::Sandbox::User=root, though
       this particular case is built-in and does not need to be specified. The full default configuration, which
       is applied regardless of the image distribution, can be examined in the  source  file  force.py.  If  any
       --force-cmd are specified, this replaces (rather than extends) the default configuration.

       Note  that because the substitutions are a simple regex with no knowledge of shell syntax, they can cause
       unwanted modifications. For example, RUN apt-get install -y apt-get will be run as /bin/sh -c "apt-get -o
       APT::Sandbox::User=root install -y apt-get -o APT::Sandbox::User=root". One workaround is to  add  escape
       syntax transparent to the shell; e.g., RUN apt-get install -y apt-get.

       This  mode  executes  all  RUN  instructions  with  the  seccomp(2)  filter and has no knowledge of which
       instructions actually used the intercepted system calls. Therefore, the printed  “instructions  modified”
       number is only a count of instructions with a hook applied as described above.

   RUN logging
       In  terminal  output, image metadata, and the build cache, the RUN instruction is always logged as RUN.S,
       RUN.F, or RUN.N.  The letter appended to the instruction reflects the root emulation mode used during the
       build in which the instruction was executed. RUN.S indicates seccomp, RUN.F indicates fakeroot, and RUN.N
       indicates that neither form of root emulation was used (--force=none).

   Compatibility and behavior differences
       ch-image is an independent implementation and shares no code with other Dockerfile interpreters. It  uses
       a   formal  Dockerfile  parsing  grammar  developed  from  the  Dockerfile  reference  documentation  and
       miscellaneous other sources, which you can examine in the source code.

       We believe this independence is valuable for several reasons.  First,  it  helps  the  community  examine
       Dockerfile  syntax  and  semantics  critically, think rigorously about what is really needed, and build a
       more robust standard.  Second, it yields disjoint sets of bugs (note that Podman, Buildah, and Docker all
       share the same Dockerfile parser). Third, because it is a much smaller  code  base,  it  illustrates  how
       Dockerfiles  work  more  clearly.  Finally,  it  allows  straightforward  extensions if needed to support
       scientific computing.

       ch-image tries hard to be compatible with  Docker  and  other  interpreters,  though  as  an  independent
       implementation, it is not bug-compatible.

       The  following  subsections  describe  differences  from  the  Dockerfile  reference that we expect to be
       approximately permanent. For not-yet-implemented features and bugs in this area, see  related  issues  on
       GitHub.

       None of these are set in stone. We are very interested in feedback on our assessments and open questions.
       This helps us prioritize new features and revise our thinking about what is needed for HPC containers.

   Context directory
       The  context  directory is bind-mounted into the build, rather than copied like Docker. Thus, the size of
       the context is immaterial, and the build reads directly from storage like any other local  process  would
       (i.e.,  it  is  reasonable  use  / for the context). However, you still can’t access anything outside the
       context directory.

   Variable substitution
       Variable substitution happens for all instructions, not just the ones listed in the Dockerfile reference.

       ARG and ENV cause cache misses upon definition, in contrast with Docker where these variables  miss  upon
       use, except for certain cache-excluded variables that never cause misses, listed below.

       Note that ARG and ENV have different syntax despite very similar semantics.

       ch-image  passes the following proxy environment variables in to the build. Changes to these variables do
       not cause a cache miss. They do  not  require  an  ARG  instruction,  as  documented  in  the  Dockerfile
       reference.  Unlike  Docker,  they  are  available  if  the  same-named  environment  variable is defined;
       --build-arg is not required.

          HTTP_PROXY
          http_proxy
          HTTPS_PROXY
          https_proxy
          FTP_PROXY
          ftp_proxy
          NO_PROXY
          no_proxy

       In addition to those listed in the Dockerfile reference, these environment variables are  passed  through
       in the same way:

          SSH_AUTH_SOCK
          USER

       Finally, these variables are also pre-defined but are unrelated to the host environment:

          PATH=/ch/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
          TAR_OPTIONS=--no-same-owner

   ARG
       Variables  set  with ARG are available anywhere in the Dockerfile, unlike Docker, where they only work in
       FROM instructions, and possibly in other ARG before the first FROM.

   FROM
       The FROM instruction  accepts  option  --arg=NAME=VALUE,  which  serves  the  same  purpose  as  the  ARG
       instruction. It can be repeated.

   LABEL
       The  LABEL  instruction  accepts  key=value  pairs to add metadata for an image. Unlike Docker, multiline
       values are not supported; see issue #1512.  Can be repeated.

   COPY
       NOTE:
          The behavior described here matches Docker’s now-deprecated legacy  builder.   Docker’s  new  builder,
          BuildKit, has different behavior in some cases, which we have not characterized.

       Especially  for  people  used  to  UNIX  cp(1),  the  semantics of the Dockerfile COPY instruction can be
       confusing.

       Most notably, when a source of the copy is a directory, the contents of that directory, not the directory
       itself, are copied. This is documented, but it’s a real gotcha because that’s not what cp(1) does, and it
       means that many things you can do in one cp(1) command require multiple COPY instructions.

       Also, the reference documentation is incomplete. In our  experience,  Docker  also  behaves  as  follows;
       ch-image does the same in an attempt to be bug-compatible.

       1. You can use absolute paths in the source; the root is the context directory.

       2. Destination directories are created if they don’t exist in the following situations:

          1. If the destination path ends in slash. (Documented.)

          2. If the number of sources is greater than 1, either by wildcard or explicitly, regardless of whether
             the destination ends in slash. (Not documented.)

          3. If there is a single source and it is a directory. (Not documented.)

       3. Symbolic links behave differently depending on how deep in the copied tree they are. (Not documented.)

          1. Symlinks  at  the top level — i.e., named as the destination or the source, either explicitly or by
             wildcards — are dereferenced. They are followed,  and  whatever  they  point  to  is  used  as  the
             destination or source, respectively.

          2. Symlinks at deeper levels are not dereferenced, i.e., the symlink itself is copied.

       4. If  a directory appears at the same path in source and destination, and is at the 2nd level or deeper,
          the source directory’s metadata (e.g., permissions) are copied  to  the  destination  directory.  (Not
          documented.)

       5. If  an  object  (a) appears in both the source and destination, (b) is at the 2nd level or deeper, and
          (c) is different file  types  in  source  and  destination,  the  source  object  will  overwrite  the
          destination object. (Not documented.)

       We expect the following differences to be permanent:

       • Wildcards use Python glob semantics, not the Go semantics.

       • COPY --chown is ignored, because it doesn’t make sense in an unprivileged build.

   Features we do not plan to support
       • Parser directives are not supported. We have not identified a need for any of them.

       • EXPOSE:  Charliecloud  does not use the network namespace, so containerized processes can simply listen
         on a host port like other unprivileged processes.

       • HEALTHCHECK: This instruction’s main use case is monitoring server processes rather than  applications.
         Also, it requires a container supervisor daemon, which we have no plans to add.

       • MAINTAINER is deprecated.

       • STOPSIGNAL requires a container supervisor daemon process, which we have no plans to add.

       • USER does not make sense for unprivileged builds.

       • VOLUME:  Charliecloud has good support for bind mounts; we anticipate that it will continue to focus on
         that and will not introduce the volume management features that Docker has.

   RSYNC (Dockerfile extension)
       WARNING:
          This instruction is experimental and may change or be removed.

   Overview
       Copying files is often simple but has numerous difficult corner cases, e.g.  when dealing  with  symbolic
       or hard links. The standard instruction COPY deals with many of these corner cases differently from other
       UNIX  utilities,  lacks  complete  documentation, and behaves inconsistently between different Dockerfile
       interpreters (e.g., Docker’s legacy builder vs.   BuildKit),  as  detailed  above.  On  the  other  hand,
       rsync(1)  is  an extremely capable, widely used file copy tool, with detailed options to specify behavior
       and 25 years of history dealing with weirdness.

       RSYNC (also spelled NSYNC) is a Charliecloud extension that gives copying behavior identical to rsync(1).
       In fact, Charliecloud’s current implementation literally calls the host’s rsync(1) to do the copy, though
       this may change in the future. There is no list form of RSYNC.

       The two key usage challenges are trailing slashes on paths  and  symlink  handling.  In  particular,  the
       default symlink handling seemed reasonable to us, but you may want something different. See the arguments
       and  examples  below.  Importantly, COPY is not any less fraught, and you have no choice about what to do
       with symlinks.

   Arguments
       RSYNC takes the same arguments as rsync(1), so refer to its man page for a detailed  explanation  of  all
       the  options  (with  possible  emphasis  on  its  symlink  options).  Sources are relative to the context
       directory even if they look absolute  with  a  leading  slash.  Any  globbed  sources  are  processed  by
       ch-image(1)  using  Python  rules,  i.e.,  rsync(1) sees the expanded sources with no wildcards. Relative
       destinations are relative to the image’s current working directory, while absolute destinations refer  to
       the image’s root.

       For  arguments  that  read  input  from  a file (e.g. --exclude-from or --files-from), relative paths are
       relative to the context directory, absolute paths refer to the image root, and - (standard input)  is  an
       error.

       For example,

          WORKDIR /foo
          RSYNC --foo src1 src2 dst

       is translated to (the equivalent of):

          $ mkdir -p /foo
          $ rsync -@=-1 -AHSXpr --info=progress2 -l --safe-links \
                  --foo /context/src1 /context/src2 /storage/imgroot/foo/dst2

       Note  the extensive default arguments to rsync(1). RSYNC takes a single instruction option beginning with
       + (plus) that is shorthand for a group of rsync(1) options. This single option is one of:

          +m     Preserves metadata and directory structure. Symlinks are skipped with a warning. Equivalent  to
                 all of:

                 • -@=-1: use nanosecond precision when comparing timestamps.

                 • -A: preserve ACLs.

                 • -H: preserve hard link groups.

                 • -S: preserve file sparseness when possible.

                 • -X: preserve xattrs in user.* namespace.

                 • -p: preserve permissions.

                 • -r: recurse into directories.

                 • --info=progress2  (only  if  stderr  is  a terminal): show progress meter (note subtleties in
                   interpretation).

          +l (default)
                 Like +u, but silently skips “unsafe” symlinks  whose  target  is  outside  the  top-of-transfer
                 directory. Preserves:

                 • Metadata.

                 • Directory structure.

                 • Symlinks,  if  a  link’s  target  is within the “top-of-transfer directory”.  This is not the
                   context directory and often not the source either. Also, this creates broken symlinks if  the
                   target is not within the source but is within the top-of-transfer. See examples below.

                 Equivalent to the rsync(1) options listed for +m plus --links (copy symlinks as symlinks unless
                 otherwise specified) and --safe-links (silently skip unsafe symlinks).

          +u     Like  +l,  but  replaces  with  their  target  “unsafe”  symlinks  whose  target is outside the
                 top-of-transfer directory, and thus can copy data outside the context directory into the image.
                 Preserves:

                 • Metadata.

                 • Directory structure.

                 • Symlinks, if a link’s target is within the “top-of-transfer  directory”.   This  is  not  the
                   context  directory and often not the source either. Also, this creates broken symlinks if the
                   target is not within the source but is within the top-of-transfer. See examples below.

                 Equivalent to the rsync(1) options listed for +m plus --links (copy symlinks as symlinks unless
                 otherwise specified) and --copy-unsafe-links (copy the target of unsafe symlinks).

          +z     No default arguments. Directories will not be descended, no metadata  will  be  preserved,  and
                 both hard and symbolic links will be ignored, except as otherwise specified by rsync(1) options
                 starting  with a hyphen.  (Note that -a/--archive is discouraged because it omits some metadata
                 and handles symlinks inappropriately for containers.)

       NOTE:
          rsync(1) supports a configuration file ~/.popt that alters its  command  line  processing.  Currently,
          this configuration is respected for RSYNC arguments, but that may change without notice.

   Disallowed rsync(1) features
       A small number of rsync(1) features are actively disallowed:

          1. rsync:  and  ssh: transports are an error. Charliecloud needs access to the entire input to compute
             cache hit or miss, and these transports make that impossible. It  is  possible  these  will  become
             available in the future (please let us know if that is your use case!).  For now, the workaround is
             to  install rsync(1) in the image and use it in a RUN instruction, though only the instruction text
             will be considered for the cache.

          2. Option arguments must be delimited with = (equals). For example, to set the block size  to  4  MiB,
             you  must say --block-size=4M or -B=4M. -B4M will be interpreted as the three arguments -B, -4, and
             -M; --block-size 4M will be interpreted as --block-size with no argument and a  copy  source  named
             4M.  This  is  so  Charliecloud  can  process  rsync(1)  options without knowing which ones take an
             argument.

          3. Invalid rsync(1) options:

             --daemon
                    Running rsync(1) in daemon mode does not make sense for container build.

             -n, --dry-run
                    This makes the copy a no-op, and Charliecloud may want to use it internally in the future.

             --remove-source-files
                    This would let the instruction alter the context directory.

       Note that there are likely other flags that don’t make sense and/or cause undesirable behavior.  We  have
       not characterized this problem.

   Build cache
       The  instruction is a cache hit if the metadata of all source files is unchanged (specifically: filename,
       file type and permissions, xattrs, size, and last modified time). Unlike Docker,  Charliecloud  does  not
       use  file  contents.  This  has  two  implications.  First,  it is possible to fool the cache by manually
       restoring the last-modified time. Second, RSYNC is I/O-intensive even  when  it  hits,  because  it  must
       stat(2)  every  source  file  before checking the cache. However, this is still less I/O than reading the
       file content too.

       Notably, Charliecloud’s cache ignores rsync(1)’s  own  internal  notion  of  whether  anything  would  be
       transferred (e.g., rsync -ni). This may change in the future.

   Examples and tutorial
       All of these examples use the same input, whose content will be introduced gradually, using edited output
       of  ls  -oghR  (which  is  like  ls  -lhR but omits user and group). Examples assume a umask of 0007. The
       Dockerfile instructions listed also assume a preceding:

          FROM alpine:3.17
          RUN mkdir /dst

       i.e., a simple base image containing a top-level directory dst.

       Many additional examples are available in the source code in the file test/build/50_rsync.bats.

       We begin by copying regular  files.  The  context  directory  ctx  contains,  in  part,  two  directories
       containing  one  regular file each. Note that one of these files (file-basic1) and one of the directories
       (basic1) have strange permissions.

          ./ctx:
          drwx---r-x 2  60 Oct 11 13:20 basic1
          drwxrwx--- 2  60 Oct 11 13:20 basic2

          ./ctx/basic1:
          -rw----r-- 1 12 Oct 11 13:20 file-basic1

          ./ctx/basic2:
          -rw-rw---- 1 12 Oct 11 13:20 file-basic2

       The simplest form of RSYNC is to copy a single file into a specified directory:

          RSYNC /basic1/file-basic1 /dst

       resulting in:

          $ ls -oghR dst
          dst:
          -rw----r-- 1 12 Oct 11 13:26 file-basic1

       Note that file-basic1’s metadata — here its odd permissions — are preserved. 1  is  the  number  of  hard
       links to the file, and 12 is the file size.

       One  can  also rename the destination by specifying a new file name, and with +z, not copy metadata (from
       here on the ls command is omitted for brevity):

          RSYNC +z /basic1/file-basic1 /dst/file-basic1_nom

          dst:
          -rw------- 1 12 Sep 21 15:51 file-basic1_nom

       A trailing slash on the destination creates a new directory and places the source file within:

          RSYNC /basic1/file-basic1 /dst/new/

          dst:
          drwxrwx--- 1 22 Oct 11 13:26 new

          dst/new:
          -rw----r-- 1 12 Oct 11 13:26 file-basic1

       With multiple source files, the destination trailing slash is optional:

          RSYNC /basic1/file-basic1 /basic2/file-basic2 /dst/newB

          dst:
          drwxrwx--- 1 44 Oct 11 13:26 newB

          dst/newB:
          -rw----r-- 1 12 Oct 11 13:26 file-basic1
          -rw-rw---- 1 12 Oct 11 13:26 file-basic2

       For directory sources, the presence or absence of a trailing slash is highly  significant.  Without  one,
       the directory itself is placed in the destination (recall that this would rename a source file):

          RSYNC /basic1 /dst/basic1_new

          dst:
          drwxrwx--- 1 12 Oct 11 13:28 basic1_new

          dst/basic1_new:
          drwx---r-x 1 22 Oct 11 13:28 basic1

          dst/basic1_new/basic1:
          -rw----r-- 1 12 Oct 11 13:28 file-basic1

       A  source  trailing  slash  means  copy  the  contents  of  a directory rather than the directory itself.
       Importantly, however, the directory’s metadata is copied to the destination directory.

          RSYNC /basic1/ /dst/basic1_renamed

          dst:
          drwx---r-x 1 22 Oct 11 13:28 basic1_renamed

          dst/basic1_renamed:
          -rw----r-- 1 12 Oct 11 13:28 file-basic1

       One gotcha is that RSYNC +z is a no-op if the source is a directory:

          RSYNC +z /basic1 /dst/basic1_newC

          dst:

       At least -r is needed with +z in this case:

          RSYNC +z -r /basic1/ /dst/basic1_newD

          dst:
          drwx------ 1 22 Oct 11 13:28 basic1_newD

          dst/basic1_newD:
          -rw------- 1 12 Oct 11 13:28 file-basic1

       Multiple source directories can be specified, including with wildcards.  This  example  also  illustrates
       that copies files are by default merged with content already existing in the image.

          RUN mkdir /dst/dstC && echo file-dstC > /dst/dstC/file-dstC
          RSYNC /basic* /dst/dstC

          dst:
          drwxrwx--- 1 42 Oct 11 13:33 dstC

          dst/dstC:
          drwx---r-x 1 22 Oct 11 13:33 basic1
          drwxrwx--- 1 22 Oct 11 13:33 basic2
          -rw-rw---- 1 10 Oct 11 13:33 file-dstC

          dst/dstC/basic1:
          -rw----r-- 1 12 Oct 11 13:33 file-basic1

          dst/dstC/basic2:
          -rw-rw---- 1 12 Oct 11 13:33 file-basic2

       Trailing slashes can be specified independently for each source:

          RUN mkdir /dst/dstF && echo file-dstF > /dst/dstF/file-dstF
          RSYNC /basic1 /basic2/ /dst/dstF

          dst:
          drwxrwx--- 1 52 Oct 11 13:33 dstF

          dst/dstF:
          drwx---r-x 1 22 Oct 11 13:33 basic1
          -rw-rw---- 1 12 Oct 11 13:33 file-basic2
          -rw-rw---- 1 10 Oct 11 13:33 file-dstF

          dst/dstF/basic1:
          -rw----r-- 1 12 Oct 11 13:33 file-basic1

       Bare / (i.e., the entire context directory) is considered to have a trailing slash:

          RSYNC / /dst

          dst:
          drwx---r-x 1  22 Oct 11 13:33 basic1
          drwxrwx--- 1  22 Oct 11 13:33 basic2

          dst/basic1:
          -rw----r-- 1 12 Oct 11 13:33 file-basic1

          dst/basic2:
          -rw-rw---- 1 12 Oct 11 13:33 file-basic2

       To  replace  (rather  than  merge  with) existing content, use --delete.  Note also that wildcards can be
       combined with trailing slashes and that the directory gets the metadata of the first slashed directory.

          RUN mkdir /dst/dstG && echo file-dstG > /dst/dstG/file-dstG
          RSYNC --delete /basic*/ /dst/dstG

          dst:
          drwx---r-x 1 44 Oct 11 14:00 dstG

          dst/dstG:
          -rw----r-- 1 12 Oct 11 14:00 file-basic1
          -rw-rw---- 1 12 Oct 11 14:00 file-basic2

       Symbolic links in the source(s) add significant complexity. Like rsync(1), RSYNC  can  do  one  of  three
       things with a given symlink:

       1. Ignore it, silently or with a warning.

       2. Preserve it: copy as a symlink, with the same target.

       3. Dereference it: copy the target instead.

       These  actions  are selected independently for safe symlinks and unsafe symlinks. Safe symlinks are those
       which point to a target within the top of transfer, which is the deepest directory  in  the  source  path
       with  a  trailing  slash. For example, /foo/bar’s top-of-transfer is /foo (regardless of whether bar is a
       directory or file), while /foo/bar/’s top-of-transfer is /foo/bar.

       For the symlink examples, the context contains two sub-directories with a variety of symlinks, as well as
       a sibling file and directory outside the context. All of these links are  valid  on  the  host.  In  this
       listing, the absolute path to the parent of the context directory is replaced with /....

          .:
          drwxrwx--- 9 200 Oct 11 14:00 ctx
          drwxrwx--- 2  60 Oct 11 14:00 dir-out
          -rw-rw---- 1   9 Oct 11 14:00 file-out

          ./ctx:
          drwxrwx--- 3 320 Oct 11 14:00 sym1

          ./ctx/sym1:
          lrwxrwxrwx 1 13 Oct 11 14:00 dir-out_rel -> ../../dir-out
          drwxrwx--- 2 60 Oct 11 14:00 dir-sym1
          lrwxrwxrwx 1  8 Oct 11 14:00 dir-sym1_direct -> dir-sym1
          lrwxrwxrwx 1 10 Oct 11 14:00 dir-top_rel -> ../dir-top
          lrwxrwxrwx 1 47 Oct 11 14:00 file-out_abs -> /.../file-out
          lrwxrwxrwx 1 14 Oct 11 14:00 file-out_rel -> ../../file-out
          -rw-rw---- 1 10 Oct 11 14:00 file-sym1
          lrwxrwxrwx 1 57 Oct 11 14:00 file-sym1_abs -> /.../ctx/sym1/file-sym1
          lrwxrwxrwx 1  9 Oct 11 14:00 file-sym1_direct -> file-sym1
          lrwxrwxrwx 1 17 Oct 11 14:00 file-sym1_upover -> ../sym1/file-sym1
          lrwxrwxrwx 1 51 Oct 11 14:00 file-top_abs -> /.../ctx/file-top
          lrwxrwxrwx 1 11 Oct 11 14:00 file-top_rel -> ../file-top

          ./ctx/sym1/dir-sym1:
          -rw-rw---- 1 14 Oct 11 14:00 dir-sym1.file

          ./dir-out:
          -rw-rw---- 1 13 Oct 11 14:00 dir-out.file

       By default, safe symlinks are preserved while unsafe symlinks are silently ignored:

          RSYNC /sym1 /dst

          dst:
          drwxrwx--- 1 206 Oct 11 17:10 sym1

          dst/sym1:
          drwxrwx--- 1 26 Oct 11 17:10 dir-sym1
          lrwxrwxrwx 1  8 Oct 11 17:10 dir-sym1_direct -> dir-sym1
          lrwxrwxrwx 1 10 Oct 11 17:10 dir-top_rel -> ../dir-top
          -rw-rw---- 1 10 Oct 11 17:10 file-sym1
          lrwxrwxrwx 1  9 Oct 11 17:10 file-sym1_direct -> file-sym1
          lrwxrwxrwx 1 17 Oct 11 17:10 file-sym1_upover -> ../sym1/file-sym1
          lrwxrwxrwx 1 17 Oct 11 17:10 file-sym2_upover -> ../sym2/file-sym2
          lrwxrwxrwx 1 11 Oct 11 17:10 file-top_rel -> ../file-top

          dst/sym1/dir-sym1:
          -rw-rw---- 1 14 Oct 11 17:10 dir-sym1.file

       The source files have four rough fates:

       1. Regular  files  and  directories (file-sym1 and dir-sym1).  These are copied into the image unchanged,
          including metadata.

       2. Safe symlinks, now broken. This is one of the gotchas of RSYNC’s top-of-transfer directory (here  host
          path  ./ctx, image path /) differing from the source directory (./ctx/sym1, /sym1), because the latter
          lacks a trailing slash.  dir-top_rel, file-sym2_upover, and file-top_rel all ascend only  as  high  as
          ./ctx  (host  path, / image) before re-descending. This is within the top-of-transfer, so the symlinks
          are safe and thus copied unchanged, but their targets were not included in the copy.

       3. Safe symlinks, still valid.

          1. dir-sym1_direct and file-sym1_direct point directly to files in the same directory.

          2. dir-sym1_upover and file-sym1_upover point to files in the same directory, but by  first  ascending
             into  their parent — within the top-of-transfer, so they are safe — and then re-descending. If sym1
             were renamed during the copy, these links would break.

       4. Unsafe symlinks, which are ignored by the copy and do not appear in the image.

          1. Absolute symlinks are always unsafe (*_abs).

          2. dir-out_rel and file-out_rel are relative symlinks that ascend above the top-of-transfer,  in  this
             case to targets outside the context, and are thus unsafe.

       The  top-of-transfer can be changed to sym1 with a trailing slash. This also adds sym1 to the destination
       so the resulting directory structure is the same.

          RSYNC /sym1/ /dst/sym1

          dst:
          drwxrwx--- 1 96 Oct 11 17:10 sym1

          dst/sym1:
          drwxrwx--- 1 26 Oct 11 17:10 dir-sym1
          lrwxrwxrwx 1  8 Oct 11 17:10 dir-sym1_direct -> dir-sym1
          -rw-rw---- 1 10 Oct 11 17:10 file-sym1
          lrwxrwxrwx 1  9 Oct 11 17:10 file-sym1_direct -> file-sym1

          dst/sym1/dir-sym1:
          -rw-rw---- 1 14 Oct 11 17:10 dir-sym1.file

       *_upover and *-out_rel are now unsafe and replaced with their targets.

       Another common use case is to follow unsafe symlinks and copy their targets in place of the  links.  This
       is accomplished with +u:

          RSYNC +u /sym1/ /dst/sym1

          dst:
          drwxrwx--- 1 352 Oct 11 17:10 sym1

          dst/sym1:
          drwxrwx--- 1 24 Oct 11 17:10 dir-out_rel
          drwxrwx--- 1 26 Oct 11 17:10 dir-sym1
          lrwxrwxrwx 1  8 Oct 11 17:10 dir-sym1_direct -> dir-sym1
          drwxrwx--- 1 24 Oct 11 17:10 dir-top_rel
          -rw-rw---- 1  9 Oct 11 17:10 file-out_abs
          -rw-rw---- 1  9 Oct 11 17:10 file-out_rel
          -rw-rw---- 1 10 Oct 11 17:10 file-sym1
          -rw-rw---- 1 10 Oct 11 17:10 file-sym1_abs
          lrwxrwxrwx 1  9 Oct 11 17:10 file-sym1_direct -> file-sym1
          -rw-rw---- 1 10 Oct 11 17:10 file-sym1_upover
          -rw-rw---- 1 10 Oct 11 17:10 file-sym2_abs
          -rw-rw---- 1 10 Oct 11 17:10 file-sym2_upover
          -rw-rw---- 1  9 Oct 11 17:10 file-top_abs
          -rw-rw---- 1  9 Oct 11 17:10 file-top_rel

          dst/sym1/dir-out_rel:
          -rw-rw---- 1 13 Oct 11 17:10 dir-out.file

          dst/sym1/dir-sym1:
          -rw-rw---- 1 14 Oct 11 17:10 dir-sym1.file

          dst/sym1/dir-top_rel:
          -rw-rw---- 1 13 Oct 11 17:10 dir-top.file

       Now  all  the  unsafe  symlinks noted above are present in the image, but they have changed to the normal
       files and directories pointed to.

       WARNING:
          This feature lets you copy files outside the context into the image, unlike other  container  builders
          where COPY can never access anything outside the context.

       The sources themselves, if symlinks, do not get special treatment:

          RSYNC /sym1/file-sym1_direct /sym1/file-sym1_upover /dst

          dst:
          lrwxrwxrwx 1 9 Oct 11 17:10 file-sym1_direct -> file-sym1

       Note  that  file-sym1_upover  does  not  appear  in  the  image,  despite  being  named explicitly in the
       instruction, because it is an unsafe symlink.

       If the destination is a symlink to a file, and the source is a file, the link is replaced and the  target
       is unchanged. (If the source is a directory, that is an error.)

          RUN touch /dst/file-dst && ln -s file-dst /dst/file-dst_direct
          RSYNC /file-top /dst/file-dst_direct

          dst:
          -rw-rw---- 1 0 Oct 11 17:42 file-dst
          -rw-rw---- 1 9 Oct 11 17:42 file-dst_direct

       If the destination is a symlink to a directory, the link is followed:

          RUN mkdir /dst/dir-dst && ln -s dir-dst /dst/dir-dst_direct
          RSYNC /file-top /dst/dir-dst_direct

          dst:
          drwxrwx--- 1 16 Oct 11 17:50 dir-dst
          lrwxrwxrwx 1  7 Oct 11 17:50 dir-dst_direct -> dir-dst

          dst/dir-dst:
          -rw-rw---- 1 9 Oct 11 17:50 file-top

   Examples
       Build image bar using ./foo/bar/Dockerfile and context directory ./foo/bar:

          $ ch-image build -t bar -f ./foo/bar/Dockerfile ./foo/bar
          [...]
          grown in 4 instructions: bar

       Same, but infer the image name and Dockerfile from the context directory path:

          $ ch-image build ./foo/bar
          [...]
          grown in 4 instructions: bar

       Build using humongous vendor compilers you want to bind-mount instead of installing into the image:

          $ ch-image build --bind /opt/bigvendor:/opt .
          $ cat Dockerfile
          FROM centos:7

          RUN /opt/bin/cc hello.c
          #COPY /opt/lib/*.so /usr/local/lib   # fail: COPY doesn’t bind mount
          RUN cp /opt/lib/*.so /usr/local/lib  # possible workaround
          RUN ldconfig

BUILD-CACHE

          $ ch-image [...] build-cache [...]

       Print  basic  information  about  the  cache.  If -v is given, also print some Git statistics and the Git
       repository configuration.

       If any of the following options are given, do  the  corresponding  operation  before  printing.  Multiple
       options can be given, in which case they happen in this order.

          --dot  Create  a DOT export of the tree named ./build-cache.dot and a PDF rendering ./build-cache.pdf.
                 Requires graphviz and git2dot.

          --gc   Run Git garbage collection on the cache, including full de-duplication of similar  files.  This
                 will immediately remove all cache entries not currently reachable from a named branch (which is
                 likely  to  cause  corruption  if  the  build  cache  is being accessed concurrently by another
                 process). The operation can take a long time on large caches.

          --reset
                 Clear and re-initialize the build cache.

          --tree Print a text tree of the cache using Git’s git log --graph feature. If -v is  also  given,  the
                 tree has more detail.

DELETE

          $ ch-image [...] delete IMAGE_GLOB [IMAGE_GLOB ... ]

       Delete the image(s) described by each IMAGE_GLOB from the storage directory (including all build stages).

       IMAGE_GLOB  can  be  either  a  plain image reference or an image reference with glob characters to match
       multiple images. For example, ch-image delete 'foo*' will delete all images whose names start  with  foo.
       Multiple images and/or globs can also be given in a single command line.

       Importantly,  this  sub-command does not also remove the image from the build cache. Therefore, it can be
       used to reduce the size of the storage directory, trading off the time needed to retrieve an  image  from
       cache.

       WARNING:
          Glob  characters must be quoted or otherwise protected from the shell, which also desires to interpret
          them and will do so incorrectly.

GESTALT

          $ ch-image [...] gestalt [SELECTOR]

       Provide information about the configuration and available features of ch-image. End users generally  will
       not need this; it is intended for testing and debugging.

       SELECTOR is one of:

          • bucache.  Exit  successfully  if  the build cache is available, unsuccessfully with an error message
            otherwise. With -v, also print version information about dependencies.

          • bucache-dot. Exit successfully if build cache DOT trees can be written, unsuccessfully with an error
            message otherwise. With -v, also print version information about dependencies.

          • python-path. Print the path to the Python interpreter in use and exit successfully.

          • storage-path. Print the storage directory path and exit successfully.

LIST

       Print information about images. If no argument given, list the images in builder storage.

   Synopsis
          $ ch-image [...] list [-l] [IMAGE_REF]

   Description
       Optional argument:

          -l, --long
                 Use long format (name, last change timestamp) when listing images.

          -u, --undeletable
                 List images that can be undeleted. Can also be spelled --undeleteable.

          IMAGE_REF
                 Print details of what’s known about IMAGE_REF, both locally and in the remote registry, if any.

   Examples
       List images in builder storage:

          $ ch-image list
          alpine:3.17 (amd64)
          alpine:latest (amd64)
          debian:buster (amd64)

       Print details about Debian Buster image:

          $ ch-image list debian:buster
          details of image:    debian:buster
          in local storage:    no
          full remote ref:     registry-1.docker.io:443/library/debian:buster
          available remotely:  yes
          remote arch-aware:   yes
          host architecture:   amd64
          archs available:     386       bae2738ed83
                               amd64     98285d32477
                               arm/v7    97247fd4822
                               arm64/v8  122a0342878

       For remotely available images like Debian Buster, the associated digest is listed beside  each  available
       architecture.  Importantly,  this  feature  does  not  provide the hash of the local image, which is only
       calculated on push.

IMPORT

          $ ch-image [...] import PATH IMAGE_REF

       Copy the image at PATH into builder storage with name IMAGE_REF. PATH can be:

       • an image directory

       • a tarball with no top-level directory (a.k.a. a “tarbomb”)

       • a standard tarball with one top-level directory

       If the imported image contains Charliecloud metadata, that  will  be  imported  unchanged,  i.e.,  images
       exported from ch-image builder storage will be functionally identical when re-imported.

       WARNING:
          Descendant  images (i.e., FROM the imported IMAGE_REF) are linked using IMAGE_REF only. If a new image
          is imported under a new IMAGE_REF, all instructions descending from that  IMAGE_REF  will  still  hit,
          even if the new image is different.

PULL

       Pull the image described by the image reference IMAGE_REF from a repository to the local filesystem.

   Synopsis
          $ ch-image [...] pull [...] IMAGE_REF [DEST_REF]

       See the FAQ for the gory details on specifying image references.

   Description
       Destination:

          DEST_REF
                 If specified, use this as the destination image reference, rather than IMAGE_REF. This lets you
                 pull an image with a complicated reference while storing it locally with a simpler one.

       Options:

          --last-layer N
                 Unpack only N layers, leaving an incomplete image. This option is intended for debugging.

          --parse-only
                 Parse IMAGE_REF, print a parse report, and exit successfully without talking to the internet or
                 touching the storage directory.

       This  script does a fair amount of validation and fixing of the layer tarballs before flattening in order
       to support unprivileged use despite image problems we frequently see in the  wild.  For  example,  device
       files  are  ignored,  and  file  and  directory  permissions  are increased to a minimum of rwx------ and
       rw------- respectively. Note, however, that symlinks pointing outside the image  are  permitted,  because
       they are not resolved until runtime within a container.

       The  following metadata in the pulled image is retained; all other metadata is currently ignored. (If you
       have a need for additional metadata, please let us know!)

          • Current working directory set with WORKDIR is effective in downstream Dockerfiles.

          • Environment variables set with ENV are effective in  downstream  Dockerfiles  and  also  written  to
            /ch/environment for use in ch-run --set-env.

          • Mount  point  directories specified with VOLUME are created in the image if they don’t exist, but no
            other action is taken.

       Note that some images (e.g., those with a “version 1 manifest”) do not contain  metadata.  A  warning  is
       printed in this case.

   Examples
       Download the Debian Buster image matching the host’s architecture and place it in the storage directory:

          $ uname -m
          aarch32
          pulling image:    debian:buster
          requesting arch:  arm64/v8
          manifest list: downloading
          manifest: downloading
          config: downloading
          layer 1/1: c54d940: downloading
          flattening image
          layer 1/1: c54d940: listing
          validating tarball members
          resolving whiteouts
          layer 1/1: c54d940: extracting
          image arch: arm64
          done

       Same, specifying the architecture explicitly:

          $ ch-image --arch=arm/v7 pull debian:buster
          pulling image:    debian:buster
          requesting arch:  arm/v7
          manifest list: downloading
          manifest: downloading
          config: downloading
          layer 1/1: 8947560: downloading
          flattening image
          layer 1/1: 8947560: listing
          validating tarball members
          resolving whiteouts
          layer 1/1: 8947560: extracting
          image arch: arm (may not match host arm64/v8)

PUSH

       Push the image described by the image reference IMAGE_REF from the local filesystem to a repository.

   Synopsis
          $ ch-image [...] push [--image DIR] IMAGE_REF [DEST_REF]

       See the FAQ for the gory details on specifying image references.

   Description
       Destination:

          DEST_REF
                 If specified, use this as the destination image reference, rather than IMAGE_REF. This lets you
                 push to a repository without permanently adding a tag to the image.

       Options:

          --image DIR
                 Use  the  unpacked  image  located  at  DIR rather than an image in the storage directory named
                 IMAGE_REF.

       Because Charliecloud is fully unprivileged, the owner and group of files in its images are not meaningful
       in the broader ecosystem. Thus,  when  pushed,  everything  in  the  image  is  flattened  to  user:group
       root:root.  Also,  setuid/setgid  bits  are  removed,  to  avoid  surprises  if  the image is pulled by a
       privileged container implementation.

   Examples
       Push a local image to the registry example.com:5000 at path /foo/bar with tag latest. Note that  in  this
       form, the local image must be named to match that remote reference.

          $ ch-image push example.com:5000/foo/bar:latest
          pushing image:   example.com:5000/foo/bar:latest
          layer 1/1: gathering
          layer 1/1: preparing
          preparing metadata
          starting upload
          layer 1/1: a1664c4: checking if already in repository
          layer 1/1: a1664c4: not present, uploading
          config: 89315a2: checking if already in repository
          config: 89315a2: not present, uploading
          manifest: uploading
          cleaning up
          done

       Same,  except  use local image alpine:3.17. In this form, the local image name does not have to match the
       destination reference.

          $ ch-image push alpine:3.17 example.com:5000/foo/bar:latest
          pushing image:   alpine:3.17
          destination:     example.com:5000/foo/bar:latest
          layer 1/1: gathering
          layer 1/1: preparing
          preparing metadata
          starting upload
          layer 1/1: a1664c4: checking if already in repository
          layer 1/1: a1664c4: not present, uploading
          config: 89315a2: checking if already in repository
          config: 89315a2: not present, uploading
          manifest: uploading
          cleaning up
          done

       Same, except use unpacked image located at /var/tmp/image rather  than  an  image  in  ch-image  storage.
       (Also, the sole layer is already present in the remote registry, so we don’t upload it again.)

          $ ch-image push --image /var/tmp/image example.com:5000/foo/bar:latest
          pushing image:   example.com:5000/foo/bar:latest
          image path:      /var/tmp/image
          layer 1/1: gathering
          layer 1/1: preparing
          preparing metadata
          starting upload
          layer 1/1: 892e38d: checking if already in repository
          layer 1/1: 892e38d: already present
          config: 546f447: checking if already in repository
          config: 546f447: not present, uploading
          manifest: uploading
          cleaning up
          done

RESET

          $ ch-image [...] reset

       Delete all images and cache from ch-image builder storage.

UNDELETE

          $ ch-image [...] undelete IMAGE_REF

       If  IMAGE_REF  has been deleted but is in the build cache, recover it from the cache. Only available when
       the cache is enabled, and will not overwrite IMAGE_REF if it exists.

ENVIRONMENT VARIABLES

       CH_IMAGE_USERNAME, CH_IMAGE_PASSWORD
              Username  and  password  for  registry  authentication.   See   important   caveats   in   section
              “Authentication” above.

       CH_LOG_FILE
              If  set, append log chatter to this file, rather than standard error. This is useful for debugging
              situations where standard error is consumed or lost.

              Also sets verbose mode if not already set (equivalent to --verbose).

       CH_LOG_FESTOON
              If set, prepend PID and timestamp to logged chatter.

       CH_XATTRS
              If set, save xattrs in the build cache and restore them when rebuilding from the cache (equivalent
              to --xattrs).

REPORTING BUGS

       If Charliecloud was obtained  from  your  Linux  distribution,  use  your  distribution’s  bug  reporting
       procedures.

       Otherwise, report bugs to: https://github.com/hpc/charliecloud/issues

SEE ALSO

       charliecloud(7)

       Full documentation at: <https://hpc.github.io/charliecloud>

COPYRIGHT

       2014–2023, Triad National Security, LLC and others

0.38                                          2024-11-23 16:04 UTC                                   CH-IMAGE(1)