Provided by: datalad_1.1.5-1_all bug

NAME

       datalad create-sibling-ria - creates a sibling to a dataset in a RIA store

SYNOPSIS


       datalad  create-sibling-ria  [-h]  -s  NAME  [-d  DATASET] [--storage-name NAME] [--alias ALIAS] [--post-
              update-hook]   [--shared   {false|true|umask|group|all|world|everybody|0xxx}]   [--group    GROUP]
              [--storage-sibling  MODE]  [--existing  MODE]  [--new-store-ok]  [--trust-level  TRUST-LEVEL] [-r]
              [-R  LEVELS]  [--no-storage-sibling]   [--push-url   ria+<ssh|file>://<host>[/path]]   [--version]
              ria+<ssh|file|http(s)>://<host>[/path]

DESCRIPTION

       Communication with a dataset in a RIA store is implemented via two siblings. A regular Git remote (repos‐
       itory sibling) and a git-annex special remote for data transfer (storage sibling) -- with the former hav‐
       ing  a  publication dependency on the latter. By default, the name of the storage sibling is derived from
       the repository sibling's name by appending "-storage".

       The store's base path is expected to not exist, be an empty directory, or a valid RIA store.

       Notes -----

   *RIA URL format*
       Interactions with new or existing RIA stores require RIA URLs to identify the store or specific  datasets
       inside of it.

       The  general structure of a RIA URL pointing to a store takes the form ``ria+[scheme]://<storelocation>``
       (e.g.,       ``ria+ssh://[user@]hostname:/absolute/path/to/ria-store``,       or        ``ria+file:///ab‐
       solute/path/to/ria-store``)

       The  general  structure  of  a RIA URL pointing to a dataset in a store (for example for cloning) takes a
       similar form, but appends either the datasets UUID or a "~" symbol followed by the dataset's alias  name:
       ``ria+[scheme]://<storelocation>#<dataset-UUID>`` or ``ria+[scheme]://<storelocation>#~<aliasname>``.  In
       addition,  specific  version  identifiers  can  be  appended  to  the  URL with an additional "@" symbol:
       ``ria+[scheme]://<storelocation>#<dataset-UUID>@<dataset-version>``, where ``dataset-version`` refers  to
       a branch or tag.

   *RIA store layout*
       A  RIA  store is a directory tree with a dedicated subdirectory for each dataset in the store. The subdi‐
       rectory name is constructed from the DataLad dataset ID, e.g.  ``124/68afe-59ec-11ea-93d7-f0d5bf7b5561``,
       where the first three characters of the ID are used for an intermediate subdirectory in order to mitigate
       files system limitations for stores containing a large number of datasets.

       By  default,  a dataset in a RIA store consists of two components: A Git repository (for all dataset con‐
       tents stored in Git) and a storage sibling (for dataset content stored in git-annex).

       It is possible to selectively disable either component using ``storage-sibling 'off'`` or  ``storage-sib‐
       ling  'only'``, respectively.  If neither component is disabled, a dataset's subdirectory layout in a RIA
       store contains a standard bare Git repository and an ``annex/`` subdirectory inside of  it.   The  latter
       holds  a  Git-annex  object  store  and comprises the storage sibling.  Disabling the standard git-remote
       (``storage-sibling='only'``) will result in not having the bare git  repository,  disabling  the  storage
       sibling (``storage-sibling='off'``) will result in not having the ``annex/`` subdirectory.

       Optionally,  there  can be a further subdirectory ``archives`` with (compressed) 7z archives of annex ob‐
       jects. The storage remote is able to pull annex objects from these archives, if it  cannot  find  in  the
       regular  annex  object store. This feature can be useful for storing large collections of rarely changing
       data on systems that limit the number of files that can be stored.

       Each dataset directory also contains a ``ria-layout-version`` file that identifies the data  organization
       (as, for example, described above).

       Lastly,  there  is  a  global  ``ria-layout-version`` file at the store's base path that identifies where
       dataset subdirectories themselves are located. At present, this file must contain a single  line  stating
       the version (currently "1"). This line MUST end with a newline character.

       It  is  possible  to  define  an  alias  for an individual dataset in a store by placing a symlink to the
       dataset location into an ``alias/`` directory in the root of the store. This enables dataset  access  via
       URLs of format: ``ria+<protocol>://<storelocation>#~<aliasname>``.

       Compared to standard git-annex object stores, the ``annex/`` subdirectories used as storage siblings fol‐
       low a different layout naming scheme ('dirhashmixed' instead of 'dirhashlower').  This is mostly noted as
       a technical detail, but also serves to remind git-annex powerusers to refrain from running git-annex com‐
       mands  directly  in-store as it can cause severe damage due to the layout difference. Interactions should
       be handled via the ORA special remote instead.

   *Error logging*
       To enable error logging at the remote end, append a pipe symbol and an  "l"  to  the  version  number  in
       ria-layout-version (like so: ``1|l0`).

       Error  logging will create files in an "error_log" directory whenever the git-annex special remote (stor‐
       age sibling) raises an exception, storing the Python traceback of it. The logfiles are named according to
       the scheme ``<dataset id>.<annex uuid of the remote>.log`` showing "who" ran into this issue  with  which
       dataset.  Because  logging can potentially leak personal data (like local file paths for example), it can
       be  disabled  client-side  by  setting  the   configuration   variable   ``annex.ora-remote.<storage-sib‐
       ling-name>.ignore-remote-config``.

OPTIONS

       ria+<ssh|file|http(s)>://<host>[/path]
              URL  identifying the target RIA store and access protocol. If ``--push-url`` is given in addition,
              this is used for read access only. Otherwise it will be used for write access too  and  to  create
              the repository sibling in the RIA store. Note, that HTTP(S) currently is valid for consumption on‐
              ly  thus requiring to provide ``--push-url``. Constraints: value must be a string or value must be
              NONE

       -h, --help, --help-np
              show this help message. --help-np forcefully disables the use of a pager for displaying  the  help
              message

       -s NAME, --name NAME
              Name of the sibling. With RECURSIVE, the same name will be used to label all the subdatasets' sib‐
              lings. Constraints: value must be a string or value must be NONE

       -d DATASET, --dataset DATASET
              specify the dataset to process. If no dataset is given, an attempt is made to identify the dataset
              based on the current working directory. Constraints: Value must be a Dataset or a valid identifier
              of a Dataset (e.g. a path) or value must be NONE

       --storage-name NAME
              Name of the storage sibling (git-annex special remote). Must not be identical to the sibling name.
              If  not  specified, defaults to the sibling name plus '-storage' suffix. If only a storage sibling
              is created, this setting is ignored, and the primary sibling name is used. Constraints: value must
              be a string or value must be NONE

       --alias ALIAS
              Alias for the dataset in the RIA store. Add the necessary symlink so  that  this  dataset  can  be
              cloned from the RIA store using the given ALIAS instead of its ID. With `recursive=True`, only the
              top dataset will be aliased. Constraints: value must be a string or value must be NONE

       --post-update-hook
              Enable  Git's default post-update-hook for the created sibling. This is useful when the sibling is
              made accessible via a "dumb server" that requires running 'git update-server-info' to let Git  in‐
              teract properly with it.

       --shared {false|true|umask|group|all|world|everybody|0xxx}
              If  given, configures the permissions in the RIA store for multi-users access. Possible values for
              this option are identical to those of `git init --shared` and are described in its  documentation.
              Constraints:  value  must  be  a string or value must be convertible to type bool or value must be
              NONE

       --group GROUP
              Filesystem group for the repository. Specifying the group is  crucial  when  --shared=group.  Con‐
              straints: value must be a string or value must be NONE

       --storage-sibling MODE
              By  default,  an ORA storage sibling and a Git repository sibling are created (on). Alternatively,
              creation of the storage sibling can be disabled (off), or a storage sibling created  only  and  no
              Git  sibling  (only). In the latter mode, no Git installation is required on the target host. Con‐
              straints: value must be one of ('only',) or value must be convertible to type bool or  value  must
              be NONE [Default: True]

       --existing MODE
              Action to perform, if a (storage) sibling is already configured under the given name and/or a tar‐
              get already exists. In this case, a dataset can be skipped ('skip'), an existing target repository
              be  forcefully  re-initialized, and the sibling (re-)configured ('reconfigure'), or the command be
              instructed to fail ('error'). Constraints: value must be one of ('skip',  'error',  'reconfigure')
              [Default: 'error']

       --new-store-ok
              When  set, a new store will be created, if necessary. Otherwise, a sibling will only be created if
              the url points to an existing RIA store.

       --trust-level TRUST-LEVEL
              specify a trust level for the storage sibling. If not specified, the default git-annex trust level
              is used. 'trust' should be used with care (see the git-annex-trust man page).  Constraints:  value
              must be one of ('trust', 'semitrust', 'untrust')

       -r, --recursive
              if set, recurse into potential subdatasets.

       -R LEVELS, --recursion-limit LEVELS
              limit  recursion  into  subdatasets to the given number of levels. Constraints: value must be con‐
              vertible to type 'int' or value must be NONE

       --no-storage-sibling
              This option is deprecated. Use '--storage-sibling off' instead.

       --push-url ria+<ssh|file>://<host>[/path]
              URL identifying the target RIA store and access protocol for write access to the storage  sibling.
              If  given  this  will  also  be used for creation of the repository sibling in the RIA store. Con‐
              straints: value must be a string or value must be NONE

       --version
              show the module and its version which provides the command

AUTHORS

        datalad is developed by The DataLad Team and Contributors <team@datalad.org>.

datalad create-sibling-ria 1.1.5                   2025-03-03                      datalad create-sibling-ria(1)