Provided by: trurl_0.16-1_amd64 bug

NAME

       trurl - transpose URLs

SYNOPSIS

       trurl [options / URLs]

DESCRIPTION

       trurl parses, manipulates and outputs URLs and parts of URLs.

       It  uses  the RFC 3986 definition of URLs and it uses libcurl's URL parser to do so, which includes a few
       "extensions". The URL support is limited to "hierarchical" URLs, the ones that use ://  separators  after
       the scheme.

       Typically  you  pass  in one or more URLs and decide what of that you want output. Possibly modifying the
       URL as well.

       trurl knows URLs and every URL  consists  of  up  to  ten  separate  and  independent  components.  These
       components  can be extracted, removed and updated with trurl and they are referred to by their respective
       names: scheme, user, password, options, host, port, path, query, fragment and zoneid.

NORMALIZATION

       When provided a URL to work with, trurl "normalizes" it. It means that individual URL components are  URL
       decoded then URL encoded back again and set in the URL.

       Example:

       $ trurl 'http://ex%61mple:80/%62ath/a/../b?%2e%FF#tes%74'
       http://example/bath/b?.%ff#test

OPTIONS

       Options start with one or two dashes. Many of the options require an additional value next to them.

       Any  other  argument  is  interpreted  as  a  URL argument, and is treated as if it was following a --url
       option.

       The first argument that is exactly two dashes (--), marks the end of options; any argument after the  end
       of options is interpreted as a URL argument even if it starts with a dash.

       Long options can be provided either as --flag argument or as --flag=argument.

       -a, --append [component]=[data]
              Append data to a component. This can only append data to the path and the query components.

              For path, this URL encodes and appends the new segment to the path, separated with a slash.

              For  query, this URL encodes and appends the new segment to the query, separated with an ampersand
              (&). If the appended segment contains an equal sign (=) that one is kept verbatim and  both  sides
              of the first occurrence are URL encoded separately.

       --accept-space
              When  set, trurl tries to accept spaces as part of the URL and instead URL encode such occurrences
              accordingly.

              According to RFC 3986, a space cannot legally be part of a URL. This option provides a best-effort
              to convert the provided string into a valid URL.

       --as-idn
              Converts a punycode ASCII hostname to its original International Domain Name in  Unicode.  If  the
              hostname is not using punycode then the original hostname is used.

       --curl Only accept URL schemes supported by libcurl.

       --default-port
              When set, trurl uses the scheme's default port number for URLs with a known scheme, and without an
              explicit port number.

              Note that trurl only knows default port numbers for URL schemes that are supported by libcurl.

              Since,  by  default, trurl removes default port numbers from URLs with a known scheme, this option
              is pretty much ignored unless one of --get, --json, and --keep-port is not also specified.

       -f, --url-file [filename]
              Read URLs to work on from the given file. Use the filename - (a single minus)  to  tell  trurl  to
              read the URLs from stdin.

              Each  line  needs to be a single valid URL. trurl removes one carriage return character at the end
              of the line if present, trims off all the trailing space and tab characters, and skips  all  empty
              (after trimming) lines.

              The maximum line length supported in a file like this is 4094 bytes. Lines that exceed that length
              are skipped, and a warning is printed to stderr when they are encountered.

       -g, --get [format]
              Output  text  and URL data according to the provided format string. Components from the URL can be
              output when specified as {component} or [component], with the name of the part show  within  curly
              braces or brackets. You can not mix braces and brackets for this purpose in the same command line.

              The  following  component  names  are  available  (case  sensitive):  url, scheme, user, password,
              options, host, port, path, query, fragment and zoneid.

              {component} expands to nothing if the given component does not have a value.

              Components are shown URL decoded by default.

              URL decoding a component may cause problems to display  it.  Such  problems  make  a  warning  get
              displayed unless --quiet is used.

              trurl  supports a range of different qualifiers, or prefixes, to the component that changes how it
              handles it:

              If url: is specified, like {url:path}, the component gets output URL encoded. As a shortcut,  url:
              also works written as a single colon: {:path}.

              If  strict:  is specified, like {strict:path}, URL decode problems are turned into errors. In this
              stricter mode, a URL decode problem makes trurl stop what it is doing and return  with  exit  code
              10.

              If  must:  is  specified,  like  {must:query},  it  makes  trurl  return an error if the requested
              component does not exist in the URL. By default a missing component will just be shown blank.

              If default: is specified, like {default:url} or {default:port}, and the  port  is  not  explicitly
              specified in the URL, the scheme's default port is output if it is known.

              If  puny:  is  specified, like {puny:url} or {puny:host}, the punycoded version of the hostname is
              used in the output. This option is mutually exclusive with idn:.

              If idn: is specified like {idn:url} or {idn:host}, the International Domain Name  version  of  the
              hostname  is  used  in  the output if it is provided as a correctly encoded punycode version. This
              option is mutually exclusive with puny:.

              If --default-port is specified, all formats  are  expanded  as  if  they  used  default:;  and  if
              --punycode  is  used,  all  formats  are  expanded  as if they used puny:. Also note that {url} is
              affected by the --keep-port option.

              Hosts  provided  as  IPv6  numerical  addresses  are  provided  within   square   brackets.   Like
              [fe80::20c:29ff:fe9c:409b].

              Hosts  provided  as  IPv4  numerical  addresses  are normalized and provided as four dot-separated
              decimal numbers when output.

              You can access specific keys in the query string using the format {query:key}. Then the  value  of
              the first matching key is output using a case sensitive match. When extracting a URL decoded query
              key that contains %00, such octet is replaced with a single period . in the output.

              You  can  access  specific  keys  in  the  query  string  and  out  all  values  using  the format
              {query-all:key}. This looks for  key  case  sensitively  and  outputs  all  values  for  that  key
              space-separated.

              The format string supports the following backslash sequences:

              \ - backslash

              \t - tab

              \n - newline

              \r - carriage return

              \{ - an open curly brace that does not start a variable

              \[ - an open bracket that does not start a variable

              All other text in the format string is shown as-is.

       -h, --help
              Show the help output.

       --iterate [component]=[item1 item2 ...]
              Set  the  component  to  multiple  values  and  output the result once for each iteration. Several
              combined iterations are allowed to generate  combinations,  but  only  one  --iterate  option  per
              component. The listed items to iterate over should be separated by single spaces.

              Example:

              $ trurl example.com --iterate=scheme="ftp https" --iterate=port="22 80"
              ftp://example.com:22/
              ftp://example.com:80/
              https://example.com:22/
              https://example.com:80/

       --json Outputs  all  set components of the URLs as JSON objects. All components of the URL that have data
              get populated in the parts object using their component  names.  See  below  for  details  on  the
              format.

              The URL components are provided URL decoded. Change that with --urlencode.

       --keep-port
              By  default,  trurl  removes  default  port numbers from URLs with a known scheme even if they are
              explicitly specified in the input URL. This options, makes trurl not remove them.

              Example:

              $ trurl https://example.com:443/ --keep-port
              https://example.com:443/

       --no-guess-scheme
              Disables libcurl's scheme guessing feature. URLs that do not  contain  a  scheme  are  treated  as
              invalid URLs.

              Example:

              $ trurl example.com --no-guess-scheme
              trurl note: Bad scheme [example.com]

       --punycode
              Uses  the  punycode version of the hostname, which is how International Domain Names are converted
              into plain ASCII. If the hostname is not using IDN, the regular ASCII name is used.

              Example:

              $ trurl http://åäö/ --punycode
              http://xn--4cab6c/

       --qtrim [what]
              Trims data off a query.

              what is specified as a full name of a name/value pair,  or  as  a  word  prefix  (using  a  single
              trailing  asterisk  (*))  which makes trurl remove the tuples from the query string that match the
              instruction.

              To match a literal trailing asterisk instead of using a wildcard, escape it with  a  backslash  in
              front of it. Like \*.

       --query-separator [what]
              Specify  the  single  letter used for separating query pairs. The default is & but at least in the
              past sometimes semicolons ; or even colons : have been used for this purpose.  If  your  URL  uses
              something  other  than the default letter, setting the right one makes sure trurl can do its query
              operations properly.

              Example:

              $ trurl "https://curl.se?b=name:a=age" --sort-query --query-separator ":"
              https://curl.se/?a=age:b=name

       --quiet
              Suppress (some) notes and warnings.

       --redirect [URL]
              Redirect the URL to this new location. The redirection is performed on the base  URL,  so,  if  no
              base URL is specified, no redirection is performed.

              Example:

              $ trurl --url https://curl.se/we/are.html --redirect ../here.html
              https://curl.se/here.html

       --replace [data]
              Replaces a URL query.

              data  can  either take the form of a single value, or as a key/value pair in the shape foo=bar. If
              replace is called on an item that is not in the list of queries trurl ignores that item.

              trurl URL encodes both sides of the = character in the given input data argument.

       --replace--append [data]
              Works the same as --replace, but trurl appends a missing query string if it is not  in  the  query
              list already.

       -s, --set [component][:]=[data]
              Set this URL component. Setting blank string ("") clears the component from the URL.

              The  following  components  can  be  set:  url, scheme, user, password, options, host, port, path,
              query, fragment and zoneid.

              If a simple =-assignment is used, the data is URL encoded when applied. If := is used, the data is
              assumed to already be URL encoded and stored as-is.

              If ?= is used, the set is  only  performed  if  the  component  is  not  already  set.  It  avoids
              overwriting any already set data.

              You can also combine : and ? into ?:= if desired.

              If  no  URL  or  --url-file argument is provided, trurl tries to create a URL using the components
              provided by the --set options. If not enough components are specified, this fails.

       --sort-query
              The  "variable=content"  tuplets  in  the  query  component  are  sorted  in  a  case  insensitive
              alphabetical  order. This helps making URLs identical that otherwise only had their query pairs in
              different orders.

       --trim [component]=[what]
              Deprecated: use --qtrim.

              Trims data off a component. Currently this can only trim a query component.

              what is specified as a full word or as a word prefix (using a single trailing asterisk (*))  which
              makes trurl remove the tuples from the query string that match the instruction.

              To  match  a  literal trailing asterisk instead of using a wildcard, escape it with a backslash in
              front of it. Like \*.

       --url [URL]
              Set the input URL to work with. The URL may be provided without a scheme, which then typically  is
              not  actually a legal URL but trurl tries to figure out what is meant and guess what scheme to use
              (unless --no-guess-scheme is used).

              Providing multiple URLs makes trurl act on all URLs in a serial fashion.

              If the URL cannot be parsed for whatever reason, trurl simply moves on to the next provided URL  -
              unless --verify is used.

       --urlencode
              Outputs URL encoded version of components by default when using --get or --json.

       -v, --version
              Show version information and exit.

       --verify
              When  a  URL  is provided, return error immediately if it does not parse as a valid URL. In normal
              cases, trurl can forgive a bad URL input.

URL COMPONENTS

       scheme This is the leading character sequence of a URL, excluding  the  "://"  separator.  It  cannot  be
              specified URL encoded.

              A  URL  cannot  exist  without  a  scheme, but unless --no-guess-scheme is used trurl guesses what
              scheme that was intended if none was provided.

              Examples:

              $ trurl https://odd/ -g '{scheme}'
              https

              $ trurl odd -g '{scheme}'
              http

              $ trurl odd -g '{scheme}' --no-guess-scheme
              trurl note: Bad scheme [odd]

       user   After the scheme separator, there can be a username provided. If it ends with a colon  (:),  there
              is  a  password provided. If it ends with an at character (@) there is no password provided in the
              URL.

              Example:

              $ trurl https://user%3a%40:secret@odd/ -g '{user}'
              user:@

       password
              If the password ends with a semicolon (;) there is an options field following. This field is  only
              accepted by trurl for URLs using the IMAP scheme.

              Example:

              $ trurl https://user:secr%65t@odd/ -g '{password}'
              secret

       options
              This field can only end with an at character (@) that separates the options from the hostname.

              $ trurl 'imap://user:pwd;giraffe@odd' -g '{options}'
              giraffe

              If the scheme is not IMAP, the giraffe part is instead considered part of the password:

              $ trurl 'sftp://user:pwd;giraffe@odd' -g '{password}'
              pwd;giraffe

              We  strongly  advice  users  to  %-encode  ;,  :  and  @  in URLs of course to reduce the risk for
              confusions.

       host   The host component is the hostname or a numerical IP address. If a hostname is provided, it can be
              an International Domain Name non-ASCII characters. A hostname can be provided URL encoded.

              trurl provides options for working with the IDN  hostnames  either  as  IDN  or  in  its  punycode
              version.

              Example, convert an IDN name to punycode in the output:

              $ trurl http://åäö/ --punycode
              http://xn--4cab6c/

              Or the reverse, convert a punycode hostname into its IDN version:

              $ trurl http://xn--4cab6c/ --as-idn
              http://åäö/

              If  the  URL's  hostname  starts with an open bracket ([) it is a numerical IPv6 address that also
              must end with a closing bracket (]). trurl normalizes IPv6 addreses.

              Example:

              $ trurl 'http://[2001:9b1:0:0:0:0:7b97:364b]/'
              http://[2001:9b1::7b97:364b]/

              A numerical IPV4 address can be specified using one, two, three or  four  numbers  separated  with
              dots and they can use decimal, octal or hexadecimal.  trurl normalizes provided addresses and uses
              four dotted decimal numbers in its output.

              Examples:

              $ trurl http://646464646/
              http://38.136.68.134/

              $ trurl http://246.646/
              http://246.0.2.134/

              $ trurl http://246.46.646/
              http://246.46.2.134/

              $ trurl http://0x14.0xb3022/
              http://20.11.48.34/

       zoneid If the provided host is an IPv6 address, it might contain a specific zoneid. A number or a network
              interface name normally.

              Example:

              $ trurl 'http://[2001:9b1::f358:1ba4:7b97:364b%enp3s0]/' -g '{zoneid}'
              enp3s0

       port   If  the  host ends with a colon (:) then a port number follows. It is a 16 bit decimal number that
              may not be URL encoded.

              trurl knows the default port number for many URL schemes so it can show port  numbers  for  a  URL
              even  if none was explicitly used in the URL. With --default-port it can add the default port to a
              URL even when not provide.

              Example:

              $ trurl http:/a --default-port
              http://a:80/

              Similarly, trurl normally hides the port number if the given number is the default.

              Example:

              $ trurl http:/a:80
              http://a/

              But a user can make trurl keep the port even if it is the default, with --keep-port.

              Example:

              $ trurl http:/a:80 --keep-port
              http://a:80/

       path   A URL path is assumed to always start with and contain at least a  slash  (/),  even  if  none  is
              actually provided in the URL.

              Example:

              $ trurl http://xn--4cab6c -g '[path]'
              /

              When setting the path, trurl will inject a leading slash if none is provided:

              $ trurl http://hello -s path="pony"
              http://hello/pony

              $ trurl http://hello -s path="/pony"
              http://hello/pony

              If the input path contains dotdot or dot-slash sequences, they are normalized away.

              Example:

              $ trurl http://hej/one/../two/../three/./four
              http://hej/three/four

              You can append a new segment to an existing path with --append like this:

              $ trurl http://twelve/three?hello --append path=four
              http://twelve/three/four?hello

       query  The query part does not include the leading question mark (?) separator when extracted with trurl.

              Example:

              $ trurl http://horse?elephant -g '{query}'
              elephant

              Example, if you set the query with a leading question mark:

              $ trurl http://horse?elephant -s "query=?elephant"
              http://horse/?%3felephant

              Query  parts  are often made up of a series of name=value pairs separated with ampersands (&), and
              trurl offers several ways to work with such.

              Append a new name value pair to a URL with --append:

              $ trurl http://host?name=hello --append query=search=life
              http://host/?name=hello&search=life

              You cam --replace the value of a specific existing name among the pairs:

              $ trurl 'http://alpha?one=real&two=fake' --replace two=alsoreal
              http://alpha/?one=real&two=alsoreal

              If the specific name you want to replace perhaps does not exist in the URL, you can opt to replace
              or append the pair:

              $ trurl 'http://alpha?one=real&two=fake' --replace-append three=alsoreal
              http://alpha/?one=real&two=fake&three=alsoreal

              In order to perhaps compare two URLs using query name value pairs, sorting  them  first  at  least
              increases the chances of it working:

              $ trurl "http://alpha/?one=real&two=fake&three=alsoreal" --sort-query
              http://alpha/?one=real&three=alsoreal&two=fake

              Remove name/value pairs from the URL by specifying exact name or wildcard pattern with --qtrim:

              $ trurl 'https://example.com?a12=hej&a23=moo&b12=foo' --qtrim a*'
              https://example.com/?b12=foo

       fragment
              The fragment part does not include the leading hash sign (#) separator when extracted with trurl.

              Example:

              $ trurl http://horse#elephant -g '{fragment}'
              elephant

              Example, if you set the fragment with a leading hash sign:

              $ trurl "http://horse#elephant" -s "fragment=#zebra"
              http://horse/#%23zebra

              The  fragment  part  of a URL is for local purposes only. The data in there is never actually sent
              over the network when a URL is used for transfers.

       url    trurl supports url as a named component for --get to allow  for  more  powerful  outputs,  but  of
              course it is not actually a "component"; it is the full URL.

              Example:

              $ trurl ftps://example.com:2021/p%61th -g '{url}'
              ftps://example.com:2021/path

JSON output format

       The  --json  option outputs a JSON array with one or more objects. One for each URL. Each URL JSON object
       contains a number of properties, a series of key/value pairs. The exact set present depends on the  given
       URL.

       url    This  key exists in every object. It is the complete URL. Affected by --default-port, --keep-port,
              and --punycode.

       parts  This key exists in every object, and contains an object with a key for each of  the  settable  URL
              components.  If  a  component is missing, it means it is not present in the URL. The parts are URL
              decoded unless --urlencode is used.

       parts.scheme
              The URL scheme.

       parts.user
              The username.

       parts.password
              The password.

       parts.options
              The options. Note that only a few URL schemes support the "options" component.

       parts.host
              The normalized hostname. It might be a UTF-8 name if an IDN name  was  used.  It  can  also  be  a
              normalized  IPv4  or IPv6 address. An IPv6 address always starts with a bracket ([) - and no other
              hostnames can contain such a symbol. If --punycode is used, the punycode version of  the  host  is
              outputted instead.

       parts.port
              The  provided  port  number  as  a string. If the port number was not provided in the URL, but the
              scheme is a known one, and --default-port is in use, the default port for that scheme is  provided
              here.

       parts.path
              The path. Including the leading slash.

       parts.query
              The full query, excluding the question mark separator.

       parts.fragment
              The fragment, excluding the pound sign separator.

       parts.zoneid
              The  zone id, which can only be present in an IPv6 address. When this key is present, then host is
              an IPv6 numerical address.

       params This key contains an array of query key/value objects. Each such pair is  listed  with  "key"  and
              "value" and their respective contents in the output.

              The  key/values  are  extracted from the query where they are separated by ampersands (&) - or the
              user sets with --query-separator.

              The query pairs are listed in the order of appearance in a left-to-right order, but  can  be  made
              alpha-sorted with --sort-query.

              It is only present if the URL has a query.

EXAMPLES

       Replace the hostname of a URL
              $ trurl --url https://curl.se --set host=example.com
              https://example.com/

       Create a URL by setting components
               $ trurl --set host=example.com --set scheme=ftp
               ftp://example.com/

       Redirect a URL
              $ trurl --url https://curl.se/we/are.html --redirect here.html
              https://curl.se/we/here.html

       Change port number
              This also shows how trurl removes dot-dot sequences
              $ trurl --url https://curl.se/we/../are.html --set port=8080
              https://curl.se:8080/are.html

       Extract the path from a URL
              $ trurl --url https://curl.se/we/are.html --get '{path}'
              /we/are.html

       Extract the port from a URL
              This gets the default port based on the scheme if the port is not set in the URL.
              $ trurl --url https://curl.se/we/are.html --get '{default:port}'
              443

       Append a path segment to a URL
              $ trurl --url https://curl.se/hello --append path=you
              https://curl.se/hello/you

       Append a query segment to a URL
              $ trurl --url "https://curl.se?name=hello" --append query=search=string
               https://curl.se/?name=hello&search=string

       Read URLs from stdin
              $ cat urllist.txt | trurl --url-file -
              \&...

       Output JSON
              $ trurl "https://fake.host/search?q=answers&user=me#frag" --json
              [
                {
                  "url": "https://fake.host/search?q=answers&user=me#frag",
                  "parts": [
                      "scheme": "https",
                      "host": "fake.host",
                      "path": "/search",
                      "query": "q=answers&user=me"
                      "fragment": "frag",
                  ],
                  "params": [
                    {
                      "key": "q",
                      "value": "answers"
                    },
                    {
                      "key": "user",
                      "value": "me"
                    }
                  ]
                }
              ]

       Remove tracking tuples from query
              $ trurl "https://curl.se?search=hey&utm_source=tracker" --qtrim "utm_*"
              https://curl.se/?search=hey

       Show a specific query key value
              $ trurl "https://example.com?a=home&here=now&thisthen" -g '{query:a}'
              home

       Sort the key/value pairs in the query component
              $ trurl "https://example.com?b=a&c=b&a=c" --sort-query
              https://example.com?a=c&b=a&c=b

       Work with a query that uses a semicolon separator
              $ trurl "https://curl.se?search=fool;page=5" --qtrim "search" --query-separator ";"
              https://curl.se?page=5

       Accept spaces in the URL path
              $ trurl "https://curl.se/this has space/index.html" --accept-space
              https://curl.se/this%20has%20space/index.html

       Create multiple variations of a URL with different schemes
              $ trurl "https://curl.se/path/index.html" --iterate "scheme=http ftp sftp"
              http://curl.se/path/index.html
              ftp://curl.se/path/index.html
              sftp://curl.se/path/index.html

EXIT CODES

       trurl returns a non-zero exit code to indicate problems.

       1      A problem with --url-file

       2      A problem with --append

       3      A command line option misses an argument

       4      A command line option mistake or an illegal option combination.

       5      A problem with --set

       6      Out of memory

       7      Could not output a valid URL

       8      A problem with --qtrim

       9      If --verify is set and the input URL cannot parse.

       10     A problem with --get

       11     A problem with --iterate

       12     A problem with --replace or --replace-append

WWW

       https://curl.se/trurl

SEE ALSO

       curl(1), wcurl(1)

trurl                                              2024-09-19                                           trurl(1)