Skip to content
/ unfurl Public
forked from tomnomnom/unfurl

Pull out bits of URLs provided on stdin

License

Notifications You must be signed in to change notification settings

stfn/unfurl

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

unfurl

Pull out bits of URLs provided on stdin

Install

If you have Go installed and configured:

▶ go get -u github.com/tomnomnom/unfurl

Otherwise download the latest binary for your platform, extract it and move it to somewhere in your $PATH (e.g. /usr/bin/):

▶ wget https://github.com/tomnomnom/unfurl/releases/download/v0.0.1/unfurl-linux-amd64-0.0.1.tgz
▶ tar xzf unfurl-linux-amd64-0.0.1.tgz
▶ sudo mv unfurl /usr/bin/

Usage

unfurl works with URLs provided on stdin; they might come from a file like this one:

▶ cat urls.txt
https://sub.example.com/users?id=123&name=Sam
https://sub.example.com/orgs?org=ExCo#about
https://example.net/about#contact

Domains

You can extract the domains from the URLs with the domains mode:

▶ cat urls.txt | unfurl domains
sub.example.com
sub.example.com
example.net

If you don't want to output duplicate values you can use the -u or --unique flag:

▶ cat urls.txt | unfurl --unique domains
sub.example.com
example.net

The -u/--unique flag works for all modes.

Paths

▶ cat urls.txt | unfurl paths
/users
/orgs
/about

Query String Keys

▶ cat urls.txt | unfurl keys
id
name
org

Query String Values

▶ cat urls.txt | unfurl values
123
Sam
ExCo

Query String Key/Value Pairs

▶ cat urls.txt | unfurl keypairs
id=123
name=Sam
org=ExCo

Custom Formats

You can use the format mode to specify a custom output format:

▶ cat urls.txt | unfurl format %d%p
sub.example.com/users
sub.example.com/orgs
example.net/about

The available format directives are:

%%  A literal percent character
%s  The request scheme (e.g. https)
%u  The user info (e.g. user:pass)
%d  The domain (e.g. sub.example.com)
%S  The subdomain (e.g. sub)
%r  The root of domain (e.g. example)
%t  The TLD (e.g. com)
%P  The port (e.g. 8080)
%p  The path (e.g. /users)
%q  The raw query string (e.g. a=1&b=2)
%f  The page fragment (e.g. page-section)
%@  Inserts an @ if user info is specified
%:  Inserts a colon if a port is specified
%?  Inserts a question mark if a query string exists
%#  Inserts a hash if a fragment exists
%a  Authority (alias for %u%@%d%:%P)

Any characters that don't match a format directive remain untouched:

▶ cat urls.txt | unfurl -u format "%d (%s)"
sub.example.com (https)
example.net (http)

Note that if a URL does not include the data requested, there will be no output for that URL:

▶ echo https://example.com | unfurl format "%P"
▶ echo https://example.com:8080 | unfurl format "%P"
8080

Help

▶ unfurl -h
Format URLs provided on stdin

Usage:
  unfurl [OPTIONS] [MODE] [FORMATSTRING]

Options:
  -u, --unique   Only output unique values
  -v, --verbose  Verbose mode (output URL parse errors)

Modes:
  keys     Keys from the query string (one per line)
  values   Values from the query string (one per line)
  keypairs Key=value pairs from the query string (one per line)
  domains  The hostname (e.g. sub.example.com)
  paths    The request path (e.g. /users)
  format   Specify a custom format (see below)

Format Directives:
  %%  A literal percent character
  %s  The request scheme (e.g. https)
  %u  The user info (e.g. user:pass)
  %d  The domain (e.g. sub.example.com)
  %P  The port (e.g. 8080)
  %p  The path (e.g. /users)
  %q  The raw query string (e.g. a=1&b=2)
  %f  The page fragment (e.g. page-section)
  %@  Inserts an @ if user info is specified
  %:  Inserts a colon if a port is specified
  %?  Inserts a question mark if a query string exists
  %#  Inserts a hash if a fragment exists
  %a  Authority (alias for %u%@%d%:%P)

Examples:
  cat urls.txt | unfurl keys
  cat urls.txt | unfurl format %s:https://%d%p?%q

About

Pull out bits of URLs provided on stdin

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Go 86.7%
  • Shell 13.3%