Skip to content

valentin12/wdhypermedia-client

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Web Discoverable Hypermedia Client

Usage

Example server is https://github.com/FND/wdh

import wdhypermedia

# point the client to the root of the API
client = wdhypermedia.Client("http:https://localhost:5000/")

Paths to resources are described by relations, not by links.

# request a list of all authors by describing a path of relations to them
# [resource with list of all authors,  the author resource itself]
authors = client.traverse(["http:https://rels.example.org/authors",
                           "http:https://rels.example.org/author"])
# list the authors
for author in authors:
    handle = author.props["handle"][0]
    print(handle, end=" ")
    name = author.props.get("name", [""])[0]
    print(name, end=" ")
    website = author.props.get("website", [None])[0]
    if website:
        print(website.uri, end="")
    print()

There is also a very basic support for forms

# use a search form
search_site = client.traverse(["http:https://rels.example.org/search"])[0]
search_form = search_site.forms["search"]
# print available field names
print(list(search_form.params.keys()))
# set term and category to search for articles containing "hello"
search_form.params["category"] = "article"
search_form.params["term"] = "hello"
# submit the request
result = search_form.submit()
# get all articles in the returned resource
articles = result.traverse(["http:https://rels.example.org/article"])
for article in articles:
    print(article.props["title"])

More Examples

Basics

Starting point for an WDH API is a root document, pointing to the different documents important for the client. The client is looking for the rel attribute of the anchor elements, to find its way through the API, even if links change or other anchors are added. An example index.html could look like the following:

A standard HTML document, which contains a self-reference in its <head> as a <link> tag with rel="self". The body contains the list to the different documents: A list of articles, a list of authors and a search.

<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8">
    <title>index</title>
    <link rel="self" href="/">
</head>
<body>
<h1>index</h1>
<ul>
    <li>
        <a href="/articles" rel="http:https://rels.example.org/articles">articles</a>
    </li>
    <li>
        <a href="/authors" rel="http:https://rels.example.org/authors">authors</a>
    </li>
    <li>
        <a href="/search" rel="http:https://rels.example.org/search">search</a>
    </li>
</ul>
</body>
</html>

A Client is used to access the document

html_str = "..."
client = Client.from_html(html_str)

# all resources known to the client (key is the URI)
print(client._resources)
# Out:
# {'/': <Resource rel='' uri='/', fetched=True at ...>,
# '/authors': <Resource rel='http:https://rels.example.org/authors' uri='/authors',
#              fetched=False at ...>,
# '/articles': <Resource rel='http:https://rels.example.org/articles' uri='/articles',
#               fetched=False at ...>,
# '/search': <Resource rel='http:https://rels.example.org/search' uri='/search',
#             fetched=False at ...>}

# accessing one of the resources (as said above, by relation):
# get a list of resources with rel=~
author_lists = client.traverse(["http:https://rels.example.org/authors"])
author_list = author_lists[0]
print(author_list)
# Out: <Resource rel='http:https://rels.example.org/authors' uri='/authors', fetched=False>

Parsing Properties

A resource document has multiple options to provide its properties.

<dl><dt><dd></dd></dt></dl>

That construct is used to provide basic types like strings, numbers, booleans, timestamps and links that aren't resources. The <dt> object has an optional data-type attribute to describe the type of the content in the following <dd>/<dd>s. default data-type is string. Options are:

  • null
  • boolean: true | false
  • number
  • string
  • timestamp: <time datetime="..."></time>
  • link: <a href="...">...</a>

A <dt> tag can be followed by multiple <dd> tags, so every property, even if only one <dd> followed, is stored as list.

<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8">
    <title>author</title>
    <link rel="self" href="/authors/jdoe">
</head>
<body>
<article>
    <h1>John Doe (jdoe)</h1>
    <dl>
        <dt>handle</dt>
        <dd>jdoe</dd>
        <dt>name</dt>
        <dd>John Doe</dd>
        <dt data-type="link">website</dt>
        <dd>
            <a href="https://jdoe.example.com/blog/">https://jdoe.example.com/blog/</a>
        </dd>
        <dt data-type="timestamp">join-date</dt>
        <dd>
            <time datetime="2014-01-01">2014/01/01</time>
        </dd>
    </dl>
</article>
</body>
</html>

Parsing with the client

html_str = "..."
client = Client.from_html(html_str)
print(client.get_root().props)
# Out:
# {'name': ['John Doe'],
#  'join-date': [datetime.datetime(2014, 1, 1, 0, 0)],
#  'handle': ['jdoe'],
#  'website': [<wdhypermedia.Link at 0x7fe75c1d9240>]}

(Embedded) Resources

By looking for resources with a specific relation, the client can get resources related to him like properties, for example an article can list its authors as a list of resources, not in a <dl> tag.

<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8">
    <title>article</title>
    <link rel="self" href="/articles/0">
</head>
<body>
<article>
    <h1>Hello World</h1>
    <h3>Authors</h3>
    <ul>
        <li>
            <a href="/authors/jdoe"
                    rel="http:https://rels.example.org/author">
                FND
            </a>
        </li>
        <li>
            <a href="/authors/janedoe"
                    rel="http:https://rels.example.org/author">
                janedoe
            </a>
        </li>
    </ul>
    <h3>Properties</h3>
    <dl>
        <dt>title</dt>
        <dd>Hello World</dd>
        ...
    </dl>
</article>
</body>
</html>

Python code

html_str = "..."
client = Client.from_url(html_str)

# get the 'normal' props
print(client.get_root().props)
# Out:
# {'title': ['Hello World']}

# get the authors by rel
authors = client.traverse(['http:https://rels.example.org/author'])
for author in authors:
    print(author)
# Out:
# <Resource rel='http:https://rels.example.org/author' uri='/authors/jdoe', fetched=False>
# <Resource rel='http:https://rels.example.org/author' uri='/authors/janedoe', fetched=False>
# Right now, they contain if embedded, some, like this, no information. To change this,
# you could do author.fetch() to load the referenced document from the server or just
# try author.props["someprop"]. If "someprop" isn't in author.props, the resource will
# (if it hasn't done so yet) load the document, too.

Not Implemented

  • Full support for forms (more input elements, different methods)
  • Embedding resources in <dd> tags
  • Support multiple datetime layouts
  • extended HTML in <dd> tags

About

Web-discoverable hypermedia client

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published