Skip to content

Commit

Permalink
Revert "I don't know how to use GitHub, apparently"
Browse files Browse the repository at this point in the history
This reverts commit c95bcca.
  • Loading branch information
snappyapple632 committed Aug 17, 2019
1 parent c95bcca commit dcc5ac8
Show file tree
Hide file tree
Showing 56 changed files with 3,390 additions and 0 deletions.
674 changes: 674 additions & 0 deletions documentation-171200ee3cd1cd80f7d3f575cef02f1bd51dbf67/LICENSE

Large diffs are not rendered by default.

22 changes: 22 additions & 0 deletions documentation-171200ee3cd1cd80f7d3f575cef02f1bd51dbf67/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
NewPipe Tutorial
================

[![travis_build_state](https://api.travis-ci.org/TeamNewPipe/documentation.svg?branch=master)](https://travis-ci.org/TeamNewPipe/documentation)

This is the [tutorial](https://teamnewpipe.github.io/documentation/) for the [NewPipeExtractor](https://github.com/TeamNewPipeExtractor).
It is for those who want to write their own service, or use NewPipeExtractor in their own projects.

This tutorial and the documentation are in an early state. So [feedback](https://github.com/TeamNewPipe/documentation/issues) is always welcome :D

The tutorial is created using [`mkdocs`](https://www.mkdocs.org/). You can test and host it your self by running `mkdocs serve` in the root
directory of this project. If you want to deploy your changes and you are one of the maintainers you can run `mkdocs gh-deploy && git push`.

## License

[![GNU GPLv3 Image](https://www.gnu.org/graphics/gplv3-127x51.png)](https://www.gnu.org/licenses/gpl-3.0.en.html)

NewPipe is Free Software: You can use, study share and improve it at your
will. Specifically you can redistribute and/or modify it under the terms of the
[GNU General Public License](https://www.gnu.org/licenses/gpl.html) as
published by the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
15 changes: 15 additions & 0 deletions documentation-171200ee3cd1cd80f7d3f575cef02f1bd51dbf67/copyright
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Copyright: 2018 Christian Schabesberger <[email protected]>

License: GPL-3.0+
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# Before You Start

These documents will guide you through the process of understanding or creating your own Extractor
service of which will enable NewPipe to access additional streaming services, such as the currently supported YouTube, SoundCloud and MediaCCC.
The whole documentation consists of this page and [Jdoc](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/) setup, which explains the general concept of the NewPipeExtractor.

__IMPORTANT!!!__ This is likely to be the worst documentation you have ever read, so do not hesitate to
[report](https://github.com/teamnewpipe/documentation/issues) if
you find any spelling errors, incomplete parts or you simply don't understand something. We are an open community
and are open for everyone to help :)

## Setting Up Your Dev Environment

First and foremost, you need to meet the following conditions in order to write your own service.

### What You Need to Know:

- A basic understanding of __[git](https://try.github.io)__
- Good __[Java](https://whatpixel.com/best-java-books/)__ knowledge
- A good understanding of __[web technology](https://www.w3schools.com/)__
- A basic understanding of __[unit testing](https://www.vogella.com/tutorials/JUnit/article.html)__ and __[JUnit](https://junit.org/)__
- A thorough understanding of how to [contribute](https://github.com/TeamNewPipe/NewPipe/blob/dev/.github/CONTRIBUTING.md#code-contribution) to the __NewPipe project__

### Tools/Programs You Will Need:

- A dev environment/ide that supports:
- __[git](https://git-scm.com/downloads/guis)__
- __[Java 8](https://www.java.com/en/download/faq/java8.xml)__
- __[Gradle](https://gradle.org/)__
- __[Unit testing](https://junit.org/junit5/)__
- [IDEA Community](https://www.jetbrains.com/idea/) (Strongly recommended, but not required)
- A __[Github](https://github.com/)__ account
- A lot of patience and excitement ;D

After making sure all these conditions are provided, fork the [NewPipeExtractor](https://github.com/TeamNewPipe/NewPipeExtractor)
using the [fork button](https://github.com/TeamNewPipe/NewPipeExtractor#fork-destination-box).
This is so you have a personal repository to develop on. Next, clone this repository into your local folder in which you want to work in.
Then, import the cloned project into your [IDE](https://www.jetbrains.com/help/idea/configuring-projects.html#importing-project)
and [run it.](https://www.jetbrains.com/help/idea/performing-tests.html)
If all the checks are green, you did everything right! You can proceed to the next chapter.

### Importing the NewPipe Extractor in IntelliJ IDEA
If you use IntelliJ IDEA, you should know the easy way of importing the NewPipe extractor. If you don't, here's how to do it:

1. `git clone` the extractor onto your computer locally.
2. Start IntelliJ Idea and click `Import Project`.
3. Select the root directory of the NewPipe Extractor
4. Select "__Import Project from external Model__" and then choose __Gradle__.
![import from gradle image](img/select_gradle.png)
5. In the next window, select "__Use gradle 'wrapper' task configuration__".
![use gradle 'wrapper' task configuration checkbox](img/select_gradle_wrapper.png)

### Running "test" in Android Studio/IntelliJ IDEA

Go to _Run_ > _Edit Configurations_ > _Add New Configuration_ and select "Gradle".
As Gradle Project, select NewPipeExtractor. As a task, add "test". Now save and you should be able to run.

![tests passed on idea](img/prepare_tests_passed.png)

# Inclusion Criteria for Services

After creating you own service, you will need to submit it to our [NewPipeExtractor](https://github.com/teamnewpipe/newpipeextractor)
repository. However, in order to include your changes, you need to follow these rules:

1. Stick to our [Code contribution guidelines](https://github.com/TeamNewPipe/NewPipe/blob/dev/.github/CONTRIBUTING.md#code-contribution)
2. Do not send services that present content we [don't allow](#content-that-is-not-permitted) on NewPipe.
3. You must be willing to maintain your service after submission.
4. Be patient and make the requested changes when one of our maintainers rejects your code.

## Content That is Permitted:

- Any content that is not in the [list of prohibited content](#content-that-is-not-permitted).
- Any kind of pornography or NSFW content that does not violate US law.
- Advertising, which may need to be approved beforehand.

## Content That is NOT Permitted:

- Content that is considered NSFL (Not Safe For Life)
- Content that is prohibited by US federal law (Sexualization of minors, any form of violence, violations of human rights, etc).
- Copyrighted media, without the consent of the copyright holder/publisher.

Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
# Concept of the Extractor

## The Collector/Extractor Pattern

Before you start coding your own service, you need to understand the basic concept of the extractor itself. There is a pattern
you will find all over the code, called the __extractor/collector__ pattern. The idea behind it is that
the [extractor](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/Extractor.html)
would produce fragments of data, and the collector would collect them and assemble that data into a readable format for the front end.
The collector also controls the parsing process, and takes care of error handling. So, if the extractor fails at any
point, the collector will decide whether or not it should continue parsing. This requires the extractor to be made out of
multiple methods, one method for every data field the collector wants to have. The collectors are provided by NewPipe.
You need to take care of the extractors.

### Usage in the Front End

A typical call for retrieving data from a website would look like this:
``` java
Info info;
try {
// Create a new Extractor with a given context provided as parameter.
Extractor extractor = new Extractor(some_meta_info);
// Retrieves the data form extractor and builds info package.
info = Info.getInfo(extractor);
} catch(Exception e) {
// handle errors when collector decided to break up extraction
}
```

### Typical Implementation of a Single Data Extractor

The typical implementation of a single data extractor, on the other hand, would look like this:
``` java
class MyExtractor extends FutureExtractor {

public MyExtractor(RequiredInfo requiredInfo, ForExtraction forExtraction) {
super(requiredInfo, forExtraction);

...
}

@Override
public void fetch() {
// Actually fetch the page data here
}

@Override
public String someDataFiled()
throws ExtractionException { //The exception needs to be thrown if someting failed
// get piece of information and return it
}

... // More datafields
}
```

## Collector/Extractor Pattern for Lists

Information can be represented as a list. In NewPipe, a list is represented by a
[InfoItemsCollector](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/InfoItemsCollector.html).
A InfoItemsCollector will collect and assemble a list of [InfoItem](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/InfoItem.html).
For each item that should be extracted, a new Extractor must be created, and given to the InfoItemsCollector via [commit()](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/InfoItemsCollector.html#commit-E-).

![InfoItemsCollector_objectdiagram.svg](img/InfoItemsCollector_objectdiagram.svg)

If you are implementing a list in your service you need to implement an [InfoItemExtractor](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/Extractor.html),
that will be able to retreve data for one and only one InfoItem. This extractor will then be _comitted_ to the __InfoItemsCollector__ that can collect the type of InfoItems you want to generate.

A common implementation would look like this:
```
private SomeInfoItemCollector collectInfoItemsFromElement(Element e) {
// See *Some* as something like Stream or Channel
// e.g. StreamInfoItemsCollector, and ChannelInfoItemsCollector are provided by NP
SomeInfoItemCollector collector = new SomeInfoItemCollector(getServiceId());
for(final Element li : element.children()) {
collector.commit(new InfoItemExtractor() {
@Override
public String getName() throws ParsingException {
...
}
@Override
public String getUrl() throws ParsingException {
...
}
...
}
return collector;
}
```

## ListExtractor

There is more to know about lists:

1. When a streaming site shows a list of items, it usually offers some additional information about that list like its title, a thumbnail,
and its creator. Such info can be called __list header__.

2. When a website shows a long list of items it usually does not load the whole list, but only a part of it. In order to get more items you may have to click on a next page button, or scroll down.

Both of these Problems are fixed by the [ListExtractor](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.html) which takes care about extracting additional metadata about the liast,
and by chopping down lists into several pages, so called [InfoItemsPage](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.InfoItemsPage.html)s.
Each page has its own URL, and needs to be extracted separately.


For extracting list header information a `ListExtractor` behaves like a regular extractor. For handling `InfoItemsPages` it adds methods
such as:

- [getInitialPage()](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.html#getInitialPage--)
which will return the first page of InfoItems.
- [getNextPageUrl()](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.html#getNextPageUrl--)
If a second Page of InfoItems is available this will return the URL pointing to them.
- [getPage()](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/ListExtractor.html#getPage-java.lang.String-)
returns a ListExtractor.InfoItemsPage by its URL which was retrieved by the `getNextPageUrl()` method of the previous page.


The reason why the first page is handled special is because many Websites such as YouTube will load the first page of
items like a regular web page, but all the others as an AJAX request.

An InfoItemsPage itself has two constructors which take these parameters:
- The __InfoitemsCollector__ of the list that the page should represent
- A __nextPageUrl__ which represents the url of the following page (may be null if not page follows).
- Optionally __errors__ which is a list of Exceptions that may have happened during extracton.

Here is a simplified reference implementation of a list extractor that only extracts pages, but not metadata:

```
class MyListExtractor extends ListExtractor {
...
private Document document;
...
public InfoItemsPage<SomeInfoItem> getPage(pageUrl)
throws ExtractionException {
SomeInfoItemCollector collector = new SomeInfoItemCollector(getServiceId());
document = myFunctionToGetThePageHTMLWhatever(pageUrl);
//remember this part from the simple list extraction
for(final Element li : document.children()) {
collector.commit(new InfoItemExtractor() {
@Override
public String getName() throws ParsingException {
...
}
@Override
public String getUrl() throws ParsingException {
...
}
...
}
return new InfoItemsPage<SomeInfoItem>(collector, myFunctionToGetTheNextPageUrl(document));
}
public InfoItemsPage<SomeInfoItem> getInitialPage() {
//document here got initialzied by the fetch() function.
return getPage(getTheCurrentPageUrl(document));
}
...
}
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# Concept of the LinkHandler

The [LinkHandler](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/linkhandler/LinkHandler.html)
represent links to resources like videos, search requests, channels, etc.
The idea is that a video can have multiple links pointing to it, but it has
one unique ID that represents it, like this example:

[oHg5SJYRHA0](https://www.youtube.com/watch?v=oHg5SJYRHA0) can be represented as:

- [https://www.youtube.com/watch?v=oHg5SJYRHA0](https://www.youtube.com/watch?v=oHg5SJYRHA0) (the default URL for YouTube)
- [https://youtu.be/oHg5SJYRHA0](https://youtu.be/oHg5SJYRHA0) (the shortened link)
- [https://m.youtube.com/watch?v=oHg5SJYRHA0](https://m.youtube.com/watch?v=oHg5SJYRHA0) (the link for mobile devices)

### Importand notes about LinkHandler:
- A simple `LinkHandler` will contain the default URL, the ID, and the original URL.
- `LinkHandler`s are read only.
- `LinkHandler`s are also used to determine which part of the extractor can handle a certain link.
- In order to get one you must either call
[fromUrl()](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/linkhandler/LinkHandlerFactory.html#fromUrl-java.lang.String-) or [fromId()](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/linkhandler/LinkHandlerFactory.html#fromId-java.lang.String-) of the the corresponding `LinkHandlerFactory`.
- Every type of resource has its own `LinkHandlerFactory`. Eg. YoutubeStreamLinkHandler, YoutubeChannelLinkHandler, etc.

### Usage

The typical usage for obtaining a LinkHandler would look like this:
```java
LinkHandlerFactory myLinkHandlerFactory = new MyStreamLinkHandlerFactory();
LinkHandler myVideo = myLinkHandlerFactory.fromUrl("https://my.service.com/the_video");
```

### Implementation

In order to use LinkHandler for your service, you must override the appropriate LinkHandlerFactory. eg:

```java
class MyStreamLinkHandlerFactory extends LinkHandlerFactory {

@Override
public String getId(String url) throws ParsingException {
// Return the ID based on the URL.
}

@Override
public String getUrl(String id) throws ParsingException {
// Return the URL based on the ID given.
}

@Override
public boolean onAcceptUrl(String url) throws ParsingException {
// Return true if this LinkHanlderFactory can handle this type of link
}
}
```

### ListLinkHandler and SearchQueryHandler

List based resources, like channels and playlists, can be sorted and filtered.
Therefore these type of resources don't just use a LinkHandler, but a class called
[ListLinkHandler](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/linkhandler/ListLinkHandler.html),
which inherits from LinkHandler and adds the field [ContentFilter](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/linkhandler/ListLinkHandler.html#contentFilters),
which is used to filter by resource type, like stream or playlist, and
[SortFilter](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/linkhandler/ListLinkHandler.html#sortFilter),
which is used to sort by name, date, or view count.

__!!ATTENTION!!__ Be careful when you implement a content filter: No selected filter equals all filters selected. If your get an empty content filter list in your extractor, make sure you return everything. By all means, use "if"
statements like `contentFilter.contains("video") || contentFilter.isEmpty()`.

ListLinkHandler are also created by overriding the [ListLinkHandlerFactory](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/linkhandler/ListLinkHandlerFactory.html)
additionally to the abstract methods this factory inherits from the LinkHandlerFactory you can override
[getAvailableContentFilter()](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/linkhandler/ListLinkHandlerFactory.html#getAvailableContentFilter--)
and [getAvailableSortFilter()](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/linkhandler/ListLinkHandlerFactory.html#getAvailableSortFilter--).
Through these you can tell the front end which kind of filter your service supports.


#### SearchQueryHandler

You cannot point to a search request with an ID like you point to a playlist or a channel, simply because one and the
same search request might have a different outcome depending on the country or the time you send the request. This is
why the idea of an "ID" is replaced by a "SearchString" in the [SearchQueryHandler](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/linkhandler/SearchQueryHandler.html)
These work like regular ListLinkHandler, except that you don't have to implement the methods `onAcceptUrl()`
and `getId()` when overriding [SearchQueryHandlerFactory](https://teamnewpipe.github.io/NewPipeExtractor/javadoc/org/schabi/newpipe/extractor/linkhandler/SearchQueryHandlerFactory.html).








Loading

0 comments on commit dcc5ac8

Please sign in to comment.