-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: ability to re-upload a file from given url #33
Comments
That sounds reasonable. It will not be a second type (
We could also have several supported ways. Uploading files should be easy and I don't mind giving people a choice here. |
I don't mind the exact implementation, it was just a suggestion. Passing instance of Also I think it's important to not start the download process until the message is actually being send. Like if I use throttler to slow down sending messages and I "sent" like 20 of them (and they get queued) I don't want files to be downloaded immediately when I set them to be sent but when the queue gets to the message with files. |
This will be achieved automatically because grammY features a custom multipart/form-data implementation that collects all resources lazily when they are needed. For example, if you want to send ten files in a media group with 50 MB each (so 500 MB in one request), then grammY will not pull all files into memory at the same time, and then start streaming them up. Instead, only one file at a time is downloaded, and only a small portion of the file is kept in a buffer at all times. |
I think we have to accept that files can be downloaded automatically several times if a request needs to be retried. This does cause redundant traffic, but I don't see any good alternative. |
This doesn't sound like an unsolvable problem from general programming perspective, but may be hard to implement due to grammY internals, I know nothing about grammY internal code. I think most users will accept that downside and maybe something could be done to fix that in the future. |
If the grammY internals would not matter, what would your solution be? Write the file to disk? Keep in RAM? We somehow have to remember the already downloaded data without OOMing the server. |
I somehow missed your reply earlier. I guess writing to disk is the best solution. I personally would like to reupload ~100MB files (few of them, when send as album). Wasting RAM for that doesn't sound good. In RAM we can keep just the pointers to file path. Meanwhile, while this is an early idea stage: we may think about how flexible this would be. Some servers will deny access for non-browser User Agents, some will deny access without Referer from the same domain being set. Some files may be accessible via POST request. Are we gonna make this flexible and allow setting method and headers and maybe body, which will make it more ugly to use or we will allow just simple GET links with default headers and for any other use case user would have to manually download the file anyway? |
That's totally possible to do. Either way, caching the file is optional and explicit, i.e. the file path must be specified.. How are you going to handle it if a user creates a single instance of an InputFile, and passes it to several API calls? Does this reuse the cached file? If so, that means we cannot clear the cache after a successful request. We can't even do it after a successful update handling. Should the file be deleted again automatically, and if so, when?
We could expose the same type that's used for the base fetch configuration. That would allow users to simply pass their fetch config object, and we'll forward it to fetch. That way, we automatically get support for proxies etc. I agree that it starts to look more verbose for the user at this point, but I think that's acceptable because the complexity of the problem increases. Nonetheless, it should still look pretty if you just want a simple GET request without config. |
Sorry for being late again @KnorpelSenf, I looked at the merged code and I see no fetch config feature (so no POST, no headers), so I guess we can keep this open meanwhile? |
I will take a look at the questions you asked in last comment in the evening |
We can skip the cache for now, but if we want to use it I guess we should count how many times it was used (bump the counter each time some function like sendPhoto receives it) and after successfull (or failed) request we should decrease the counter. Files should be cached until counter is zero. |
This is described in #80. Please read the PR description. It includes a simple workaround.
I do not see how this logic adds up. If a user tries to send a file, and the request fails, and then does not want to try again, we are polluting the cache with more and more files. We cannot guess how grammY will be used. Another point is that the transfer may fail because the server returned a wrong file, e.g. a photo that is too large. When the bot tries the request again, grammY must fetch the new file. It must not used the wrong cached version. How are we going to detect this? It is much easier to leave caching to user land code. Remember that we already have a files plugin (https://grammy.dev/plugins/files) that can store files locally. This makes it trivially easy to download a file, and to try several requests based on the file path. |
Oh, i missed the description I guess you are right with the caching - let's leave that to user |
I want to use for example
sendMediaGroup
with an URL but I don't want grammy to pass the URL to Telegram but instead in the background download the file and upload it as a new filewhy?
Telegram fetches the media information from URL and caches it. Small enough MP4 files with no audio are considered Animations. MP4 with audio will be a Video.
I want to send an album with two mp4 files, one has audio and one has not, so they are Video + Animation. And Telegram does not allow me to send them as media group as Media Groups cannot contain Animations.
If i upload files from disk with InputFile then it's ok:
Telegram won't use it's cache on uploads and will consider my mp4 files a Videos.
If I use URL:
Telegram will return:
Proposed change could be useful in other cases as well - like my bot has access to some url, but telegram servers don't - useful with local http-based microservices.
I can of course download the file manually but it would be a much smoother experience to use something like that:
The text was updated successfully, but these errors were encountered: