Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modifed event triggered twice #93

Closed
jcouyang opened this issue Mar 2, 2012 · 21 comments · Fixed by #747
Closed

Modifed event triggered twice #93

jcouyang opened this issue Mar 2, 2012 · 21 comments · Fixed by #747

Comments

@jcouyang
Copy link

jcouyang commented Mar 2, 2012

if I copy and paste a file on OSX lion
it triggered created event once and modified event twice
should it be just trigger once modified event?

INFO:root:Created file: /Users/ouyangjichao/BoxSync/rename/testwatchdog.py INFO:root:Modified file: /Users/ouyangjichao/BoxSync/rename/testwatchdog.py INFO:root:Modified file: /Users/ouyangjichao/BoxSync/rename/testwatchdog.py

@gorakhargosh
Copy link
Owner

Are you able to reproduce this issue with the current HEAD or v0.6.0?

@wsantos
Copy link

wsantos commented Aug 16, 2012

I can reproduce with HEAD. but i have some strange behaviour, the log command don't run twice, only occours in shell_command

@thenSir
Copy link

thenSir commented Mar 29, 2013

Same situation here as wsantos. Works great but this particular issue made watchdog useless for me.

@samuela
Copy link

samuela commented Dec 5, 2013

I'm encountering this as well. The on_modified method is called twice whenever I modify a file. First event.src_path is "/path/to/file.txt" and the second time event.src_path is "/path/to".

I'm running 587cbee on Mavericks.

@travcunn
Copy link

travcunn commented Dec 5, 2013

I'll look into this. This has been an issue for quite some time.

@tamland
Copy link
Collaborator

tamland commented Dec 5, 2013

First order of business is to check if it's actually modified twice or not. E.g if you touch an existing file and it triggers two events, it should be a bug.

@StephenHesperus
Copy link

I'm having the same issue when using vim to edit files(with "set nobackup, nowritebackup") under Ubuntu 14.04.
Although this is vary late, now watchdog.observers.polling.PollingObserver works fine for me.

@naveenjafer
Copy link

Was this issue resolved? I just resave the same file back to the disk without modifying it and it again triggers 2 events in my case.

@travcunn
Copy link

Perhaps it would be helpful to do a stat to check when a file was last modified to avoid triggering twice, especially if it's unavoidable from OS-specific implementation.

@RRSR
Copy link

RRSR commented Dec 25, 2017

The same issue still exists i.e. on the creation of a new file the 'modified' event is called twice.

@naveenjafer
Copy link

naveenjafer commented Dec 25, 2017

@RRSR A workaround to this which I am following is based on a suggestion by @travcunn

https://pastebin.com/gFQBQM0S

The 2 events that are fired are fired just after each other, with a few milliseconds at most(at least in my case). Hope this helps

@JMLX42
Copy link

JMLX42 commented Dec 15, 2018

I have the same issue here on Ubuntu 18.04 64bit: the modified event is triggered twice when I use the cp command in my shell.

@Endogen
Copy link

Endogen commented Jan 11, 2019

The same issue still exists i.e. on the creation of a new file the 'modified' event is called twice.

I have the same issue. Can we please get that resolved?

@leon-vv
Copy link

leon-vv commented Jan 22, 2019

I also have this issue, and I think I found the problem.

In src/watchdog/observers/inotify.py line 162 we have:

elif event.is_modify:
    cls = DirModifiedEvent if event.is_directory else FileModifiedEvent
    self.queue_event(cls(src_path))

which uses the is_modify property. Which comes from src/watchdog/observers/inotify_c.py:

@property
def is_modify(self):
    return self._mask & InotifyConstants.IN_MODIFY > 0

Notice that IN_MODIFY is used as the definition of file modification. But the Inotify documentation (http:https://inotify.aiken.cz/?section=inotify&page=faq&lang=en) says:

What is the difference between IN_MODIFY and IN_CLOSE_WRITE?
The IN_MODIFY event is emitted on a file content change (e.g. via the write() syscall) while IN_CLOSE_WRITE occurs on closing the changed file. It means each change operation causes one IN_MODIFY event (it may occur many times during manipulations with an open file) whereas IN_CLOSE_WRITE is emitted only once (on closing the file).

Is it better to use IN_MODIFY or IN_CLOSE_WRITE?
It varies from case to case. Usually it is more suitable to use IN_CLOSE_WRITE because if emitted the all changes on the appropriate file are safely written inside the file. The IN_MODIFY event needn't mean that a file change is finished (data may remain in memory buffers in the application). On the other hand, many logs and similar files must be monitored using IN_MODIFY - in such cases where these files are permanently open and thus no IN_CLOSE_WRITE can be emitted.

So the solution is, I think, to expose both the IN_MODIFY and the IN_CLOSE_WRITE event in the Python interface. This way both use cases can be supported.

@anvesh1212
Copy link

Is this issue resolved? because I'm using watchdog 0.9.0, when i modify a .txt file through notepad in windows 10, on_modify methods gets call once or twice randomly.

@Endogen
Copy link

Endogen commented Jun 20, 2019

@anvesh1212 no, not resolved

@anvesh1212
Copy link

@Endogen any alternative solution? and when can we expect it to be resolved?

@Endogen
Copy link

Endogen commented Jun 21, 2019

@anvesh1212 I don't think it will ever be fixed. This issue is open since 2012. I go with the workaround that @naveenjafer posted on 25 Dec 2017.

@MalcolmOdd
Copy link

MalcolmOdd commented Oct 29, 2019

@anvesh1212 On Windows 10 it might be a slightly different issue. The library will provide a different implementation (WindowsApiObserver) when it detects that the platform is Windows. I was able to reproduce the issue 100% of the time on Windows 10 with the default import:

from watchdog.observers import Observer

However I no longer reproduce it by selecting the generic implementation:

from watchdog.observers import PollingObserver as Observer

This could be used as a workaround, although this class may not be guaranteed to be suitable in the future.

@danielloader
Copy link

danielloader commented Oct 30, 2019

@anvesh1212 On Windows 10 it might be a slightly different issue. The library will provide a different implementation (WindowsApiObserver) when it detects that the platform is Windows. I was able to reproduce the issue 100% of the time on Windows 10 with the default import:

from watchdog.observers import Observer

However I no longer reproduce it by selecting the generic implementation:

from watchdog.observers import PollingObserver as Observer

This could be used as a workaround, although this class may not be guaranteed to be suitable in the future.

This happens because you're defaulting back to the slow polling rather than relying on the windows filechange API. See more here: https://github.com/gorakhargosh/watchdog#supported-platforms
https://github.com/gorakhargosh/watchdog/blob/master/src/watchdog/observers/polling.py#L126

As an aside I had to use from watchdog.observers.polling import PollingObserver as Observer instead for v0.9.0.

Not convinced this would be suited for anything other than tens of files, it'll scale awfully with increasing complexity of file structures.

It's a complicated issue to fix, though it makes it difficult for my use case - something akin to an npm watch command to monitor file changes for compilation of jinja2 templates.

Interestingly, VScode has some peculiar but expected behaviour when saving - if the file is empty it's a single modification event, if it's saving with content it's two - assuming one is dumping the content after. Not much else to debug here other than some events will be duplicated and sometimes it could be by design.

Edit: Found this and it works for the most part, but limits some things - https://stackoverflow.com/questions/18599339/python-watchdog-monitoring-file-for-changes just by virtue of having a second timeout to prevent duplicates.

As an example: Running the following in WSL Bash
touch test_file-{0..100000}.txt
and the resulting output from that stackoverflow script was:

Event type: modified  path : .\test\test_file-0.txt
False
Event type: modified  path : .\test\test_file-3334.txt
False
Event type: modified  path : .\test\test_file-7209.txt
False
Event type: modified  path : .\test\test_file-10331.txt
False
Event type: modified  path : .\test\test_file-13916.txt
False
Event type: modified  path : .\test\test_file-17153.txt
False
Event type: modified  path : .\test\test_file-20907.txt
False
Event type: modified  path : .\test\test_file-24328.txt
False
Event type: modified  path : .\test\test_file-27599.txt
False
Event type: modified  path : .\test\test_file-30788.txt
False
Event type: modified  path : .\test\test_file-33740.txt
False
Event type: modified  path : .\test\test_file-37422.txt
False
Event type: modified  path : .\test\test_file-40442.txt
False
Event type: modified  path : .\test\test_file-44172.txt
False
Event type: modified  path : .\test\test_file-47999.txt
False
Event type: modified  path : .\test\test_file-51371.txt
False
Event type: modified  path : .\test\test_file-55067.txt
False
Event type: modified  path : .\test\test_file-58909.txt
False
Event type: modified  path : .\test\test_file-61758.txt
False
Event type: modified  path : .\test\test_file-65516.txt
False
Event type: modified  path : .\test\test_file-69054.txt
False
Event type: modified  path : .\test\test_file-72506.txt
False
Event type: modified  path : .\test\test_file-76257.txt
False
Event type: modified  path : .\test\test_file-79455.txt
False
Event type: modified  path : .\test\test_file-82837.txt
False
Event type: modified  path : .\test\test_file-86725.txt
False
Event type: modified  path : .\test\test_file-90524.txt
False
Event type: modified  path : .\test\test_file-94054.txt
False
Event type: modified  path : .\test\test_file-97403.txt
False

As such it's missing most of the events there - so there may be some scope to fix that because if a lot of files change at once it'll miss it rather than dropping duplicates.

For comparison: Standard observer behaviour when I click save all on 4 modified files:

event type: modified  path : .\test\test_file-b.txt
event type: modified  path : .\test\test_file-a.txt
event type: modified  path : .\test\test_file-c.txt
event type: modified  path : .\test\test_file-d.txt
event type: modified  path : .\test\test_file-b.txt
event type: modified  path : .\test\test_file-d.txt
event type: modified  path : .\test\test_file-c.txt
event type: modified  path : .\test\test_file-a.txt

@danielloader
Copy link

danielloader commented Oct 30, 2019

So I did some digging and found watchgod, and it handles this better albeit in a different way, opened 4 files in vscode, made a change and did save all running the following example:

from watchgod import watch

for changes in watch('.'):
    print(changes)

resulting output:

python .\test.py 
{(<Change.modified: 2>, '.\\test\\test_file-d.txt'), (<Change.modified: 2>, '.\\test\\test_file-c.txt'), (<Change.modified: 2>, '.\\test\\test_file-a.txt'), (<Change.modified: 2>, '.\\test\\test_file-b.txt')}

It's definitely different but it's not duplicated so from my POV that's something I can work with - last thing I want to do is add a FIFO queue and dequeue system in to deduplicate these things prior to execution of a function on the file.

Example 2:
$ touch test_file-{a..z}.txt

{(<Change.modified: 2>, '.\\test\\test_file-d.txt'), (<Change.modified: 2>, '.\\test\\test_file-m.txt'), (<Change.modified: 2>, '.\\test\\test_file-f.txt'), (<Change.modified: 2>, '.\\test\\test_file-h.txt'), (<Change.modified: 2>, '.\\test\\test_file-v.txt'), (<Change.modified: 2>, '.\\test\\test_file-w.txt'), (<Change.modified: 2>, '.\\test\\test_file-g.txt'), (<Change.modified: 2>, '.\\test\\test_file-n.txt'), (<Change.modified: 2>, '.\\test\\test_file-q.txt'), (<Change.modified: 2>, '.\\test\\test_file-z.txt'), (<Change.modified: 2>, '.\\test\\test_file-x.txt'), (<Change.modified: 2>, '.\\test\\test_file-i.txt'), (<Change.modified: 2>, '.\\test\\test_file-l.txt'), (<Change.modified: 2>, '.\\test\\test_file-c.txt'), (<Change.modified: 2>, '.\\test\\test_file-p.txt'), (<Change.modified: 2>, '.\\test\\test_file-a.txt'), (<Change.modified: 2>, '.\\test\\test_file-o.txt'), (<Change.modified: 2>, '.\\test\\test_file-r.txt'), (<Change.modified: 2>, '.\\test\\test_file-s.txt'), (<Change.modified: 2>, '.\\test\\test_file-y.txt'), (<Change.modified: 2>, '.\\test\\test_file-j.txt'), (<Change.modified: 2>, '.\\test\\test_file-t.txt'), (<Change.modified: 2>, '.\\test\\test_file-u.txt'), (<Change.modified: 2>, '.\\test\\test_file-e.txt'), (<Change.modified: 2>, '.\\test\\test_file-k.txt'), (<Change.modified: 2>, '.\\test\\test_file-b.txt')}

All 26 files changed were shown in a single response.

Tried the same with 1..1000 and all the files were in the same response, and added print(len(changes)) to the for loop to confirm that too.

All in all it seems to work and it may help the people in this thread.

Disclaimer: It's using file system polling, so it may not be efficient but the async support presented some perks against that backdrop. Performance comparison isn't that easy with an async function so I've not tried to wrap it in timeit, that being said, it may be useful to get a snapshot dictionary back of changes in a polling time delta rather than many itemised changes - depends on usecase.

Note: Just tried it on Arch with linux 5.3 and it's ORDERS of magnitude faster than windows. For comparison, I switched to powershell natively with fsutil to create files and I couldn't even create files quick enough - on linux with bash parameter expansion on touch I could create, and get a dictionary of, 10000 files in less than a second.

Can't confirm on MacOS but I can't think of a way to create many dummy files simultaneously on windows, a for loop with fsutil seems too slow, using WSL to use touch in bash suffers the NTFS translation layer IO performance penality - if anyone has a good suggestion there to create 1000 files on windows simultaneously for testing I'm all ears.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.