-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Active Display by Mouse Location #93
Active Display by Mouse Location #93
Conversation
@cparish312 weird frame dimensions were probably due to scaling factors... which get weird and tricky when different monitors have different scaling factors - Also - what exactly does DisplayID being |
Okay cool I can look into the App: Xcode App: Firefox App: Messages From my understanding (at least for my device and settings), DisplayID being 1 means that the active display is the main display of the computer. Something else I came across is the Screens have SeparateSpaces setting that could impact a bunch of these APIs. I have mine set to true. |
How do you feel about using the mouse for tracking the active display? |
I'm too ignorant of mac apis to be able to say it confidently, but i'd be very hesitant to rely on a specific display id to determine anything ( re using mouse to track the active display - feels relatively (or even quite) sketchy. if i left my mouse on my external monitor, and used cmd + tab to swap over to my ide and start typing, i'd sure hope rem would record wherever i am "active", not necessarily where my mouse happens to be |
Could we do something like |
I agree the mouse tracking is not ideal but I wonder if there's a good reason rewind decided to use it. My guess would be robustness given how fickle these API window approaches seem to be. Just tried |
That said I am currently using my mouse for scrolling even though my active window is on a different monitor so the trade off seems to go both ways (for me at least) |
Maybe mouse event steals focus to wherever it is and keyboard event falls back to active window? |
We have global listeners for both that could trigger |
Unfortunately, not seeing any way to get the active window from the actual keyboard event listener. I'm a bit worried about this approach bringing back the original bug. Imagine you scroll over to an external monitor that is in full screen mode (start capturing on that monitor). Then, you click and start typing (starts recording your main monitor). Really wish |
Does CGMainDisplayID have the same problem? |
I think NSScreen.main is "the screen with the menu bar" |
|
They should make a NSScreen.menuBar for that 😂 |
I mean another option is to completely rethink the multiple monitor recording where one possibility is just recording the displays with changes, but this could obviously get more tricky than it is worth at this stage (unless we came up with some elegant way to achieve this). Also this assumes the capturing and image comparison steps are relatively cheap computationally. |
Overall I'm leaning towards mouse location unless you're strongly opposed (or we find the "real" NSScreen.main without having to use the Accessibility API) |
Can we make a setting that says "always record window with mouse"? |
Yeah sure! And then otherwise it defaults to |
Yeah until we find a better solution! |
let mouseLocation = NSEvent.mouseLocation | ||
if let screen = NSScreen.screens.first(where: {$0.frame.contains(mouseLocation)}) { | ||
if let screenID = screen.deviceDescription[NSDeviceDescriptionKey("NSScreenNumber")] as? NSNumber { | ||
displayID = CGDirectDisplayID(screenID.uint32Value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put mouse first since it sometimes results a nil displayID when computer wakes up
guard displayID != nil else { return } | ||
|
||
guard displayID != nil else { | ||
logger.debug("DisplayID is nil") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are still some instances where recording will randomly stop so added these debugs
This is a great compromise! It is probably worth putting this in the docs somewhere as it would be frustrating to realize rem was never recording your external monitors in fullscreen. |
|
||
let frameId = DatabaseManager.shared.insertFrame(activeApplicationName: activeApplicationName) | ||
|
||
if settingsManager.settings.onlyOCRFrontmostWindow { | ||
if settingsManager.settings.onlyOCRFrontmostWindow && displayID == CGMainDisplayID() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this still applicable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah so for mouse mode it is to make sure the image isn't cropped when the active window is on the main display but the recorded display is not the main display.
For active window mode it doesn't matter (for my set up at least) since the cropping step will fail because the window.frame will have negative indexes (Also the frame would be wrong either way). I briefly tried to recover this functionality for the external monitors but ran into the funky window.frame behaviors. Did you think that the cropping was working for external monitors previously?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
funky window frame behaviors
Is the backingScalingFactor
being properly applied to the window frames? that's usually the culprit.
I assume the window frames need to be repositioned based on the monitor.
I think there's a larger issue here.
So - should checking the active mouse mode uncheck the only ocr frontmost window setting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Seems like they can't be used together)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I mean theoretically it could work when both the mouse and active window are on display but would require some more complex handling (that we can't even currently do without consistently identifying the display of the active window).
Also, I think the reality is for people with multiple monitors most of the time the front window will be the entire display so I don't think the feature is a big save. I'm going to look at the window.frame problem with some fresh eyes and try the backingScalingFactor
again but assuming that doesn't work one option could be ignoring settingsManager.settings.onlyOCRFrontmostWindow
when there are multiple displays. Otherwise the whole thing gets a bit funky since the window is only ever cropped on the main display.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With every window.frame
exploration the mystery deepens. So It seems like it is possible to match up a window.frame
to a screen location by making the y coordinate of the screen negative. However, I would be very hesitant to assume this is the case for all set ups.
Here is an example where I had all 3 windows on fullscreen:
App: Xcode
window.frame: (712.0, -1056.0, 1920.0, 54.0)
Screen: VG245, Display ID: 5, Coordinates: x=712.0, y=982.0, Width=1920.0, Height=1080.0
backingScaleFactor: 1.0
App: Firefox
window.frame: (-1208.0, -1080.0, 1920.0, 72.0)
Screen: HP 24mh, Display ID: 3, Coordinates: x=-1208.0, y=982.0, Width=1920.0, Height=1080.0
backingScaleFactor: 1.0
App: Messages
window.frame: (0.0, 37.0, 1512.0, 68.0)
Screen: Built-in Retina Display, Display ID: 1, Coordinates: x=0.0, y=0.0, Width=1512.0, Height=982.0
backingScaleFactor: 2.0
"Funky" behavior:
- Weird very small heights (sometimes) when in full screen. Ran two back to back tests without changing the code or moving the window and one had a height of 72 and one had 1080. Strange the widths always seem to be reasonable 🤷
- The
window.frame
doesn't change when the window is manipulated. I tried both moving it around and resizing it. - I have a terminal open on each monitor and it seems to always return the
window.frame
of the terminal on the main monitor even if I am actively using the terminal on an external monitor.
From my perspective, the cleanest solution is to ignore the settingsManager.settings.onlyOCRFrontmostWindow
setting when there are multiple displays. Based on some online research and my belief that rewind just uses mouse location and no window based OCR optimization, I'm not entirely sure a robust solution exists given the current APIs (without using the Accessibility API). unlost uses the Accessibility API to do window based OCR but it causes this annoying icon to appear on the active window. Also for most users with multiple monitors, I doubt it will be much of a compute save anyways.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only running ocr on the current window can be a pretty large savings especially for people using multiple monitors. if i have a 4k monitor, i can easily have 4-5 open windows with text. only doing 1 is a big deal. running rem 24/7 could/can be expensive from an energy perspective, so always working to reduce it as much as possible.
if you're using external monitors though - probably plugged in.
One other thing worth noting is if I wake up and my mouse is on my one monitor that takes a second to get an active display (or if a monitor is unplugged), the Also, thanks for all of the support with these updates! |
What do you mean by "the unplugging"? I'd say retry a few times and then stop remembering seems like a fine solution here! I want to enable you to get this over the line, so is this an existing issue? Or does it cause problems as a result of this change set? If the former, happy to get this in as is! |
This PR is addressing #91
The issue was that when a window in in full screen mode on an external display
NSScreen.main
is set to 1 (which is the main monitor) even if the active application is on the external monitor. Since this was method used to determine which display to record, an external monitor in fullscreen mode was never able to be recorded.I attempted to use the active window frame to identify the active display ID but things got strange with the returned frame dimensions, and I could not identify a consistent way to get the correct display ID. This got me concerned about the onlyOCRFrontmostWindow functionality as it utilizes the window.frame to determine the region to crop. Two weird things occurred:
Also, since the window.frame will contain negative values when the window is on an external monitor, the cropImage functionality fails causing a fallback to full image OCR.
Sub ideal solution:
Use the mouse location to identify the active display.
Main considerations:
P.S. Maybe this is not an issue when there are no external monitors, but the funky window.frame results (even when the window was on the main display) sketched me out, so I will personally be turning off onlyOCRFrontmostWindow 🤷. Hopefully can find a good solution, but a small amount of research leads me to believe using the accessibility API may be the only robust option.