Hacker News new | past | comments | ask | show | jobs | submit login
Launch HN: Reflect (YC S20) – No-code test automation for web apps
262 points by tmcneal on July 20, 2020 | hide | past | favorite | 102 comments
We're Fitz and Todd, co-founders of Reflect (https://reflect.run) - we’re excited to share our no-code tool for automated web testing.

We worked together for 5+ years at a tech start-up that deployed multiple times a week. After every deployment, a bunch of our developers would manually smoke test the application by running through all of the critical user experiences. This manual testing was expensive in terms of our time. To speed up the tests’ run time, we dedicated developer resources to writing and managing Selenium scripts. That was expensive at “compile time” due to authoring and maintenance. At a high-level, we believe that the problem with automated end-to-end testing comes down to two things: tests are too hard to create, and they take too much time to maintain. These are the two issues we are trying to solve with Reflect.

Reflect lets you create end-to-end tests just by using your web application, and then executes that test definition whenever you want: on a schedule, via API trigger or simply on-demand. It emails you whenever the test fails and provides a video and the browser logs of the execution.

One knee-jerk reaction we’re well aware of: record-and-playback testing tools, where the user creates a test automatically by interacting with their web application, have traditionally not worked very well. We’re taking a new approach by loading the site-under-test inside of a VM in the cloud rather than rely on a locally installed browser extension. This eliminates a class of recording errors due to existing cookies, proxies or other extensions introducing state that the test executor is not aware of, and unifies the test creation environment with the test execution environment. By completely controlling the test environment we can also expose a better UX for certain actions. For example, to do visual testing you just click-and-drag over the element(s) you want to test. For recording file uploads we intercept the upload request in the VM, prompt you to upload a file from your local file system, and then store that file in the cloud and inject it into the running test. If you want to add additional steps to an existing test, we’ll put you back into the recording experience and fast-forward you to that point in the test, where again all you need to do is use your site and we’ll add those actions to your existing test. Controlling the environment also allows us to reduce the problem space by blocking actions which you typically wouldn’t want to test, but which are hard to replicate and thus could lead to failed recordings (e.g. changing browser dimensions mid-recording). As an added bonus, our approach requires no installation whatsoever!

We capture nearly every browser action from hovers to file uploads, and drag-and-drops to iframes, while building a repeatable, machine-executable test definition. We support variables for dynamic inputs, and test composition so your test suite is DRY. The API provides flexible integration with your CI/CD out of the box and supports creating tests in prod and running them in staging on the fly. You don’t need to use a separate test grid, as all Reflect tests run on our own infrastructure. Parallel execution of your tests is a two click config change and we don’t charge you extra for it.

Some technical details that folks might find interesting:

- For every action you take we’ll generate multiple selectors targeting the element you interacted with. We wrote a custom algorithm that generates a diverse set of selectors (so that if you delete a class in the future your tests won’t break), and ranks them by specificity (i.e. [data-test-id] > img[alt=”foo”] > #bar > .baz).

- To detect certain actions we have to track DOM mutations across async boundaries. So for example we can detect if a hover ended up mutating an element you clicked on and thus should be captured as a test step, even if the hover occurred within a requestAnimationFrame, XHR/fetch callback, setTimeout/setInterval, etc.

- We detect and ignore auto-genned classes from libraries like Styled Components. We use a heuristic to do this so it’s not perfect, but this approach allows us to generate higher quality selectors than if we didn’t ignore them.

- One feature in beta that we’re really excited about: For React apps we have the ability to target React component names as if they were DOM elements (e.g. if you click on a button you might get a selector like “<NotificationPopupMenu> button”). We think this is the best solution for the auto-genned classes problem described in the bullet above, as selectors containing component names should be very stable.

We tried to make Reflect as easy as possible to get started with - we have a free tier (no credit card required) and there’s nothing to install. Our paid plans start at $99 and you pay primarily for test execution time. Thanks and we look forward to your feedback!




Interesting tool! And you've definition given a lot of thought to many of the problems encountered while attempting to do UI testing automation.

I've written my fair share of extensive selenium stuff (and appium, but.. let's forget about those painful memories) and one thing that I found fairly "easy' to add to the suites I had written was accessibility testing (using Deque's Axe[0] tools). It literally took a few lines of code and a flag at runtime to enable the accessibility report on my Jenkins/TestNG/Selenium combo. However WCAG is constantly changing and it's hard to keep up, and even Deque is not always up to date AFAIK. Do you have plans to have accessibility testing with your tool ? (even a subset of WCAG's rules)

Another thing I've noticed is the jump in pricing from Free to 100 USD/month which goes from 30mins to 6hours. This might be steep for a team attempting to test out the validity of the tool against the competition - perhaps offering a first-month discounted trial or something like that would be appealing.

I also haven't really seen if it is possible to enable concurrency (for example, is testing on three platforms at the same time for 10 minutes on that free tier possible? I would imagine so if one is doing the lifting with CI like Jenkins - but perhaps you have your own solutions). Tangentially related but you say your tests integrate with various CI solutions, does this mean one can extract the test results in a way that allows further processing and integration into other tools? (I'm thinking of the XMLs coming out of TestNG there).

Lastly, I don't know if the time used is counted from the moment you start your VMs or browser starts, or the page is loaded, or the first step is done or something else. Clarifying that might help with a team's estimates (I had internal tests where the struggle was entirely on having the virtual environment/device/browser ready and the test was then a breeze, so the significant metric was the boot-up time).

[0] https://github.com/dequelabs/axe-core


Thank you for the thoughtful comment! Let me know if I miss any points:

1) We capture all attributes for elements and although we don't yet let you fail a test if the attributes are incorrect, that's where we're headed. So, you could imagine enabling/disabling accessibility validation at the account or per-test level. This is great feedback.

2) Re: pricing - thanks for this idea! Pricing is still evolving, will keep in mind.

3) We support parallel execution at no extra charge; it's only test execution time. So if you want to spin up 10 tests at once, that's fine. (We may need to scale up our infra at the outset, obviously.) There is no hard cap on the number of concurrent tests.

4) Our API (https://reflect.run/docs/developer-api/documentation/) is in the early stages, but includes endpoints for executing tests and fetching results for tests in JSON.

5) Runtime is the time the browser begins navigating to the page. You are not charged for time spent spinning up infrastructure or saving the test run. And just to be clear, you are not charged for any time spent recording tests.

Thanks again!


Thanks for answering those questions.

I think it's all great news, and for customers it's reassuring to know that running time is really just about the test itself (which I noticed is in the FAQ, so sorry for wasting your time there). This makes me think about a couple more things: can clients select zones from which to run the tests? It's probably not a huge focus if your current target customers are in North America, but can become useful if a European company considers the tool and all loading times become noticeable. Or just if a company expects clients from anywhere and wants to test for that (which is basically the hardest bit for a small team, as it starts to involve having some devops experience to spin up VMs on the cloud in different zones and all that jazz). Another tangentially related setting would be the locale for tests execution and scenarios (I'm thinking of date formats, monetary symbols, etc.. with expectations being different if you're in the US or in the UK for example)

The lack of cross-browser testing is something I can see scaring away some teams, especially with mobile being restricted to profiles of older iPhones only. IIRC over 70-80% of iPhone users are on the latest major iOS version. But it all ties in to other issues that I'm sure you're already dealing with (iPhone users using Safari, something like 20% of desktop users using Firefox/Safari, and the odd clients here and there who would have very specific annoying requirement like "I must be able to test on Internet Explorer - yes I know it's deprecated but my startup is B2B and I'm the small fish there").

Your docs might be short but they are already good and readable, especially for a target that might include a mix of pure devs and automation QAs/SDET/etc.

All the best!


> One feature in beta that we’re really excited about: For React apps we have the ability to target React component names as if they were DOM elements (e.g. if you click on a button you might get a selector like “<NotificationPopupMenu> button”). We think this is the best solution for the auto-genned classes problem described in the bullet above, as selectors containing component names should be very stable.

I remember looking into this exact idea. Theoretically, if you're able to capture the React state, and you're working with a "pure" React app, you should be able to auto-gen readable tests from human interaction. And, if you're capturing the state in a granular enough fashion, you should be able to "time-travel", but for non-technical users.

IMO the biggest use case for E2E tests are for critical things like auth & checkout. If you're able to auto-gen, maybe you can get even deeper than that.

Congrats on the launch, it looks cool!


We have a product that does this (generates E2E tests from user interactions) - it's very tricky. If you just want to cover the auth & checkout flows it's great but if you need to model the more complex and nuanced interactions record + playback is really a great way to go.


You should be able to hack React to do what I'm suggesting (playbacks), though last time I looked at the source code I got a bit confused and gave up.

It's historically been such a battle to deal with E2E tests and changing code, but if this playback idea works well then the interop should be relatively seamless. It might require an over-engineered React app, though, which is likely the biggest issue.


We are using Reflect.run at Roboflow and it really is very slick. We have it testing our core flow on a schedule so we know if any of our 3rd party services or dependencies go down and/or if we introduce a regression.

Our app's core workflow involves signing in, uploading images and annotation files, parsing those files, using HTML5 canvas to render previews and thumbnails, uploading files to GCS, and kicking off several async functions and ajax requests.

That Reflect.run was able to handle all of those complex operations makes me pretty confident it can effectively test any web app.


It sounds like you're running these tests on a schedule against production?

If that's the case, curious how you might build a test that can handle something like a MFA token request on login.


I would love to learn more about your setup. Does Reflect has built-in Roboflow integration, or it ismth custom?


Nope, we were able to use their UI to create our tests just like any other user would -- everything we needed was supported out of the box (minus one minor workaround I had to add to my code because I had some janky, non-deterministic IDs on some dynamically added hidden fields).


Really nice tool, thanks! I'm more a backend developer so I might have some stupid questions: What is your competition and what are you doing better?

Update: reading through this thread and a bit web search resulted in the following list:

https://ghostinspector.com https://www.testim.io https://www.katalon.com https://www.cypress.io https://github.com/dequelabs/axe-core https://www.mabl.com https://preflight.com

> We’re taking a new approach by loading the site-under-test inside of a VM in the cloud rather than rely on a locally installed browser extension.

But how would I record and test things locally? I.e. I would need a public setup right at the beginning.


Hi! It's technically possible to test your localhost though it's a bit of work with ngrok. What we're seeing more and more of with our customers is they stand up ephemeral environments for each PR, or each merge, and then use Reflect's hostname and URL overrides at run time to target their code changes with the tests they already recorded against prod or staging. We've worked with Release W20 (https://releaseapp.io/) in the past to demonstrate running Reflect tests against each PR. I know it's not exactly what you're looking for, but similar in spirit, I think.


Never heard of ngrok, thanks for this pointer. Sounds like a magic tool :)

Will look into releaseapp.io


I've largely moved away from UI testing. The two main reasons are:

1) Flakiness / non-determinism 2) The constant change of the UI

Both of these are absolute killers to productivity on a large team. Note, I'm not against UI testing in theory. I think if you could, you would have full end to end tests for every single edge case so that you could be sure that the actual product presented to users work. But in practice, end to end testing an asynchronous distributed system (which a simple client-server application still is) is full of non-determinism.

Re the constant changing of the UI. This is just also true in my experience. I've worked on a navigation / global UI redesign at every company I've ever worked at. It happens like once every 3-5 years. Within the redesigns, it's still extremely common to subtly change the UI for UX reasons all the time. When this happens, be prepared to spend half of the time updating all of your tests.


To be fair, I work for a company in this space (Testim.io) and we have had success stories with very large companies doing UI tests with the service (like Microsoft).

I think the hardest aspects of testing are fast authoring and stability (maintenance) - AI tools can help with that and learning data between test runs can create very stable tests.

So as someone in this space - it's a very exciting space and I am very optimistic for this startup (Reflect)


Yea I don't want to discourage work in the space. Actually, quite the opposite. I'm just saying where the bar is for me when I'm considering handing over money. I'm not going to pay money for something that ends up actually slowing me down.


First AJAX heavy app I worked on, we rigged the game so that we could win.

We wired up all of the asynchronous bits to tweak a hidden DOM element at the bottom of the page, so that our tests could wait for that element to settle before validating anything.

We'd already had a lot of our async code run through a couple of helper methods (to facilitate cache busting and Karma tests), so it was 'just' a matter of finishing what we'd started.

I kinda feel like the browsers are letting us down by not exposing this sort of stuff.


How does that address the issue of the UI changing? If the markup changed, you still need to update your tests.


A great deal of your non-determinism is in trying to run an End-to-End test while the UI is still trying to settle.

For instance, your negative tests (click something and make sure nothing happens) can fail to detect anything at all because they perform an action and then validate the UI before the action has even processed.

Yes, of course nothing has happened, because you didn't let it.


Let me clarify.

Non-determinism is just one issue. The other issue is how frequently the UI changes. For example, a large form can get broken out into a wizard. The UI is completely semantically different, but performs the same functionality.

How do you make such UI changes without your test suite slowing you down?


This sounds pretty similar to my experience with UI testing. What kinds of tests do you write instead? And how have they worked out for your team?


We tend to create a "view model" object which maintains the state of the UI and interacts with the server. The actual view just forwards commands to this object, and after a command happens the view always re-renders based on the current state of the model. We do this on iOS but I've been pushing for us to do it with React as well since you get the built-in state rendering for free.

So all of the tests look something like this (here's a UI form):

  // Arrange
  let searchResponse = { usernameTaken: false };
  let networkService = NetworkServiceTestDouble(searchResponse);
  let vm = UserFormViewModel(
    networkService: networkService
  );

  // Act
  vm.username = "[email protected]";
  vm.checkUsernameAvailable();

  vm.submitForm();

  // Assert
  expect(networkService.calls[0].arguments[0]).to.eq({
    username: "[email protected]"
  });
So you see the view model is kind of like a Page Object, in that it has an API which represents the user's interaction. But, in this case the view model is what's used in the actual production code, it's not just a testing helper. So you just invoke the API as the user would when interacting with the feature, and verify that the correct data gets sent to the server when the form is submitted. After all, the client's job is primarily to interact with the server and maintain state. The feature is functional if the correct data is sent to the server.

Then, let's say you're in React, hooking up this view model is quite easy:

  import { UserFormViewModel } from "./UserFormViewModel";

  let vm = UserFormViewModel(realNetworkService);

  function UserForm() {
    return (
      <button onClick={vm.submitForm()}>Submit</button>
    )
  }
The buttons / inputs just get wired up to the appropriate view model functions and properties, and you can have the component's state just be the view model's state. The view is then just entirely about appearance and forwarding events.

I'll then write some tests that actually render the UI, but they're just sanity checks. How often do you get a bug where you failed to call the correct function in response to a button click? The logic and wiring in response to all of the events is the more interesting part.

Edited to improve the code formatting a bit.


This is nice, thanks for taking the time to explain and write it all out. I like UI testing because I dislike maintaining mocks and stubs all over the place (and haven't found a way to write lower-level tests that don't rely on them heavily), but this approach definitely seems interesting. I like that it also forces your views to stay simple and just call simple functions. A problem resulting from not calling a handler or calling the wrong one would be immediately obvious, the tests really deal with checking what happens when that handler is called.. this seems like a great way to isolate that and test it without the pain of selenium et al! Cool pattern, thanks for sharing.


Agreed about the mocks and stubs. I dislike it too, but it's unfortunately pretty necessary if your frontend and backend are written in separate languages. If you are writing in the same language, you can actually wire up the frontend and backend together and you don't need to use test doubles! That's an architecture I've been playing around with, but unfortunately not at work.


Congrats on launching Reflect. Looks solid!

I soft launched something eerily similar ~6 months ago and got zero feedback (probably because no one was able to actually try it since Google took forever to approve the extension).

TestFront.io: https://news.ycombinator.com/item?id=22130590

Ran out of money though and had to pursue other things so I put it on hold. Maybe I'll resume work on TestFront at some point and you'll have some competition. ;)


> We’re taking a new approach by loading the site-under-test inside of a VM in the cloud rather than rely on a locally installed browser extension.

So while you do get the benefit of clean, repeatable tests, this requirement also puts you out of reach for most enterprise-y applications, where your pre-production environments (i.e., what you actually want to run your tests against) exist in a tightly locked down network environment that you can't reach from the outside. In stricter environments, even a reverse port forwarding setup to allow you to reach inside the LAN would be out of the question.

I think this is a problem you're going to run into pretty soon if you want to do enterprise sales, and one that is not trivial to solve without an open-source flavor of the test runner (e.g. like Selenium).


Thank you for this thought-provoking perspective. I agree we'll need to figure out a way to provide an on-prem solution in some enterprise cases. We'll think on this internally. Thanks, again!


This is super cool! I'm at a startup and we've finally started building up our testing infrastructure lately, this seems like a great time-saver. Just tried it out and it worked great. Two questions:

1) Is it possible to set the screen resolution? I see device profiles but don't see if it's possible to manually create profiles. The default Desktop res is lower than what we normally target.

2) How exactly does the visual testing work? I tried dragging over a div and it got the text, but parts of the text were formatted differently with spans. So does it just validate the text, or does it actually compare the visuals?


Thanks for giving us a spin!

1) We don't support custom resolutions yet, but you're spot on that the way we would do this is by allowing you to create device profiles in your account. The Desktop, Tablet and Mobile profiles are all that we support today.

2) Visual Validations will capture _both_ a screenshot of the element and the visible text of the element. In order for the visual validation to pass during execution, the text must match and the element's screenshot must be within your account configurable % difference. Default image difference is 0.5% pixels. We do a pixel by pixel comparison of the images and show you a delta when the images differ.


One thing I like about this is that it supports drag and drop support. That's an item that I haven't seen to be super straight-forward with other suites like Cypress and is very-much a user initiated action.


For selecting things, I've found that allowing a "data-test='user-name-input'" type attribute is useful for a lot of cases, when doing Gauge/Taiko tests.

It might make sense to allow/recognize these to save trying to find things that way, rather than via CSS-type selectors that may change.


Totally! If you have data-test* attributes set up, we will use those first when generating selectors for each action. The full list of data-test attributes we use is listed here: https://reflect.run/docs/recording-tests/creating-resilient-.... If we don't find data-test* attributes we'll also look for other attributes that tend to have a good degree of specificity, like alt, rel, schema.org, and aria-* attributes.


I recomment considering learning between test runs and I encourage you to train a relatively simple model for selection on top of http-archive and tagged data.

"off the shelf" machine learning makes it pretty easy to create very robust selectors. I gave a talk about it in GDG Israel and was supposed to speak about it in HalfStack that got delayed cancelled because of COVID19 - but the principle is pretty simple.

It's amazing how much RoI you can get from relatively simple models of reinforcement learning. Here are some ideas: https://docs.google.com/presentation/d/1OViIwDJJw1kjVJH5Z2N5...

Good luck :]


Fantastic! I like inglor's idea too.


Actually you can use the whole attribute name as selector, like data-test-user-name-input, and if you need a dynamic id, you can add as a value: data-test-product-item="book" ;)


Cool idea, thanks!


How does this compare with qawolf.com which uses Playwright, open-source and free?


I want to use Reflect to test my game with many concurrent, simulated players. For context, the game needs to properly handle 50 players at once.

The pricing (6 hours of testing for $99) means that if I want to do a 1 minute test with 50 concurrent players, I can only test 6 times a month. A big benefit of testing is to ensure we ship reliable software, and we plan to ship much more often than 6 times a month.

Is there a way you could price by unique tests instead of hours tested?


Interesting use case. We haven't encountered this or thought about this before. It would be a change for us to price like this, but happy to discuss further if you want to shoot us an email!


I maintain a web app that would really benefit from an E2E suite, but we don't have the developer capacity to write one right now, so this looks like it hits a potential sweet spot for me. To use Reflect, I think we would need to understand the plan for what happens if the SaaS goes under -- ideally, we'd be left holding our test definitions and some open-source version of the test runner that we can instantiate on our own VM.


Testcafe’s test recorder would let you create tests relatively easy by browsing and you can tweak them if needed. Ime they have great support and I like that I can buy the software rather than having another SaaS subscription. Ymmv.


The "easy out" for tools in this space is to give you export (For example to Selenium/Puppeteer/Playwright). A lot of the "premium" test tools offer this functionality.

The "less easy out" is an on-prem version with a contract regarding updates + a clause for what happens if the company goes under in terms of support + an escrow over the code (the company gets a copy of the code + the license to change but not sell it etc).


Sorry - I just saw this! Some customers on our Enterprise tier have asked about this and we've agreed to provide them with a one-time Selenium export of the tests as a contingency for if we go out of business. Happy to chat about that if you'd like! My email's in my profile


This looks amazing. First test went incredibly well, easy to set up. Enjoying the VM approach vs. extension.

Will monitor this to see how our tests perform over time.


How do you handle logging in to the app that needs to be tested? We use Google OAuth.


We fully support e-mail based logins, but OAuth can be challenging.

Github OAuth for example will issue a 2FA email-based challenge non-deterministically. We handle that by detecting the challenge and filling out the challenge code based on the contents of the email sent by Github. This requires a one-time setup where you add an email address we control to your Github user so that we can read and parse it.

For Google OAuth we can execute all the steps but the two issues there are (1) we run everything in a single window and some web apps don't like that because they assume the oauth flow will happen in a new window, and (2) sometimes Google prompts you to answer a security question and we don't yet support marking test steps as optional.

What our customers have been doing instead is setting up a mechanism to auto-log in the test user using a magic link. Basically sending in a one-time-use auth code to a URL in their app that then authenticate the user. I think some platforms (Firebase?) have built-in support for this.

I'm certainly happy to brainstorm what could work best for you though if you'd like (my email: todd at reflect.run)


I asked this elsewhere in the thread to a different user, but it seems relevant here as well:

Curious how your service would handle a app that mandates non-email based MFA like SMS or TOTP.

Additionally, what about testing a onboarding flow that might require some form of manual approval?

Thanks!


We have the ability to handle MFA logins that use email as the second factor. The way it works is you configure your user account to have an email address that we control (e.g. [email protected]). When the multi-factor challenge comes in, we receive the email, parse it for the challenge code, and fill it in live in your test. We don't have SMS support at the moment but we could take a similar approach there if SMS is used as a second factor. There's a one-time setup here where we set up our system to parse your challenge email.

No support for TOTP unfortunately, but if the SMS or email-based challenge support would work for you feel free to reach out and I can talk to the specifics of the one-time setup - todd at reflect dot run


I'm using Hardware keys to secure my personal accounts. What are the alternative auth methods suggested for this situation?


I would suggest adding the ability to auth via a magic-link in your web app. This would allow your tests to bypass the auth flow entirely by passing an auth token as a request parameter in the first URL of your test. You can pass new auth tokens when you go to run your tests either via the UI or via our API if you have it hooked up to your CI/CD pipeline. More docs on how to do it via the API is here - we call these 'request overrides' in our docs: https://reflect.run/docs/developer-api/documentation/#run-te....

In terms of added security, some options there would be to only enable these "magic links" in staging and only enable this type of auth for a least-privledged user account (e.g. no admin or employee accounts could auth this way).


Magic links sound like they have several pitfalls (the potential for security incidents like Twitter are beeping here) and require significant changes on our end to use this platform


You're right, there's certainly security implications for magic links. Unfortunately for an auth that incorporates hardware keys, I can't think of how you would test behind that without some sort of workaround, but I may be overlooking something.


I generally have service accounts specific for testing with significant restrictions. Hardware keys present their own complications for non-human ops, so they don't really belong there.

More just seeking bounds of possibilities, thanks for your replies.


Ungh Google OAuth is kind of frustrating - the major reason is that websites are pretty good at blocking pages that are coming from automation - and you get all sorts of issues and problems (not to mention if a company like Google (not the real example) uses you to test their platform and then SafeSearch picks up their staging login as a phishing site and blocks you ^_^).

I warmly recommend _not_ testing Google OAuth and instead passing a token and bypassing it on your server's side.

The way automation can work around it (in Testim.io for example) is "mock network" which is supported in our codeless solution but also available in most test infrastructure (like Playwright which is FOSS). You would mock out the parts that leave the user with a token they can use to authenticate.


The registration is buggy - The form is misaligned - at least on Firefox, and why does Google sign up say "continue to amazoncognito.com"?


Thanks for reporting! We'll get the misalignment of the form fixed up for Firefox. The Google sign-in says 'continue to amazoncognito.com' because we're using AWS Cognito as our auth provider, but it looks like there's a way to configure it to be "continue to reflect.run" so we'll look into that!


Makes me think of http://waldo.io, but for web apps! Really excited about the new wave of no-code test automation tools - definitely helps semi-technical team members take on more responsibility on the testing side vs. just writing specs.


Yes! 100% our feeling as well.


How does this compare to recent YC alum Preflight?


Hey, co-founder Fitz here! Great question. We're both tackling the same problem. Our key differentiator from Preflight (and testim, and mabl and ghost inspector) is that we spin up a new VM for every browser session rather than rely on a local browser extension like all of those competitors. The comment above highlights the trade-offs of this approach but happy to discuss further (fitz at).


Also, for those familiar with the space or using something else, what should we compare this against? I'm in the market for something like this.


Looks great. How much easier is this than say, writing E2E tests with Cypress?


I wrote an article about Cypress. I was a big fan (gave panels about cypress in conferences) but am disillusioned. For a very specific use case Cypress is great and arguably the best choice though: https://www.testim.io/blog/puppeteer-selenium-playwright-cyp...

Even if you use something like Cypress you probably need something like Reflect (or Testim where I work but don't represent - for that matter) - you would just end up writing the framework in-house.


Had very similar experience with Cypress here as well. The command queuing API that abstracted away async operations sounded awesome in theory and I totally drank the coolaid.

But once I started to write tests with it, I realized the model was optimized for simple tests where you can afford to act only as the end-user, whereas for the more complex flows we're interested in testing, you need to act both as the end-user as well as, say, someone in the operations team doing a bunch of manual approval steps, or say, the email/sms system that delivering some secret out-of-band (through a staging backdoor API), interleaved between the end-user actions, and not being able to precisely coordinate those async interactions between these distinct actors made those tests extremely awkward to write compared to something using playwright/puppeteer, where you just coordinate promises like you usually do.

That's before we get into the limitations around single domain per test which meant we couldn't effectively test our login flow that uses a separate domain for login for OIDC.


This is what my company is currently just getting started with, so I'd be interested in this comparison as well. If Reflect is much better, we still have time to switch.


Extremely biased of course :) But here's where I see the advantages of Reflect vs. Cypress:

- Cypress has a really nice local development experience that feels at home to front-end devs. I would describe Reflect as a nice remote recording experience that simulates an end-user. So we're kind of attacking the same problem from a different perspective. You can technically record Reflect tests against your local env with ngrok, but Cypress is certainly a better local testing experience. So advantage Cypress here.

- There are actions like interacting with iframes, drag and drop, links that open a new window, and file uploads that range from difficult to almost impossible to do in Cypress. We support these actions out-of-the-box.

- Similarly if you want to do visual testing in Cypress you'll need to integrate with a third-party visual testing tool like Percy or Applitools. We have visual testing support built-in.

- I've seen folks struggle bridging the gap between using Cypress for local testing and actually getting it set up in their build/deployment pipeline. Since we're a fully managed service it's just a single API call to get Reflect in your CI/CD. We also have built-in scheduling so if you just wanted to run your tests at a fixed schedule you don't need to integrate with anything, which I think is a nice way to ge t going quickly and prove out our tool.

- Because of Cypress's design, it's not so easy to get Cypress tests to run in parallel. This is really important because E2E tests take way longer to run vs. unit and integration tests, and the best lever IMO for reducing this is parallelization. This is also another tripping point for folks getting Cypress set up in CI/CD.

- The final differentiator I'll mention is really the difference between a code-based vs. codeless approach. We're trying to reduce the time it takes to create and maintain tests and we think the way to do that is to basically handle as much as we can automatically on your behalf. Instead of writing code that simulates actions, you just record your actions and we generate the automation for you. So for a flow like adding to cart or user registration, you might only take you a few minutes to set that up in Reflect but it'd be a lot longer to do in Cypress. Certainly as your test suite grows things like reusability become really important, and we support that as well. This also means that non-developers like QA testers w/o dev experience, PMs, designers, and customer support can write and maintain tests.


> - Because of Cypress's design, it's not so easy to get Cypress tests to run in parallel.

I am not sure why you think Cypress is hard to parallelize but if you don't like their managed service (dashboard) you can use https://github.com/agoldis/sorry-cypress - it's quite possible to do.

(All the rest of what you wrote sounds solid - good luck again :])


Sounds great. Thanks for the detailed response.


The true test for such a tool are usually the edge cases; in my opinion the web is simply too finnicky, and all those well-meaning custom algorithms I've tried fell way short.

I would recommend Ranorex, which combines a comfortable record-replay functionality (which creates real C# code) and a code-first approach and everything in-between. A powerful "spy" assists in creating the most sophisticated locators and turns them into first-class Ranorex objects; a shortcut jumps from code to the repository and back; duplicate elements are detected on the fly.


This assumes the URL is publicly available, or is there an on-premise offer?


We don't offer an on-premise version, but we do have customers testing environments not accessible to the public internet. They're doing it by allow-listing a static IP and we configure all traffic to come from that IP for that account. There's other options if a static IP allow-list doesn't work - we've specced an approach for a prospect where we would set up an agent inside their firewall that we use to access their internal environments. This is an approach used by other tools to access secured environments - we haven't done it ourselves yet though.


Wow, this looks really useful and well-thought out. Also, kudos for the compelling summary writeup / pitch here. Bookmarked, will def be giving this a try.


Thanks! Feel to get in touch if you hit any snags, [email protected], [email protected] and [email protected]


Similar product I’ve had a great experience with is https://ghostinspector.com


I use it too and I really like it.

However the premise of having a test editor in a VM where I can time travel to a specific point in time and add steps from there could save me a bunch of time. Also the multiple selectors per element and the ability to target React component names sound really cool.

Happy for the competition :)


Took a look, nice job!

Would be nice:

1. Clearer setting for notification email

2. The ability to target an area for an element change without knowing what the elements will be. Example: filter for recent items without foreknowledge of which items will appear.

3. Ability to traverse up DOM (to select parent/s) based on a selector that was too specific. Encountering this quite a bit.


Thank you for this feedback! I opened an issue for each :)


How does this compare to other existing "no code" SaaS regression testing tools such as screenster.io?


Looks neato. Are tests versioned? I'd like to be able to see a textual diff to changes in tests over time.


Thanks! Yep tests are versioned - if you click on the 'History' button when viewing a test you'll see all the executions within your retention period. We don't show a diff view of historical changes, however we do show a diff view when a text change causes your test to fail. So say if your 'Log In' button changed to 'Sign In', we would show you a diff of that text change and give you an option to make 'Sign In' the new baseline used when running future tests.


That's not the kind of versioning I intended. What I'm interested in is if I make some changes to the test setup in your UI, is there a way to see what I changed, and when those changes happened in the past?


Ah - sorry for misunderstanding. No we don't have a view like that, though I can see that being pretty handy. This is great feedback - thank you!


One thing that will come up and we've found out: users want branches, they want merging and they want the ability to track their git branches with their tests.


Hey! Product looks great! My main question would be, for a rapidly evolving product would it generate a lot of false positives?

I really enjoy the idea of business/marketing people collaborating with tests. Congrats.

I also really like your business model so will give it a try.


False positive failures are a really common issue with existing E2E testing tools, so we try to do a number of things to prevent them in Reflect tests:

- We generate multiple selectors for each element you interact with. So if in the future you change a class or id, we'll fallback to other selectors when running your test. We also only choose a selector if (1) it matches a single element in the DOM, and (2) that element is interactable (e.g. not completely hidden), and (3) if it has text associated with it then we'll only choose it if the text captured at recording time matches the text at run time. This helps prevent us from selecting the wrong element when running a test.

- For React apps, we use heuristics to ignore classes generated by CSS-in-JS classes like Styled Components and EmotionJS. We also have the ability to target elements based on React component name (it requires a one-time setup on your side to enable this)

- For Visual Testing failures (e.g. you've screenshotted an element and now that element has a different UI) we have a simple workflow to 'Accept Changes' and mark the new UI as the new baseline for tests going forward.

Certainly more to do here but this is one of the key problems we're looking to tackle.


Thank you, there are two things that are either a bug, or hard to figure out.

1) I can't delete steps

2) I can't hover.

Great product guys, keep it up!


Thanks for this feedback!

1) You can delete steps on a recorded test but not during the recording. This is to ensure that we have an exact copy of the initial recording. Thereafter, you can click on the test step and click "Delete" on the middle pane.

2) These should be captured out of the box. Can you email me your test URL so I can investigate? fitz at Thank you!


Looks good. A small suggestion for the website: perhaps I missed it, but I watched both videos "record a test" and "view test results" on the front page and I didn't see Reflect detects an actual regression.


Good point. The test failure case is something we should show. Thank you!


Are there any plans to support electron apps. I have a react-based Electron App, and also would love to test a few things that are relevant to the app loading and getting data from disk. Just another use case to consider.


Hey, no plans to support electron apps right now. If your app has a web-based portal (or started as web based), then you could do test that, of course.


Congrats on the launch, it looks awesome. I wished I had that a few years ago and will definitely give it a try


I assume it's not possible to clean up after a test since you directly interact with the web app, right?


Current customers typically either run a "clean up" Reflect test that deletes account state that was created in previous tests, or they have a periodic internal job/logic that auto-deletes all test accounts. The next day's Reflect tests can sign-up fresh, for example, and use the new accounts for tests.


This is awesome, congrats guys!


How well would this work with a hybrid DOM/WebGL application?


If the elements you interact with are DOM elements and non-DOM elements within a canvas then we should be able to detect and replicate those actions. The biggest remaining issue might be performance - our VMs don't have a GPU attached so depending on the application it may be slow because WebGL is not running with hardware acceleration.


I also had the idea to run browsers in the cloud and intercept the interaction stream to record and replay interactions, also for the reasons you say, because it reduces a large class of bugs. But not for test automation, rather for web scraping. I applied to YC with this around 9 times. Great to see you convince them this is a good thing!

Are you using the Chrome DevTools protocol? That's what I built on top of. You can see my open-source code for this here: https://github.com/dosyago/BrowserGap

And a demo running right now of the interactive cloud browsers (without automation, it's WIP):

https://start.cloudbrowser.xyz [0]

[0]: Currently on Show https://news.ycombinator.com/item?id=23904243

Fitz and Todd, wanna connect?


"...rather for web scraping. I applied to YC with this around 9 times"

There actually was a YC company that did exactly this several years ago. http://www.kimonolabs.com/


Sort of but not exactly, because IIRC they used a downloadable client. I wanted something that could run anywhere easily, hence the browser client. I used KL a few times, and followed their story, and found it interesting they moved to Tokyo. At the time I considered Import.IO, KL, Connotate, Mozenda and DiffBot to be the main competitors. My main idea as a point of difference was to be able to share the structure you've mapped on web apps, and share the action sequences as things others can use, so everyone gets a network effect benefit from the work of everyone else.

Thanks for posting. I don't think I knew they were YC, and did not know they were acquired by Palantir. That's interesting, and good news. Means the market is growing.


This looks cool! We use the DevTools protocol for a few things like our mobile emulation and screenshotting. Happy to chat - my email's todd at reflect.run


[deleted]


Hey congrats, I work for Testim ( https://testim.io ) which I assume is somewhat competition?

Excited to see more players in this space - good luck! Most of the market is still doing manual QA and that has to change.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: