Hacker News new | past | comments | ask | show | jobs | submit login
Zillow just gave us a look at machine learning's future (vinvashishta.substack.com)
166 points by amrrs on Nov 7, 2021 | hide | past | favorite | 176 comments



This post strikes me as empty bluster. It basically says "the data scientists didn't consider the possibility that the model might fail to predict the real world." Which is like looking at the Boeing 737 MAX debacle and saying "they didn't consider the possibility that the plane might not be stable". This is most likely both untrue and lacking insight. What almost certainly happened was they validated their model extensively, but overlooked the angle relevant to this use case. Or overexcited executives failed to heed the caveats of the nerds.

> When I advise Data Scientists to understand Topology and Differential Geometry, this is why.

Oh really? I've taken grad-level courses in those fields. It's nice general education but has nothing to do with this situation. Knowing about the Lucas critique from economics would be much more relevant.

My pet theory -- completely unvalidated but at least potentially insightful -- is that they failed to put in adequate margin for adverse selection: homeowners whose homes are worth less than they seem on paper were much more likely to take the Zillow offer. Adverse selection turns prediction variance, which you might naively expect to "average out" when using a well-calibrated model at scale, into a bias towards underperformance.

Another way to put it: asymmetric information. Be very careful betting against homeowners who probably know much more than you do.

Or read this comment, which puts it better than I did: https://news.ycombinator.com/item?id=29140104


As someone who doesn't work at Zillow but follows the space a little closer, don't buy into the superficial explanation/meme going around that they failed because of ML.

My advice would be to read the earnings call transcripts over the last couple of years. What Zillow has communicated in the last earnings call is fairly contradictory to what they've been saying in previous calls.

Some bullet points on my understanding of what went wrong:

-They were paying way too much for too many homes. I believe this is ultimately due to internal incentives and ambitious goal setting. IIRC, they mentioned they had to alter their model output to be able to give themselves the permission to keep buying at high volumes. They lacked fiscal discipline here.

-The holding times on these homes were too long, which increased their costs. My impression is that this was due to lack of operational experience with renovating and selling homes. Zillow was new to this, and they were trying to scale up very quickly, independent of whether they were efficient.

-The core business model of Zillow historically has been as an advertising site/lead gen site. This business model is not in their DNA and their operational expertise. The negative margin pressure from this business in tougher times can really stress the overall business at times.

Look at the other iBuyers in the space - they are still in the space. They just aren't as naive as Zillow was. Zillow's behavior here is really confusing and speaks to poor operations/strategy/discipline. Housing prices have been mostly been only going up in this market and Zillow was losing their shirt.


Well, their strategy of blaming the nerds will at least ensure that they never have the opportunity to bet big on ML again. No one competent will want to work for them now.


I wish this was true but I suspect it is not. If Facebook can still have excellent engineers because they pay a ton of money, then so can Zillow.


Zillow has always paid low compared to the market. They never got the “top nerds” and this was really a CEO problem. I would never bet on that company on any decently complex technology like predictive models. It took them years to make their model decent, something that a machine learning tutorial does better - just a linear regression on square feet and location. They were literally comparing my house to ones on the other side of the city - comparing a downtown property with outskirts.


To be honest linear regression on square feet and location is what potential (human) buyers would mentally use, therefore it's a very, very accurate prediction of the price.


In my area at least Zillow valuations really suck outside of subdivisions. They don’t do some fancy ML, just apply a factor with a cap to tax assessment based on comparable sales and listings.

It mis-values houses near me by 20-40%. Other tools are much more accurate.


I'm a quant trader and I agree. There's a big difference between being able to predict what will happen, and having a sensible strategy for how to behave.

One notable place this happens is backtesting. Particularly for passive strategies, you don't actually know from historical data what would have happened if you had been in the market at a given time. Would your prices move the market? Maybe. Would you get filled disproportionately when it's bad to get filled? Fair chance of it. So don't act like your presence would not be felt.

Another angle is risk management. You may have high confidence in your model, but you should have a plan for what to do if things don't go as expected. It seems crazy to me that it could get so far out of hand that a quarter of the company gets sacked. Why not set a budget and dip your toes? Then you can watch out for the adverse selection issues as you grow the portfolio and make adjustments.

Finally, it's very normal for quant strategies to look good when they're first proposed, then look a bit worse when traded in real life, and perhaps then degrade over time. Another reason to take it easy when sizing the bets.

You're right about Lucas critique. It's fundamental that models affect the thing they're modelling.


The big difference here is that the counterparty is deciding which orders get filled. Zillow could be right 99% of the time, but that 1% of the time where they're overpaying are the transactions that for sure will go through.

And indeed if you go to /r/realestate anecdotally that is what happened. People would get offers from Zillow/Opendoor etc and talk to agents to see what they would list it at. If Zillow was higher than what the agents said they went with Zillow. If not they'd list it and sell it the old fashion way.


Yeah, it's remarkable. You'd think that "I'll buy your house immediately" should be at a worse price than the guy who insists on the old school inspection and due diligence.

In fact it might not be a bad strategy to simply start with that... see if there's desperate people needing the money fast, and use your corporate heft to both sweep up the deal (brand) and patiently dump it back into the ordinary market, pocketing the spread as a liquidity premium.

One thing I remember my first boss saying is that it's hard to get a good trade on. Attractive deals have several interested parties. If you're getting a lot of trades, they might not be good ones.


Zillow's ambitions were bigger than that. They wanted to get enough inventory so that they could capture the 6% of the transaction going to agents. Plus a chunk of the miscellaneous fees that go to Mortgage brokers, inspectors, appraisers, and so on. For that they needed volume and the way to get volume in real estate is pay more than other people.

I find it hard to believe that everyone there didn't realize that they were burning money. I saw it personally when looking to buy a house a couple years ago. There were a couple Zillow flips out there where the numbers obviously didn't add up. Whether or not they truly believed that it would work eventually or if it was just a way towards higher compensation/promotions/stock price or whatever is anyone's guess. But I suspect pretty much everyone involved from the person banging out ML models up to the CEO knew what was going on.


One thing I thought about is alternative scenarios. If the house market had rocketed up for whatever reason, Zillow could have made a pile of cash without having a sensible strategy.

Then the articles would be talking about what geniuses they were and ML trampling traditional real estate.

They would double down and when the market calmed, all the same issues would reappear. Perhaps that's actually what happened, because the house market has been buoyant recently, and they seem to have had the iBuyer business for a few years?


Startup mentality right? Lose money on each unit and make it up on traction.....


I'm a former quant trader as well and this article rang true to me.

I've certainly found myself in situations where I developed a model that performed so well in its specific domain that executives wanted to roll it out to a a much broader domain despite my advice.

It very much is a sign of immaturity and lack of discipline from leadership. It's too easy to buy into the hype thinking you are being supportive of AI in your org but not realize that truly being supporting an AI strategy involves a lot of nuanced skepticism.


Oh man, your comment about the need for nuanced skepticism reminded me of the co-CEO of Waymo who publicly claimed he's willing to trust his kids' lives to a driverless car.


> co-CEO of Waymo who publicly claimed he's willing to trust his kids' lives to a driverless car.

So are lots of families in Chandler, Arizona


I mean sure why not he probably didn’t raise them. With all that money he could buy another wife kids and nanny care


"A" driverless car or Waymo's driverless cars? From the outside looking in, it appears Waymo's approach has a lot more self-restraint and is a lot less cavalier about safety compared to the competition.


> What almost certainly happened was they validated their model extensively, but overlooked the angle relevant to this use case.

To me, this is a systemic organizational problem in ML, rather than a model problem. Or essentially, pre-Agile development all over again.

Stop me if this sequence sounds familiar: (1) attempt to capture all the details in a nuanced domain that's foreign to you, from people who are experts but not expert developers, (2) develop code from that spec, (3) deliver code to the experts as a black box that they can't opine on, (4) cross fingers and run in production.

Which isn't to say this is the way everyone does it, but is to say it will be the way the majority does it, just as the majority of development is still waterfall under the hood.

But the mismatch is a mismatch in understanding, not in correctness. The experts can't offer meaningful feedback to the technological black box, and developers don't have enough time to spec a lifetime of subject matter experience.


You are right about this anti-pattern, but it's not the way most ML models are operated - especially in trading operations. The following principles are well known:

- enlist domain expertise

- validate incrementally

- expect adverse selection

- expect failures to generalise

Many, many production ML teams operate in this framework. The fact that Zillow either cut corners or hit a bigger snag suggests there is a more interesting story we haven't heard yet, not that they should have consulted OP or a few hn readers.


Trading models are a very specific and exceptional (in multiple senses) subset of ML use.

A given business probably doesn't employ more than a few people of that class and rigorousness: not least, because those people have the option of making a lot more money at a trading firm.

So yes, all businesses should do all those things. And yes, the best run places do. But it feels like there's going to be a decade or two where employee supply & expertise lags corporate demand, with bad results.


Not sure how they could really validate their models given that "market making" in houses is probably really thin - if at all.

Finding alpha in financial markets is just tough (real assets or not), and real estate also needs a lot of capital, which makes "waiting it out" tough


I can make a market by quoting a bid and ask.

To run a market making business I have to know how to hedge my net position and manage my cost of carry.

I find it crazy hard to believe that they just went long a few billion in houses just based on a model of price and failed to hedge vs. the secular market and didn’t model time to sale. But if you believe the popular press maybe that’s what they did.


Are there efficient hedges avaible on a few B of RE? I thought the derivatives space is pretty thin there these days, but not my area of expertise.


I don’t know of anything exchange traded or terribly liquid, but if I called Goldman and said I wanted a one or two billion dollar swap on the Case-Schiller Home Price index I’m sure they could work up a quote for me.


options on anything that's correlated with it should do it. (basically if "RE is up" so should builders, suppliers, big corporate lenders, mortgage processors -- so if they want to hedge against RE going down they could have bought options that pay when those go down)


Relevant xkcd: Here to Help[0].

If you don't have the minimum skills on the team, you're going to fail in ways that you don't see coming, the unknown unknowns. To spend billions when a former Zillow employee says that they have no finance people on the team, is just, drastically, wow. I would much sooner believe he is mistaken then believe that is what actually happened.

[0]: https://xkcd.com/1831


I totally agree, this is more of a case of simple over-confidence and exessive risk taking than lack of mathematical understanding. Regardless of the quality of the model you are using there are still countless other factors that may impact your prediction quality. To me, the only way to mitigate this is to take it slow and plan ways for failure correction. Zillow is an example of greed and impatiance winning over common sense and risk-mitigation.

If anything this could have been because of _too much_ higher level statistics and math, as those kinds of things can often shroud underlying problems.


> Knowing about the Lucas critique from economics would be much more relevant

“The Market for Lemons: Quality Uncertainty and the Market Mechanism” by George Akerlof is also a nice related resource [1]

> widely-cited[1] 1970 paper by economist George Akerlof which examines how the quality of goods traded in a market can degrade in the presence of information asymmetry between buyers and sellers, leaving only "lemons" behind.

[1] https://wikipedia.org/wiki/The_Market_for_Lemons


This was ebay now is Amazon


> This post strikes me as empty bluster.

That's because it is. A common characteristic of almost all blogs run by consultants. The only exception that jumps to mind is John Cook's [0].

0.https://www.johndcook.com/blog/


I think John cook has hit on a brilliant marketing strategy : write mostly about basic, upper-division probability and statistical concepts (e.g., the tail behavior of the gamma distribution, or a list of conjugate distributions), interspersed with the occasional non-controversial polemic (“don’t invert that matrix”). To the uninformed—including his potential clients—he looks like some sort of genius, and for the rest of us, it’s a great reference.


What if the model said there were only $100m in homes that were viable, but executives wanted $1 billion in purchases? They could have had an alright model, but the volume wouldn't sustain the purchasing quantities leadership wanted. So, they relax the model to accept lower quality results. It's another form of executives attempting to squeeze blood from a stone. It just doesn't work, but it's a common strategy. They'll ask simple questions like "Why can't you just do X?", you advise them not to, and they ignore you.


Basically the equivalent of 2008 no docs no income market for housing loans.


Do you have any evidence that this is what happened?


Their CEO also seems to have fundamentally misunderstood what he was doing. He described their activity as market making. That’s wrong.

Market makers attempt to buy and sell instantaneously without changing the asset. Zillow had no intention of simultaneous purchase and sale. They also modified the asset. They used a market making model glued to a real estate speculating one to run a house-flipping operation at scale and from a distance.

Instead of focussing on sellers who would sell at a discount for immediacy, they paid over market. There was no effort put into pre-selling or hedging. (The latter would have been so possible as a security against their entire portfolio.)


A market making model would actually be interesting. Setup a potential seller with a buyer beforehand. Then when the seller buys the new home both transactions occur together and you get 2 sales instead of one


> overexcited executives failed to heed the caveats of the nerd

Yes, the nerds are so well-known to consider the risks of what they create. If only they’d thought of the risk that someone could use their work in the real world, even though it came with specific instructions not to do that. Maybe they wouldn’t create possibly-destructive ML models then.

Alas, it remains a mystery what a private company might do after paying a lot of money for something. Now where was I? Ah, right! Predicting criminality from baby photos! I love how DARPA sponsors basic research with absolutely no expectation of ever using it. Although I‘d better name it README.MD, just to make sure they do not forget to read it.


Yes, at every startup I was in, the stats nerds were in fact well-known to tell the nontechnical people and even the devs that what they were doing wasn't going to work, or rather that we didn't have good evidence that it would work. This is such a well-known feature of stats nerds that it's a meme.


Technology has dual uses. Kitchen knives can be used to kill people, so should they never have been invented?

You can take this line of reasoning very, very far. Some people think we never should have left our hunter-gatherer lifestyle. To these people, the agricultural revolution was a disaster and things have only gotten worse from there.

It's a tremendously ignorant viewpoint IMO, but it's where you land if you believe technology with the potential for bad use shouldn't exist in general.


I think they failed to consider adverse selection. For instance, if I'm selling my home, I will check Zillow instant offer and try the traditional market. The Zillow offer will be my floor plus some value for convenience. So they will likely be overbidding.

The other thing is that running a website and being a data aggregator is an entirely different business than speculating on houses. Matt Levine pointed this out last week [0], but its not making markets if the turnover takes 6 months.

[0] https://www.bloomberg.com/opinion/articles/2021-11-03/zillow...


> is that they failed to put in adequate margin for adverse selection. Adverse selection turns prediction variance, which you might naively expect to "average out" when using a well-calibrated model at scale, into a bias towards underperformance.

Presumably, they trusted that their model does a more accurate assessment of price and future potential than a person themselves. If you had an excellent model, why _not_ would it balance out?

My theory is that most people would consult various sources and professionals-who-assess-homes (whatever they are called) and this the likelihood of Zillow having a better assessment than the average person is just low.


Trained by all kinds of nominally relevant data.

Makes nominal decisions with at best somewhat nominal ability.


I agree with the OP that "it's put up or shut up time for Data Science teams. When the business depends on our work for revenue growth, success is expected. Failure is fatal." Zillow's CEO blamed the Data Science team for the failure of a mission-critical business unit and fired 25% of all employees. The stock lost a third of its market cap in a few days. When revenue growth depends on an ML/AI system, failure is indeed fatal.

However, I disagree with the OP's diagnosis.

The initiative failed because to a close approximation the only homeowners who would routinely accept Zillow's offers were those whose homes could not fetch anywhere near as much via traditional sales channels. Everyone else would rather sell their home via conventional listing. Zillow gave homeowners a free put option: Whenever Zillow's offer was above market, the homeowner would take it; otherwise the homeowner could freely ignore it. The incentives were strongly stacked against Zillow from day one.

Zillow unwittingly launched what I would describe as a seller's market for lemons. Naturally, Zillow ended up buying... a lot of lemons.


I made the same point on twitter a few days ago. https://twitter.com/mattmcknight/status/1456096267768147971

Even CarMax and Carvana with commodity products in the traditional lemon market have to work around this problem by adding an inspection to confirm the data- and they have a large spread between their buying price and their selling price.


> large spread between their buying price and their selling price

This is the market I was amazed Zillow didn't get into: the "WE BUY UR HOUSE 4 CASH" market.

I'm guessing the volume is too low (few people sell their house for stupid low offers) and advertising costs are too high (those people probably aren't obsessively checking Zillow for their home's value every day).

But still, it seems like easy money, that someone far less knowledgeable and talented than them is currently scooping up, that would better insulate them against market turns.


> the market I was amazed Zillow didn't get into: the "WE BUY UR HOUSE 4 CASH" market

They could have bootstrapped by offering a product to those cash4home operations.


And don't forget flipping a car is a lot faster and easier than flipping a house at much lower capital. This makes each cycle of trial and error a lot cheaper and a lot more frequent.

Also there is something to be said for a market where the customer brings the product to your location for inspection, and whose product is considerable faster to inspect more rigorously than a house.

Also the products within that market are much more uniform and share several unifying characteristic across even different makes and model so its easier to learn a good and accurate pricing function.

That gives them a lot more cycles to improve their algorithms (both human and machine based).

Plus when CarMax were starting out they likely did so with a bevy of folks already experienced and profitable at buying and selling used cars at a profit who had the systematic knowledge of how to do so in their heads.

It then was simply a matter of building up customer goodwill by focusing on another metric customers cared about, taking a perhaps slight loss to avoid the risk of being considerably ripped off by a savvy local dealer and saving oneself the anxiety induced by high pressure sales tactics.

Having said that I think Zillow could still be quite successful in this endeavor however from my experience they didn't deploy their trading algorithm in a prudent way, they massively over-traded an untested by the real world model, they trusted their paper trading (backtesting) way to much, and they didnt focus on the nuts and bolts of building a successful market making system: making actual money over and over at small size and ramping up size as your empirically demonstrated pnl grows.

Also putting trading aside they didn't have to be a flipper, they would probably naturally be more fit as an exchange+ or intermediary that provides exchange type backing and reduces the friction in buyers finding sellers, taking on risk by only making an offer when they have a bona fide buyer for fixed or % fee, but glossing over all the hassle of two people who dont know each other making a large trade. Offering services like if you dont like the house you just bought through us we will help your sell it at a cost proportional to renting no questions asked.


Carmax and Carvana also provide a service. They bring a bunch of cars to one place that you can peruse and buy. That's a better experience than trying to connect with a bunch of randos on craigslist.

Zillow hasn't gotten to the point where they provide that, but theoretically they could. You could drive around with a Zillow agent and see their 50 homes for sale a lot quicker than coordinating with 50 random sellers.


The broader point is... Zillow's alpha should not be valuation, for all the reasons you listed.

It should be geospatial trends.

There's a fixed margin on any given home, caused by the inefficiency of real estate markets. But many actors (the seller, other agents) have an inkling of what that margin and the numbers on each side of the sale are.

What they do not have as good of a bead on, and Zillow is best placed to know, is high vs low neighborhood boundaries, and how they're moving, across time.

It seems like there would be more (and more reliable) margin in betting on the boundary movement, that the seller isn't aware of yet, than the absolute value of a particular home.

And yes, they are presumably capturing this in their price difference, but it seems like it should be a core foundation.


I think your basic point is good -- a free put option is always going to be a problem.

However, can you unpack "Everyone else would rather sell their home..."? At the core there are two factors at play in this decision. One is financial; anyone with the luxury of time will compare a realtor's comps with an ibuyer's offer. The second is certainty; an ibuyer offer is guaranteed, and can be executed without having to show a house over the course of 1 to X months.

It seems to me that certainty has value to some sellers in some situations. Right now, the certainty value of the traditional route is pretty good because the market is on fire. That is not always the case everywhere.


It depends mainly on how much more money the homeowners believe they can fetch by listing the home versus selling it right away on the spot.

For example, if you're the homeowner and you think you can get, say, a few extra $100K's by listing your home instead of selling it right away on the spot, you would probably take your changes and list it. Even if you're under pressure to sell quickly, you might still try listing the home for a couple of months to see what happens. Only if you think your home is a lemon will you consider selling to Zillow.

That's why Zillow ended up buying all those lemons.


> Only if you think your home is a lemon will you consider selling to Zillow.

What if Zillow just makes an offer that’s high enough that you think it’s unlikely you’d do better if you went through the hassle of looking for other buyers?


Even if Zillow does, they still take a 12% fee. When I was selling my house a year ago in Texas, Zillow's offer is both lower than comp in the area AND has a hidden 12% fee. Did you know they also charge YOU the repair fee?

I had a chance to talk to an inspector who looked at my place who has previously worked with Zillow to inspect houses that they were buying. The gist is that Zillow are getting the worst houses (the ones with trash everywhere, burnt carpet, feces on the wall). For them to get a "normal" house is very rare. Nobody with a normal house (and better) under normal condition (and better) will sell to Zillow. You can get a 1%+3% fee full service (pro pix, pro staging, pro cleaning, ...) any day of the week and still sell your house in mere weeks. I was not even in Austin. I guess if you don't care about making tens of thousands in a month (or hundreds of thousands in nicer cities), then you have bigger things to worry about.

Zillow's offer was already higher than OfferPad and other iBuyer's services. After my experience on the sell side with Zillow, I am incredibly wary of ALL iBuyers on the buy side.


Lets say P(h) is the price that a home fetches after being listed, O(h) is the price that an owner expects to get from selling the home, and Z(h) is the price that Zillow expects to get if they sell the home. An owner will sell to Zillow when O(h) < Z(h). They will not sell when O(h) > Z(h). Zillow's outcome would then be (P(h) - Z(h)) * (Z(h) >= O(h)).

The extremely important thing to note is that this is an asymmetric information game. Therefore O(h) < Z(h) is going to be self-selected such that O(h) < Z(h) because P(h) is closer to O(h) then Z(h) is to P(h). If Z(h) was actually able to be a reliable and better estimate of O(h) then it would mean that P(h) wasn't determined by information available to O(h) that was not also available to P(h). The market would already offer this value to O(h). So O(h) value is going to be reflective of Z(h). Its only if the non-public information changes the value that O(h) != Z(h).


Like the previous posts said, the advantages of selling to Zillow are price certainty and immediacy, so your model isn't quite right. But I agree with the adverse selection problem you highlighted.


Exactly. And to add to what you say, adversarial problems in data science are much harder than the regular ones which data scientists deal with and the hubris that I often see with data scientists won't cut it here. You need a different mindset to tackle these problems and they must be handled incrementally with the real world in the loop.



>those whose homes could not fetch anywhere near as much via traditional sales channels

Isn't this just another way of saying the models overstated the value of the homes?


On an average they may not have. Three separate million dollar houses get offered 900k, 900k and 1.2M. The point is that the guy offered 1.2m knows he will not get the 1.2 anywhere else and takes the offer.

The individual house is overvalued but you don't find out until after, let's say, 7000 of those sales.


> Zillow gave homeowners a free put option: Whenever Zillow's offer was above market, the homeowner would take it; otherwise the homeowner could freely ignore it.

Sorry for the possibly dumb question, but doesn’t this just describe buying things in any financial market? Your offer will only be accepted in cases where nobody else is willing to pay more.


In a financial market (e.g: stocks) 1 share owned an individual investor is just as good as 1 share owned by an institutional investor: they have identical representation. The financialization of housing glosses over the fact the some houses will have foundational issues, need new roofs in a year, etc but you can't see that when you treat all houses as if they were shares, which seems like Zillow did, or by the nature of the asset class, sellers with low-quality product dumped for high prices.

Maybe Zillow went wrong because they deployed too much capital too quickly. With the opportunity to be selective, they could have found homes with 'good bones' that were under-priced. But if the executives mandated X billions of purchases in Y months, you move downstream to progressively less good deals, due to your own insatiable demand, when housing stock is already extremely low. That's not the AI/ML team's fault. That's you simply taking what you can get.


Yeah, it’s not correct to call it a put option. A put option has a time component where you are guaranteed a contract price while the market price fluctuates.

Zillow is an example just bad bidding (due to bad valuation model)


In a commodity market (all goods are fungible) pricing is easy to come by. For idiosyncratic goods it’s harder to get someone to make you an offer. The traditional way of doing price discovery in those cases was an auction. Zillow was willing to give a price upfront with no commitment, even in the sense of face time with a charismatic salesperson. That’s valuable for the seller and dangerous for the buyer.


> Sorry for the possibly dumb question, but doesn’t this just describe buying things in any financial market? Your offer will only be accepted in cases where nobody else is willing to pay more.

This isn't a financial (securities and derivatives) market, nor is that behavior specific to financial markets. OTOH, in financial markets, you are dealing with masses of interchangeable items (e.g., every share of IBM is the same); whereas with real estate every individual item is different. In financial markets, you typically have (some) other participants order data for the same asset as a pricing signal.

It's a lot easier, when you are willing to give an offer on any item in the market, to be way off in real estate than financial markets without knowing it till you try to flip the assset.


It wasn’t a free put, the problem here is that it was a put but it was not priced correctly.


Personal example. I bought a condo in the University District in Seattle back in 2017. It's a 92-year-old, 478 sq ft condo. I paid $279,000 (yikes, I know)...Zillow's estimate? $500,000.

I have no idea how they were so far off. Even right now:

Zillow estimate: $479,100 - https://www.zillow.com/homedetails/905-NE-43rd-St-APT-212-Se...

Redfin: $307,337 - https://www.redfin.com/WA/Seattle/901-NE-43rd-St-98105/unit-...

Keep in mind that Zillow's HQ is also in Seattle. While we may be protective of the data scientists on HN, this is one case where I agree, there's something wrong with their models. The "similar recently sold" condos that Zillow shows are indeed $400K, but those are at most 50 years old and more square footage. I think a simple logistic regression of square footage and age would give a better prediction than whatever model they are using.


Right. I read their current actions as the result of a liquidity crunch. Something forced them to realize a 25% reduction in labor and other cuts. They also completely ended this home buying program. They’re hoping they’ve cut enough that when they sell their properties they have enough funds to keep them afloat. If not, chapter 11 filings are not far behind.


Chapter 11 filings?

Look at their balance sheet, they have over $3B in cash. They will be staying afloat comfortably.


Just some quick figures. They purchased 15k homes in Q2 alone. Debt on 15,000 homes with average purchase price of 350k (as a ballpark figure) comes to 5.25B.

There’s a lot of information we don’t have but them laying off huge percentages of staff is a scary indicator. Edit-also note their market cap is about 17B.


Can you point me to the source that says they bought 15k homes in Q2 alone?

Zillow says they bought 3,805 homes in Q2: https://s24.q4cdn.com/723050407/files/doc_financials/2021/q2...


This might be where google got that figure. Again, as I state above, I was using rough figures to get an idea of the magnitude of the funds they were devoting to their flipping/analysis unit. It looks like they were doing thousands of units instead of tens of thousands with discrepancies among the sources I'm finding in google:

https://www.zillow.com/research/zillow-ibuyer-report-q2-2021...

What's not changed is the fact they were leveraging a significant portion of their company's market cap into this segment and it resulted, so far, in them laying off a quarter of their labor and completely terminating the program - amongst other things. This is not good news for their business and prospects of continuing to be an ongoing concern.


From personal experience, two major contributors to the Zestimate are: 1. The town/city’s assessment of the home’s value for tax purposes 2. Recent sale prices of homes nearby, mainly as the per square foot price.


Another thing I don't understand. How do you develop a model with VOLUMES of real world data to train it on, and not just check how well it predicted pricing every month, both in aggregate (trending prices) and on individual listings that sold that month? This seems like a very small, closed loop to me. You could tweak the heck out of that algorithm in short order based on WHAT ACTUALLY HAPPENED.


They publish some accuracy data. I think this compares the model to WHAT ACTUALLY HAPPENED

https://www.zillow.com/z/zestimate/

Based on this I suspect they do check the model based on WHAT ACTUALLY HAPPENED.

what makes you think they don't check the model on WHAT ACTUALLY HAPPENED?


> what makes you think they don't check the model on WHAT ACTUALLY HAPPENED?

As I said, it seems like a pretty tight feedback loop to check, and then tweak the model. So it's confusing to me how they could be far enough off to ruin the business plan.

I'll put it back to you: What am I missing? How could they revise the model based on many months of real-world data, and yet get it wrong enough to nearly destroy the business?


They became part of the feedback in their model. Real estate markets are high value with limited inventory, if you add market maker (zilllow) they set the floor. They had too much influence on the price, and didn’t just provide liquidity. Additionally these transactions take time to settle, and Zillow tried to improve their profit buy improving adding additional risk by increasing holding time.

Finally im not sure how they hedged the risk. How do you short houses? If you consider mortgages like you holding 20% to protect the bank from losing, the variability can be difficult up to 20%, and leaves you with a hell of a downside.


Good question. I don’t know. There are a lot of possible interacting causes. Let’s itemize a few.

So there is not checking. This seems unlikely given, as you point out, it’s data right at hand and, as I point out, they publish checks. In general, I’m suspicious of WOW PEOPLE ARE STUPID explanations.

There is checking and mistranslating model accuracy into business outcomes. The model is 90% accurate. Bid for 1000 houses. Maybe mainly people in the wrong by overpricing segment take you up on it. See the vast literature on adverse selection and winner’s curse.

There is tweaking the model because you want the a certain outcome.

> Over the last few weeks, it’s become clear that Zillow made a lot of mistakes. It tweaked its algorithm to be more aggressive with offers, winning bidding wars just as the market was starting to cool and overpaying for properties [1]

The CEO has commented that the volatility of the prediction was a problem. He mentioned that the unit economics swung from over plus 5% to minus 7%. So the model might be a accurate on average but too noisy for their purpose. [2]

Point is it’s not at all obvious that the problem here was accuracy on predictions of house prices.

[1] https://www.curbed.com/2021/11/zillows-i-buying-algorithm-ho...

[2] https://www.google.com/amp/s/seekingalpha.com/amp/article/44...


The problem with that is that many sellers use the Zestimate as a “ground truth” to set the sales price.


My neighborhood is a mix of single family homes, condos, and apartments. I live in a single family home. Zillow regularly shows nearby condos as comps - well the condos have always sold for less than the single family homes. The condos also have a high HOA that does not show up in the price. Zillow does not know that there are different kinds of buyers for condos vs single family homes, even when they are close by and a similar size.

My neighbor's house is the mirror image of my own, same size, same layout, same size lot, right next door. Zillow shows my home currently at $226K more than the one next door that is the same thing. Zillow's Zestimate of my home is more than a recent appraisal - which is odd because until 2021 Zillow had a really low Zestimate for my home, like the house next door. No idea why Zillow decided on a much higher value recently for me but not for my neighbor.

And I'll have to say that lenders do consider the Zestimate as part of a loan, especially in a case where needing an appraisal might not be needed - if it on the bubble then the Zestimate might be the difference of whether you get an appraisal or not.


I think (hope?!) that the Zestimate and the model predictions in their iBuying business are completely separate. The Zestimate is essentially a top-of-funnel strategy to get home owners and potential buyers into their normal business platform.

For my home, the Zestimate is about 8% higher than Redfin's.


I think part of it has to do with black box models being easier to build incrementally. I’ve seen many teams start off with 3-5 features, which gets you to some threshold for launch (say 90% precision and recall) and get badly burnt on the other 10% post-launch


Curious why didn't you flip it to Zillow ? If their prices are still not revised even after your buy wouldn't that be a good spread to exploit?


I think you underestimate the market and shortage in supply. I’d be surprised if anyone cared about your condo being older.


A 92-year old house probably is worth more than a 50 year old house, unless you like doing asbestos abatement.


I hate to use this language on HN - but this post is a crock of shit. This is just an opportunistic post by the author that cleverly evades defining anything in solid terms, providing even one concrete example, and is a pot of handwavey BS. It started off encouragingly, and two paragraphs in, the smoke alarms in my head were blaring. The author doesn’t even have a clue (or for some mysterious reason didn’t mention) about what went wrong for Zillow.

Zillow’s massive screw up was that their key business assumption was a self fulfilling prophecy. If a model learns that “price to pay today is price to pay yesterday in this area * 1.05”, and Zillow is buying multiple homes in a neighborhood spread over time, you can see that the price is headed to the moon (and the buyer who puts their faith in the model to the gutter) in no time. In my parents’ town, they offered a neighbor a price, and came back 3 months later with an offer $70k higher. I have a strong suspicion what was supposed to be a price prediction model ended up being a price discovery agent due to the lack of a “visited” set. Oops!


I agree with your characterization. I've been in meetings with people like this that will throw out irrelevant stuff (topology, distributions of distributions!?) as if it's an insight, and there is a certain kind of "leader" that loves that stuff, to the frustration of everyone who actually has some background in the area. I assume this is where he's coming from


I suspect something like this happened:

- The team behind the zestimate kept touting how accurate it is. Since they needed to keep improving it to keep their jobs they kept cherry-picking the best results to showcase. Edge cases were silently not mentioned and failures were spoken of as little as possible. The further up the management chain the result went the nicer they looked as each layer tweaked them to be even more impressive sounding. None of it mattered much so who cares about a few white lies?

- The other executives seeing these great results decided to create a new business model based on top of it. The DS team and their management was not consulted about this until many wheels were already moving.

- The zestimate team and their management chain would get fired (or otherwise punished) if they suddenly showed all the edge cases and potential failures they never mentioned before. So they collectively kept quiet, hoped they could fix the edge cases and hoped it would all go well. Messengers tended to get shot in this situation and no one wanted be that messenger.

edit: In my experience skewed institutional incentives are more often the source of issues than technical mistakes or low level employees being blind to things. You don't get promoted in most companies by telling the CEO they're wrong so people don't.


If it went down like this, then senior management is where the problem lies. Every CEO should know that expecting an AI to understand systems that humans can study for a lifetime and still get wrong is irrationally optimistic. Firing 25% of a team for a self-inflicted failure (by senior management) is another signal there is a deep management problem.


This is not even close to the reality.

Zestimate is not the tail that wags the dog. It’s wild how much people get wrong in their assumptions here.


I used to work at a trading systems vendor and once in a while, someone would ask "couldn't we use ML to tell our clients what to buy and how much to pay?" My response was that "if we could do that, we'd never give it to our customers and just trade ourselves."

So I always take things like the Zillow estimate with a grain of salt because "if Zillow really believed this number, they'd use it to buy houses rather than put it on their website."

So it was interesting when Zillow actually started to essentially "trade on their numbers" while also sharing them with the public. That was a very bullish sign that the company deeply believed the quality of its signals.

That can be totally normal. Every investment (and in fact, every project - which is a form of investment) happens because people believe that it's a good bet which is different than guaranteed. Some of the most money-making hedge funds are right in only like 51% of their bets, but that's enough!

The thing that somehow went wrong is simply risk management. It seems like nobody with power at Zillow asked the question "how do we make sure this doesn't ruin the company if it's wrong?" UNLESS the scenario we're seeing now is their predicted acceptable worst case scenario.

It's possible that they simply said "this is a juicy bet we want to make. If it goes wrong, we'll loose X millions of dollars and have to fire 25% of our team. Are we willing to take this bet?"


> "The CEO points directly to the failure of their Data Science team as the root cause. Their investors aren’t buying it and I don’t either. Zillow’s Data Science team is excellent. The business knew the risks and ignored expert advice. The initial rollout should have been far more limited and cautious.

> The Data Science team didn’t go to the lengths necessary to support their models. That was their failing. However, the business is ultimately responsible for betting the farm without demanding a real world track record to support those models."

I'm confused. If the "team" was "excellent" (and he says this multiple times), why didn't they do the extra leg work? It feels like he has a dog in this hunt. Did they use him for external consulting or something?


This author doesn’t have a clue about anything. After the first paragraph, this might as well have been written by GPT-3. Sounds like a consultant trying to cash in on the news of the week to me.


There was a post on Blind a few days ago from somebody who works in Zillow, stating that management disregarded models and decided to go instead with "intuition"


That post tracks with some of my experiences working in forecasting. Senior leadership loves the DS team until they are told something may not work how they want or reaching their quarterly KPI (which is often tied to their bonus) is infeasible/a bad idea.

This is a generalization, and has happened a couple times in my career. I feel lucky now to be at an organization where this is no longer the case.


This seems to be the thread. I’m not sure how to link to the specific reply. https://www.teamblind.com/post/Zillowers-what-happened-fXvbZ...


What is Blind? I tried searching various things but just keep getting results for accessibility.


https://www.teamblind.com/

Workplace chat app, you use your work email to sign in and chatter anonymously with your coworkers and others in the industry. You are ID'ed only by the company you work for.




Also, I take issue with this:

> their CEO put a lot of blame on the Data Science team.

That is the height of hypocrisy on the CEO's part. If this move had worked, who would have taken all the glory and the $$$$$? The CEO! That MF should take the blame too, since the buck stops with him. He didn't know how to use their predictions properly; stop blaming them.


This reflects something I say often as a data scientist: the best data science implementations augment humans.

Domain understanding is key. Integration of the domain into ML models are a key part of generalizability -- far beyond train/validate/test, which are subject to data collection biases and temporal trends.

Zillow had a high, general equilibrium impact on a large market. We saw microcosms of this behavior in the "good old days" of Amazon book-selling when competitors would mark up a books price to one penny more than another, resulting in outrageously priced books valued far below the offering price. No behavioral governors were in place, review was inadequate, and they become part of Data Science lore we tell ourselves as our single threaded scripts running Python payloads warm our homes and melt our s'mores whilst our GPU weeps.

Data Science as a field is distinct from ML in that it uses the tools of ML to augment and enhance digital services. Despite the exuberance of boot camps to market prediction and neural nets as the end-all-be-all solution for hard business problems, value in data science adoption isn't solely in prediction, but in validation. Causal modeling and inference are key in this space, but also simply quantitative understanding of statistics and algorithms. As an example, you need someone in the room who understands multi-armed bandit sometimes. But, more importantly, you'll always need someone in the room who understands how multi-armed bandit would improve a business' value proposition.


This post is mostly full of hot air. Not sure how much experience this person really has in data science as they contradict themselves multiple times.

I think the execs at Zillow bear almost all of the responsibility for this. The harsh reality is that anyone with a brain could see that they were going to face a serious adversarial selection issue (only homeowners with crappy homes would sell to them). They tried to put into place guardrails to prevent that but they systematically failed to prevent that . And I think they failed because they fatally underestimated how corrupt the real estate industry is.

The inspectors and appraisers that were supposed to protect Zillow from overpaying failed to do their job systematically. I say this because I’m seeing homes from Zillow and Opendoor with major structural issues. No sane homebuyer would have paid what Zillow did for these homes. The only possible explanations are either: Zillow did not do their due diligence, or the inspectors/appraisers didn’t do their jobs.

If it’s the former than Zillow is a cautionary tale. If it’s the latter then we’re definitely in a bubble again and the appraisers are going to be the target of more legislation down the road.


It's difficult reading articles without any Headings!

Data mining is common in finance. Finance PhDs are constantly identifying new "factors" (e.g. value, momentum, price/book) based on historical data that don't pan out as time goes on. See here for some academic research: https://www.researchaffiliates.com/publications/articles/710...

Robert Shiller has does extensive research on how psychology impacts real estate prices. Monetary policy isn't standard right now. Machine learning isn't always a good tool for making predictions. Look at Shiller's interviews and surveys for predicting home prices. He doesn't use machine learning or advanced analytics.


Did they consider that they just hired the people who would tell them what they wanted to hear?

It would have been fortune cookies, it would have been the same.

> When I advise Data Scientists to understand Topology and Differential Geometry, this is why.

This is ridiculous, it has strictly nothing to do with that.

The author thinks he’s smarter than everyone else.


> Did they consider that they just hired the people who would tell them what they wanted to hear?

Management everywhere thinks they are getting good info from those below them when everywhere I have worked, they get a selective job preserving subset.


I’d really like to see the author write an explanation as to why. I completed by PhD in “AI” (as it was then called) in 2001. Very little DNN work then, but I haven’t a clue where the connection lies.


Is it even possible to predict things whose predictions influence the events that are supposed to be predicted?

Suppose you did have a perfect way to appraise houses by some arbitrary criteria. If the houses are "perfectly" appraised, that would change the values of the houses themselves, no?

e.g. magical system predicts 3b/2bath house should cost 500,000. People know the magical system's price is the perfect one. someone who has more money to burn can acknowledge this and still pay even more money, knowing most will not want to "overpay" for a house. house sells for 550,000. prediction is wrong in the end anyway.

schrodinger's predictor - it's accurate right until it's observed.


They don't need to predict the actual sale price of every given property perfectly to make money. They need to be able to say that with 99% probability, we will find at least one offer for this house >= 500,000. The errant buyer with loads of cash is free to come in and offer more. As long as they turn a sufficient profit at the 500,000 price point they should be set. In competitive markets, for people familiar with local dynamics, this is very much possible in a short-term time window. The question is doing this at scale in a fully automated way for a much longer time horizon. Zillow tried and failed.


Predictions like this work on aggregates, not individual cases. There are also sellers who are in a hurry to move and will settle for $450,000. Magical system is wrong in both cases but right in the aggregate.


Yes. This is one of the fundamental teachings of economics. The ideal price is not knowable


Nitpicking here, but it is my understanding that there isn't even a consensus about the existence (or not) of an "ideal price" in economics. Just to make it clear, I'm not an economics major nor do I work in the field, this opinion is just based on my amateurish readings.


Yes it is possible. You can tackle it through different technics like game theory.

Cf. also the “market impact” in HFT, for a practical example.


One thing that I feel this article misses is the notion of domain expertise. Zillow has entered the house flipping market that has been already populated by human real estate agents and investors, who already do their work of pricing houses and striking deals quite well. To compete, a successful entrant needs to excel in at least one thing and be not much worse in all the others. In practice, that is a tall order.


Real Estate agents 'price' by looking at rough comparable down the street, and padding it up a little bit. That's it.


Yes, but there is more than pricing that goes into a successful operation.


Couple of years ago, there's been an article posted here: How to Recognize AI Snake Oil. https://readhacker.news/c/4cSun It looks like this was the case with Zillow.

Most of commenters here insist ML and data science have nothing to do with this failure, and put all blame on the management.

I don't by it's not failure of ML. In the last several years I consistently see Tech leads oversell the possibilities of ML and theorizing that ML could solve many problems with deep learning.

The result is always modest. Error margins for real world metrics that involve location data are around 20-30%, which is somewhat better than analysts predictions in Excel, but not enough to rely on them. I saw very competent ML engineers try to improve a model, take a huge amount of satellite imagery, tracking data, etc., train their neural networks for couple of months, and only improve error margin by 1 percent point! (e.g. 25% -> 24%)


> Zillow believed their inference quality was high enough to support a new business model

This isn’t why Zillow started Offers and the zestimate did not play a big part of it. Some same people but those were different teams.


What if zillow's defective pricing model is single-handedly responsible for the housing price bubble in the US? I have not seen prices increase at all elsewhere in the world.


I think they've been a contributor. Having looked at many houses in various markets over the last few years, I feel their estimates are high but sellers take them as gospel.


There’s another consideration here. The entire buying operation was debt financed which created incentives for risky behavior and left very little margin for error. Not only does your sales price need to cover the cost of the house and any improvements, it must also cover the interest expense (123 million YTD according to their latest 10Q, vs 2.67B in sales of houses or about 4.6% yield relative to sales). And that interest expense is there regardless of how many houses you sell.

If Zillow predicted housing prices to go up and it merely stayed flat instead, they can’t just hold the inventory for longer in hopes that the market might go up a few months from now. They’ve got that interest expense to cover. The whole operation implodes the moment they made a bad prediction because of the debt financing. Yes, their models were wrong. But if they had a different capital structure maybe they’d be able to hold the inventory for longer to try to turn a profit and rebound instead of collapse.


I don't know anything about the US housing buy/sell market, but I have some little experience in my country (Italy).

Appraisal of houses is extremely difficult because there are both objective data (like size, quality of construction) and many very subjective (at least here while you can draw some rough lines distinguishing neighborhood/areas in a city, sometimes a house just around the corner has much more - or much less - value) and there are what I personally call "financial capability of the masses" (in a given area/city).

Say that you have a house that is appraised by several people to have a value around 400,000 Euro (or dollars, anyway "units").

What a young couple (both working) can afford is 350,000 Euro max, let's say 100,000 Euro cash and 250,000 loan.

So you have out of 100 people - willing to buy at the moment - 98 of these young couples and only 2 other people (these latter that can afford the 400,000 Euro or more).

Your house suddenly is not worth anymore 400,000 Euro because most of the houses (similar to yours and valued anything between 360,000 and 420,000) after - say - six months that are unsold will lower the asking price, and your asked 400,000 will appear way too much and you will be forced to lower it.

The other way round, if your house is valued 300,000, there are good chances that you can sell it for 330/340/350,000, because there are 98 people out of 100 looking for exactly that price range in the area, and as soon as one or a few ones are sold in the neighborhood for 340,000, all similar houses will go from 300 to 350,000.

And this tends to be cyclic, at a certain point that particular area/neighborhood becomes (for whatever reasons) trendy and prices skyrocket, then a couple years later (again for whatever reasons) it goes down to the bottom.

Everything is very ephemeral and fluctuating, maybe what Zillow attempted to do (and seemingly failed) was more to become a sort of monopolistic entity capable to "impose" prices?


Back in the 1980s, people used to say things like, "It's in the computer, so it can't be wrong." The computer says your account balance is $86.23, so it's $86.23, end of story. Can't argue with the computer, the all-seeing, all-knowing, infallible source of information.

Such statements sound totally crazy now, but it was a pretty widespread misconception. Computerization made such big improvements in availability and accuracy of information that, for a time, people simply overestimated what computers could do.

Maybe we're seeing some of the same thing now with machine learning.

(This isn't a comment on whether attitudes about machine learning were the cause of Zillow's troubles. I'm not a Zillow insider, plus the situation is probably complex.)


Machine learning was never supposed to be good at modeling constraints (hard constraints, equilibrium constraints).

Applying it to real applications with real constraints was a mistake. Google will not collapse if they serve me one recommendation that is out of bounds (a monkey photo instead of a black person). I will ignore it (as we all did for years, before the journalists took on this). Google’s business model does not strongly correlate with the accuracy of their ML algos.

I have yet to see 1 successful company whose business decisions rely on the recommendations of a machine learning framework.

The explanation is trivial. ML works, most of the times. And that is the problem.


Sounds more like classic large organization things.

I bet there were some employees who saw the estimates were untrustable and the entire thing was heading towards failure, but good luck changing the course of a ship controlled by hundreds of people.


Pharma has been using QSAR/QSPR and machine learning, along with various other statistical models for 50+ years. There have been repeated over-predictions as each new technology has come down the line. Yet the principles, like in this article, hold - have a good training set, have a test set that isn't just an echo of the training set and then finish with a validation set that is sufficiently different to help expose edge failures if not outright failures. Remember Egan's rule of 1 - the number of times a modeler can fail before their predictions are ignored...


From the blind thread

> Our back tested model error as well as true error rates (models only) were considerably smaller than what was realized. Largely because leadership wanted to be aggressive on acquisitions and incentivized all teams to err on the side of growth, which manifested in system overrides. Without proper oversight and controls these lead to compounding and unchecked error

Sounds less like blaming the data science team and more like the entire organization didn't want to face the music.


“Pull back the covers on large scale deep learning models and this is one reason overfitting creates miraculous accuracy. Large scale models have a primitive understanding of the deformations leading to a degree of generalization. The more the model learns about how the inference space changes over time, the better it generalizes.“

Pretentious mention of topology followed by super broad, almost completely incorrect characterization of the double descent phenomena, color me surprised.


It seems like zillow used an algorithm to decide which houses to buy and sent in people to execute the purchase. I suspect many of the purchasers knew they were offering too much or they were ignorant to the realities of real estate.

What would have made more sense would have been to use an algorithm to identify homes and then flag those homes to highly skilled agents to vet and buy. Additionally you’d want some incentive to keep the agents motivated to buy the right homes.


Did Zillow even have buying agents? I imagine a lawyer could have handled the entire transaction.


That’s kind of my point, hitting someone to go execute a contract will have different outcomes than hiring someone to evaluate if a contract is a good idea


The whole point of their operation was to buy real estate based on an algorithm. Hiring an agent to evaluate and execute each sale was the opposite of what they were trying to do and likely wouldn’t scale to the billions of dollars they were looking to spend. Obviously it didn’t work out for them though.


ML is a sharp tool, just like a knife is. In the wrong hands, it can cut you up and leave you bleeding.

Moral of the story is: handle sharp tools with care; know how to use them properly.


Modern ML algorithms are so powerful they will soak up any entropy that leaks into the data whatsoever. Much of the time spent developing these models is checking that this isn't happening. I compare it to high voltage electricity: a 100kV line needs some serious insulation around it.


Seems there are 2 parts:

1. The business decision to go into the 'flipping homes' business.

2. The capabilities of the ML/AI to help optimize that business decision.

The post does a nice job outlining that #2 will have great variance across businesses, use cases, time, etc.

The biz decision is maybe more interesting and instructive. A wrong biz decision can doom the ML/AI out of the blocks - the ML/AI is often a tool to do a job, but there is a business decision to do the job to begin with.


I don't understand one thing. Predicting a price of a stock or derivative for next 24 hours should be much easier task. We have a tremendous amount of perfect quality data to train our models on and to infere from. Having those prediction accurate enough will be an Eldorado. And still no one solved that problem.

Why Zillow ever thought they gonna solve much more complex problem with much higher risk using much worse data?


On the other hand, what if everybody solved the problem of stock price prediction the way Zillow made it with housing market, and a steady growth we are seeing now is just a byproduct of those algos? When will we have a Zillow moment there?


This is a poor use case for ML. You'd only have a couple hundred examples for a neighborhood with maybe two dozen variables. The model will also miss key variables that that there aren't datasets for but would be apparent upon a visit. ML only beats humans in very, very narrow scenarios.

https://youtu.be/j0z4FweCy4M?t=9319


I think it probably would have helped to start with some basic economics. The housing market is large, diverse and very efficient. Trying to throw some data science voodoo at it and thinking you can somehow outsmart the market with some magical ML despite the fact that all the data you're crunching is already incorporated in the price by the existing participants sounds stupid to begin with.


This is exactly the problem with AI for recruiting too. Humans are smart and will evolve their strategy to counter Machine learning strategies.


My takeaway from the Zillow debacle has nothing to do with OPs approach. Instead I look at it from the Antifragile framework.

It looks like this business model had a lot of ways to go wrong and very few ways to go right. In other words, it was fragile. It doesn't even sound like the Black Swan of Covid was needed to make this fail.

In other words, if a system is fragile, you can bet on it breaking sooner or later.


The stability (or lack thereof, actually) of distributions is a fundamental problem for any statistical model of complex systems. The author stated that more succinctly than I personally have heard anywhere else. Even if the rest of the piece might be BS (as many others here are arguing) pay attention to that aspect, since it is reality for most complex systems...


This is not data science, its speculative trading. So, a data science approach without the massive paranoia and appropriate hedges that must accompany any trading endeavor, is a problem.

But I wonder of the future of mass corporate home buying, which I would assume Airbnb would do like Amazon, create a huge platform and then enter that market to compete with your customers.


This is BS. The issue had little todo with AI, they could have used professional real estate valuers and had the same problem.

This was caused by a faulty business model - houses aren’t shares or bonds that can be flipped instantly. The feedback for their pricing model wasn’t known for three to six months after buying the houses, by which time they’d lost billions.


IMHO the Data Science team at any organization should work closely with a strategy team who uses data to ask hard questions with tools like scenario planning. It’s hard work, and when things are going well it feels like that team is wasting time but when things are hard, they have already thought through of ways to pivot and how to move forward.


So... the failure may not be due to ML, but due to people :) See this blind thread: https://www.teamblind.com/post/Did-Zillow-fail-miserably-due...


Any bond trader can tell you that overall your model can have zero bias, and a small error term, but your bids that get hit and offers that get lifted are the ones that are "off". This could have been solved with a "human in the loop". Of course, that may have broken the economics of the entire operation.


Predicting home prices is like predicting which way the cows will stampede.

It's radical hubris and lack of self awareness on their part.

What this 'exposes' is that both leaders, and Data Science leaders will probably overstate their value, and the rank and file will just go along with it, because, well, it's a job.


"the Zillow failure was largely in part due to humans 'overriding' ML outputs in order to meet their quotas"

https://twitter.com/sh_reya/status/1457094567111528455


Yep, that’s not the only thread from Zillow folks saying the same things. That their output has still been accurate and was overridden by management’s gut feelings. If true, quite atrocious that the CEO is blaming the data science, either by maliciousness or ignorance of how the company is operating.


A fascinating thread about Zillow, comparing it to quant trading

https://mobile.twitter.com/0xdoug/status/1456032851477028870

TL;DR: not taking into account adverse selection - other people's prices & trades aren't random/Gaussian, but adversarial (they're trying to profit at your loss) and better (they probably have better information than you)


From that (fabulous) thread:

Behind every order is a thinking, breathing shark. A shark with mouths to feed and a mortgage to pay. A shark who's likely smarter than you. Definitely more ruthless.

You're swimming in shark-infested waters, treating apex predators like specks of geometric Brownian dust


Most data scientists are not smart. They are basically pushing a few buttons.

This is true even for “Ivy” trained PhDs. There are may be a handful of people who can do truly profound work today. The rest are only capable of replication ie mouse clicks here and there.


Further, they should be telling people a model is only as good as the data it has as input. If the data only sees upward movement in valuations then it will only predict upwards. A model can’t tell you anything about a situation where the underlying factors are different from its data. Same thing happened with mortgage backed securities in 2007/2008. The math was very complex and still couldn’t see around a corner.


The bottom line is that the domain here is under the umbrella of the social sciences. The phenomena the models describe is historical, and how closely they will match what happens from that point on relies on social stability. In the grand scheme of things, social stability is ephemeral. It isn't rocket science.

I'm not making a joke. Newton published his models in the late 17th century. The models hold no better or worse today than they did then. If you had rockets in 1687, they would work the same way then that they do now. But, social science is nothing like hard science. I worry on some fundamental level, even people who will claim that they "of course" know this in fact forget this. There is hubris here.


It was pretty obvious to me these algos are busted when an agent mispriced a house in our neighborhood at $600,000,000 (instead of just $6M) and the "estimate" was something like $604,082,000


> The Data Science team didn’t go to the lengths necessary to support their models. That was their failing. However, the business is ultimately responsible for betting the farm without demanding a real world track record to support those models.

"Well, sure, them folks were (wrong | lying), but its really your fault for listening to them! It's not like making that call was their actual job or anything, ... oh."

I'd say the lesson here is that horseshit is still horseshit even after it has been extensively processed by the finest machines and learning we can buy.

Just imagine being the poor sod there who got a model to say "Bail the fak OUT of this market" 3 months ago: think anyone would have listened?


One of the harder lessons of life is that occasionally people get wedged into ideological positions where they want comforting lies that confirm their existing beliefs. These situations often are very dangerous for companies, because it creates incredible pressure on lower level people to produce dubious models, and on middle managers to accept dubiously supported models.

As far as who is culpable for those situations … it’s not clear to me.


And good luck raising the alarm on anything if you are a bottom tier foot soldier.


I think it’s long been known that if the market actors don’t want to heed results of analysis then they simply won’t.



Maybe they would have listened if the model successfully used historical data to predict when people should have bailed in the past, in addition to now. That is, if it was a properly validated model. I am not an expert, but this seems like a reasonable way to see if a model works.


Or, more simply, this business only works if prices are going up, and home prices aren't going up. Zillow saw the house price bust coming, and got out.


The article premise is completely wrong. Zestimates were never meant to be accurate. In fact, they were deliberately inflated to drive traffic to Zillow.


They were driving prices up by speculating on buying houses? Feels like it's great their data science team failed. More such failures should happen.


Predicting the future is not a use case for machine learning. And failure of financial models happens all the time - Long Term Captial Management is probably the best example of a big one. This has nothing to do with ML/Data Science, other than maybe some people thought it could do something it cant.

Also, the article reads a bit too much like a stream of conscious sprinkled with buzzwords and causal references that the author knows better than Zillow.


My take on all of this is that someone in too high position at Zillow forgot to read their Taleb books.


It’s no surprise AI cannot understand and predict economic outcomes, yet Zillow tried


"A stock trading company uses inaccurate stock prediction models for technical analysis, loses money, downsizes."

How is this different? They were playing roulette anyway...


I hope that it gave us look at what will happen with all these real-estate-bubble-inducing a*holes.


It might and it might not. Many things have rolled down the pike that in normal markets would cause reevaluation and correction downwards. But so far they’ve all been pushed past.

My professors used to say “when the last stalwart holdout is sold on the premise of the irrational factors holding the market up the correction is near.” (Or something like that.)

I seem to recall it being Enron filing bankruptcy that really kicked off the Great Recession.

Oh further, anytime you have your market appraiser becoming a market participant their valuation becomes highly unreliable. This is another thing that’s just not done if you want to have a reliable indicator of an assets value and sell it knowing whether you’re getting a deal you’ll be comfortable with long term or not.


It’s really not clear to me if we’re in a bubble. A lot of these houses sold to investors are aiming to generate rent, not be resold at a higher price. Zillow appears to be relatively alone among institutional investors in aiming to flip homes and not rent them.

If corporate land lords are extracting enough rent to justify the higher than normal prices, is that really a bubble? It’s socially pernicious, yes, but that doesn’t sound like an asset bubble to me.


The bubble was caused by fiscal, tax, and homebuilding policies. Zillow wasn't even renting the the houses it bought, just flipping.

Blame the government; Zillow had nothing to do with the housing price blowup.


I too am tired of the greed.


Zillow blinked.


Agreeing with other commentors here who don't buy into the superficial 'silly Zillow's ML folks didn't consider the possiblity that the model might fail to predict the real world' narrative. Below I'll outline what Zillow learned from the 'iBuying failure' from my perspective having worked in real estate tech.

As someone who worked as a senior Data Scientist at one of the Silicon Valley companies involved in iBuying some years ago (5+ years ago) I see the recent Zillow iBuying spree as a mechanism to test how much market pressure needed to be applied to historically not-so-competitive residential real estate markets to induce 'FOMO' / social contagion behaviors of large-ticket items AND as a way to produce a dataset on the actual dollar amount that (residential) property sellers would need to abandon the 'safe', 'we've always done it this way' process of selling a piece of real estate through a real estate agent / broker. The upside (for real estate tech companies) in removing the middle[wo]man - the real estate agent - in a residential home transaction is that the real estate platform can now control both sides of information asymmetry in the real estate transaction. They can also start offering (like Zillow does) mortage services and other ancillary financial services, allowing them to earn millions of dollars in fees by capturing the home-buying financial services markets.

In my work at the $real_estate_tech_company I mainly developed lead-generation data products which were used by $real_estate_tech_company market to get less-desirable single family homes to be bought up by high-net-worth indiviudals who invest in real estate. The end goal of this process was to get the foreclosure and pre-foreclosure single family homes (SFHs) off of the bank's ledgers and leave someone else holding the (debt) bag. HNW individuals would buy up the assets, the banks would have someone with sufficient collateral now in possession of the single family home, and the HNW individual could rent out the home to less-likely-to-default-than-the-original-homeowners family / renter. According to what I read about Zillow's iBuying model, Zillow focused on buying up assets (SFHs) in markets with strong, diversified economies, i.e., economies that are less sensitive to economic downswings. (Blackstone is doing the same thing and is also buying up trailer parks / mobile home communities near tech hubs.) As someone who builds data products for a living the datapoints that Zillow was able to gather are, in my opinion:

- A hard number, in USD, of the amount of money needed to get humans to abandon the process of selling their home in the traditional way: through a real estate agent, broker, etc. Because Zillow has home buyer and seller data they now know what that switching cost trigger is, in USD, for homeowners with an income of x, a mortgage of y, and a debt-to-income ratio of z. Anecdotally, from reading posts on Twitter and other sites from people who sold their SFH's to Zillow in the iBuyer program it appears that in economically depressed regions of the United States (the Midwest, rural places within 1-2 hours of a medium-sized city, etc.) that 'cash-in-hand' amount that the iBuyer program offered homeowners to sell their houses to Zillow is only $15,000 - $30,000 over list price per property. People who sold their houses to Zillow via the iBuyer program were talking about how they could 'pay off their new car and have a bit left over to buy new appliances in their new house'. These sums are rounding errors to Zillow's business mode even when multiplied by the thousands of properties that Zillow bought. But to the home sellers, $30,000 or $50,000 is written about as thought it is some life-changing sum of money. I say this not to mock, demean nor poke fun at the homeowners, having grown up in one of these economically-depressed regions of the United States. A good portion of these homeowners are selling their modest homes and taking on risky levels of debt in a real estate bubble. I hope that I'm wrong about the risk that they're incurring - all the while celebrating getting $30,000 cash-in-hand from Zillow - but I don't think that I am misreading the situation.

- Hard numbers on how much (or how little) a given housing market's supply need to be (artificially) constrained before home prices skyrocket. Real estate markets in different regions behave differently: rural Iowa's market is nothing like Santa Barbara's market which is different from Boston's market. Until Zillow undertook large-scale coordinated (artifical) reduction in supply *at a time of unprecendented _physical_ mobility of workers due to remote work status during covid19* we really had no way to model which markets would be more resistent to large upticks in housing prices, which markets would see meteoric growth quickly and in a sustained fashion ('pent-up demand').

So if I were running Zillow's iBuying experiment 'failure' as a data scientist I would be delighted in the new data points gleaned from the 'failed experiment', namely: - What's the exact dollar amount that causes single family homeowners to abandon the 'sticky' process of selling their home through a human (broker, real estate agent)? Answer: it's pretty damn low for most folks: less than $100,000 over the Zestimate price or price that their real estate agent quoted them. And home sellers talked about that being some huge windfall enabling them to pay off a new car, or buy all new appliances in their new home (that they probably overpaid for). - To what degree do I have to (artifically) constrain the housing supply in different regions to induce a 'feeding frenzy' / FOMO / social-contagion-like behavior? By manipulating public perception of the real estate market in their area can I induce irrational / deleterious individual behaviors that then spread to others in their geographic area and social circles?

To me Zillow's iBuying experiment mirrors what Facebook allowed researchers to do in the mid-2010s when they manipulated content in user's feeds to see if they could induce positive or negative emotional states: https://www.theguardian.com/technology/2014/jun/29/facebook-... Until recently there has never been a way to leverage mechanisms of social contagion in the nation-wide housing market for the middle class. Five plus years ago when working at the real estate tech company I would have loved to get my hands on a dataset of linked behavioral data like this, especially a dataset that had reduced geographic buying pressure (due to remote work) as the dataset would have revolutionalized supply- and demand-side real estate data product development.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: