The found that the frequency of visiting a place, multiplied by the distance traveled (rf) forms a stable parameter which can be used as a single dependent variable. There is then an (approximate, statistically fitted) inverse square law involving this combined variable. Or two square laws.
If we hold frequency constant (say "once a month" or whatever), then the number of people visiting some place drops off inverse square with distance. If 400 people are willing to visit some place once a month that is 10 km away, about 100 once-a-month visitors will come from 20 km away.
Or if we hold distance constant: if 400 people are visiting some place that is 10 km away once a month, about 100 will be visiting twice a month.
> It accurately predicts, for instance, that the number of people coming from two kilometers away five times per week will be the same as the number coming from five kilometers twice a week.
It doesn't predict this; rather this frequency-distance product being a stable parameter is a discovery from the data, on which the formula is then based. I.e this frequency-distance product becomes a model assumption baked into the formula, not a prediction.
Note that this derivation is essentially identical to showing that electric field falls off as 1/(r^2). In that case the area refers to the area of a sphere through which field lines must pass.
At the very least, the book is a great introduction to the many different aspects that scale up with city size.
I'd be very interested to see how this law holds for travel destinations, both from the POV of the destination (national park, ski area, attraction), and from the POV of occasional travelers.
I'm highly skeptical of this result as my lived experience is that travel time is so much more important than distance.
There are two parts of my city that I like to visit that are roughly equidistant from my home. One can takes 20 minutes to arrive at, the other 45. Can you guess which one I visit more often?
We used the same concept in our 2009 paper (https://www.pnas.org/content/106/51/21484) but the exact functional form of the distance dependency (1/r, 1/r^2, 1/e^r, etc) varies with their exact definition of city due to the Modifiable Areal Unit problem (https://en.wikipedia.org/wiki/Modifiable_areal_unit_problem). In our specific case (cities defined as Voronoi cells centered around airports) the dependency was exponential.
The paper is saying that from the point of view of the location, the distribution of effective distances traveled is invariant. So if you really like the 45min part of your city, then every one closer really loves it, so visit it much more than you do. Therefore the law holds true, even though on an individual basis you obviously go to the closer parts more frequently.
The fact that you might be prepared to go all the way to e.g. Disneyworld means that the people who live closer are massively more likely to go there to the point where people who live 5 miles from it go there all the time.
> we reveal a simple and robust scaling law that captures the temporal and spatial spectrum of population movement
They throw around the word "temporal", but it's not clear exactly how they incorporate the time element without taking the plunge on the full study.
Not USA, England, Australia, Thailand, China, England, France.
England would be surprisingly close, if London had been half the size it currently is. If you took half the population of London and distributed it proportionally out among the rest of the country then you get quite close to a Zipf law distribution.
France doesn't really follow Zipf law for its 5 largest cities, but cities 6-15 follow quite closely.
Australia I 'weird' in that it's two largest cities are basically the same size, but if we ignore Melbourne for while the next few cities line up quite nicely before we start seeing a much faster drop off in city population than Zipf's law would predict
Thailand is way off, since it has 1 massive city followed by 10 cities of more or less the same size.
So yea, no country follows exactly, but it's closer in many cases than you might expect. Plus comparing the actual distribution to Zipf's law is an interesting way of comparing urbanization in different countries
But if you go by city - NYC is 8.5M, LA is 4M, Chicago is 2.7M.
New York-Newark-Jersey City, NY-NJ-PA [19,216.18]
Los Angeles-Long Beach-Anaheim, CA [13,214.8]
Chicago-Naperville-Elgin, IL-IN-WI [ 9,458.54]
Dallas-Fort Worth-Arlington, TX [ 7,573.14]
Anyone else get creeped out by the fact that this is becoming normalized?
(Background: it is impossible to anonymize location tracklogs.)
In fact such datasets are available by Telecom providers.
> Studying spending patters
> Studying voting intentions
> Studying twitter
The only creepy thing is if the data is being collected via dubious means. If they sign up a few 1000 participants, use their data in a transparent manner, I dont have a problem.
If however, they buy the data from the lowest bidder on the internet who has a free VPN app that logs location for resale, they yeah that is shitty.
Big difference studying the language of published books by authors vs. their private conversations in their homes.
I'm not publishing my walking info that happens in public, even if the information is public, and someone has to do extra effort to get my unpublished information. I'm also not publishing my identity when I walk down the street, someone would have to dox me to figure out who I am on top of that, unless I was a public figure and they already knew who I was.
On the other hand if it's general information about a spot, I don't really mind too much. This happens already with traffic speed analysis on public roads for example for cars.
And there are many similar infrastructural analytics I don't care much about, as long as it's done to maintain a system and not to track identities, much like server traffic usage & CPU usage monitoring on servers.
I'm no computational social scientist, but this analysis doesn't make sense. In fact it would seem to me that if the study took those variables into account the model would become more predictive. It's idealizing the analysis.
Mathematican and physician often forget to cite their work :'(