Operation 47debb3 – MU density
A cross-post from reddit - actually reddit seems to autodelete this post - will have to figure out what offends it
tldr; MU density in S2 lvl 12 cells multiplied with the area of a field yields a good estimate for the total MU captured by that field. But there seem to be additional factors at play that can yield lower values.
A recent post here speculated how the MU of fields get calculated and postulated that each S2 lvl10 cell has a given MU density. Some comments in the feedback postulated S2 level 12 cells instead of larger lvl10 cells. Finding densities from 107-233 in my own S10 lvl10 cell I wanted to generate a clean data set in a single cell.
Operation 47debb3 covered s much as possible of a S2 lvl12 cell in 24 individual fields while ensuring all ankers are inside the same cell. The operation consists of 6 larger fields of approx. 0.5 square km each and 18 smaller fields resulting from a single layer inside each larger field. This should allow to determine density to +/- 1 mu/km2 in the larger fields.
The layout of the experiment is shown below
All twelve northern fields can be calculated exactly (assuming rounding) using a single value of approx. 235 MU/km2. This would validate the theory that MU can be calculated using density * area with a single density used for all fields in the same S2 lvl12 cell.
Unfortunately the 12 southern fields show a small but significant lower density - ranging from 209-227 MU/km2 which is too low to be attributed to rounding issues. The fields closest to water/the harbour show the largest differences - which could be coincidence or might be by design.
This experiment was unable to figure out how Niantic really calculates MU. The basis might be indeed S2 cells - but other factors can change the results. I publish it here as it can be used as a high quality data set for future research.
More details and data on my blog: https://drthod.com/operation-47debb3-mu-density/
Are the values on the inner vertices the average?
Would be interesting to do it with the minimum number of fields you can and see if it gives similar MU.
No exactly - they are the value for the larger Field. They should also yield the average if all three smaller fields add up to the larger one.
In half the cases they add up to large field -1 and therefore are a small amount higher.
Just nitpicking here.
I did have one layered field prior to this experiment where there was a large difference - but these went across multiple S2 cells - so I redid the experiment avoiding this as a factor.
I think the size of the cells used for calculation does change with field size.
On larger fields i often have seen the inner layers not adding up to the MU of the outer field.
The most extreme example was on a multi layer field with a ~1000km Base link. MU count of the fields grew with the amplitude to 780kMU and suddenly, with a another 60km grew in amplitude, it dropped down to 740kMU. That the larger field got less MU can only mean that the calculation changed. There was only one major city in the field area, which where near to the base link. I assume a lot of that city MU could have been lost, because it was averaged to a larger cell which was not fully covered by the field.
@navno I don't think they estimate the size of a field using S2 cells. Are you aware of the google presentation here: https://docs.google.com/presentation/d/1Hl4KapfAENAOf4gv-pSngKwvS_jwNVHRPZTTDzXXn6Q/view#slide=id.i130
tldr; shape of the field - and relative size - should determine how inaccurate it is - not size on its own
Slide 15 does discuss estimation of a sphere using a limited number of S2 cells.
My take from that slide:
a) size is always overpredicted (has no influence as it just means we have a given factor)
b) accuracy depends on number of cells used - default is 8
This one would mean that fields of a certain size are more accurate, then get less accurate and get more accurate again. I tried to avoid any of this by using fields of similar size. But it means that it should be approximate an even chance that fields inside gain more MU as fields outside - depending on relative size to the cell dimensions. I have never heard this happen - ever - that the sum of internal cells score higher. Happy to be corrected.
There is one more subtle issue: Very narrow fields should have more trouble to be represented. They would be predicted most wrong as they have to go though the same number of cells in at least one direction. I don't see any of this in my data (published as well as unpublished).
One possibility: distances are not calculated properly. During Avvenir I got distances wrong as I used Pythagoras. I still don't know why it didn't work. But my error went up with distance. The larger the link - the larger my error. For area this error would be sqaured and would grow even further. So this is one possibility of what is wrong. Some area calculation that has a small, creeping mistake getting larger with size.
But I'm open to other suggestions.
Oh - and someone doesn't like what I did - the fields are currently under attack :)
If you've ever looked at S2 geometry, you'd see how it works. Niantic has MU values for different cell levels and uses the largest cell that fits inside the triangle as the base MU value. Then uses smaller cells to fill in the edges. That's how you have different MU values for fields that cover the same areas.
The simplest test is to make a large 4 portal layered field, and add the values of the inner fields and compare to the outer one.
@Perringaiden what is the smallest S2 level for which Niantic has (different) MU densities.
Your claim is exactly what I failed to proof. I was told S2 lvl 10 but could do different fields in a lvl10 that’s yielded different densities.
I tried lvl 11 next - still no luck. This article is a lvl 12 test - and it still failed.
Lvl 13 - maybe. If both northern fields are the same
The issue is - the smaller I go the larger is my error. We can go to large fields once this base assumption is shown to be true.
But it didn’t seem to make sense just to rely on this assumption to be true.
Awesome! I'm running your data now to see if I can see anything. I was just about to do a post, I got some more data: https://imgur.com/dqDWRrA
There was a 10 layer field where the first 8 continuously grew in number then suddenly the last two dropped down. See above figure for the difference between the last two. They are plotted as a combination of level 11 and 10 cells using S2RegionCoverer from the S2 library. The math perfectly works out (see above post) to account for the decrease in MU.
Small field: 50,225 MU and 29 level 11 s2 cells
Larger field: 48,799 MU and 30 level 11 s2 cells.
If you take the smaller field, and multiply the MU by it by 29/30, you get 48,550. Which nearly perfectly matches what you get in real life (plus a little bit because the fields are bigger!)
I also dug in to the s2 library, and it looks like the library does what we're hypothesizing MU calculation does:
According to the library, this is the most computationally efficient way of doing it. The remaining question is, "what's the minimum level, max level, and max number of cells that S2RegionCoverer takes?" I think the above example shows that level 11 might by the max level (those calculations only work for 11).
Here are the coordinates for the fields in the above image:
3rd post in a row from me, but should be the last tonight. I went over the data (very detailed, thank you!) and I didn't see anything.
There might be a hint in there: the large fields in the north have the same densities as one of their member fields, so maybe they are pulling it from the same data point? I also wondered in there is a random factor involved, like some seed based on the hash of the portal uid.
That plot of MU vs area and correlation coefficient are so strong, indicating uniform density - but the figure of the plot trailing off by the water is compelling that something else is going on.
I think we're on to something, that effective density seems to be key, and then some secret sauce on top of that.
this deserved a repost at SitReps!
good work, agent(s)!
For the long links, did you factor in the curvature of the earth? Using Pythagoras assumes the plane is flat
I know the highest S2 is level 6, but I'm not sure about the lowest. The biggest problem is going to be finding areas where the breakup of the cells doesn't match at multiple levels. I'm not an expert on it, but those who have delved into it with reams of data said that's how they do it. They needed lots of data and lots of examples to be accurate though.
I've also seen an (unverified) claim that the maximum number of cells they will use to calculate MU is 20.
@Perringaiden S2 go up to level 30. that is around a single square cm. Each level up has 4 cells.
So level 10 is 4 level 11 is 16 level 12 is 64 level 13 is 256 level 14 etc.
level 10: caught and weather
Level 13: EX raids
Level 14: Number of gyms
Level 15: Block Military zones
Level 16: render the map you see
Level 17: Pokestop location
level 20: Pokemon spawns
My Personal guess after all my research:
Level 13 for MU density with the caveat that MANY neighbour cells have the same value if they are in the same town/city/environment .
I currently look at level 10 with 5% in the sea (south) and it seems it is that part where I need to go to level 13
I thought I did originally - but somehow that is likely what got it wrong. My calculations now are accurate. Still need to revisit what happened there mathematically - but didn't spend an hour with paper/pen to figure out how haversine vs my own original calculation works. So just copied something that works.
Generally "higher" is referring to Z1 being the highest, Z30 being the lowest. If you're looking at a map, your point of reference is 'higher' for the lower numbers.
I meant they don't use larger S2 cells than level 6 for MU calculations.
@maqifrnswa Great work you are doing. I think there might be 3 different issues here that we discuss.
a) region coverer. This is likely being used to describe it to find the average MU density
b) MU density raw data being used - this is what I try to figure out in detail
c) area of a field
Starting with region coverer - your interactive animations looks great. But I didn't fail to notice that you use 20 cells for the smaller field and 21 for the larger one. 17 small, 3 large ones. Now go to slide 15 of the google presentation: https://docs.google.com/presentation/d/1Hl4KapfAENAOf4gv-pSngKwvS_jwNVHRPZTTDzXXn6Q/view#slide=id.i130
They use as parameters:
a) max cell level (irrelevant for your example as it surely will be high enough)
b) min cell level - that will be the level at which they have data for
c) max number of cells - focus on this one please
The first part - the beauty of S2 cells is that you can use different cell levels together for odd shapes. This is why I count 20/21 cells and not 29/30 cells. 3 level 10 and 17/18 level 11 cells
Now focus on c). In the presentation they give cut-offs for 4, 8, 20 and 100 as max cells to calculate. You want to have such a cut-off as otherwise you might calculate way too many fields in a really large triangle vs a small one. But your example brings you from 20 -> 21 and possibly just across the max number.
Can you set the max number to 20 and redo the covering? I have no clue what cells will be used / how it uses them. But it clearly will use different ones - and possibly cells far further out as you think as it has to use more S2 lvl10 cells to reduce the number to max 20. A possibility is that it uses 11 S2 lvl 10 cells instead.
11 S2 level 10 cells would mean:
3 level 10 stay unchanged for the average
The bottom 6 cells S2 level 11 cells are used 50% but they might be diluted (or get more MU density) by another row of 6 S2 level 11 cells. Row 2 and 3 from the bottom - they are likely used 75% and have a possible dilution of one additional S2 lvl 11 cell. On the top you get 4 extra S2 level 11 cells.
So in total you go from 3 S2 lvl 10, 17 S2 level 10 (which is the same as 29 level 11) to my best guess of 11 S2 level 10 which is equivalent to 44 S2 level 11 cells. These 11 S2 level 10 would represent 16 * 11 = 704 S2 level 13 with actual raw data.
In my experience cell levels don't just drop to zero in rural areas and they stay astonishingly constant. Dover - my neigbhour town seems 235. My own town is around 230. The lowest I have seen is 107 in an area with fields/villages only of <100 people in them compared to the 30K that both towns have.
@maqifrnswa I calculated area and density for your fields
You would need a negative value for the S2 lvl 11 added if we average over 30 S11 instead of 29 S11 cells. I would expect that cell is at least 100. Try max cells 20 and mixed cell levels to cover the area and there is lots of scope to explain the numbers.
Here is my Excel sheet (uploaded to google) that makes all the calculations for me. https://docs.google.com/spreadsheets/d/1XTQo0y2U1oiQR3RyfW3M6ur7KgtOqSHGV_AlSafTot0/edit?usp=sharing
All it needs - draw a field in intel, copy paste the link and add the MU. The spreadsheet extracts the coordinates itself as long as the field is drawn clockwise or anticlockwise and as long as it is the only field. Actually more technically - it takes co-ordinates 1, 3 and 4 from the field out of 6 (each one is double). So direction doesn't matter but the second link can't include the start portal of link 1.