Statistics is Hard, Even for Scientists

As a burgeoning data scientist I have taken a more professional interest in statistics, which in this day and age means statistics blogs!  In particular I really enjoy the prolific writings of Andrew Gelman.  This guy is super organized, planning out his posts for an entire month.  It’s either amazing or indicative of a psychological disorder or maybe both.  Anyways, I have learned that while I “know” elementary statistics, the implications of even the basics were never imparted to me and are not easy to suss out on one’s own.

For instance, the elementary result that any statistically significant result under certain very common conditions ALWAYS overestimates the magnitude of an effect was mindblowing to me. This becomes even more pronounced in the often underpowered studies endemic to social sciences. I would say it is not entirely unfair to take a good 1/3 off any effect size in social sciences as a rule of thumb, it’s really that enormous. This actually has implications in physics too. For instance, if we report that our entanglement measurement is statistically different than a situation with no entanglement, it is likely we are overestimating the quality of our entanglement especially if we employ a bit of post processing on the data. It really makes you discount a bit every result in the scientific literature.

More fundamentally, statistical significance itself is kind of a red herring when it comes to significance. Obviously, but not always appreciated, is the fact that because something is statistically significant does not mean it is significant. The classic examples are Facebook studies that take a sample of a million to just tease out some trivial effect. Who really cares at if sad Facebook posts adjust your mood by 1% if this is completely swamped by everything else in your life? On the other hand, statistical significance has become a signifier of “publishable” results without any further validation of the results. There are many examples of this effect, Gelman really likes a study that published an 8% effect size for how attractiveness influences the likelihood you will bear girls. This is an order of magnitude higher than pretty much any other influence on girl/boy sex ratios and furthermore lacks a credible explanation. Yet it got published in a prestigious journal just because it was statistically significant. It was quickly pointed out how ludicrous this was on its face and the bad statistics that went into the paper.

Connected to this idea is that Bayesian statistics is superior to frequentist statistics in almost all cases and that is almost solely due to informed priors. Essentially, a lot of scientific research assumes flat priors, that every possibility is equally likely. They publish confidence intervals under this assumption and then misinterpret what a confidence interval means. Just for precision, an X% confidence interval does not indicate that the interval has a X% chance of containing the true value. After all there are multiple ways to construct confidence intervals that give different results. Furthermore, you can often construct intervals that verifiably do or do not contain the true value. Instead a particular confidence interval algorithm says that it will on average contain the true value X% of the time. That is, if I sample the same population multiple times and use the same instructions for the confidence interval for each sample then these myriad intervals I calculate will contain the true value X% of the time.

However, that is not the truly egregious error here, rather it is the assumption that the prior probability of a result is the same for all possible results. Do we really think that eating a beet before a run will improve performance by 50%? Of course not and a proper analysis should heavily discount that possibility rather than giving it equal weight with a more likely effect size of a few %. Unfortunately, scientists are lazy or ignorant or ambitious and a proper Bayesian analysis with informed priors might reveal the weakness in their study.

Apart from statistical analysis, informed priors are just a good common sense check on results. You see, scientists are implicit data filters, constantly looking for patterns. Some people have accused scientists of fishing for statistical significance; i.e. if you try 20 different tests then odds are one of them will be statistically significant at the 5% level. However, it’s more likely that scientists take data, see a pattern and then analyze based on that. However, in doing so they are implicitly making multiple comparisons. With enough dimensions you can pretty much always find a correlation between two of them. Scientists aren’t doing this explicitly, but it doesn’t matter. Implicitly or explicitly the multiple comparisons are very likely to find something of statistical significance. An informed prior is a very easy check on your results. Another Gelman example was a study that posited that the ovulation cycle in women affects their political views at a 20% level despite evidence that political views are remarkably stable. A look at the literature for typical effect sizes may have prompted the authors to reexamine how they arrived at their result. It reminds me of my teaching experience where I often asked my students if their answer made any sense in the context of the problem. These kind of intuitive checks are highly valuable, particularly when the math is highly complicated or unintuitive to you.

I conclude by saying that all of this was novel to me and that I don’t want to judge the many excellent scientists that fall into these statistical traps. I guess I am more chagrined that these issues are still relatively unknown and that it is still seen as acceptable to use the rather crude statistical analysis tools wielded by most scientists when their issues are well documented. I mean if a journal wanted to they could require Bayesian credible intervals instead of confidence intervals and require a defense of the prior used in their calculation. This would require more engagement with prior literature by authors and readers and provide a better estimate of true effect sizes. Gelman suggests avoiding the effect of multiple comparisons by replicating your experiment on new data; essentially the first run was to generate a hypothesis and test with flexibility in interpreting and analyzing data, but the second should be a rigid replication to see if the effect is still apparent. Again there are professional bodies that could take steps toward requiring such experimental procedures. As it is, a lot of social science research, which I always took with a grain of salt due to the many implausible results I read, is looking a bit farcical.

Advertisements

Vietnam > Thailand

My wife and I just finished a three week vacation where we blitzed our way from North Vietnam to South before popping into Bangkok and Chiang Mai for a week.  We had an absolutely stellar time through Vietnam even though I enountered a nasty bout of food poisoning.  However, I think we were both relieved when our vacation was over and we could get the hell out of Thailand.  Our time in Thailand was nowhere near as enjoyable and often actually dismal.  Thus I write this post to point out that you should definitely visit Vietnam over Thailand if you are hitting Southeast Asia.

Of course, the world does not appear to agree with me.  Apparently Bangkok is maybe the most visited city in the world and its airport is the most photographed location on the internet (one wonders why people are takign photographs of airports, but I digress).  Meanwhile the entire country of Vietnam gets only 7 million visitors which is like a 1/3 of the people that JUST visit Bangkok.  Yes, there are cities that get more tourists than the entirety of Vietnam.  However, I can’t fathom why.

Let’s start with what is wrong with Thailand.  First off, Bangkok is awful.  It’s like a vision of a dystopian future.  The central city is very posh with high tech advertisements barraging you constantly from the sides of tall buildings.  It strongly caters to expats as many advertisements are solely in English, sex is a constant in their advertisements and I noticed an astonishing number of cosmetic surgery clinics and advertisements.  If shopping and leisure are your thing, then Bangkok might appeal.  That isn’t really us. 

The attractions in Bangkok are fairly limited.  You have the Grand Palace which is absolutely infested with tourists and yet the authorities do an absolutely terrible job turning this into a directed experience.  We arrived took a look around and saw a huge mob crushing into a ticket window.  We decided to take our leave, but the Grand Palace is a black hole and nearly impossible to escape from.  Instead we got ripped off by the absolutely delightful taxi drivers in Bangkok.  You see they always want to turn off their meter and “bargain” with you.  DO NOT DO THIS.  You don’t know enough to even try and if you use the meter taxis are very reasonable in Bangkok.  Pretty much every experience in a taxi in Bangkok was uncomfortable and awkward.  Another attraction in Bangkok is the Jim Thompson house, the house of a white guy “interpreting” Thai architecture.  Really, that is a major attraction in a huge city like Bangkok.  It’s pretty sad.

Thankfully Chiang Mai is better.  It’s a lot easier to get around and the Old City is pretty charming.  The sense I got from the Thailand is that it is turning into a city-state with all resources funneled into Bangkok as even Chiang Mai, the second largest city in the country, is pretty underdeveloped.  Ayyutayah, the former capital, felt like it was only around to cater to tourists trawling the unimpressive ruins of the temples and palaces.  I can see the origins of the many coups in Thailand forming from the tensions between the blessed Bangkok and the rest of the much poorer country.

In contrast, Vietnam seems far friendlier to tourists despite or maybe because it is not as well traveled.  We had no issues with taxis there, but to be honest we rarely needed them as their city centers are dense and packed with good eats.  Bangkok citizens seem out to fleece tourists for every penny they can but the reaction of most Vietnamese people was far more indifferent, in a good way.  We didn’t feel awkward or misplaced even at the most offbeaten restaurants that probably rarely get nonlocals. 

I was only there a week and a half, but the Vietnamese seem more my style.  They love food, coffee and shooting the shit on the street.  I think every other establishment was a restaurant and in between there was almost always a food cart.  The density of food offerings in Vietnam is nuts.  It’s also not a materialistic culture.  Their temples are austere full of copper and wood statues for the most part.  Meanwhile, the Thai seem very proud of their jewel encrusted golden pagodas.  I much preferred the peace and serenity of Bai Dinh temple  in Vietnam over anything we saw in Thailand.  Similarly, I enjoyed the imperial palace in Hue over the Grand Palace in Bangkok.  The latter is an over-the-top tribute to royalty whereas the former is less a residence and more a restrained but beautiful government compound.

Which is not to say the Vietnamese can’t do ostentatious.  We saw the mausoleum of a Vietnamese emperor that had the most enchanting mosaics inside, easily the most beautiful art we experienced on our trip.  Furthermore, the silk embroidery “paintings” in Vietnam was our most wanted souvenir, though the price eventually dissuaded us (we are so cheap).  Don’t be like us, these things are too beautiful to pass up. 

Advertisement is also another good indication of the differences in the two countries.  There isn’t much in Vietnam, except for the many signs displaying the offerings at the myriad restaurants.  Certainly, sex is rarely used especially in comparison to Bangkok.

I guess my point is that I have a lot of respect for the Vietnamese people.  They survived two terrible wars with the West (as an aside, you absolutely must visit the War Museum in Saigon; harrowing and illuminating, we broke down into tears) and they don’t appear to hold any grudges.  Instead they seem laidback and interested in just enjoying life but not in an overly hedonistic way.  Just friends sitting around eating delicious food and drinking good coffee.  If anything they seem a bit too laidback as they all seem to agree their government is corrupt and inefficient, but they seem to just accept it.

One last VERY IMPORTANT note.  Vietnamese food is so much better than Thai food and in all the major cities good spots are much more accessible than in the sprawling Bangkok.  Frankly, the Thai food in America is just as good and often better than what we ate in Thailand.  You see, Thai people like things sweet, cloyingly sweet.  Their version of Cha Yen, Thai Iced Tea, is nearly inedible to our tastes because it is barely tea.  Vietnamese cuisine is fairly light with a delicate balance of fish sauce, limes, chilies and sugar that is absolutely delectable, though Saigon erred a bit sweeter than I would have liked.  Thai food is often a coconut milk and palm sugar bomb that often required a lot of seasoning at the table to get into proper shape.  I know at one point I would have rated Thai as my favorite cuisine, but in the last year, Nha Hang in Chicago and now our trip has convinced me that Vietnamese food is really far more exceptional.

Best CRPG of the Year

So RPS announced their favorite RPG of the year and to little surprise it was Dragon Age Inquisition.  Now I have only played a little of Inquisition but by all accounts this takes the action combat of Dragon Age 2 and mashes it together with Skyrim like “open world” or maybe even more appropriately an Ubisoft icon hunt.  It may do this well, but I can’t really see this being great and certainly not the best RPG in a year packed with them.

That said, I did find most of the RPGs I played this year mildly disappointing.  Wasteland 2 had lots of tedious combat, no polish and boring characters.  Divinity Original Sin had great combat at the beginning, but you saw all of its tricks early on and it became very easy after that.  The rest of the story and characters and role playing were pretty banal.  Some have praised the Banner Saga, but it’s mostly a tactical combat game with really mundane abilities.  It wants to be chess and in doing so sapped all the fun out of the genre.  Legend of Grimrock 2 kept the ridiculous real time combat while having much weaker puzzles than the first and being far too long for its own good.

Ok so I sound really grumpy.  The biggest surprise for me and my pick for CRPG of the year goes to Shadowrun Dragonfall. The writing here is top notch, best I have seen since the demise of Black Isle.  They off a central character fairly early just like in the base game, but here it is much better done; you care and it motivates the rest of the game.  The storyline is interesting and while Shadowrun may seem like a ridiculous setting, dragons and elves in a cyberpunk setting sounds like something a 12 year old envisions, this makes it far more unique and engaging than the staid archetypical fantasy and sci-fi settings of Bioware’s Dragon Age and Mass Effect.

What really elevates it is that they manage to get most of the boring details right.  Skills actually impact your gameplay in meaningful ways.  How you infiltrate buildings, for instance, depends highly on your personal abilities.  Are you a decker?  Then you hack in.  Or maybe you share some kind of social affinity that you can fall back on to persuade an NPC.  This kind of character build->gameplay interaction is essentially a lost art that I was hoping would make a comeback with all the RPGs this year.  However, Wasteland 2 failed to get this right despite its myraid skills, Bioware doesn’t even have skills any more and Original Sin is almost entirely combat.

Speaking of combat, this is another area where Dragonfall outdoes its peers.  The tactical combat here isn’t exceptional, but it’s competent and entertaining.  The addition of magic and special abilities puts it ahead of the stand and shoot mechanics of Wasteland 2.  It manages to slightly evolve and stay somewhat balanced far better than Original Sin and well, Dragon Age Inquisition is basically gussied up MMORPG combat and so automatically loses.  My only complaint would be the decking/hacking combat where you are in the Matrix equivalent.  This is dull stuff, but their new Kickstarter talks about completely revamping it so they seem to be aware of the problem.

My last criteria is inventory management.  I don’t know what happened, but I feel like we have gone backward in this field and a lot of it seems to be tied to fussy and boring crafting.  Warlords of Draenor is cluttering my bags with crafting items from my garrison.  Inquisition has a terrible console-style UI, crafting reagents and a bit of a Diablo-like loot system going on that makes it extremely irritating to keep your inventory clean and yet it only has like 4 equipment slots!  Wasteland 2 had so many things to pick up and yet weight allowances were relatively low that you spent far too much time inventory juggling.  Finally Original Sin was undone by skill books and consumables and a metric ton of crafting items that were probably never useful but the hoarder in me kept around.  The sorting options at the time I played were abysmal.  Sometime I lost quest items in the morass that was my inventory.

Shadowrun Returns instead basically lets you outfit before a mission with a simple but functional UI.  You don’t have that many slots and most things are obvious sequential upgrades, but that is usually always the case in other games, they just obfuscate it more.  Do I wish equipment were more varied?  Yes.  But I will take this system over fiddling with my inventory for hours every single time.  I will say that I wish cybernetic enhancements were a better and more interesting option.  They kind of pale in comparison to magic which they directly compete with and so I rarely use them.

So that is my overview of the year’s RPGs.  Most of them are undone by details, lessons learned long ago that everyone seems to have forgotten.  Shadowrun Returns is the lone exception and at least to me seems to be the underdog with all the hype around Wasteland 2 and Original Sin.  I hear the director’s cut of Dragonfall is even better and I am really looking forward to their new Kickstarter.  This seems like a team that is going to keep getting better with experience.