Wednesday, August 20, 2014

Stop being so god damn nice all the time

Sure, ad hominem is a terrible fallacy. Sure, calling people names is not nice. In today's world there seems to be a consensus that we need to be nicer to each other and open to new ideas. I hold that these two things are incompatible. I will argue in the following that scientific discourse needs to get less nice, more bellicose, funny and vitriolic and that will result in a more friendly environment as well as much better output.

Let's start off easy, by focusing on our friends. I have a couple of very good friends, for which I would literally do anything (I really mean anything. Would even make out with a woman for them). It is with them that I have had all the conversations that have ever changed my mind. It is to them that I go with an open mind to discuss issues that I am genuinely undecided on. It is them that I call names all the time. It is also them that continuously take the piss out of me. And it bloody works! Here I'll assert that even if you have the nicest of friends (maybe boring Mormons or something) you're still more likely to be loose-tongued around them than in any other circumstance. And that's because we're much more relaxed around our friends and we can really be ourselves. Here's my point: if you take your boring politically correct hat off and really become yourself again, you're a cursing, judgmental, well poisoning, ad-hominem throwing little twerp. If you don't buy that from my brilliant friends analysis, then just fucking go online and see what people sounds like behind the veil of anonymity that the internet provides.
The same holds for scientist (though they're not really actual people). We read a paper (not even one that disagrees directly with our work) and a lot of the time go: "how did this piece of shit ever get published?". You know what happens then? Nothing! We ridicule it with a colleague, point out its terrible flaws and then shred it (I don't really have a shredder but now i know what i want for Christmas). This is unacceptable. What should happen is we should make fun of whomever published it and make the record clear that that one paper is crap. But we're going to be nice about it. And even if we go as far as publicly disapproving of it, we'll do it in such a high brow, nice way, that it won't really be clear to everyone that the paper should be considered a joke. So what, you'll ask? They live to research another day and the paper will probably we discarded in the long run anyway. Well, true, but with two attached costs:
1. Some people will base novel research on that publication. They, pardon my French, will get a surprise fisting. So will their funding body.
2. The above isn't that big of a problem. Scientist spend loads of time and money chasing crap. It's all business as usual. The big problem is that the bar gets lowered. If shit can so easily make it through the review process, then why bother doing things properly? Even worst than that, the reasoning usually also considers the level at which your competition may be satisfied publishing at and scooping you. And because you already have a rather low opinion of your competition, that level will be quite low. And then, when it's out, mum's the word.

Trying to be nice in scientific discourse is stupid and counterproductive! The reason we do it is that we think it leads to a more focused discussion. Oh, if only we phrase things in the language of constructive criticism. What a load of bollocks. What the hell is that? Have you ever received criticism that didn't make you cringe and want to punch someone? No, you haven't! (Unless it was from you friends, and they usually called you a moron in the process of outlining their position). Off the bat, there is so much investment in anything you say as a scientist that criticism cannot but seem a siege on everything you hold dear. This problem has not and will not be solved by being nice. The only thing being nice does is stifle genuine conversation and debate, because we fear we'll hurt someone's feelings.
Thus, I propose the solution to be a forum where the two opposing sides are free to throw shit at each other. Basically, British parliamentary debate. One other rule though: they have to grab a drink together afterwards.
So come on idiots, let's do this shit together! Call your collaborator a moron, for science.

UPDATE:
It's been a decade since this post. Ouch, that hurst. 
Spent most of that time working in industry, where the invisible hand is supposed to do its magic and remove inefficiencies, boost productivity and all that good stuff. 
But don't be fooled, the niceness still wins and ruins it for all of us. Industry has exactly the same problem: everyone is do damn nice all the time. So, you end up spending your life in meetings about nothing, where everyone just smiles and pretends this is not complete and utter nonsense. 
Would be enough for one to lose their mind and start throwing feces at their colleagues, if not this: you get good money and if you realize just how low everyone's productivity actually is, you can get your job done in about 10 hours per week. In this context, why rock the boat? 

I still believe we could do much better and i stand by the title of this piece. However, the incentives are such that it's not worth bothering. Off to nap in another meeting...

Sunday, July 27, 2014

If you're handing out advice, you'd better not be talking out of your ass

This is a story of ignorance and liberal phrasing. Sounds complicated, but bare with me, it'll be a fun ride.
So, this lovely primer found its way into the world:
http://www.cell.com/cell/abstract/S0092-8674(14)00864-2

It teaches all us boys and girls how to setup meta-genomic studies and analyze the data that may come out of them. Actually, it only focuses on 16s and does a rather bad job discussion anything relevant, but hey, what do you expect? You want a power analysis or to know how many samples you need, ask your statistician! These people clearly aren't the ones to answer any of those questions anyway.
None of this would be a problem, had they not actually gotten some things completely ass backwards. Thus, this unpleasant rant.
First, there's this:
"Lauber and colleagues recently showed that storage for 2 weeks at temperatures ranging from −80°C to 20°C did not significantly affect patterns of between-sample diversity or the abundance of major taxa (Lauber et al. 2010)". So basically, they're saying you can keep the samples in your pocket for a week and it will all be fine. Really? I work on stool samples and in my experience they are rather sensitive to thawing. But ok, I have been wrong before (once!). What does the original publication say? Well, ignoring the appalling setup and analysis, here's a nice quote: "One sub-sample was omitted from the data set (Fecal 1 Day 14, 20°C replicate 2) due to visible fungal growth prior to DNA extraction". Are you seriously expecting me to believe that this won't change the composition of your sample? You must be out of your mind. I extend a challenge to Dr. Lauber: that he eat his lunch after having left it at 20°C for 2 weeks; what could go wrong? Sure, you won't see the difference if you compare to just one other sample (yes, they have replicates of TWO samples) as the differences between them are huge. But to take this and conclude that the shifts in community composition are "minor" is idiotic. You might even say it's the result of eating two week old lunch...
I've tried getting my hands on their data but the SRA number doesn't actually exist. I'll get back to this once I have it in my hands.

And then there's this:
"The number of sequences obtained in a sequencing run can vary across samples for technical rather than biological reasons, and these sequencing depth artifacts can affect diversity estimates. One approach to account for variable sequencing depth is to use frequencies of OTUs (operational taxonomic units, described below) within samples (i.e., to normalize by total sample sequence count). We recommend against this approach, as we have found that it is subject to statistical pitfalls and can lead to samples clustering by sequencing depth (Friedman and Alm, 2012; C. Lozupone, J.G.C., and R.K., unpublished data)."
It goes on to propose rarefaction and then to cite a paper that explains quite plainly why rarefaction is generally a stupid thing to do (McMurdie and Holmes, 2013). But that's not my problem at all. You're free to do rarefaction and throw 50% of your data out the window. What do i care? It's not my money. Just don't tell your funding body.
No, my problem is with the assertion that abundances "[are] subject to statistical pitfalls and can lead to samples clustering by sequencing depth", which is simply not true. They cite a lovely paper by Friedman and Alm, which I doubt they have read. Because if they would have, they would know just how silly their point is.
I do recommend reading the Friedman paper, but i'll put their point plainly here: You cannot use correlation analysis on compositional data, because it's going to be crap. This is because each measurement is by definition dependent on all others and this will break the correlation. Pearson (yeah, that one) had figured this out in 1897. Then they clearly show a way of getting around this, by employing a log-ratio transformation from the good old Aitchison. So, there's no problem with abundance values, you should just not use them wrongly. As to the "clustering by sequencing depth" i'll repudiate that off the bat since they can't be bothered to show any data for it.
But it gets better. The compositional problem doesn't arise from a total sum scaling (the one that gives you abundances that sum to 100%), as the authors of the primer would have you believe (and i'm sure they believe it themselves). It comes from the way the measurement is done. Let me put it this way: any value that you measure is only valid in the context of the number of measurements you've done. So, if you have 80 measurements that reflect the presence of Bacteria A and you don't tell me how many times you've measured, then 80 doesn't mean anything! It only becomes a coherent measure when you're saying 80 out of 256 are Bacteria A. And here, the compositional issue is already present. Because if in your "community" Bacteria A grows and is peppy and everything else stays the same, then in a next measurement it might be 170 out of 256. And thus, because of the way you measure, the values of Bacteria A and "the rest" will have a prefect negative correlation. They will simply have to.
One last thing: Friedman and Alm also nicely make the point that the compositional effect is going to be stronger the less diverse your community is. Rarefaction does exactly that! It minimizes the diversity in your sample, thus exacerbating the compositional effect. And this is how you get things ass backwards!