## Sunday, April 26, 2015

### Supervoid superhype, or the publicity problem in science

Part of the reason this blog has been quiet recently is that I decided at the start of this year to try to avoid — as far as possible — purely negative comments on incorrect, overhyped papers, and focus only on positive developments. (The other part of the reason is that I am working too hard on other things.)

Unfortunately, last week a cosmology story hit the headlines that is so blatantly incorrect and yet so unashamedly marketed to the press that I'm afraid I am going to have to change that stance. This is the story that a team of astronomers led by Istvan Szapudi of the University of Hawaii have found "the largest structure in the Universe", which is a "huge hole" or "supervoid" that "solves the cosmic mystery" of the CMB Cold Spot. This story was covered by all the major UK daily news outlets last week, from the Guardian to the Daily Mail to the BBC, and has been reproduced in various forms in all sorts of science blogs around the world.

There are only three things in these headlines that I disagree with: that this thing is a "structure", that it is the largest in the Universe, and that it solves the Cold Spot mystery.

Let's focus on the last of these. Readers of this blog may remember that I wrote about the Cold Spot mystery in August last year, referring to a paper my collaborators and I had written which conclusively showed that this very same supervoid could not explain the mystery. Our paper was published back in November in Phys. Rev. D (journal link, arXiv link). And yet here we are six months later, with the same claims being repeated!

Does the paper by Szapudi et al. (journal link, arXiv link) refute the analysis in our paper? Does it even acknowledge the results in our paper? No, it pretends this analysis does not exist and makes the claims anyway.

Just to be clear, it's possible that Szapudi's team are unaware of our paper and the fact that it directly challenged their conclusions several months before their own paper was published, even though Phys. Rev. D is a very high profile journal. This is sad and would reflect a serious failure on their part and that of the referees. The only alternative explanation would be that they were aware of it but chose not to even acknowledge it, let alone attempt to address the argument within it. This would be so ethically inexcusable that I am sure it cannot be correct.

I am also frankly amazed at the standard of refereeing which I'm afraid reflects extremely poorly on the journal, MNRAS.

Coming to the details. In our paper last year, we made the following points:
1. Unless our understanding of general relativity in general, and the $\Lambda$CDM cosmological model in particular, is completely wrong, this particular supervoid, which is large but only has at most 20% less matter than average, is completely incapable of explaining the temperature profile of the Cold Spot.
2. Unless our understanding is completely wrong as above, the kind of supervoid that could begin to explain the Cold Spot is incredibly unlikely to exist — the chances are about 1:1,000,000!
3. The corresponding chances that the Cold Spot is simply a random fluctuation that requires no special explanation are at worst 1:1000, and, depending on how you analyse the question, probably a lot better.
4. This particular supervoid is big and rare, but not extremely so. In fact several voids that are as big or bigger, and as much as 4 times emptier, have already been seen elsewhere in the sky, and theory and simulation both suggest there could be as many as 20 of them.
To illustrate point 1 graphically, I made the following figure showing the actual averaged temperature profile of the Cold Spot versus the prediction from this supervoid:

If this counts as a "solution to a cosmic mystery" then I'm Stephen Hawking.

The supervoid can only account for less than 10% of the total temperature decrement at the centre of the Cold Spot (angle of $0^\circ$). At other angles it does worse, failing to even predict the correct sign! And remember, this prediction only assumes that our current understanding of cosmology is not completely, drastically wrong in some way that has somehow escaped our attention until now.

You'll also notice that if the entire red line is somehow magically (through hypothetical "modified gravity effects") scaled down to match the blue line at the centre, it remains wildly, wildly wrong at every other angle. This is a direct consequence of the fact that the supervoid is very large, but really not very empty at all.

By contrast, the simple fact that the Cold Spot is chosen to be the coldest spot in the entire CMB already accounts for 100% of the cold temperature at the centre:

 The red line is the observed Cold Spot temperature profile. 95% 68% of the coldest spots chosen in random CMB maps have temperatures lying within the dark blue band, and 99% 95% lie within the light blue band. Image credit: http://arxiv.org/abs/1408.4720.

Similarly, the fact that Mt. Everest is much higher than sea level is not at all surprising. The highest mountains on other planets (Mars, for instance) can be a lot higher still.

But how to explain the fact that a large void does appear to lie in the same direction as the Cold Spot? Is this not a huge coincidence that should be telling us something?

Let's try the following calculation. Take the hypothesis that this particular void is causing the Cold Spot, let's call it hypothesis H1. Denote the probability that this void exists by $p_\mathrm{void}$, and the probability that all of GR is wrong and that some unknown physics leads to a causal relationship as $p_\mathrm{noGR}$. Then
$$p_\mathrm{H1}=p_\mathrm{void}p_\mathrm{noGR}$$On the other hand, let H2 be the hypothesis that the void and the Cold Spot are separate rare occurrences that happen by chance to be aligned on the sky. This gives
p_\mathrm{H2}=p_\mathrm{void}p_\mathrm{CS}p_\mathrm{align},where $p_\mathrm{CS}$ is the probability that the Cold Spot is a random fluctuation on the last scattering surface, and p_\mathrm{align} the probability that two are aligned.

The relative likelihood of the two rival hypotheses is given by the ratio of the probabilities:
\frac{p_\mathrm{H1}}{p_\mathrm{H2}}=\frac{p_\mathrm{noGR}}{p_\mathrm{CS}p_\mathrm{align}}.Suppose we assume that $p_\mathrm{CS}=0.05$, and that the chance of alignment at random is p_\mathrm{align}=0.001.[1] Then the likelihood we should assign to "supervoid-caused-the-Cold-Spot" hypothesis depends on whether we think $p_\mathrm{noGR}$ is more or less than 1 in 20,000.

This exact calculation appears in Szapudi et al's paper, except that they mysteriously leave out the numerator on the right hand side. This means that they assume, with probability 1, that general relativity is wrong and that some unknown cause exists which makes a void with only a 20% deficit of matter create a massive temperature effect. In other words, they've effectively assumed their conclusion in advance.

Well, call me old-fashioned, but I don't think that makes any sense. We have a vast abundance of evidence, gathered over the last 100 years, which show that if indeed GR is not the correct theory of gravity it is still pretty damn close to it. What's more, we have lots of cosmological evidence — from the Planck CMB data, from cross-correlation measurements of the ISW effect, as well as from weak lensing — that gravity behaves very much as we think it does on cosmological scales. Looking at the figure above, for the supervoid to explain the Cold Spot requires at least a factor of 10 increase in the ISW effect at the void centre, as well as a dramatic effect on the shape of the temperature profile. And all this for a void with only a 20% deficit of matter! If the ISW effect truly behaved like this we would have seen evidence of it in other data.

For my money, I would put $p_\mathrm{noGR}$ at no higher than $2.9\times10^{-7}$, i.e. I would rule out the possibility at $5\sigma$ confidence. This is a lot less than 1:20,000, so I would say chance alignment is strongly favoured. Of course you should feel free to put your own weight on the validity of all of the foundations of modern cosmology, but I suggest you would be very foolish indeed to think, as Szapudi et al. seem to do, that it is absolutely certain that these foundations are wrong.

So much for the science, such as it is. The sociological observation that this episode brings me back to is that, almost without exception, whenever a paper on astronomy or cosmology is accompanied by a big press release, either the science is flawed, or the claims in the press release bear no relation to the contents of the paper. This is a particularly blatant example, where the authors have generated a big splash by ignoring (or being unaware of) existing scientific literature that runs contrary to their argument. But the phenomenon is much more ubiquitous than this.

I find this deeply depressing. Like most other young researchers (I hope), I entered science with the naive impression that what counted in this business was the accuracy and quality of research, the presentation of evidence, and — in short — facts. I thought the scientific method would ensure that papers would be rigorously peer-reviewed. I did not expect that how seriously different results are taken would instead depend on the seniority of the lead author and the slickness of their PR machine. Do we now need to hold press conferences every time we publish a paper just to get our colleagues to cite our work?

One possible response to this is that I was hopelessly naive, so more fool me. Another, which I still hope is closer to the truth, is that, in the long run, the crap gets weeded out and that truth eventually prevails. But in an era when public "impact" of scientific research is an important criterion for career advancement, and such impact can be simply achieved by getting the media to hype up nonsense papers [2], I am sadly rather more skeptical of the integrity of scientists [3].
----

[1] This probability for alignment is the number quoted by Szapudi's team, based on the assumption that there is only one such supervoid, which could be anywhere in the sky. In fact, as I've already said, theory and simulation suggest there should be as many as 20 supervoids, and several have already been seen elsewhere in the sky (including one other by Szapudi's team themselves!). The probability that any one supervoid should be aligned with the Cold Spot should therefore be roughly 20 times larger, or 0.02.

[2] Not everything in Szapudi's paper is nonsense, of course. For instance, it seems quite likely that there is indeed a large underdensity where they say. But there is still a deal of nonsense (described above) in the actual paper, and vastly more in the press releases, especially the one from the Institute for Astronomy in Hawaii.

[3] On the whole, given the circumstances I though journalists handled the hype quite well, especially Hannah Devlin in the Guardian, who included a skeptical take from Carlos Frenk. (I suspect Carlos at least was aware of our paper!)

## Tuesday, December 23, 2014

### Planck's starry sky

Well December 22nd has come and gone, and the promised release of Planck data has, perhaps unsurprisingly, not materialised. Some of the talks presented at the Ferrara conference are available here, and there was a second conference in Paris more recently, video recordings from which can be found here.

I've seen a bit of speculation about the delays and what the data might or might not be showing on a few physics blogs, some of which I think are a little mistaken. So I thought I'd put up a quick post summarising the situation as I see it — but note that my opinion is not at all official and may be wrong on some of the details (especially since I wasn't at either of the conferences).

For a start, you may notice that some of the talks at Ferrara are not made available on the website. I'm informed that this internal censorship was applied by the Planck science team, and it is based on their estimation that the censored talks are ones containing results which are still preliminary and liable to change before the eventual data release. The flip side of this is that the talks that are available contain data which they are confident will not change, so these are the ones you'd want to pay attention to in any case.

In terms of the data itself, there appear to be two and a half important improvements so far. The first is that the overall calibration of the temperature power spectrum — which was previously somewhat discrepant with the WMAP measurement — has been improved, and now Planck and WMAP agree very well with each other. The second is that the apparent anomaly in the temperature power spectrum at multipole values of $\ell\sim1800$ has been identified as being due to a glitch in the 217 GHz data, and has been corrected. The anomaly has therefore disappeared. This can be seen by comparing the 2013 and 2014 versions of the TT power (if you look carefully):

The remaining half an improvement comes from the polarization data. Previously, this was so badly affected by systematics at large scales that the Planck team were only able to even show the data points at $\ell>100$, and were unable to use them for any science analysis, relying instead on the WMAP polarization. These systematics have still not been completely resolved — apparently it is the HFI instrument which is the problematic one — but they have been somewhat improved, such that the EE and TE power spectra are trustworthy at $\ell>30$, which is enough to start using them for parameter constraints in place of the WMAP data. (This means that the error bars on various derived parameter values have decreased a little from 2013, but they will decrease a lot more when all the data is finally available.)

This last half improvement is somewhat relevant to the BICEP2 issue which I discussed here, since the improved polarization data in 2014 was an important reason that Planck was able to say something about the dust polarization in the BICEP2 window. The fact that they still aren't 100% happy with this data yet could be a bit concerning. On the other hand, the relevant range of multipoles for BICEP2 is $\ell\sim80$ rather than $\ell<30$.

In terms of what these new data tell us, I'm afraid the story appears mostly rather boring, since there is very little change from what we learned already in 2013. As expected, the values of all cosmological parameters are consistent with what Planck announced in 2013; insofar as there have been any minor changes in the values, they tend to move in the direction of making Planck and WMAP more consistent with each other, but really these shifts appear very small and not worth worrying about. It appears the only parameter which has shifted at all significantly is the optical depth $\tau$. Constraints on $\tau$ rely on the use of polarization data; the previous constraints were obtained by combining Planck temperature with WMAP polarization measurements whereas the current value comes from Planck alone.

At some point I suppose the various systematics with the HFI polarization data will be sorted out to the extent that we will get the long-awaited release and the papers. But I have given up trying to predict when. In the meantime, I thought the coolest thing to come out of the recent conferences was this image:

which rather reminded me of this:

 Detail from The Starry Night.
and this:

 Detail from Haystacks Near a Farm in Provence.

## Monday, December 1, 2014

### Planck at Ferrara

There is a conference starting today in Ferrara on the final results from Planck.

Though actually these won't be the final results from Planck, since although all scientists in the Planck team have been scrambling like mad to prepare for this date, they haven't been able to get all their results ready for presentation yet. So the actual release of most of the data and the scientific papers is scheduled for later this month. December 22nd, in fact — for European scientists, almost the last working day of the year (Americans tend to have some conferences between Christmas and New Year) — so at least we will technically have the results in 2014.

Except even that isn't really it, because the actual Planck likelihood code will only be released in January 2015. Or at least, I'm pretty sure that's what the Planck website used to say: now it doesn't mention the likelihood code by name, referring instead to "a few of the derived products."

If you're confused, well, so am I. The likelihood code is one of the most important Planck products for anyone planning to actually use Planck data for their own research — to do so properly normally means re-running fits to the data for your favourite model, which means you need the likelihood code. (Of course, some people do take the short cut of simply quoting Planck constraints on parameters derived in other contexts, and this is not always wrong.) This means that having the final, correct version likelihood code is rather important even for Planck scientists themselves to be completely confident in the results they are presenting. So it would make more sense to me if the likelihood code were released at the same time as the rest of the data. Perhaps that is what is actually going to happen, I suppose we'll find out soon.

Incidentally, my information is that the "final, correct" version of the likelihood code was distributed for internal use within the Planck collaboration about 4 weeks ago or so. Considering that it is only after this happens that proper model comparison projects can begin, that obtaining parameter constraints for each model can take a surprisingly large amount of computing time, that the various Planck teams responsible for this step had scores of different models to investigate, that the "final, correct" version may well have undergone a subsequent revision, and that the process of drafting each paper at the end of the analysis must itself take a couple of weeks minimum ... I suppose I'm not very surprised that the date for data release has been pushed back.

There's some uncertainty about whether the videos from the conference will be made available, as a statement on the website saying this would happen has been removed. For those interested here is a Youtube channel purporting to provide video from the conference, but disappointingly it doesn't appear to actually work.

## Wednesday, November 26, 2014

### Quasar structures: a postscript

A few days ago I discussed the purported 'spooky' alignment of quasar spins and the cosmological principle here. So as to focus better on the main point, I left a few technical comments out of that discussion which I want to mention here. These don't have any direct bearing on the main argument made in that post — rather they are interesting asides for a more expert audience.

#### Quasars can't prove the Universe is homogeneous

Readers of the original post might have noticed that I was quite careful to always state that the distribution of quasars was statistically homogeneous, but not that the quasars showed the Universe was homogeneous. The reason for this lies in the properties of the quasar sample itself.

There are two main ways of constructing a sample of galaxies or quasars to use for further analysis, such as testing homogeneity. The first is that you simply include every object seen by the survey instruments within a certain patch of sky that lies between two redshifts of interest. But these objects will vary in their intrinsic brightness, and the survey instruments have a limited sensitivity, so can only record dim objects when they are relatively close to us. Intrinsically bright objects are rarer, but if they are very far away we will only be able to see the rare bright ones. So this strategy results in a sample with very many, but largely dim, galaxies or quasars relatively close to us, and fewer but brighter objects far away. This is known as a flux-limited sample.

The other strategy is to correct the measured brightness of each object for the distance from us, to determine its 'intrinsic' brightness (otherwise known as its absolute magnitude), and then select a sample of only those objects which have similar absolute magnitudes. The magnitude range is chosen in accordance with the range of distances such that within the volume of the Universe surveyed, we can be confident we have seen every object of that magnitude that exists. This is called a volume-limited sample.

Testing the homogeneity of the Universe requires a volume-limited survey of objects. For a flux-limited sample the distribution in redshift (i.e., in the line-of-sight direction) would not be expected to be uniform in the first place: the number density of objects would ordinarily decrease sharply with redshift. But looking out away from Earth also involves looking back in time; so if the redshift range of the survey is large, the farthest objects are seen as they were at an earlier time than the closest ones. If the objects in question had evolved significantly in that time, near and far objects could represent significantly different populations even in a volume-limited sample, and once again we wouldn't expect to see homogeneity along the line of sight, even if the Universe were homogeneous.

So to really test the cosmological principle without having to assume homogeneity at the outset,1 we really need a volume-limited sample of galaxies that cover a very large volume of the Universe but span a relatively narrow range of redshifts. Such surveys are hard to come by. For example, the study confirming homogeneity in WiggleZ galaxies (see here and here) actually used a flux-limited sample, so required additional assumptions. In this case one doesn't obtain a proof, rather a check of the self-consistency of those assumptions — which people may regard as good enough, depending on taste.

Anyway, the key point is that the DR7QSO quasar sample everyone uses is most definitely flux-limited and not volume-limited (I was myself reminded of this point by Francesco Sylos Labini). Despite this, the redshift distribution of quasars is remarkably uniform (between redshifts 1 and 1.8). So what's going on? Well, unlike certain types of galaxies that live much closer to home, distant quasar populations are expected to evolve rather quickly with time. And the age difference between objects at redshifts 1 and 1.8 is more than 2 billion years!

It would appear that this effect and the flux-limited nature of the survey coincidentally roughly cancel each other out for the sample in question. A volume-limited subset of these quasars would be (is) highly inhomogeneous — but then because of the time evolution the homogeneity or otherwise of any sample of quasars says nothing much about the homogeneity or otherwise of the Universe in general.

Luckily this is only incidental to the main argument. The fact that the distribution of these (flux-limited) quasars is statistically homogeneous on scales of 100-odd Megaparsecs despite claims for the existence of Gigaparsec-scale 'structures' simply demonstrates the point that the existence of single structures of any kind doesn't have any bearing on the question of overall homogeneity. Which is the main point.

#### Homogeneity is sample-dependent

Of course the argument above cuts both ways.

Let's imagine that a study has shown that the distribution of a particular type of galaxy — call them luminous red galaxies — approaches homogeneity above a certain distance scale, say 100 Megaparsecs. Such a study was done by David Hogg and others in 2005. From this we may reasonably conclude (though not, strictly speaking, prove) that the matter distribution in the Universe is homogeneous above at most 100 Mpc. But we are not allowed to conclude that the distribution of some other sample of objects — radio galaxies, quasars, blue galaxies etc. — approaches homogeneity above the same scale, or indeed at all!

Even in a Universe with a homogeneous matter distribution, the scale above which a volume-limited sample of galaxies whose properties are constant with time approaches homogeneity depends on the galaxy bias. This number depends on the type of galaxies in question, and so too to a lesser extent will the expected homogeneity scale. Of course if the sample is not volume-limited, or does evolve with time, all bets are off anyway.

More generally, for each sample of galaxies that we wish to use for higher order statistical measurements, the statistical homogeneity of that particular sample must in general be demonstrated first. This is because higher order statistical quantities, such as the correlation function, are conventionally normalized in units of the sample mean, but in the absence of statistical homogeneity this becomes meaningless.

There was a time when the homogeneity of the Universe was less well accepted than it is today, and the possibility of a fractal distribution of matter was still an open question. At that time demonstrating the approach to homogeneity on large scales in a well-chosen sample of galaxies was worth a publication (even a well-cited publication) in itself. This is probably no longer the case, but it remains a necessary sanity check to perform for each galaxy survey.

1Properly speaking, even the creation of a volume-limited sample requires an assumption of homogeneity at the outset, since the determination of absolute magnitudes requires a cosmological model, and the cosmological model used will assume homogeneity. In this sense all "tests" of homogeneity are really consistency checks of our assumption thereof.

## Sunday, November 23, 2014

### A 'spooky alignment' of quasars, or just hype?

In the news this week we've had a story on the alignment of quasar spins with large-scale structure, based on this paper by Hutsemekers et al. The paper was accompanied by this press release from the European Space Observatory, which was then reproduced in various forms in a number of blogs and news outlets — almost all of which stress the 'spooky' or 'mysterious' nature of the claimed alignment 'over billions of light years'.

At least one of these blogs (the one at The Daily Galaxy) explicitly claims that the alignment of these quasar spins is a challenge for the cosmological principle, which is the assumption of large-scale statistical homogeneity and isotropy of the Universe, on which all of modern cosmology is based. This claim is not contained in the press release, but originates from a statement in the paper itself, where the authors say
The existence of correlations in quasar axes over such extreme scales would constitute a serious anomaly for the cosmological principle.
I'm afraid that this claim is completely unsupported by any of the actual results contained within the paper, and is therefore one of those annoying examples of scientific hype. In this post I will try to explain why.

I have actually covered much of this ground before — in a blog post here, but more importantly in a paper published in Monthly Notices last year — and I must admit I am a little surprised at having to repeat these points (especially since my paper is cited by Hutsemekers et al.). Nevertheless, in what follows I shall try not to sound too grumpy.

The immediate story started with a paper by Roger Clowes and collaborators, who claimed to have detected the 'largest structure' in the Universe (dubbed the 'Huge-LQG') in the distribution of quasars, and also claimed that this structure violated the cosmological principle. My paper last year was a response to this, and made the following points:

1. the detection of a single large structure has essentially no relevance to the question of whether the Universe is statistically homogeneous and isotropic;
2. the quasar sample within which the Huge-LQG was identified is statistically homogeneous, and approaches homogeneity at the scale we expect theoretically, thus providing an explicit demonstration of point 1;
3. the definition of 'structure' by which the Huge-LQG counts as a structure is so loose that by using it we would find equally vast 'structures' even in completely random distributions of points which (by construction!) contain no correlations and therefore no structure whatsoever; and
4. therefore the classification of the Huge-LQG set of quasars as a 'structure' is essentially empty of meaning.

#### Quasar structures don't violate homogeneity

Since I am already repeating myself, let me elaborate a little more on points 1 and 2. Our Universe is not exactly homogeneous. The fact that you exist — more generally, the fact that stars, galaxies and clusters of galaxies exist — is sufficient proof of this, so it would a very poor advertisement for cosmology indeed if it were all founded on the assumption of exact homogeneity. Luckily it isn't. In fact our theories could be said to predict the existence of structure in the potential $\Phi$ on all scales (that's what a scale-invariant power spectrum from inflation means!), and even the galaxy-galaxy correlation function only goes asymptotically to zero at large scales.

Instead we have the assumption of statistical homogeneity and isotropy, which means that we assume that when looked at on large enough scales, different regions of the Universe are on average the same. Clearly, since this is a statement about averages, it can only be tested statistically by looking at large numbers of different regions, not by finding one particular example of a 'structure'. In fact there is a well-established procedure for checking the statistical homogeneity of the distribution of a set of points (the positions of galaxies or quasars, in this case), which involves measuring its fractal dimension and checking the scale above which this is equal to 3. I've described the procedure before, here and here, and Peter Coles describes a bit of the history of it here.

The bottom line is that, as I showed last year, the quasar distribution in question is statistically homogeneous above scales of at most $\sim130h^{-1}$Mpc. There is therefore no 'structure' you can find in this data which could violate the cosmological principle. End of story.

 Scaled number counts in spheres as a measure of the fractal dimension of the quasar distribution. On scales where this number approaches 1, the distribution is statistically homogeneous. From arXiv:1306.1700.

#### Structures and probability

Of course, there are many different ways of being statistically homogeneous. It is perfectly possible that within a statistically homogeneous distribution one could find a particular structure or feature whose existence in our specific cosmological model (which is one of many possible models satisfying the cosmological principle) is either very unlikely or impossible. This would then be a problem for that cosmological model despite not having any wider implications for the cosmological principle. But to prove this requires some serious analysis, which should include a proper treatment of probabilities — you can't just say "this structure is big, so it must be anomalous."

In particular, any serious analysis of probabilities must take into account how a 'structure' is defined. Given infinitely many possible choices of definition, and a very large Universe in which to search, the probability of finding some 'structure' that extends over billions of light years is practically unity. In fact the definition used for the Huge-LQG would be likely to throw up equally vast 'structures' even if quasar positions were not at all correlated with each other (and we know they must be at least somewhat correlated, because of gravity). So it really isn't a very useful definition at all.

#### 'Spooky' alignments

This brings us to the current paper by Hutsemekers et al. The starting assumption of this paper is that the Huge-LQG is a real structure which is somehow distinguished from its surroundings. This assumption is manifest in the decision that the authors make to try to measure the polarization of light from only those quasars that are classified as part of the Huge-LQG rather than a more general sample of quasars. This classic case of circular reasoning is the first flaw in the logic, but let's put it to one side for a minute.

The press release then tells us that the scientists
found that the rotation axes of the central supermassive black holes in a sample of quasars are parallel to each other over distances of billions of light years
and that the spins of the central black holes are aligned along the filaments of large-scale structure in which they reside.

I find this statement extremely problematic. Here is a figure from the paper in question, showing the sky positions of the 93 quasars in question, along with the polarization orientations for the 19 which are used in the actual analysis:

 Quasar positions (black dots) and polarization alignments (red lines). From arXiv.1409.6098.

Do you see the alignment? No, me neither. In fact, looking at the distribution of angles in panel b, I would say that looks very much like a sample drawn from a perfectly uniform distribution.

So what is the claim actually based on? Well, for a start one has to split up the (arbitrarily defined) 'structure' into several (even more arbitrarily defined) 'sub-structures'. Each of these sub-structures then defines a different reference angle on the sky:

 Chopping the data to suit the argument (Figure 4 of arXiv:1409.6098). On what basis are sub-structures 1 and 2 defined as separate from each other?

And now one has to measure the angles between the quasar polarization direction and the reference direction of the particular sub-structure, and the direction perpendicular to the reference direction, and choose the smaller of the two. In other words, rather than prove that quasars are aligned parallel to each other over distances extending over billions of light years (the claim in the press release), what Hutsemekers et al. are actually doing is attempting to show that given arbitrary choices of some smaller sub-structures and reference directions, quasars in different sub-structures are typically aligned either parallel to or perpendicular to this direction. This is a much less exacting standard.

Even this claim is not particularly well supported by the evidence. That is, looking at the distribution of angles, I am really not at all convinced that this shows evidence for a bimodal distribution with peaks at 0 and 90 degrees:

 Distribution of angles purportedly showing two distinct peaks at 0 and 90. Figure 5 of arXiv:1409.6098.

So in summary I think the statistical evidence of alignment of quasar spins is already pretty weak. I don't see any analysis in the paper dealing with the effects of a different arbitrary choice of sub-structures, nor do I see any error analysis (the error in measuring the polarization direction of a quasar can be as large as 10 degrees!). And I haven't even dealt with the fact that the polarization data is used for only 19 quasars out of the full 93 — in other words, for the majority of quasars in the sample the central black hole spins are aligned along some other, undetermined, direction such that we can't measure the polarization.

#### Extraordinary claims require extraordinary evidence

Now, it's worth repeating that we've already seen that in fact the space distribution of quasars is statistically homogeneous in accordance with the cosmological principle. That simple test has been done, the cosmological principle survives. So if you've got some more nuanced claim of an anomaly, I think the onus is on you not only to describe the measurement you made, but also say what exactly is anomalous about it. What is the theoretical prediction we should compare it to? Which model is being rejected (or otherwise) by the new data?

So, for instance, if quasar spins in sub-structures are indeed aligned either parallel or perpendicular to each other (and I still remain to be convinced that they are), is this really something 'spooky', or would we expect some degree of alignment in the standard $\Lambda$CDM model?

Such an analysis has not been presented, but even if it had, it's worth bearing in mind the principle that extraordinary claims require extraordinary evidence. I'm afraid throwing out a p-value of about 1% simply doesn't cut it. Not only is that actually not an enormously impressive number (especially given all the other things I mentioned above), such a frequentist statistic doesn't take account of all our prior knowledge.

Other people have banged this drum at length before, but the point is easily summarized: the p-value tells us the probability of getting this data given the model, but doesn't tell us the probability of the model being correct despite the new data appearing to contradict it. This is the question we really wish to answer. To do this requires a Bayesian analysis, in which one must account for the prior belief in the model, which is the result of confidence built up from all other experimental results that agree with it. We have an incredible amount of observational evidence in favour of our current model, that would probably not be consistent with a model in which gigantic structures could exist (I say 'probably' because no such model actually exists at present).

So my prior in favour of $\Lambda$CDM is pretty high — 19 quasars and an analysis so full of holes are not going to change that so quickly.

## Monday, September 22, 2014

### Biting the dust

Sorry about the obvious pun in the title. Today's important announcement is of course the long-awaited Planck verdict on the level at which the BICEP2 "discovery" of primordial gravitational waves had been contaminated by foreground dust. That verdict does not look good for BICEP.

(Incidentally, back in July I reported a Planck source as saying this paper would be ready in "two or three weeks". Clearly that was far too optimistic. But interestingly many members of the Planck team themselves were confidently expecting today's paper to appear about 10 days ago, and the rumour is that the current version has been "toned down" a little, perhaps accounting for some of the additional delay. Despite that it's still pretty devastating.)

Let me attempt to summarize the new results. Some important points are made right in the abstract, where we read:
"... even in the faintest dust-emitting regions there are no "clean" windows in the sky where primordial CMB B-mode polarization measurements could be made without subtraction of foreground emission"
and that
"This level [of the dust power in the BICEP2 window, over the multipole range of the primordial recombination bump] is the same magnitude as reported by BICEP2 ..."
(my emphasis). Although
"the present uncertainties are large and will be reduced through an ongoing, joint analysis of the Planck and BICEP2 data sets,"
from where I am looking unfortunately it now does not look as if there is a realistic chance that what BICEP2 reported was anything more than a very precise measurement of dust.

The Planck paper is pretty thorough, and actually quite interesting in its own right. They make use of the fact that Planck observes the sky at many frequencies to study the properties of dust-induced polarization. Whereas BICEP2 was limited to a single frequency channel at 150 GHz, the Planck HFI instrument has 4 different frequencies, of which the most useful is at 353 GHz. Previous Planck results have already shown that dust emission behaves sort of like a (modified) blackbody spectrum at a temperature of 19.6 Kelvin. Since this is a significantly higher temperature than the CMB temperature of 2.73 K, dust emission dominates at higher frequencies, which means that the 353 GHz channel essentially sees only dust and nothing else. Which makes it perfect for the task at hand, since in this particular situation roles are reversed and it is the dust that is the signal and the primordial CMB is noise!

The analysis proceeds in a number of steps. First, they study the power spectra of the two polarization modes (EE and BB) in several different large regions in the sky:

 The different large sky regions studied are shown as increments of red, orange, yellow, green and two different shades of blue. The darkest blue region is always excluded. Figure from arXiv:1409.5738.
In all these different regions, both power spectra $C_\ell^{EE}$ and $C_\ell^{BB}$ are proportional to $\ell^{\alpha}$, consistent with a value of $\alpha=-2.42\pm0.02$. Fixing $\alpha$ to this value, the amplitude of the power spectra in the different large regions then shows a characteristic dependence on the mean intensity of the dust emission — i.e. regions with more dust overall also show more polarization power — and this purely empirical relationship is characterized by
$$A^{EE,BB}\propto\langle I_{353}\rangle^{1.9},$$though with a bit of uncertainty in the fit. The amplitudes of the polarization power spectra then also show a dependence on frequency from 353 GHz down to 100 GHz which matches previous Planck results (the dependence is something close to a blackbody spectrum at 19.6 K, but with a specific modification).

It then turns out that if the sky is split into very many much smaller regions close to the poles rather than the 6 large ones above, the same results continue to hold on average, though obviously there is some scatter introduced by the fact that dust in different bits of the sky behaves differently. So this allows the Planck team to take the measured dust intensity in any one of these smaller regions and extrapolate down to see what the contribution to the BB power would be if measured at the BICEP2 frequency of 150 GHz. The result looks like this:

 The level of dust contamination across the in measurements of the primordial B-mode signal. Blue is good, red is bad. The BICEP2 window is the black outline on the right.
This really sucks for BICEP2, who chose their particular patch of the sky precisely because, according to estimates of the 1990s and early 2000s, it was supposed to have very little dust. Planck is now saying that isn't true, and that there is a better region just a little further south. Even that better region isn't perfect, of course, but it may be clean enough to see a primordial GW signal of $r\sim 0.1$ to $0.2$ — if such a signal exists, and if we're lucky and/or figure out cleverer ways of subtracting the dust foreground.

The problem with the BICEP2 region is that Planck's estimate of the dust contribution there looks like this:

 Planck's estimate of the dust contribution to the BB power spectrum at 150 GHz and in the BICEP2 sky window. The first bin is the one that's most relevant. The black line is the contribution primordial GW with $r=0.2$ would make, if they existed.
So it appears that in the BICEP2 window, in the $\ell$ region where primordial gravitational waves produce a measurable BB signal (and BICEP2 has measured something), dust is expected to produce the same amplitude of signal as does an $r=0.2$. In fact, even accounting for the uncertainties in the Planck analysis (the extent of the pink error bars on the plot) it is clear that (a) dust will be contributing significantly to the BICEP2 measurement, and (b) it's pretty likely that only dust is contributing.

Planck avoid explicitly saying that BICEP2 haven't seen anything but dust. This is because they haven't directly measured the dust contribution in that window and at 150 GHz. Rather what's shown in the plot above is based on a number of little steps in the chain of inference:
1. generally, the BB polarization amplitude is dependent on the average total dust intensity in a region;
2. the relationship between these two doesn't vary too much across the sky;
3. generally, the frequency dependence of the amplitude shows a certain behaviour;
4. and again this doesn't appear to vary too much across the sky
5. Planck have measured the average dust intensity in the BICEP2 window, and this gives the value shown in the plot above when extrapolated to 150 GHz;
6. and the BICEP2 window doesn't appear to be a special outlier region on the sky that would wildly deviate from these average relationships;
7. so, the dust amplitude calculated is probably correct.
Update: See the correction in the comments — the Planck paper actually does better than this. That is to say, they present one analysis that relies on all steps 1-7, but in addition they also measure the BB amplitude directly at 353 GHz and extrapolate that down to 150 GHz relying only on steps 3 and 4. The headline result is the one based on the second method, which actually gets a lower number for the dust amplitude.

So they leave open the small possibility that despite having been unlucky in the original choice of the BICEP2 window, we've somehow ultimately got very lucky indeed and nevertheless measured a true primordial gravitational wave signal.

Time will tell if this is true ... but the sensible betting has now got to be that it is not.

Incidentally, I have just learned that in two days' time I will be presenting a 30 minute lecture to a group of graduate students about this result. The lecture is not supposed to be very detailed, but I'm also not very much of an expert on this. So if you spot any errors or omissions above, please do let me know through the comments box!

## Monday, August 25, 2014

### A Supervoid cannot explain the Cold Spot

In my last post, I mentioned the claim that the Cold Spot in the cosmic microwave background is caused by a very large void — a "supervoid" — lying between us and the last scattering surface, distorting our vision of the CMB, and I promised to say a bit more about it soon. Well, my colleagues (Mikko, Shaun and Syksy) and I have just written a paper about this idea which came out on the arXiv last week, and in this post I'll try to describe the main ideas in it.

First, a little bit of background. When we look at sky maps of the CMB such as those produced by WMAP or Planck, obviously they're littered with very many hot and cold spots on angular scales of about one degree, and a few larger apparent "structures" that are discernible to the naked eye or human imagination. However, as I've blogged about before, the human imagination is an extremely poor guide to deciding whether a particular feature we see on the sky is real, or important: for instance, Stephen Hawking's initials are quite easy to see in the WMAP CMB maps, but this doesn't mean that Stephen Hawking secretly created the universe.

So to discover whether any particular unusual features are actually significant or not we need a well-defined statistical procedure for evaluating them. The statistical procedure used to find the Cold Spot involved filtering the CMB map with a special wavelet (a spherical Mexican hat wavelet, or SMHW), of a particular width (in this case $6^\circ$), and identifying the pixel direction with the coldest filtered temperature with the direction of the Cold Spot. Because of the nature of the wavelet used, this ensures that the Cold Spot is actually a reasonably sizable spot on the sky, as you can see in the image below:

 The Cold Spot in the CMB sky. Image credit: WMAP/NASA.

Well, so we've found a cold spot. To elevate it to the status of "Cold Spot" in capitals and worry about how to explain it, we first need to quantify how unusual it is. Obviously it is unusual compared to other spots on our observed CMB, but this is true by construction and not very informative. Instead the usual procedure quite rightly compares the properties of the cold spots found in random Gaussian maps using exactly the same SMHW technique to the properties of the Cold Spot in our CMB. It is this procedure which results in the conclusion that our Cold Spot is statistically significant at roughly the "3-sigma level", i.e. only about 1 in every 1000 random maps has a coldest spot that is as "cold" as* our Cold Spot.** (The reason why I'm putting scare quotes around everything should become clear soon!)

So there appears to be a need to explain the existence of the Cold Spot using additional new physics of some kind. One such idea that that of the supervoid: a giant region hundreds of millions of light years across which is substantially emptier than the rest of the universe and lies between us and the Cold Spot. The emptiness of this region has a gravitational effect on the CMB photons that pass through it on their way to us, making them look colder (this is called the integrated Sachs-Wolfe or ISW effect) — hence the Cold Spot.

Now this is a nice idea in principle. In practice, unfortunately, it suffers from a problem: the ISW effect is very weak, so to produce an effect capable of "explaining" the Cold Spot the supervoid would need to be truly super — incredibly large and incredibly empty. And no such void has actually been seen in the distribution of galaxies (a previous claim to have seen it turned out to not be backed up by further analysis).

It was therefore quite exciting when in May a group of astronomers, led by Istvan Szapudi of the Institute for Astronomy in Hawaii, announced that they had found evidence for the existence of a large void in the right part of the sky. Even more excitingly, in a separate theoretical paper, Finelli et al. claimed to have modeled the effect of this void on the CMB and proven that it exactly fit the observations, and that therefore the question had been effectively settled: the Cold Spot was caused by a supervoid.

Except ... things aren't quite that simple. For a start, the void they claimed to have found doesn't actually have a large ISW effect — in terms of central temperature, less than one-seventh what would be needed to explain the Cold Spot. So Finelli et al. relied on a rather curious argument: that the second-order effect (in perturbation theory terms) of this void on CMB photons was somehow much larger than the first-order (i.e. ISW) effect. A puzzling inversion of our understanding of perturbation theory, then!

In fact there were a number of other reasons to be a bit suspicious of the claim, among which were that N-body simulations don't show this kind of unusual effect, and that several other larger and deeper voids have already been found that aren't aligned with Cold Spot-like CMB features. In our paper we provide a fuller list of these reasons to be skeptical before diving into the details of the calculation, where one might get lost in the fog of equations.

At the end of the day we were able to make several substantive points about the Cold Spot-as-a-supervoid hypothesis:
1. Contrary to the claim by Finelli et al., the void that has been found is neither large enough nor deep enough to leave a large effect on the CMB, either through the ISW effect or its second-order counterpart — in simple terms, it is not a super enough supervoid.
2. In order to explain the Cold Spot one needs to postulate a supervoid that is so large and so deep that the probability of its existence is essentially zero; if such a supervoid did exist it would be more difficult to explain that the Cold Spot currently is!
3. The possible ISW effect of any kind of void that could reasonably exist in our universe is already sufficiently accounted for in the analysis using random maps that I described above.
4. There's actually very little need to postulate a supervoid to explain the central temperature of the Cold Spot — the fact that we chose the coldest spot in our CMB maps already does that!
Point number 1 requires a fair bit of effort and a lot of equations to prove (and coincidentally it was also shown in an independent paper by Jim Zibin that appeared just a day before ours), but in the grand scheme of things it is probably not a supremely interesting one. It's nice to know that our perturbation theory intuition is correct after all, of course, but mistakes happen to the best of us, so the fact that one paper on the arXiv contains a mistake somewhere is not tremendously important.

On the other hand, point 2 is actually a fairly broad and important one. It is a result that cosmologists with a good intuition would perhaps have guessed already, but that we are able to quantify in a useful way: to be able to produce even half the temperature effect actually seen in the Cold Spot would require a hypothetical supervoid almost twice as large and twice as empty as the one seen by Szapudi's team, and the odds of such a void existing in our universe would be something like a one-in-a-million or one-in-a-billion (whereas the Cold Spot itself is at most a one-in-a-thousand anomaly in random CMB maps). A supervoid therefore cannot help to explain the Cold Spot.***

Point 3 is again something that many people probably already knew, but equally many seem to have forgotten or ignored, and something that has not (to my knowledge) been stated explicitly in any paper. My particular favourite though is point 4, which I could — with just a tiny bit of poetic licence — reword as the statement that
"the Cold Spot is not unusually cold; if anything, what's odd about it is only that it is surrounded by a hot ring"
I won't try to explain the second part of that statement here, but the details are in our paper (in particular Figure 7, in case you are interested). Instead what I will do is to justify the first part by reproducing Figure 6 of our paper here:

 The averaged temperature anisotropy profile at angle $\theta$ from the centre of the Cold Spot (in red),  and the corresponding 1 and $2\sigma$ contours from the coldest spots in 10,000 random CMB maps (blue). Figure from arXiv:1408.4720.

What the blue shaded regions show is the confidence limits on the expected temperature anisotropy $\Delta T$ at angles $\theta$ from the direction of the coldest spots found in random CMB maps using exactly the SMHW selection procedure. The red line, which is the measured temperature for our actual Cold Spot, never goes outside the $2\sigma$ equivalent confidence region. In particular, at the centre of the Cold Spot the red line is pretty much exactly where we would expect it to be. The Cold Spot is not actually unusually cold.

Just before ending, I thought I'd also mention that Syksy has written about this subject on his own blog (in Finnish only): as I understand it, one of the points he makes is that this form of peer review on the arXiv is actually more efficient than the traditional one that takes place in journals.

Update: You might also want to have a look at Shaun's take on the same topic, which covers the things I left out here ...

* People often compare other properties of the Cold Spot to those in random maps, for instance its kurtosis or other higher-order moments, but for our purposes here the total filtered temperature will suffice.

** Although as Zhang and Huterer pointed out a few years ago, this analysis doesn't account for the particular choice of the SMHW filter or the particular choice of $6^\circ$ width — in other words, that it doesn't account for what particle physicists call the "look-elsewhere effect". Which means it is actually much less impressive.

*** If we'd actually seen a supervoid which had the required properties, we'd have a proximate cause for the Cold Spot, but also a new and even bigger anomaly that required an explanation. But as we haven't, the point is moot.

## Monday, July 14, 2014

### Short news items

Over the past two months I have been on a two-week seminar tour of the UK, taken a short holiday, attended a conference in Estonia and spent a week visiting collaborators in Spain. Posting on the blog has unfortunately suffered as a result: my apologies. Here are some items of interest that have appeared in the meantime:
• The BICEP and Planck teams are to share their data — here's the BBC report of this news. The information I have from Planck sources is that Planck will put out a paper with new data very soon (about a week ago I heard it would be "maybe in two weeks", so let's say two or three weeks from today). This new data will then be shared with the BICEP team, and the two teams will work together to analyse its implications for the BICEP result. From the timescales involved my guess is that what Planck will be making available is a measurement of the polarised dust foreground in the BICEP sky region, and the joint publication will involve cross-correlating this map with the B-mode map measured by BICEP. A significant cross-correlation would indicate that most (or all) of the signal BICEP detected was due to dust.
• What Planck will not be releasing in the next couple of weeks is their own measurement of the polarization of the CMB, in particular their own estimate of the value of $r$. The timetable for this release is still October: this is a deadline imposed by the fact that ESA requires Planck to release the data by December, but another major ESA mission (I forget which) is due to be launched in November and ESA don't like scheduling "competing" press conferences in the same month because there's only so much science news Joe Public can absorb at a time. From what I've heard, getting the full polarization data ready for October is a bit of a rush as it is, so it's fairly certain that's not what they're releasing soon.
• By the way, I think I've recently understood a little better how a collaboration as enormous as Planck manage to remain so disciplined and avoid leaking rumours: it's because most of the people in the collaboration don't know the full details of the results either! That is to say, the collaboration is split into small sub-groups with specified responsibilities, and these sub-groups don't share results with each other. So if you ask a randomly chosen Planck member what the preliminary polarization results are looking like, chances are they don't know any better than you. (Though this may not stop them from saying "Well, I've seen some very interesting plots ..." and smiling enigmatically!)
• The conference I attended in Estonia was the IAU symposium in honour of the 100th birth anniversary of the great Ya. B. Zel'dovich, on the general topic of large-scale structure and the cosmic web. I'll try to write a little about my general impressions of the conference next time. In the meantime all the talks are available for download from the website here.
• A science news story you may have seen recently is "Biggest void in universe may explain cosmic cold spot": this is a claim that a recently detected region with a relative deficit of galaxies (the "supervoid") explains the existence of the unusual Cold Spot that has been seen in the CMB, without the need to invoke any unusual new physics. The claim of the explanation is based on this paper. Unfortunately this claim is wrong, and the paper itself has several problems. My collaborators and I are in the process of writing a paper of our own discussing why, and when we are done I will try to explain the issues on here as well. In the meantime, you heard it here first: a supervoid does not explain the Cold Spot!
Update: It has been pointed out to me that last week Julien Lesgourgues gave a talk about Planck and particle physics at the Strong and Electroweak Matter (SEWM14) symposium, in which he also discussed the timeline of forthcoming Planck and BICEP papers. You can see this on page 12 of his talk (pdf) and it is roughly the same as what I wrote above (except that there's a typo in the year — it should be 2014 not 2015!).

## Friday, May 16, 2014

### BICEP and listening to real experts

First up, I'd like to provide a health warning for all people landing here after following links from Sean Carroll or Peter Woit (thanks for the traffic!): I am not a CMB data analysis expert. What I provide on this blog is my own interpretation and understanding of the news and papers I have read, largely because writing such things out helps me understand them better myself. If it also helps people reading this blog, that's great, and you're welcome. But there are no guarantees that any of what I have written about BICEP is correct! If you truly want the best expert opinions on CMB analysis issues, you should listen to the best CMB experts — in this case, probably people who were in the WMAP collaboration, but are not in either Planck or BICEP. Also, if you want to ask somebody to write a scholarly review article on BICEP (yes, I get strange emails!), please don't ask me.

Having said that, I'm not sure whether any WMAP scientists write blogs, so I can at least try to provide some sources for the non-expert reader to refer to. One thing that you definitely should look at is Raphael Flauger's talk (slides and video) at Princeton yesterday. I think it is this work which was the source of the "is BICEP wrong" rumours first publicly posted at Resonaances, and indeed I see that Resonaances today has a follow-up referring to these very slides.

There are several interesting things to take away from this talk. The first is to do with the question of whether BICEP misinterpreted the preliminary Planck data that they admit having taken from a digitized version of a slide shown at a meeting. Here Flauger essentially simulates the process by digitizing the slide in question (and a few others) himself and analyzing them both with and without the correct CIB subtraction. His conclusion is that with the correct treatment, the dust models appear to predict higher dust contamination than BICEP accounted for; the inference being, I guess, that they didn't subtract the CIB correctly.

How important is this dust contribution? Here there is a fair amount of uncertainty: even if the digitization procedure were foolproof, one of the dust models underestimates the contamination and another one overestimates it. Putting the two together, "foregrounds may be OK if the lower end of the estimates is correct, but are potentially dangerous" (page 40). Flauger tries another method of estimation based on the HI column density, using yet more unofficial Planck "data" taken from digitized slides. This seems to give much the same bottom line.

A key point here is that everybody who isn't privy to the actual Planck data is really just groping in the dark, digitizing other people's slides. Flauger acknowledges by trying to estimate the effect of the process of converting real data into a gif image, converting that into a pdf as part of a talk, somebody nicking the pdf and converting it back to gif and then back to useable data. As you can imagine, the amount of noise introduced in this version of Chinese Whispers is considerable! So I think the following comment from Lyman Page towards the end of the video (as helpfully transcribed by Eiichiro Komatsu for the Facebook audience!) is perhaps the most relevant:
"This is, this is a really, peculiar situation. In that, the best evidence for this not being a foreground, and the best evidence for foregrounds being a possible contaminant, both come from digitizing maps from power point presentations that were not intended to be used this way by teams just sharing the data. So this is not - we all know, this is not sound methodology. You can't bank on this, you shouldn't. And I may be whining, but if I were an editor I wouldn't allow anything based on this in a journal. Just this particular thing, you know. You just can't, you can't do science by digitizing other people's images."
Until Planck answers (or fails to definitively answer) the question of foregrounds in the BICEP window, or some other experiment confirms the signal, we should bear that in mind.

There are some other issues that remain confusing at the moment: the cross-correlation of dust models with BICEP signal doesn't seem to support the idea that all the signal is spurious (though there are possibly some other complicating factors here), and the frequency evidence — such as it is — from the cross power with BICEP1 also doesn't seem to favour a dust contaminant. But all in all, the BICEP result is currently under a lot of pressure. Having seen this latest evidence, I now think the Resonaances verdict ("until [BICEP convincingly demonstrate that foregrounds are under control], I think their result does not stand") is — at least — a justifiable position.

Footnote: I should also perhaps explain that throughout my physics education I have been taught, and had come to believe, that the types of models of inflation BICEP provided evidence for (those with inflaton field values larger than the Planck scale) were fundamentally unnatural and incomplete, and that those, small-field, models that BICEP apparently ruled out were much more likely to be true. So perhaps my conscious attempts to compensate for this acknowledged theoretical prejudice could have biased me too far in the opposite direction in some previous posts!

## Wednesday, May 14, 2014

### New BICEP rumours: nothing to see here

This week there has been a minor kerfuffle about some rumours, originally posted on Adam Falkowski's Resonaances blog, regarding the claimed gravitational wave detection by BICEP. The rumours asserted that Planck had proven BICEP had made a mistake, BICEP had admitted the mistake, and that this might mean that all the excitement about the detection of gravitational waves was misplaced and all that BICEP had seen was some foreground dust emission contaminating their maps. (Since then there has been a strong public denial of this by the BICEP team.)

Now, with the greatest respect to Resonaances, which is an excellent particle physics blog, this is really a non-issue, and certainly not worth offending lots of people for (see for instance Martin Bucher's comment here). I really do not see what substantial information these rumours have provided us with that was not already known in March, and therefore why we should alter assessments of the data  made at that time.

Let me explain a bit more. One of the important limitations of the BICEP2 experiment is that it essentially measured the sky at only one frequency (150 GHz) — the data from BICEP1, which was at 100 GHz, was not good enough to see a signal, and the data from the Keck Array at 100 GHz has not yet been analysed. When you only have one frequency it is much harder to rule out the possibility that the "signal" seen is not due to primordial gravitational waves at all but due to intervening dust or other contamination from our own Galaxy.

The way that BICEP addressed this difficulty was to use a set of different models for the dust distribution in that part of the sky, and to show that all of them predict that the possible level of dust contamination is an order of magnitude too small to account for the signal that they see. Now, some of these models may not be correct. In fact none of them are likely to be exactly right, because they may be based on old and likely less accurate measurements of the dust distribution or rely on a bit of extrapolation, wishful thinking, whatever. But the point is that they all roughly agree about the order of magnitude of dust contamination. This does not mean that we know there is or isn't any foreground contamination; this is merely a plausibility argument from BICEP (that is supported by and supports some other plausibility arguments in the paper).

Now the "new" rumour is based on the fact that it turns out that one of the dust models was based on BICEP's interpretation of preliminary Planck data, and that this data was not officially sanctioned but digitally extracted from a pdf of a slide shown at a talk somewhere. This is not exactly news, since the slide in question is in fact referenced in the BICEP paper. What's new is that now somebody unnamed is suggesting that the slide was in fact misinterpreted, and therefore this one dust model is more wrong than we thought, though we already accepted it was probably somewhat wrong. This is not the same as proving that the BICEP signal has been definitively shown to be caused by dust contamination! In fact I don't see how it changes the current picture we have at all. Ultimately the only way we can be sure about whether the observed signal is truly primordial or due to dust is to have measurements that combine several different frequencies. For that we have to wait a bit for other experiments — and that's the same as we were saying in March.

It's worth noting that when BICEP quote their result in terms of the tensor-to-scalar ratio r, the headline number $r=0.2$ assumes that there is literally zero foreground contamination. This was always an unrealistic assumption, but that hasn't stopped some 300 theorists from writing papers on the arXiv that take the number as face value and use it to rule out or support their favourite theories. The foreground uncertainty means that while we can be reasonably confident that the gravitational wave signal does exist (see here), model comparisons that strongly depend on the precise value of r are probably going to need some revision in the future.

So what new information have we gained since March? Well, Planck released some more data, this time a map of the polarized dust emission close to the Galactic plane.

 The polarization fraction at 353 GHz observed by Planck. From arXiv:1405.0871.

Since these maps do not include the part of the sky that BICEP looked at (which is mostly in the grey region at the bottom), they don't tell us a huge amount about whether that part of the sky is or is not contaminated by polarized dust emission! Some people have speculated that this is something to do with the rivalry between Planck and BICEP, which is a bit over-the-top. Instead the reason is more scientific: the mask excludes areas where the error in determining the polarisation fraction is high, or the overall dust signal itself is too small. So the fact that the BICEP patch is in the masked region indicates that the dust emission does not dominate the total emission there, at least at 353 GHz (dust emission increases with frequency). This means there is not a whole lot of dust showing up in the BICEP region — if anything, this is good news! But even this interpretation should be treated with caution: dust doesn't contribute too much to the total intensity in that region, but it may well still contribute a large fraction of whatever B-mode polarization is seen. Based on my understanding and things I have learned from conversations with colleagues, I don't think Planck is going to be sensitive enough to make definitive statements about the dust in that specific region of the sky.

Another interesting paper that has come out since March has been this one, which claims evidence for some contamination in the CMB arising from the "radio loops" of our Galaxy. It also has the great benefit of being an actual scientific paper rather than a rumour on somebody's blog. (Full disclaimer: one of the authors of this paper was my PhD advisor, and another is a friend who was a fellow student when I was at Oxford.)

The radio loops are believed to be due to ejected material from past supernovae explosions; the idea is that if this dust contains ferrimagnetic molecules or iron, it would contribute polarized emission that might be mistaken for true CMB when it is in fact more local. What this paper argues is that does appear to be some evidence that one of the CMB maps produced by the WMAP satellite (which operated before Planck) does show some correlation between map temperature and the position of one of these radio loops ("Loop I"). In particular, synchrotron emission from Loop I appears to be correlated with the temperature in the WMAP Internal Linear Combination (or ILC) map. I'm not going to comment on the strength of the statistical evidence for this claim; doubtless someone more expert than I will thoroughly check the paper before it is published. For the time being let us treat it as proven.

The relevance of this to BICEP is somewhat intricate, and proceeds like this: given our physical understanding of how the radio loops formed, it seems likely that they produce both synchrotron and dust emission which follow the same pattern on the sky. Therefore perhaps the correlation of the synchrotron emission from Loop I with the ILC map is because both are correlated with dust emission from the loop. If the correlation is because of dust emission, this might be polarized because of the postulated ferrimagnetic molecules etc., leading to a correlation between the WMAP polarization and Loop I. And if Loop I is contaminating the WMAP ILC map, it is perhaps plausible that a different radio loop, called the "New Loop", is also contaminating other CMB maps, in particular those of BICEP. Whereas Loop I doesn't get very close to the BICEP region, the New Loop goes right through the centre of it (see the figure below), so it is possible that there is some polarized contamination appearing in the BICEP data because of the New Loop. At any rate, the foreground dust models that BICEP used didn't account for any radio loops, so likely underestimate the true contamination.

 Position of some Galactic radio loops and the BICEP window. "Loop I" is large one in the upper centre, that only skims the BICEP window; the "New Loop" is the one in the lower centre that passes through the centre of it. Figure from Philipp Mertsch.

So far so good, but this is quite a long chain of reasoning and it doesn't prove that it is actually dust contamination that accounts for any part of the BICEP observation. Instead it makes a plausible argument that it might be important; further investigation is required.

At the end of the day then, we are left in pretty much the same position we were in back in March. The BICEP result is exciting, but because it is only at one frequency, it cannot rule out foreground contamination. Other observations at other frequencies are required to confirm whether the signal is indeed cosmological. One scenario is that Planck, operating on the whole sky at many frequencies but with a lower sensitivity than BICEP, confirms a gravitational wave signal, in which case pop the champagne corks and prepare for Stockholm. The other scenario is that Planck can't confirm a detection, but also can't definitively say that BICEP's detection was due to foregrounds (this is still reasonably likely!), in which case we wait for other very sensitive ground-based telescopes pointed at that same region of sky but operating at different frequencies to confirm whether or not dust foregrounds are actually important in that region, and if so, how much they change the inferred value of r.

Until then I would say ignore the rumours.