Sunday, November 23, 2014

A 'spooky alignment' of quasars, or just hype?

In the news this week we've had a story on the alignment of quasar spins with large-scale structure, based on this paper by Hutsemekers et al. The paper was accompanied by this press release from the European Space Observatory, which was then reproduced in various forms in a number of blogs and news outlets — almost all of which stress the 'spooky' or 'mysterious' nature of the claimed alignment 'over billions of light years'.

At least one of these blogs (the one at The Daily Galaxy) explicitly claims that the alignment of these quasar spins is a challenge for the cosmological principle, which is the assumption of large-scale statistical homogeneity and isotropy of the Universe, on which all of modern cosmology is based. This claim is not contained in the press release, but originates from a statement in the paper itself, where the authors say
The existence of correlations in quasar axes over such extreme scales would constitute a serious anomaly for the cosmological principle.
I'm afraid that this claim is completely unsupported by any of the actual results contained within the paper, and is therefore one of those annoying examples of scientific hype. In this post I will try to explain why.

I have actually covered much of this ground before — in a blog post here, but more importantly in a paper published in Monthly Notices last year — and I must admit I am a little surprised at having to repeat these points (especially since my paper is cited by Hutsemekers et al.). Nevertheless, in what follows I shall try not to sound too grumpy.

The immediate story started with a paper by Roger Clowes and collaborators, who claimed to have detected the 'largest structure' in the Universe (dubbed the 'Huge-LQG') in the distribution of quasars, and also claimed that this structure violated the cosmological principle. My paper last year was a response to this, and made the following points:

  1. the detection of a single large structure has essentially no relevance to the question of whether the Universe is statistically homogeneous and isotropic;
  2. the quasar sample within which the Huge-LQG was identified is statistically homogeneous, and approaches homogeneity at the scale we expect theoretically, thus providing an explicit demonstration of point 1;
  3. the definition of 'structure' by which the Huge-LQG counts as a structure is so loose that by using it we would find equally vast 'structures' even in completely random distributions of points which (by construction!) contain no correlations and therefore no structure whatsoever; and 
  4. therefore the classification of the Huge-LQG set of quasars as a 'structure' is essentially empty of meaning.

Quasar structures don't violate homogeneity

Since I am already repeating myself, let me elaborate a little more on points 1 and 2. Our Universe is not exactly homogeneous. The fact that you exist — more generally, the fact that stars, galaxies and clusters of galaxies exist — is sufficient proof of this, so it would a very poor advertisement for cosmology indeed if it were all founded on the assumption of exact homogeneity. Luckily it isn't. In fact our theories could be said to predict the existence of structure in the potential $\Phi$ on all scales (that's what a scale-invariant power spectrum from inflation means!), and even the galaxy-galaxy correlation function only goes asymptotically to zero at large scales.

Instead we have the assumption of statistical homogeneity and isotropy, which means that we assume that when looked at on large enough scales, different regions of the Universe are on average the same. Clearly, since this is a statement about averages, it can only be tested statistically by looking at large numbers of different regions, not by finding one particular example of a 'structure'. In fact there is a well-established procedure for checking the statistical homogeneity of the distribution of a set of points (the positions of galaxies or quasars, in this case), which involves measuring its fractal dimension and checking the scale above which this is equal to 3. I've described the procedure before, here and here, and Peter Coles describes a bit of the history of it here.

The bottom line is that, as I showed last year, the quasar distribution in question is statistically homogeneous above scales of at most $\sim130h^{-1}$Mpc. There is therefore no 'structure' you can find in this data which could violate the cosmological principle. End of story.

Scaled number counts in spheres as a measure of the fractal dimension of the quasar distribution. On scales where this number approaches 1, the distribution is statistically homogeneous. From arXiv:1306.1700.

Structures and probability

Of course, there are many different ways of being statistically homogeneous. It is perfectly possible that within a statistically homogeneous distribution one could find a particular structure or feature whose existence in our specific cosmological model (which is one of many possible models satisfying the cosmological principle) is either very unlikely or impossible. This would then be a problem for that cosmological model despite not having any wider implications for the cosmological principle. But to prove this requires some serious analysis, which should include a proper treatment of probabilities — you can't just say "this structure is big, so it must be anomalous."

In particular, any serious analysis of probabilities must take into account how a 'structure' is defined. Given infinitely many possible choices of definition, and a very large Universe in which to search, the probability of finding some 'structure' that extends over billions of light years is practically unity. In fact the definition used for the Huge-LQG would be likely to throw up equally vast 'structures' even if quasar positions were not at all correlated with each other (and we know they must be at least somewhat correlated, because of gravity). So it really isn't a very useful definition at all.

'Spooky' alignments

This brings us to the current paper by Hutsemekers et al. The starting assumption of this paper is that the Huge-LQG is a real structure which is somehow distinguished from its surroundings. This assumption is manifest in the decision that the authors make to try to measure the polarization of light from only those quasars that are classified as part of the Huge-LQG rather than a more general sample of quasars. This classic case of circular reasoning is the first flaw in the logic, but let's put it to one side for a minute.

The press release then tells us that the scientists
found that the rotation axes of the central supermassive black holes in a sample of quasars are parallel to each other over distances of billions of light years
and that the spins of the central black holes are aligned along the filaments of large-scale structure in which they reside.

I find this statement extremely problematic. Here is a figure from the paper in question, showing the sky positions of the 93 quasars in question, along with the polarization orientations for the 19 which are used in the actual analysis:

Quasar positions (black dots) and polarization alignments (red lines). From arXiv.1409.6098.

Do you see the alignment? No, me neither. In fact, looking at the distribution of angles in panel b, I would say that looks very much like a sample drawn from a perfectly uniform distribution.

So what is the claim actually based on? Well, for a start one has to split up the (arbitrarily defined) 'structure' into several (even more arbitrarily defined) 'sub-structures'. Each of these sub-structures then defines a different reference angle on the sky:

Chopping the data to suit the argument (Figure 4 of arXiv:1409.6098). On what basis are sub-structures 1 and 2 defined as separate from each other?

And now one has to measure the angles between the quasar polarization direction and the reference direction of the particular sub-structure, and the direction perpendicular to the reference direction, and choose the smaller of the two. In other words, rather than prove that quasars are aligned parallel to each other over distances extending over billions of light years (the claim in the press release), what Hutsemekers et al. are actually doing is attempting to show that given arbitrary choices of some smaller sub-structures and reference directions, quasars in different sub-structures are typically aligned either parallel to or perpendicular to this direction. This is a much less exacting standard.

Even this claim is not particularly well supported by the evidence. That is, looking at the distribution of angles, I am really not at all convinced that this shows evidence for a bimodal distribution with peaks at 0 and 90 degrees:

Distribution of angles purportedly showing two distinct peaks at 0 and 90. Figure 5 of arXiv:1409.6098.

So in summary I think the statistical evidence of alignment of quasar spins is already pretty weak. I don't see any analysis in the paper dealing with the effects of a different arbitrary choice of sub-structures, nor do I see any error analysis (the error in measuring the polarization direction of a quasar can be as large as 10 degrees!). And I haven't even dealt with the fact that the polarization data is used for only 19 quasars out of the full 93 — in other words, for the majority of quasars in the sample the central black hole spins are aligned along some other, undetermined, direction such that we can't measure the polarization.

Extraordinary claims require extraordinary evidence

Now, it's worth repeating that we've already seen that in fact the space distribution of quasars is statistically homogeneous in accordance with the cosmological principle. That simple test has been done, the cosmological principle survives. So if you've got some more nuanced claim of an anomaly, I think the onus is on you not only to describe the measurement you made, but also say what exactly is anomalous about it. What is the theoretical prediction we should compare it to? Which model is being rejected (or otherwise) by the new data?

So, for instance, if quasar spins in sub-structures are indeed aligned either parallel or perpendicular to each other (and I still remain to be convinced that they are), is this really something 'spooky', or would we expect some degree of alignment in the standard $\Lambda$CDM model?

Such an analysis has not been presented, but even if it had, it's worth bearing in mind the principle that extraordinary claims require extraordinary evidence. I'm afraid throwing out a p-value of about 1% simply doesn't cut it. Not only is that actually not an enormously impressive number (especially given all the other things I mentioned above), such a frequentist statistic doesn't take account of all our prior knowledge.

Other people have banged this drum at length before, but the point is easily summarized: the p-value tells us the probability of getting this data given the model, but doesn't tell us the probability of the model being correct despite the new data appearing to contradict it. This is the question we really wish to answer. To do this requires a Bayesian analysis, in which one must account for the prior belief in the model, which is the result of confidence built up from all other experimental results that agree with it. We have an incredible amount of observational evidence in favour of our current model, that would probably not be consistent with a model in which gigantic structures could exist (I say 'probably' because no such model actually exists at present). 

So my prior in favour of $\Lambda$CDM is pretty high — 19 quasars and an analysis so full of holes are not going to change that so quickly.

Monday, September 22, 2014

Biting the dust

Sorry about the obvious pun in the title. Today's important announcement is of course the long-awaited Planck verdict on the level at which the BICEP2 "discovery" of primordial gravitational waves had been contaminated by foreground dust. That verdict does not look good for BICEP.

(Incidentally, back in July I reported a Planck source as saying this paper would be ready in "two or three weeks". Clearly that was far too optimistic. But interestingly many members of the Planck team themselves were confidently expecting today's paper to appear about 10 days ago, and the rumour is that the current version has been "toned down" a little, perhaps accounting for some of the additional delay. Despite that it's still pretty devastating.)

Let me attempt to summarize the new results. Some important points are made right in the abstract, where we read:
"... even in the faintest dust-emitting regions there are no "clean" windows in the sky where primordial CMB B-mode polarization measurements could be made without subtraction of foreground emission"
and that
"This level [of the dust power in the BICEP2 window, over the multipole range of the primordial recombination bump] is the same magnitude as reported by BICEP2 ..."
(my emphasis). Although
"the present uncertainties are large and will be reduced through an ongoing, joint analysis of the Planck and BICEP2 data sets,"
from where I am looking unfortunately it now does not look as if there is a realistic chance that what BICEP2 reported was anything more than a very precise measurement of dust.

The Planck paper is pretty thorough, and actually quite interesting in its own right. They make use of the fact that Planck observes the sky at many frequencies to study the properties of dust-induced polarization. Whereas BICEP2 was limited to a single frequency channel at 150 GHz, the Planck HFI instrument has 4 different frequencies, of which the most useful is at 353 GHz. Previous Planck results have already shown that dust emission behaves sort of like a (modified) blackbody spectrum at a temperature of 19.6 Kelvin. Since this is a significantly higher temperature than the CMB temperature of 2.73 K, dust emission dominates at higher frequencies, which means that the 353 GHz channel essentially sees only dust and nothing else. Which makes it perfect for the task at hand, since in this particular situation roles are reversed and it is the dust that is the signal and the primordial CMB is noise!

The analysis proceeds in a number of steps. First, they study the power spectra of the two polarization modes (EE and BB) in several different large regions in the sky:

The different large sky regions studied are shown as increments of red, orange, yellow, green and two different shades of blue. The darkest blue region is always excluded. Figure from arXiv:1409.5738.
In all these different regions, both power spectra $C_\ell^{EE}$ and $C_\ell^{BB}$ are proportional to $\ell^{\alpha}$, consistent with a value of $\alpha=-2.42\pm0.02$. Fixing $\alpha$ to this value, the amplitude of the power spectra in the different large regions then shows a characteristic dependence on the mean intensity of the dust emission — i.e. regions with more dust overall also show more polarization power — and this purely empirical relationship is characterized by
$$A^{EE,BB}\propto\langle I_{353}\rangle^{1.9},$$though with a bit of uncertainty in the fit. The amplitudes of the polarization power spectra then also show a dependence on frequency from 353 GHz down to 100 GHz which matches previous Planck results (the dependence is something close to a blackbody spectrum at 19.6 K, but with a specific modification).

It then turns out that if the sky is split into very many much smaller regions close to the poles rather than the 6 large ones above, the same results continue to hold on average, though obviously there is some scatter introduced by the fact that dust in different bits of the sky behaves differently. So this allows the Planck team to take the measured dust intensity in any one of these smaller regions and extrapolate down to see what the contribution to the BB power would be if measured at the BICEP2 frequency of 150 GHz. The result looks like this:

The level of dust contamination across the in measurements of the primordial B-mode signal. Blue is good, red is bad. The BICEP2 window is the black outline on the right.
This really sucks for BICEP2, who chose their particular patch of the sky precisely because, according to estimates of the 1990s and early 2000s, it was supposed to have very little dust. Planck is now saying that isn't true, and that there is a better region just a little further south. Even that better region isn't perfect, of course, but it may be clean enough to see a primordial GW signal of $r\sim 0.1$ to $0.2$ — if such a signal exists, and if we're lucky and/or figure out cleverer ways of subtracting the dust foreground.

The problem with the BICEP2 region is that Planck's estimate of the dust contribution there looks like this:

Planck's estimate of the dust contribution to the BB power spectrum at 150 GHz and in the BICEP2 sky window. The first bin is the one that's most relevant. The black line is the contribution primordial GW with $r=0.2$ would make, if they existed.
So it appears that in the BICEP2 window, in the $\ell$ region where primordial gravitational waves produce a measurable BB signal (and BICEP2 has measured something), dust is expected to produce the same amplitude of signal as does an $r=0.2$. In fact, even accounting for the uncertainties in the Planck analysis (the extent of the pink error bars on the plot) it is clear that (a) dust will be contributing significantly to the BICEP2 measurement, and (b) it's pretty likely that only dust is contributing.

Planck avoid explicitly saying that BICEP2 haven't seen anything but dust. This is because they haven't directly measured the dust contribution in that window and at 150 GHz. Rather what's shown in the plot above is based on a number of little steps in the chain of inference:
  1. generally, the BB polarization amplitude is dependent on the average total dust intensity in a region;
  2. the relationship between these two doesn't vary too much across the sky;
  3. generally, the frequency dependence of the amplitude shows a certain behaviour;
  4. and again this doesn't appear to vary too much across the sky
  5. Planck have measured the average dust intensity in the BICEP2 window, and this gives the value shown in the plot above when extrapolated to 150 GHz;
  6. and the BICEP2 window doesn't appear to be a special outlier region on the sky that would wildly deviate from these average relationships;
  7. so, the dust amplitude calculated is probably correct.
Update: See the correction in the comments — the Planck paper actually does better than this. That is to say, they present one analysis that relies on all steps 1-7, but in addition they also measure the BB amplitude directly at 353 GHz and extrapolate that down to 150 GHz relying only on steps 3 and 4. The headline result is the one based on the second method, which actually gets a lower number for the dust amplitude. 

So they leave open the small possibility that despite having been unlucky in the original choice of the BICEP2 window, we've somehow ultimately got very lucky indeed and nevertheless measured a true primordial gravitational wave signal. 

Time will tell if this is true ... but the sensible betting has now got to be that it is not.

Incidentally, I have just learned that in two days' time I will be presenting a 30 minute lecture to a group of graduate students about this result. The lecture is not supposed to be very detailed, but I'm also not very much of an expert on this. So if you spot any errors or omissions above, please do let me know through the comments box!

Monday, August 25, 2014

A Supervoid cannot explain the Cold Spot

In my last post, I mentioned the claim that the Cold Spot in the cosmic microwave background is caused by a very large void — a "supervoid" — lying between us and the last scattering surface, distorting our vision of the CMB, and I promised to say a bit more about it soon. Well, my colleagues (Mikko, Shaun and Syksy) and I have just written a paper about this idea which came out on the arXiv last week, and in this post I'll try to describe the main ideas in it.

First, a little bit of background. When we look at sky maps of the CMB such as those produced by WMAP or Planck, obviously they're littered with very many hot and cold spots on angular scales of about one degree, and a few larger apparent "structures" that are discernible to the naked eye or human imagination. However, as I've blogged about before, the human imagination is an extremely poor guide to deciding whether a particular feature we see on the sky is real, or important: for instance, Stephen Hawking's initials are quite easy to see in the WMAP CMB maps, but this doesn't mean that Stephen Hawking secretly created the universe.

So to discover whether any particular unusual features are actually significant or not we need a well-defined statistical procedure for evaluating them. The statistical procedure used to find the Cold Spot involved filtering the CMB map with a special wavelet (a spherical Mexican hat wavelet, or SMHW), of a particular width (in this case $6^\circ$), and identifying the pixel direction with the coldest filtered temperature with the direction of the Cold Spot. Because of the nature of the wavelet used, this ensures that the Cold Spot is actually a reasonably sizable spot on the sky, as you can see in the image below:

The Cold Spot in the CMB sky. Image credit: WMAP/NASA.

Well, so we've found a cold spot. To elevate it to the status of "Cold Spot" in capitals and worry about how to explain it, we first need to quantify how unusual it is. Obviously it is unusual compared to other spots on our observed CMB, but this is true by construction and not very informative. Instead the usual procedure quite rightly compares the properties of the cold spots found in random Gaussian maps using exactly the same SMHW technique to the properties of the Cold Spot in our CMB. It is this procedure which results in the conclusion that our Cold Spot is statistically significant at roughly the "3-sigma level", i.e. only about 1 in every 1000 random maps has a coldest spot that is as "cold" as* our Cold Spot.** (The reason why I'm putting scare quotes around everything should become clear soon!)

So there appears to be a need to explain the existence of the Cold Spot using additional new physics of some kind. One such idea that that of the supervoid: a giant region hundreds of millions of light years across which is substantially emptier than the rest of the universe and lies between us and the Cold Spot. The emptiness of this region has a gravitational effect on the CMB photons that pass through it on their way to us, making them look colder (this is called the integrated Sachs-Wolfe or ISW effect) — hence the Cold Spot.

Now this is a nice idea in principle. In practice, unfortunately, it suffers from a problem: the ISW effect is very weak, so to produce an effect capable of "explaining" the Cold Spot the supervoid would need to be truly super — incredibly large and incredibly empty. And no such void has actually been seen in the distribution of galaxies (a previous claim to have seen it turned out to not be backed up by further analysis).

It was therefore quite exciting when in May a group of astronomers, led by Istvan Szapudi of the Institute for Astronomy in Hawaii, announced that they had found evidence for the existence of a large void in the right part of the sky. Even more excitingly, in a separate theoretical paper, Finelli et al. claimed to have modeled the effect of this void on the CMB and proven that it exactly fit the observations, and that therefore the question had been effectively settled: the Cold Spot was caused by a supervoid.

Except ... things aren't quite that simple. For a start, the void they claimed to have found doesn't actually have a large ISW effect — in terms of central temperature, less than one-seventh what would be needed to explain the Cold Spot. So Finelli et al. relied on a rather curious argument: that the second-order effect (in perturbation theory terms) of this void on CMB photons was somehow much larger than the first-order (i.e. ISW) effect. A puzzling inversion of our understanding of perturbation theory, then!

In fact there were a number of other reasons to be a bit suspicious of the claim, among which were that N-body simulations don't show this kind of unusual effect, and that several other larger and deeper voids have already been found that aren't aligned with Cold Spot-like CMB features. In our paper we provide a fuller list of these reasons to be skeptical before diving into the details of the calculation, where one might get lost in the fog of equations.

At the end of the day we were able to make several substantive points about the Cold Spot-as-a-supervoid hypothesis:
  1. Contrary to the claim by Finelli et al., the void that has been found is neither large enough nor deep enough to leave a large effect on the CMB, either through the ISW effect or its second-order counterpart — in simple terms, it is not a super enough supervoid.
  2. In order to explain the Cold Spot one needs to postulate a supervoid that is so large and so deep that the probability of its existence is essentially zero; if such a supervoid did exist it would be more difficult to explain that the Cold Spot currently is!
  3. The possible ISW effect of any kind of void that could reasonably exist in our universe is already sufficiently accounted for in the analysis using random maps that I described above.
  4. There's actually very little need to postulate a supervoid to explain the central temperature of the Cold Spot — the fact that we chose the coldest spot in our CMB maps already does that!
Point number 1 requires a fair bit of effort and a lot of equations to prove (and coincidentally it was also shown in an independent paper by Jim Zibin that appeared just a day before ours), but in the grand scheme of things it is probably not a supremely interesting one. It's nice to know that our perturbation theory intuition is correct after all, of course, but mistakes happen to the best of us, so the fact that one paper on the arXiv contains a mistake somewhere is not tremendously important.

On the other hand, point 2 is actually a fairly broad and important one. It is a result that cosmologists with a good intuition would perhaps have guessed already, but that we are able to quantify in a useful way: to be able to produce even half the temperature effect actually seen in the Cold Spot would require a hypothetical supervoid almost twice as large and twice as empty as the one seen by Szapudi's team, and the odds of such a void existing in our universe would be something like a one-in-a-million or one-in-a-billion (whereas the Cold Spot itself is at most a one-in-a-thousand anomaly in random CMB maps). A supervoid therefore cannot help to explain the Cold Spot.***

Point 3 is again something that many people probably already knew, but equally many seem to have forgotten or ignored, and something that has not (to my knowledge) been stated explicitly in any paper. My particular favourite though is point 4, which I could — with just a tiny bit of poetic licence — reword as the statement that
"the Cold Spot is not unusually cold; if anything, what's odd about it is only that it is surrounded by a hot ring"
I won't try to explain the second part of that statement here, but the details are in our paper (in particular Figure 7, in case you are interested). Instead what I will do is to justify the first part by reproducing Figure 6 of our paper here:

The averaged temperature anisotropy profile at angle $\theta$ from the centre of the Cold Spot (in red),  and the corresponding 1 and $2\sigma$ contours from the coldest spots in 10,000 random CMB maps (blue). Figure from arXiv:1408.4720.

What the blue shaded regions show is the confidence limits on the expected temperature anisotropy $\Delta T$ at angles $\theta$ from the direction of the coldest spots found in random CMB maps using exactly the SMHW selection procedure. The red line, which is the measured temperature for our actual Cold Spot, never goes outside the $2\sigma$ equivalent confidence region. In particular, at the centre of the Cold Spot the red line is pretty much exactly where we would expect it to be. The Cold Spot is not actually unusually cold.

Just before ending, I thought I'd also mention that Syksy has written about this subject on his own blog (in Finnish only): as I understand it, one of the points he makes is that this form of peer review on the arXiv is actually more efficient than the traditional one that takes place in journals.

Update: You might also want to have a look at Shaun's take on the same topic, which covers the things I left out here ...

* People often compare other properties of the Cold Spot to those in random maps, for instance its kurtosis or other higher-order moments, but for our purposes here the total filtered temperature will suffice.

** Although as Zhang and Huterer pointed out a few years ago, this analysis doesn't account for the particular choice of the SMHW filter or the particular choice of $6^\circ$ width — in other words, that it doesn't account for what particle physicists call the "look-elsewhere effect". Which means it is actually much less impressive.

*** If we'd actually seen a supervoid which had the required properties, we'd have a proximate cause for the Cold Spot, but also a new and even bigger anomaly that required an explanation. But as we haven't, the point is moot.

Monday, July 14, 2014

Short news items

Over the past two months I have been on a two-week seminar tour of the UK, taken a short holiday, attended a conference in Estonia and spent a week visiting collaborators in Spain. Posting on the blog has unfortunately suffered as a result: my apologies. Here are some items of interest that have appeared in the meantime:
  • The BICEP and Planck teams are to share their data — here's the BBC report of this news. The information I have from Planck sources is that Planck will put out a paper with new data very soon (about a week ago I heard it would be "maybe in two weeks", so let's say two or three weeks from today). This new data will then be shared with the BICEP team, and the two teams will work together to analyse its implications for the BICEP result. From the timescales involved my guess is that what Planck will be making available is a measurement of the polarised dust foreground in the BICEP sky region, and the joint publication will involve cross-correlating this map with the B-mode map measured by BICEP. A significant cross-correlation would indicate that most (or all) of the signal BICEP detected was due to dust.
  • What Planck will not be releasing in the next couple of weeks is their own measurement of the polarization of the CMB, in particular their own estimate of the value of $r$. The timetable for this release is still October: this is a deadline imposed by the fact that ESA requires Planck to release the data by December, but another major ESA mission (I forget which) is due to be launched in November and ESA don't like scheduling "competing" press conferences in the same month because there's only so much science news Joe Public can absorb at a time. From what I've heard, getting the full polarization data ready for October is a bit of a rush as it is, so it's fairly certain that's not what they're releasing soon.
  • By the way, I think I've recently understood a little better how a collaboration as enormous as Planck manage to remain so disciplined and avoid leaking rumours: it's because most of the people in the collaboration don't know the full details of the results either! That is to say, the collaboration is split into small sub-groups with specified responsibilities, and these sub-groups don't share results with each other. So if you ask a randomly chosen Planck member what the preliminary polarization results are looking like, chances are they don't know any better than you. (Though this may not stop them from saying "Well, I've seen some very interesting plots ..." and smiling enigmatically!)
  • The conference I attended in Estonia was the IAU symposium in honour of the 100th birth anniversary of the great Ya. B. Zel'dovich, on the general topic of large-scale structure and the cosmic web. I'll try to write a little about my general impressions of the conference next time. In the meantime all the talks are available for download from the website here.
  • A science news story you may have seen recently is "Biggest void in universe may explain cosmic cold spot": this is a claim that a recently detected region with a relative deficit of galaxies (the "supervoid") explains the existence of the unusual Cold Spot that has been seen in the CMB, without the need to invoke any unusual new physics. The claim of the explanation is based on this paper. Unfortunately this claim is wrong, and the paper itself has several problems. My collaborators and I are in the process of writing a paper of our own discussing why, and when we are done I will try to explain the issues on here as well. In the meantime, you heard it here first: a supervoid does not explain the Cold Spot!
Update: It has been pointed out to me that last week Julien Lesgourgues gave a talk about Planck and particle physics at the Strong and Electroweak Matter (SEWM14) symposium, in which he also discussed the timeline of forthcoming Planck and BICEP papers. You can see this on page 12 of his talk (pdf) and it is roughly the same as what I wrote above (except that there's a typo in the year — it should be 2014 not 2015!).

Friday, May 16, 2014

BICEP and listening to real experts

First up, I'd like to provide a health warning for all people landing here after following links from Sean Carroll or Peter Woit (thanks for the traffic!): I am not a CMB data analysis expert. What I provide on this blog is my own interpretation and understanding of the news and papers I have read, largely because writing such things out helps me understand them better myself. If it also helps people reading this blog, that's great, and you're welcome. But there are no guarantees that any of what I have written about BICEP is correct! If you truly want the best expert opinions on CMB analysis issues, you should listen to the best CMB experts — in this case, probably people who were in the WMAP collaboration, but are not in either Planck or BICEP. Also, if you want to ask somebody to write a scholarly review article on BICEP (yes, I get strange emails!), please don't ask me.

Having said that, I'm not sure whether any WMAP scientists write blogs, so I can at least try to provide some sources for the non-expert reader to refer to. One thing that you definitely should look at is Raphael Flauger's talk (slides and video) at Princeton yesterday. I think it is this work which was the source of the "is BICEP wrong" rumours first publicly posted at Resonaances, and indeed I see that Resonaances today has a follow-up referring to these very slides.

There are several interesting things to take away from this talk. The first is to do with the question of whether BICEP misinterpreted the preliminary Planck data that they admit having taken from a digitized version of a slide shown at a meeting. Here Flauger essentially simulates the process by digitizing the slide in question (and a few others) himself and analyzing them both with and without the correct CIB subtraction. His conclusion is that with the correct treatment, the dust models appear to predict higher dust contamination than BICEP accounted for; the inference being, I guess, that they didn't subtract the CIB correctly.

How important is this dust contribution? Here there is a fair amount of uncertainty: even if the digitization procedure were foolproof, one of the dust models underestimates the contamination and another one overestimates it. Putting the two together, "foregrounds may be OK if the lower end of the estimates is correct, but are potentially dangerous" (page 40). Flauger tries another method of estimation based on the HI column density, using yet more unofficial Planck "data" taken from digitized slides. This seems to give much the same bottom line.

A key point here is that everybody who isn't privy to the actual Planck data is really just groping in the dark, digitizing other people's slides. Flauger acknowledges by trying to estimate the effect of the process of converting real data into a gif image, converting that into a pdf as part of a talk, somebody nicking the pdf and converting it back to gif and then back to useable data. As you can imagine, the amount of noise introduced in this version of Chinese Whispers is considerable! So I think the following comment from Lyman Page towards the end of the video (as helpfully transcribed by Eiichiro Komatsu for the Facebook audience!) is perhaps the most relevant:
"This is, this is a really, peculiar situation. In that, the best evidence for this not being a foreground, and the best evidence for foregrounds being a possible contaminant, both come from digitizing maps from power point presentations that were not intended to be used this way by teams just sharing the data. So this is not - we all know, this is not sound methodology. You can't bank on this, you shouldn't. And I may be whining, but if I were an editor I wouldn't allow anything based on this in a journal. Just this particular thing, you know. You just can't, you can't do science by digitizing other people's images."
Until Planck answers (or fails to definitively answer) the question of foregrounds in the BICEP window, or some other experiment confirms the signal, we should bear that in mind.

There are some other issues that remain confusing at the moment: the cross-correlation of dust models with BICEP signal doesn't seem to support the idea that all the signal is spurious (though there are possibly some other complicating factors here), and the frequency evidence — such as it is — from the cross power with BICEP1 also doesn't seem to favour a dust contaminant. But all in all, the BICEP result is currently under a lot of pressure. Having seen this latest evidence, I now think the Resonaances verdict ("until [BICEP convincingly demonstrate that foregrounds are under control], I think their result does not stand") is — at least — a justifiable position.

Footnote: I should also perhaps explain that throughout my physics education I have been taught, and had come to believe, that the types of models of inflation BICEP provided evidence for (those with inflaton field values larger than the Planck scale) were fundamentally unnatural and incomplete, and that those, small-field, models that BICEP apparently ruled out were much more likely to be true. So perhaps my conscious attempts to compensate for this acknowledged theoretical prejudice could have biased me too far in the opposite direction in some previous posts!

Wednesday, May 14, 2014

New BICEP rumours: nothing to see here

This week there has been a minor kerfuffle about some rumours, originally posted on Adam Falkowski's Resonaances blog, regarding the claimed gravitational wave detection by BICEP. The rumours asserted that Planck had proven BICEP had made a mistake, BICEP had admitted the mistake, and that this might mean that all the excitement about the detection of gravitational waves was misplaced and all that BICEP had seen was some foreground dust emission contaminating their maps. (Since then there has been a strong public denial of this by the BICEP team.)

Now, with the greatest respect to Resonaances, which is an excellent particle physics blog, this is really a non-issue, and certainly not worth offending lots of people for (see for instance Martin Bucher's comment here). I really do not see what substantial information these rumours have provided us with that was not already known in March, and therefore why we should alter assessments of the data  made at that time.

Let me explain a bit more. One of the important limitations of the BICEP2 experiment is that it essentially measured the sky at only one frequency (150 GHz) — the data from BICEP1, which was at 100 GHz, was not good enough to see a signal, and the data from the Keck Array at 100 GHz has not yet been analysed. When you only have one frequency it is much harder to rule out the possibility that the "signal" seen is not due to primordial gravitational waves at all but due to intervening dust or other contamination from our own Galaxy.

The way that BICEP addressed this difficulty was to use a set of different models for the dust distribution in that part of the sky, and to show that all of them predict that the possible level of dust contamination is an order of magnitude too small to account for the signal that they see. Now, some of these models may not be correct. In fact none of them are likely to be exactly right, because they may be based on old and likely less accurate measurements of the dust distribution or rely on a bit of extrapolation, wishful thinking, whatever. But the point is that they all roughly agree about the order of magnitude of dust contamination. This does not mean that we know there is or isn't any foreground contamination; this is merely a plausibility argument from BICEP (that is supported by and supports some other plausibility arguments in the paper).

Now the "new" rumour is based on the fact that it turns out that one of the dust models was based on BICEP's interpretation of preliminary Planck data, and that this data was not officially sanctioned but digitally extracted from a pdf of a slide shown at a talk somewhere. This is not exactly news, since the slide in question is in fact referenced in the BICEP paper. What's new is that now somebody unnamed is suggesting that the slide was in fact misinterpreted, and therefore this one dust model is more wrong than we thought, though we already accepted it was probably somewhat wrong. This is not the same as proving that the BICEP signal has been definitively shown to be caused by dust contamination! In fact I don't see how it changes the current picture we have at all. Ultimately the only way we can be sure about whether the observed signal is truly primordial or due to dust is to have measurements that combine several different frequencies. For that we have to wait a bit for other experiments — and that's the same as we were saying in March.

It's worth noting that when BICEP quote their result in terms of the tensor-to-scalar ratio r, the headline number $r=0.2$ assumes that there is literally zero foreground contamination. This was always an unrealistic assumption, but that hasn't stopped some 300 theorists from writing papers on the arXiv that take the number as face value and use it to rule out or support their favourite theories. The foreground uncertainty means that while we can be reasonably confident that the gravitational wave signal does exist (see here), model comparisons that strongly depend on the precise value of r are probably going to need some revision in the future.

So what new information have we gained since March? Well, Planck released some more data, this time a map of the polarized dust emission close to the Galactic plane.

The polarization fraction at 353 GHz observed by Planck. From arXiv:1405.0871.

Since these maps do not include the part of the sky that BICEP looked at (which is mostly in the grey region at the bottom), they don't tell us a huge amount about whether that part of the sky is or is not contaminated by polarized dust emission! Some people have speculated that this is something to do with the rivalry between Planck and BICEP, which is a bit over-the-top. Instead the reason is more scientific: the mask excludes areas where the error in determining the polarisation fraction is high, or the overall dust signal itself is too small. So the fact that the BICEP patch is in the masked region indicates that the dust emission does not dominate the total emission there, at least at 353 GHz (dust emission increases with frequency). This means there is not a whole lot of dust showing up in the BICEP region — if anything, this is good news! But even this interpretation should be treated with caution: dust doesn't contribute too much to the total intensity in that region, but it may well still contribute a large fraction of whatever B-mode polarization is seen. Based on my understanding and things I have learned from conversations with colleagues, I don't think Planck is going to be sensitive enough to make definitive statements about the dust in that specific region of the sky.

Another interesting paper that has come out since March has been this one, which claims evidence for some contamination in the CMB arising from the "radio loops" of our Galaxy. It also has the great benefit of being an actual scientific paper rather than a rumour on somebody's blog. (Full disclaimer: one of the authors of this paper was my PhD advisor, and another is a friend who was a fellow student when I was at Oxford.) 

The radio loops are believed to be due to ejected material from past supernovae explosions; the idea is that if this dust contains ferrimagnetic molecules or iron, it would contribute polarized emission that might be mistaken for true CMB when it is in fact more local. What this paper argues is that does appear to be some evidence that one of the CMB maps produced by the WMAP satellite (which operated before Planck) does show some correlation between map temperature and the position of one of these radio loops ("Loop I"). In particular, synchrotron emission from Loop I appears to be correlated with the temperature in the WMAP Internal Linear Combination (or ILC) map. I'm not going to comment on the strength of the statistical evidence for this claim; doubtless someone more expert than I will thoroughly check the paper before it is published. For the time being let us treat it as proven.

The relevance of this to BICEP is somewhat intricate, and proceeds like this: given our physical understanding of how the radio loops formed, it seems likely that they produce both synchrotron and dust emission which follow the same pattern on the sky. Therefore perhaps the correlation of the synchrotron emission from Loop I with the ILC map is because both are correlated with dust emission from the loop. If the correlation is because of dust emission, this might be polarized because of the postulated ferrimagnetic molecules etc., leading to a correlation between the WMAP polarization and Loop I. And if Loop I is contaminating the WMAP ILC map, it is perhaps plausible that a different radio loop, called the "New Loop", is also contaminating other CMB maps, in particular those of BICEP. Whereas Loop I doesn't get very close to the BICEP region, the New Loop goes right through the centre of it (see the figure below), so it is possible that there is some polarized contamination appearing in the BICEP data because of the New Loop. At any rate, the foreground dust models that BICEP used didn't account for any radio loops, so likely underestimate the true contamination.

Position of some Galactic radio loops and the BICEP window. "Loop I" is large one in the upper centre, that only skims the BICEP window; the "New Loop" is the one in the lower centre that passes through the centre of it. Figure from Philipp Mertsch.

So far so good, but this is quite a long chain of reasoning and it doesn't prove that it is actually dust contamination that accounts for any part of the BICEP observation. Instead it makes a plausible argument that it might be important; further investigation is required.

At the end of the day then, we are left in pretty much the same position we were in back in March. The BICEP result is exciting, but because it is only at one frequency, it cannot rule out foreground contamination. Other observations at other frequencies are required to confirm whether the signal is indeed cosmological. One scenario is that Planck, operating on the whole sky at many frequencies but with a lower sensitivity than BICEP, confirms a gravitational wave signal, in which case pop the champagne corks and prepare for Stockholm. The other scenario is that Planck can't confirm a detection, but also can't definitively say that BICEP's detection was due to foregrounds (this is still reasonably likely!), in which case we wait for other very sensitive ground-based telescopes pointed at that same region of sky but operating at different frequencies to confirm whether or not dust foregrounds are actually important in that region, and if so, how much they change the inferred value of r.

Until then I would say ignore the rumours.

Monday, March 24, 2014

BICEP2: reasons to be sceptical, part 2

This is the second part of three posts in which I wanted to lay out the various possible causes of concern regarding the BICEP2 result, and provide my own opinion on how seriously we should take these worries. I arranged these reasons to be sceptical into three categories, based on the questions
  • how certain can we be that BICEP2 observed a real B-mode signal?
  • how certain can we be that this B-mode signal is cosmological in origin, i.e. that it is due to gravitational waves rather than something less exciting?
  • how certain can we be that these gravitational waves were caused by inflation?
The first post dealt with the first of the three questions, this one addresses the second, and a post yet to be written will deal with the third.

How certain can we be that the observed B-mode signal is cosmological? 

Let's take it as given that none of the concerns in the previous post turn out to be important, i.e. that the observed B-mode signal is not an artefact of some hidden systematics in the analysis, leakage or whatever. From my position of knowing a little about data in general, but nothing much about CMB polarization analysis, I guessed that the chances of any such systematic being important were about 1 in 100.

The next question is then whether the signal could be caused by something other than the primordial gravitational waves that we are all so interested in. The most important possible contaminant here is other nearby sources of polarized radiation, particularly dust in our own Galaxy. We don't actually know how much polarized dust or synchrotron emission there might be in the sky maps here, so a lot of what BICEP have done is educated guesswork.

To start with, the region of the sky that BICEP looks at was chosen on the basis of a study by Finkbeiner et al. from 1999, which extrapolated from measurements of dust emission at certain other frequencies to estimate that, at the frequency of relevance to CMB missions such as BICEP, that particular part of the sky would be exceptionally "clean", i.e. with exceptionally low foreground dust emission. Whether this is actually true or not is not yet known for certain, but there exist a number of models of the dust distribution, and most of these models predict that the level of contamination to the B-mode detection from polarized dust emission would be an order of magnitude smaller than the observed signal. Similar model-dependent extrapolation to the observation frequency based on WMAP results suggests that synchrotron contamination is also an order of magnitude too small.

Predictions for foreground contamination for different dust models (the coloured lines at the bottom) versus the actual B-mode signal observed by BICEP2 (black points).

Now one real test of these assumptions will come from Planck, because Planck will soon have the best map of dust in our Galaxy and therefore the best limits on the possible contamination. This is one of the reasons to look forward to Planck's own polarization results, due in about October or November. In the absence of this information, the other thing that we would like to see from BICEP in order to be sure their signal is cosmological is evidence that the signal exists at multiple frequencies (and has the expected frequency dependence).

BICEP do not detect the signal at multiple frequencies. The current experiment, BICEP2, operates at 150 GHz only, and that is where the signal is seen. A previous experiment, BICEP1, did run at 100 GHz as well, but BICEP1 did not have the same sensitivity and could only place an upper limit on the B-mode signal. Data from the Keck Array will eventually also include observations at 100 GHz, but this is not yet available. Until we have confirmation of the signal at different frequencies, most cosmologists will treat the result very carefully.

In the absence of this, we must look at the cross-correlation between B2 and B1. Remember that although B1$\times$B1 did not have the sensitivity to make a detection of non-zero power, B2$\times$B1 can still tell us something useful. If B1 maps were purely noise, or B2 maps were due to dust, we would not expect them to be correlated. If both were due to synchrotron radiation, we would expect them to be strongly correlated. In fact the B2$\times$B1 cross power is non-zero at the $3\sigma$ level or about 99% confidence, which is something Peter Coles' sceptical summary ignores. This is indeed evidence that the signal seen at 150 GHz is cosmological.

Still, some level of cross-correlation could be produced even if both B2 and B1 were only seeing foregrounds. Combining the B2$\times$B1 data with B2$\times$B2 and B1$\times$B1 means that polarized dust or synchrotron emission of unexpected strength are rejected as explanations – though at a not-particularly-exciting significance of about $2.2-2.3\sigma$.


It's fair to say, on the basis of models of the distribution of polarized dust and synchrotron emission, that the BICEP2 signal probably isn't due to either of these contaminants. However, we don't yet have confirmation of the detection at multiple frequencies, which is required to judge for sure. At the moment, the frequency-based evidence against foreground contamination is not very strong, but we'd still need some quite unexpected stuff to be going on with the foregrounds to explain the amplitude of the observed signal.

Overall, I'd guess the odds are about 1:100 against foregrounds being the whole story. (This should still be compared with the quoted headline result of 1:300,000,000,000 against $r=0$ assuming no foregrounds at all!)

The chances are much higher – I'd be tempted to say perhaps even as much as better than even money – that foregrounds contribute a part of the observed signal, and that therefore the actual value of the tensor-to-scalar ratio will come down from $r=0.2$, perhaps to as low as $r=0.1$, when Planck checks this result using their better dust mapping.

Friday, March 21, 2014

BICEP2: reasons to be sceptical, part 1

As the dust begins to settle following the amazing announcement of the discovery of gravitational waves by the BICEP2 experiment, physicists around the world are taking stock and scrutinizing the results.

Remember that the claimed detection is enormously significant, in more ways than one. The BICEP team have apparently detected an exceedingly faint B-mode polarization pattern in the CMB, at an order of magnitude better sensitivity than any previous experiment probing the same scales. They have then claimed to have been able to ascribe this B-mode signal unambiguously to cosmological gravitational waves, rather than any astrophysical effects due to intervening dust or other sources of radiation. And finally they have interpreted these results as direct evidence for the theory of inflation, which is really the source of all the excitement, because if true it would pin down the energy scale of inflation at an incredibly high level, with extensive and dramatic consequences for our understanding of high energy particle physics.

However, as all physicists have been saying, with results of this magnitude it is important to be very careful indeed. Speculating who should get the Nobel Prize (or Prizes) for this is still premature. The paper containing the results will of course be subjected to anonymous peer review when it is submitted to a journal, but it has also already faced a rather extraordinary open peer review by social media, with a live group on Facebook, and all sorts of other discussion on blogs, Twitter and the like. (And to the great credit of the scientists on the BICEP team, they have patiently responded to questions and comments on these forums, and the whole process has been carried out very civilly!)

What I wanted to do today is to possibly contribute to that by gathering together all the main points of concern and reasons to be sceptical of the BICEP result. This is partly for my own purposes, since writing things down helps to clarify my thoughts. I will divide these concerns into three main categories, addressing the following questions:

  • how certain can we be that BICEP2 observed a real B-mode signal?
  • how certain can we be that this B-mode signal is cosmological in origin, i.e. that it is due to gravitational waves rather than something less exciting?
  • how certain can we be that these gravitational waves were caused by inflation?

I'll discuss the first category of concerns in part 1 of this post and the next two together in parts 2 and 3. I do not claim that any of the concerns I raise here are original, however any mistakes are definitely mine alone. I'd like to encourage discussion of any of these points via the comments below.

How certain can we be that BICEP2 observed a real B-mode signal?

This is obviously the most basic issue. The general reason for concern here — and this applies to any B-mode detection experiment — is that the experimental pipeline has to be able to decompose the polarization signal seen into two components, the E-mode and the B-mode, and the level of the signal in the B-mode is orders of magnitude smaller than in E. Now, as Peter Coles explains here, the E and B polarization components are in principle orthogonal to each other when the spherical harmonic decomposition can be performed over the whole sky, but this is in practice impossible. BICEP observes only a small portion of the sky, and therefore there is the possibility of "leakage" from E to B when the separating out the components. It would not take much leakage to spoil the B-mode observation.

Obviously the BICEP team implemented many tests of the obtained maps to check for such systematics. One of the ways to do this is to cross-correlate the E and B maps: if there is no leakage the cross-correlation should be consistent with zero. Another important test is the jackknife technique, also nicely explained here: you split your data into two equal halves, and subtract the signal found in one half from that in the other; the answer should also be consistent with zero.

Now one source of concern arises because of a combination of these two tests. The blue points in the following figure show the results of a jackknife test on the BB power:

These points are consistent with zero ... but they are possibly too consistent with zero! The $1\sigma$ error bars of each one of them passes through zero, whereas it would be more natural to expect some more scatter. In fact from the number on the plot you can see that there is only a 1% chance that all 9 blue points should be so close to zero.

This raises the possibility, pointed out by Hans Kristian Eriksen, that the errorbars on the blue points are overestimated. It may then be the case that the errorbars on other points in other jackknife tests are also too large. If that were the case then reducing those errors might mean that some of the other jackknife tests now fail — the points are no longer consistent with zero. As it happens, of the 168 jackknife test results listed in the table in the paper, quite a large number (about 7) of them already "fail" by the stricter standards (2% probability) some other experiments such as QUIET might apply. Obviously some number of tests are always expected to fail, but more than 7 out of 168 starts to look like quite a large number. This then becomes a little worrying.

On the other hand, this extrapolation may be a little exaggerated, because we are surmising that the errorbars might be too large purely on the basis of the one figure above. Clearly if you do a large number of jackknife tests, it becomes less surprising that one of them gives a surprising result, if you see what I mean. Looking through the table for the other BB jackknife results, the particular example from the figure is the only one that stands out as being odd, so it is hard to conclude from this that the errorbars are too large. Overall I'm not convinced that there is necessarily a problem here, but it is something that deserves a little more quantitative attention.

The second source of concern that has been highlighted is that the data at large multipole values appear to be doing something odd. Look at the 5th, 6th and 7th black points from the figure above, which are quite a long way from the theoretical expectation. Peter Coles helpfully drew a little blue circle around them:

The worry here is that even if the data appear to be passing jackknife tests for internal consistency and null tests for EB cross power, the fact that these points are so high suggests that there is still some undetected systematic that has crept in somewhere. This hypothesized systematic could account for the measured values of the crucial first four points, which constitute the detection of the gravitational waves.

Similarly, people are worried about the EE power spectrum, which appears to be too high in the $50< \ell<100$ region — again this could be a sign of leakage from temperature into polarization, which could perhaps be contaminating the B-mode maps despite not explicitly showing up in the jackknife consistency checks.

Now, the BICEP response to this is that you shouldn't judge things simply "by eye". The EE excess does not appear to be statistically significant. It's also not incredibly unlikely that the final two of the circled BB data points could simultaneously be as high as they are just due to random chance — they say "their joint significance is $<3\sigma$", which means that the chance is about 1%. (Of course the chance that all three of the circled points could simultaneously be high is smaller than that, and so presumably less than 1% ... )

Another justification some people have been providing (mostly people from outside the BICEP collaboration to be fair, though some from within it as well) is that the preliminary data from the Keck array, which is a similar instrument to BICEP but with higher sensitivity, appear to show no anomaly in that region. I think this is a somewhat dangerous argument, because the Keck data also don't seem to be quite so high in the region of the crucial first four bandpowers! In any case, the "official" word from BICEP is that any such speculation on the basis of Keck is to be discouraged, because the Keck data is still very preliminary and has not been properly checked.


I'm a little bit worried about the various issues raised here, though overall I would say the odds are in favour of the B-mode detection being secure (this is a different issue to whether this detected signal is due to gravitational waves! More on that in the next post). I would not, however, put those odds at anywhere near 1 in 300,000,000,000 against there being an error, which is the headline significance claimed for the detection of a non-zero tensor-to-scalar ratio ($7\sigma$). If I were forced to quantify my belief, I would say something more like 1 or 2 in 100. That's not particularly secure, but luckily there are follow-up experiments, such as Keck and Planck itself, which should be able to reassure us on that score soon.

A final point: seeing the preliminary Keck data shown in a figure in the paper suggests to me that perhaps the final analysis of Keck data will now not be done "blind". I hope that's not the case, it would be very disturbing indeed if it were. 

Monday, March 17, 2014

First Direct Evidence for Cosmic Inflation

That was the title of the BICEP2 presentation today. Gives you some idea about the magnitude of the result, if it holds up: it really is astonishingly exciting.

Unfortunately it was so exciting that we in Helsinki couldn't even access the Harvard server and so couldn't watch any of the webcast at all. It seems the same was true for most other cosmologists around the world. So my comments here are based purely on a preliminary reading of the paper itself, and a distillation of the conversations occurring via Facebook and the like.

Firstly, the headline results: the BICEP team claim to have detected a B-mode signal in the CMB at exceedingly high statistical significance. Their headline claim is

$r=0.2^{+0.07}_{-0.05}$, with $r=0$ disfavoured at $7.0\sigma$

That is frankly astonishing. Here's the likelihood plot:

BICEP2 constraint on the tensor-to-scalar ratio r. 

(All figures are taken from the paper avalaible here.)

The actual measurement of the BB power spectrum looks like this:

The black points are the new measurements, the other coloured points are the previously available best upper limits. The solid red curve is the theoretical expectation from lensing (the relatively boring contribution to BB), the dashed red curve that dies off is the theoretical expectation from a model with inflationary gravitational waves and $r=0.2$, and the other dashed red curve (were they short of colours?!) is the total.

They've also done a pretty good job of eliminating other foreground sources (dust, synchrotron emission etc.) as possible explanations for the signal seen, which means it is much more likely that the signal is actually due to primordial gravitational waves from inflation. In doing this, it helps that the signal they see is actually as large as it is, since there's less chance of confusing it with these foregrounds (which are much smaller).  [Update: I'm not an expert here, apparently some others were less convinced about the removal of foregrounds. Not sure why though – I'd have thought other systematic errors were far more likely to be a problem than foregrounds.]

So far so good. In fact — and I really can't stress this enough — this is an extraordinary, wonderful, unexpected result and huge congratulations to the BICEP team for achieving it. It will mean a lot of happy theorists as well, because we finally have something new to try to explain!

However, it is very important that as a community we remain skeptical, particularly so when - as here - the result is one that we would so desperately love to be true. Given that, I'm going to list a serious of things that are potentially worrying/things to think about/things I don't understand. (Some of these are not things I noticed myself, but were points raised by Dave Spergel, Scott Dodelson and other experts at the ongoing live discussion on Facebook.) Doubtless these are questions the BICEP team will have thought about themselves; perhaps they already have all the answers and will tell us about them in due course — as I said, no one I know was able to watch the webcast live.

  • In the BB-spectrum plot above, the data seem to be showing a significant excess above expectations for multipoles about $\ell\sim200-350$. What's going on with that?
  • This is particularly noticeable in another figure (Fig. 9) in the paper:
  • From the above figure, preliminary results of the cross-correlation with Keck don't show the excess at high-$\ell$ (a reason to believe it might go away), but the same cros-correlation also shows less power at lower $\ell$ (which is a bit confusing).
  • At lower values of $\ell$ the EE power spectrum also shows an excess (Fig. 7):
  • All the above points put together suggest that perhaps there is some leakage in the polarisation maps coming from the temperature anisotropy — a large part of the analysis work is concerned with accounting for and correcting for any such leakage, of course, the question is to what extent independent experts will be satisfied that these methods worked.
  • Although the headline figure is $r=0.2$, they rather confusingly later say that when the best possible dust model is used for foreground subtraction, this becomes $r=0.16^{+0.06}_{-0.05}$. But if this the the best possible dust model, why is this not the quoted headline number? Is this related somehow to the power excess at $\ell\sim200-350$?
  • If $r$ is as large as they have measured why was it not seen by Planck? Actually this is a fairly complicated question: the point being that if the tensor amplitude is so large, it should make a non-negligible contribution to the temperature power spectrum as well, which would have affected Planck's results. Planck had a constraint $r<0.11$, but this specifically assumed that the primordial power spectrum had a power-law form with no running (sorry about the technical jargon, unfortunately not enough time to explain here today). So BICEP suggest one way around this tension is to simply introduce a running, but it seems (but this bit was not entirely clear to me from the paper) that you need a fairly large value of the running for this explanation to fly. And if you've got a large running then you have to worry about why not a running of the running, a running of the running of the running and so on ad infinitum - in fact how do we know that the power-law expansion form of the $P(k)$ is the correct way to go at all?
  • Besides, are there viable inflationary models that predict both large $r$ as well as large running (or non-power-law form of the primordial power)? Given the vast array of inflationary models, the answer to this question is almost certainly yes, but people may consider some other explanations more worthwhile ...
Phew. There are probably lots of other things to think about, but that's about all I can manage today. It's been a very exciting day!

Saturday, March 15, 2014

B-modes, rumours, and inflation

Update: The announcement will definitely be about a major discovery by BICEP2, meaning it can only really be about a B-mode signal. You can follow the webcast at, starting at 10:45 am EDT (14:45 GMT) for scientists, or 12:00 pm EDT (16:00 GMT) for the general public and news organisations.

The big news in cosmology circles at the minute is the rumour that the "major discovery" due to be announced at a press conference on Monday the 17th is in fact a claimed detection of the B-mode signal in the CMB by the the BICEP2 experiment.

Now, I'm not particularly well placed to comment on this rumour, since all the information I have comes at second- or third-hand, via people who have heard something from someone, people who think they heard something from someone, or people who are simply unashamedly speculating. (Perhaps this is a function of being on the wrong side of the Atlantic: although the BICEP2 experiment is based at the South Pole, the only non-North-American university participating in the collaboration is Cardiff University in Wales. Even worse, I'm not on Twitter.) In any case, by reading thisthisthis and this, you will be starting with essentially the same information as me.

But having got that health warning out of the way, let's pretend that the rumours are entirely accurate and that on Monday we will have an announcement of a detection of a significant B-mode signal. What would this mean for cosmology?

Firstly, the B-mode signal refers to a particular polarisation of the CMB (for a short and somewhat technical introduction, see here; for a slightly longer one, see here). This polarisation can arise in various ways, one of which is the polarisation induced in the CMB by gravitational lensing, as the CMB photons travel through the inhomogeneous Universe on their way from the last scattering surface to us. There have been a few experiments, such as POLARBEAR, which have already claimed a detection of this lensing contribution to the B-mode signal (though in this particular case after skim-reading the paper I was a little underwhelmed by the claim).

Now, detecting a lensing B-mode would be cool, but significantly less exciting than detecting a primordial B-mode. This is because whereas the lensing signal comes from late-time physics that is quite well understood, a primordial signal would be evidence of primordial tensor fluctuations or primordial gravitational waves. And this is cool because inflation provides a possible way to produce primordial gravitational waves – therefore their detection could be a major piece of evidence in favour of inflation.

The contributions to the B-mode signal coming from gravitational waves and lensing are differentiated on the basis of the multipoles (essentially the length scale) at which they are important. Figure from Hu and Dodelson 2002.

People often say that detection of this tensor signal would be a "smoking gun" for inflation; something that would be very welcome, because although inflation has proved to be an attractive and fertile paradigm for cosmology, there is still a bit of a lack of direct, incontrovertible evidence in favour of it. Coupled with certain unresolved theoretical issues it faces, this lack of a smoking gun meant that arguments for or against inflation were threatening to degenerate into what you might call "multiverse territory", definitely an unhealthy place to be.

It may be worth introducing a note of caution about this "smoking gun" though. Although inflation is a possible source of primordial gravitational waves, it is not the only one. Artefacts of possible phase transitions in the early universe, known as cosmic defects, can also produce a spectrum of gravitational waves – and what's more, this spectrum can be exactly scale-invariant, just as that from inflation. I don't know a huge amount about this field, so I am not sure whether the amplitude of the perturbations which could be produced by these cosmic defects could be sufficiently large, nor – if it is – whether there are any other features which could help distinguish this scenario from inflation if the rumours turn out to be true. Perhaps better informed people could comment below.

Suppose we put that issue to one side though, and assume that not only has a significant tensor signal been detected, we have also been able to prove that it could not be due to anything other than inflation. The rumour is that the detection corresponds to value for the tensor-to-scalar ratio r of about 0.2. What are the implications of this for the different inflation models?

Planck limits on various inflationary models.
Not all models of inflation do result in tensor modes large enough to observed in the CMB, so an observation of a large r would rule out a large class of these models. Generally speaking, the understanding is that models in which the inflaton field $\phi$ takes large values (i.e., values larger than the Planck mass $M_P$) are the ones which could produce observably large r, whereas the so-called "small-field models" where $\phi\ll M_P$ usually predict tiny values of r which could never be observed. (A note for non-experts: irrespective of the field value, the energy scale in both small-field and large-field models is always much less than the Planck scale.) Therefore, at a stroke, all small-field inflation models would be ruled out. Many people regard these as the better-motivated models of inflation, with in some respects fewer theoretical issues than the large-field models, so this would be quite significant.

There are two small caveats to this statement: firstly, it isn't strictly necessary for $\phi$ itself to be larger than $M_P$ to generate a large r, only that the change in $\phi$ be large. So models in which the inflaton field winds around a cylinder, in effect travelling a large distance without actually getting anywhere, can still give large r (hat-tip to Shaun for that phrasing). Also, it is not even strictly true that the change in $\phi$ must be large: if some other rather specific conditions (including the temporary breakdown of the slow-roll approximation) are met, this one can be avoided and even small field models can produce enough gravitational waves. This was something pointed out by a paper I wrote with Shaun Hotchkiss and Anupam Mazumdar in 2011, though other people had similar ideas at about the same time. Such rather forced small-field models would have other specific features though, so could be distinguished by other measurements.

One of the more interesting consequences of a detection of large r (aside from the earth-shattering importance of a confirmation of inflation itself) would be that the Higgs inflation model – which has been steadily gaining in popularity given the results from the LHC and Planck, and has begun to be regarded by many as the most plausible mechanism by which inflation could have occurred – would be disfavoured. In the plot above, the Higgs inflation prediction is shown by the orange points at the bottom centre of the figure. So a BICEP2 detection of $r\sim0.2$ as suggested by the rumours would be pretty serious for this model.

On the other hand, a BICEP2 detection of $r\sim0.2$ would also strongly contradict appear to be at odds with the results from the Planck and WMAP satellites. Which probably goes to show that there is not much point believing every rumour ...

We will find out on Monday!