This book is a wonderful overview of the ingredients that go into making a great customer experience. Far from being a bland guidebook for a practitioner, it really gets you thinking deeply about why people love certain products, brands and organisations against others.

The book is organised into easy-to-digest, short chapters, with a summary of the key points at the end of each chapter. The case studies throughout the book tie the concepts back to real world customer experiences that I am sure many readers have been exposed to themselves.

The chapter on the role of emotion in customer experiences is particularly relevant, highlighting why building a product that just works is not good enough. In the author’s own words, “nobody falls in love with something that’s only average”. This reminded me of remarks made by Airbnb’s founder Brian Chesky, who was quoting Paul Graham, during the How to Start a Startup lecture series: “it’s better to have 100 people that love you, than to have a million people that just sort-of like you.”

Note: this is the first book review I have ever written. It is quite short (I am still finding my feet) but I hope to write more of these reviews in the future. Stay tuned.

The voting system is perhaps the most important aspect of any collaborative policy-building tool. For the past year, I have been working on an experimental side-project, code-named Commons, to build an online debate and voting platform to enable people to collaboratively build political policy. The aim is to build an online, distributed alternative to centralised parliaments, so that the general public can truly have political influence.

Voting in Commons works differently to the strategies used in actual parliaments. The traditional methods of voting are not appropriate when trying to build a debating environment for, theoretically, millions of people. This post details the voting system that I built for Commons.

This article does get technical fairly quickly, and it might be necessary to have some grasp of statistics to understand the latter parts. I have tried to keep the first half, which details the key ideas behind the Commons voting system, accessible to everybody.

The problem with a traditional voting system

To create new political policy or modify existing political policy in Commons, a user must create a “motion”, which is a proposal for the changes that should take place to the main corpus of political policy. Once a motion is published, the community can begin voting. If the community decides to accept a motion, it will be incorporated into the main corpus permanently, or at least until somebody else changes it with a future motion.

The first thing that must be realised is that, for each motion, it would be impractical to ask millions of people their opinion on it. We must use some method of sampling, which leads to two obvious voting systems. We could either stop voting after a definite number of votes have been counted, say 1,000 votes, or we can stop after a definite number of hours have passed, say 24 hours. Then we would simply total up the number of votes for, and the number of votes against, a motion and use the majority vote to either accept or reject it.

But there is an obvious problem with this traditional voting system. For a motion that everybody agrees should be accepted, we want that motion to be accepted as quickly as possible. And if everybody agrees it should be rejected, we similarly want to reject it quickly. If a motion divides opinion, we would like to keep it open for longer to enable more debate and to gather more votes.

Suppose we have a motion that runs for 24 hours and, in the first hour, it gathers 900 votes against it and 100 votes in support of it. You can intuitively infer the consensus opinion of people being against the motion. In this traditional voting system, however, the motion stays open for another 23 hours. It becomes a distraction from the other motions that are vying to be accepted.

Alternative voting system for Commons

The Commons voting system addresses this problem by accepting or rejecting a motion as soon as it possibly can. It removes the motions that become a distraction, encouraging debate and voting on the outstanding issues, where it actually matters.

The system looks at the difference between votes over time, and has two thresholds representing the vote difference where it is safe to assume that a motion is accepted or rejected. As soon as the vote difference hits one of these thresholds, it is closed automatically and, if it was accepted, the motion gets incorporated into the main corpus.

The motion page contains a chart which shows whether the motion is on course to be accepted or rejected. The chart looks rather simple, and is purposely devoid of any scale. The horizontal line is the ‘0 line’ which, for now, you can think of meaning the vote is split 50-50, i.e. there is no reason to believe the motion should be accepted or rejected. The upper dashed line is the ‘acceptance threshold’. If voting reaches this point, it is automatically accepted. Similarly, the lower dashed line is the ‘rejection threshold’, and a motion is rejected if it reaches this threshold.

The x-axis represents time, so we can see how the vote changed through the lifetime of a motion. The width of the entire chart shows the approximate voting duration which, in this voting system, is not fixed so must be estimated.

So what does the y-axis represent? Your instinct might be that it represents the number of votes against subtracted from the number of votes for a motion. It is safe to think this way in Commons, although it might not be true in the general scheme.

The statistics behind the voting system

Voting in Commons is driven by a statistical tool called a ‘sequential probability ratio test‘ or SPRT for short. SPRT is used to test a null hypothesis against an alternative hypothesis. The basic idea is to observe votes one at a time until it is ‘safe’ to stop and then to accept one of the hypotheses.

In Commons, we are interested in what the entire population thinks of a motion. We call the true vote p. This is what the outcome would be if we asked every member of Commons to vote. For example, if p = 0.70, and there were a million people on Commons, then 700,000 people would vote ‘Yes’ and 300,000 would vote ‘No’. However, we cannot ask a million people to vote, so the actual value of p is unobservable — we can only estimate it.

Our basic aim is to test whether more people are in favour than are against a motion. We restrict ourselves to two hypotheses for the values that p could take:

H_{0}: p = p_{0} = 0.49

H_{1}: p = p_{1} = 0.51

The null hypothesis H_{0} basically says that most people are against a motion, while the alternative hypothesis H_{1} says that most people are for it.

You may have noticed that there is a gap between the values of p. If the true value lies in this range, we do not care whether the motion is accepted or rejected. More specifically, we only want to have statistical control when the value of p is less that 0.49 or greater than 0.51. To use SPRT, it is necessary to have this gap; if the true value is p=0.50 exactly, should the motion be accepted?

I am not going to go into the precise details of SPRT (see Wald’s paper below if you are interested), but it basically works according to the following rules:

Let S_{n} be the cumulative sum of the log-likelihood ratio. The likelihood ratio is the probability of the data being observed under H_{1}: p = 0.51, divided by the probability of the data being observed under the H_{0}: p = 0.49.

If S_{n} > t_{A}, then the motion is accepted.

If S_{n} < t_{R}, then the motion is rejected.

Otherwise, observe the next datum n+1 and repeat.

The t_{A} and t_{R} are acceptance and rejection threholds, respectively. We take t_{A} = log((1 – β) / α) and t_{R} = log(β / (1 – α)). These are thresholds which achieve approximate type I and type II error rates of α and β. See Wald’s paper to see where these values come from.

We assume that votes come from a Bernoulli(p) distribution, iid, where p is the true unknown value. The cumulative-sum log-likelihood ratio can be written as:

where N_{A}(n) is the number of accepts after n votes, and N_{R}(n) is the number of rejects after n votes.

That brings us back to the y-axis of the chart above. The y-axis scale is actually the cumulative sum of the log-likelihood ratio, and the blue line is S_{n} plotted over time. The two dashed lines are the thresholds t_{A} and t_{R}.

Remember that I said you could think of the y-axis as the number of votes against a motion subtracted from the number of votes for it. You can see why this is true, by looking at the individual contributions to the cumulative sum for an acceptance vote and a rejection vote:

So s_{A} = – s_{R}. This means the y-axis is simply a scaled version of N_{A}(n) – N_{R}(n). This is also why the zero line does indeed represent a ’50-50′ vote split. However, in a more general scheme, we could choose p_{0} and p_{1} differently, and this property might not hold.

So far, we have dealt with every aspect of the chart, except for one: the width of the chart. The width of the entire chart is a useful feature as it gives you some understanding of how close a motion is to completion. But, since we do not close motions after a set amount of time, this width also has a statistical basis. After some derivation, it turns out you can estimate the remaining votes according to the following formula:

where C = p̂ log(p_{1} / p_{0}) + (1 – p̂) log((1 – p_{1}) / (1 – p_{0})), a^{*} = t_{A} – S_{n}, b^{*} = t_{R} – S_{n} and S_{n} is known. The only thing needed is a plug-in estimate of p̂.

Dynamic and relevant

The Commons voting system does not rely on a fixed number of votes or a fixed duration of voting. This system makes the Commons platform more dynamic, and makes the content more relevant to users. I see this system as a critical feature of Commons, which will hopefully lead to an engaging experience for users. There is still a lot of work to be done before the application is ready for release, but stay tuned to this blog for further updates.

[1] A. Wald. Sequential tests of statistical hypotheses. The Annals of Mathemat- ical Statistics, 16(2):117–186, 06 1945. doi: 10.1214/aoms/1177731118. URL http://dx.doi.org/10.1214/aoms/1177731118.

I have recently had to choose an investment portfolio for my SIPP (self-invested pension plan), and one of the first problems I had was deciding on the bond allocation that I should use.

When it comes to investing for retirement, the generally accepted rule-of-thumb is that you should hold “y minus your age” percent of your portfolio in equities, with y generally taken to be 100, 110 or 120, depending firstly on your level of risk aversion, and secondly on your age.

Since life expectancy is expected (!) to be higher for today’s younger people, somebody like me would tend to follow the “110 minus age” rule. At age 23, I would hold 87% equities and 13% bonds. By the time I was 65, I would have 45% equities and 55% bonds. That seems fairly reasonable.

But of course, this is simply a rule of thumb. It might be a fair approximation of a good strategy for most people, but the fact that is a linear function of age unsettles me.

Bond allocation of target date funds

I did some quick research on target-date retirement funds and looked at their holdings in equities. These funds did not decay into bonds linearly with age. Instead, the decay into bonds seemed to accelerate as a person approached a certain age, and then decelerated beyond that age.

Fitting the logistic function to this data gave a very similar result to the target-date funds’ actual holdings. I am not saying this is the actual approach that target date funds use, but it seems to be a rather good estimate. Rather than “110 minus age”, the following formula might be a more sensible approach:

It doesn’t quite have the same ring to it. Plugging some numbers in: at age 23, I would have 94.5% equities and 5.5% bonds, which is more aggressive than the rule of thumb; at age 65, I would have 41.6% equities and 58.4% bonds, which is very similar to the rule.

In the following plot, you can see this equation compared against the “110 minus age” and “120 minus age” rules:

From visual inspection, you can see this strategy is more aggressive than the 110 rule and, up to age ~ 48, is slightly more aggressive than the 120 rule. Notice, however, how quickly it decays into bonds from about age 50 onwards. In the 15 years between age 50 and 65, it goes from 70% equities to 40%, whereas the 120 rule reduces to only 55%.

I am not going to comment on whether this strategy is better than either rule (this article is not finance advice), but I think it is worth questioning the suitability of rules of thumb in investment decision making.

This page serves as a reminder to myself of the books that I have read. I try to read as much as I can (typically on the train or if I have spare time in an evening), although I would like to read much more than I do. The books I read extend over a range of topics, from software development to business to marketing and beyond, but I do not read fiction. The list below is not comprehensive.

I started compilation of this list in May, 2015, and will periodically update this blog post with new books as and when I read them.

Carnegie, D. (2006). How to win friends and influence people. London, Vermilion.

Shore, J., & Warden, S. (2008). The art of agile development. Beijing, O’Reilly Media, Inc.

Lopp, M. (2012). Managing humans: biting and humorous tales of a software engineering manager. New York, Apress.

Schmidt, E., & Rosenberg, J. (2014). How Google Works. London, John Murray.

DeMarco, T., & Lister, T. R. (2013). Peopleware: productive projects and teams.

Truss, L. (2007). Eats, shoots & leaves: the zero tolerance approach to punctuation. London, Profile.

Morris, P. (1994). Introduction to game theory. New York, Springer.

Sutherland, W. A. (2009). Introduction to metric and topological spaces. Oxford, Oxford University Press.

Buckley, G., & Desai, S. (2011). What you need to know about economics.

Hawkins, J., & Blakeslee, S. (2004). On intelligence. New York, Times Books.

Greenwald, G. (2014). No place to hide: Edward Snowden, the NSA and the surveillance state. London, Hamish Hamilton Penguin Books.

Cialdini, R. B. (2007). Influence: the psychology of persuasion. New York, Collins.

Rogers, S. (2013). Facts are sacred: the power of data. London, Faber and Faber.

Patterson, S. (2012). Dark pools: the rise of artificially intelligent trading machines and the looming threat to Wall Street. London, Random House Business.

Harford, T. (2013). The Undercover Economist.

Wilkinson, R. G., & Pickett, K. (2010). The spirit level why equality is better for everyone. London, Penguin.

Tufte, E. R. (2001). The visual display of quantitative information. Cheshire, Conn, Graphic Press.

Berger, J. (2014). Contagious: How to build word of mouth in the digital age. London, Simon & Schuster.

Thiel, P. A., & Masters, B. G. (2015). Zero to one: notes on startups, or how to build the future. London, Virgin Books.

Norman, D. A. (2013). The design of everyday things. Cambridge, Mass, The MIT Press.

Norman, D. A. (2011). Living with complexity. Cambridge, Mass, MIT Press.

Eyal, N., & Hoover, R. (2014). Hooked: how to build habit-forming products. London, Pengiun.

Tableb, N. N. (2008). The black swan: the impact of the highly improbable. London, Penguin.