There’s been some “trendy” talk lately about eliminating the neutral score in research, to force listeners to take more of a stance on how they feel about a song. The problem is, it’s bad science and not backed up by any facts. We’re all about applying art to your music selection process, but that art must come after good scientific research. Otherwise, your data is flawed from the outset.
At Songscore, we follow accredited research techniques and employ the Likert Scale. We decided when we built Songscore that we wanted to use the most scientific, widely accepted (by scientific standards) approach we could, so that results could be consistent over time and prove the most valuable.
Here is a brief guide as to why we do things the way we do:
5 point scale and the neutral option
We employ the Likert Scale. You’ll find many scientists have argued for more points than 5, but never less points, or eliminating the neutral option. Think of it this way- what if your audience really is neutral on a song? Now you have forced them to choose a dishonest answer and your data is now flawed before you even finish the survey. Besides, it turns out the neutral option has very little effect on scores when proper samples are achieved. Remember, there is only one perceived neutral response and 2 negative and positive choices. When you are looking at averages and mean score (that red/green bar color in SongScore) the presence of neutral scores does not mean as much as it may seem.
Relative Comparisons are more meaningful: responses to questionnaires aren’t terribly valuable by themselves. You need to compare the scores to something meaningful. In our case, the mean score for that group of listeners, and the relative position of one song to the other. If Song 1 has a 4.01 by itself, that may seem good. But it could be that it is the far below the mean, and 10 others songs are 4.55 or higher. So the most important thing is to give the listener the answers they need to be able to be honest, and compare those answers in a meaningful way to other scores and mean score.
Labeling of 5 points
Longwood University cites the finding of Weijters who found that people are attracted to labels. So, for example, if we only labeled 1 as “dislike” and 5 as “like” and left 2, 3 and 4 as unlabeled, people would tend to choose 1 or 5 more often but for no reason. Thus, all 5 options are always labeled.
Can I change the labels?
Yes, you can! You can change your labels anytime by sending support your list of new labels. However, you should keep in mind that changing labels can impact scores and therefore you should a) not compare new label survey scores to past surveys scores and b) keep the new labels consistent over time. Basically, each time you change the labels, you should think of it as starting over in your research.
Why no national average score from all users in a format?
Imagine a contest where Campbell’s sets out to find the best homemade chicken soup recipe in every one of the top 100 cities in America. They gather the recipes and bring them to their test kitchen, where they combine all 100 recipes into one ultimate recipe. Do you think it would taste as good as any one of the individual recipes? Combining test scores from different recruitment methods and other variable data does not give a reliable, productive outcome. One could average 10 random cities’ preference for national pizza brand and decide Pizza Hut is the nation’s top pizza chain, but it would not change the fact that Tuscon as a city chose Papa John’s by a landslide margin. So Tuscon was outnumbered by 9 other cities who tipped Pizza Hut for the national award, but if you were opening a pizza franchise in Tuscon, wouldn’t you really only want to know the local result? Even if you are a national network, surveying your own listeners from a consistent recruitment method (your own promotions, etc.) will be more accurate than combining random accounts that are unrelated in every way except for the software used.
Skipping unfamiliar songs:
We all know in radio that our advertisers need a spot schedule with Optimum Effective Scheduling, or OES. OES says that the average person needs to hear an ad 3x before noticing it, so the ad should play often enough given the formula with your cume and AQH to be heard by 50% of your audience 3x. Read a more thorough explanation at the link above. If you apply the same idea to music, you quickly realize that your songs need to be heard in the same way before your audience can sufficiently sample them enough to have an opinion. At Songscore, we recommend a minimum of 400 spins before you test a song. You can test it earlier if you like of course, but just understand your results may not be complete or tell the whole story. Scientifically, people do not develop their lasting opinion of a song without significant exposure (multiple plays). Have you ever judged a song as unfit for airplay only to end up adding it a few weeks later? The same is true for listener perceptions and they are not even paying half the attention you are.
Finally, unlike most of the internet these days, we do not believe we are the final authority or have the final word. Songscore is a service for it’s customers and as such, you are in control. This is why we continually add features based on your feedback. You can hide titles or artists, use your own hooks, change rating labels and much more because in the end it’s your research and decision. We just hope this lets you know a little more about the “why” that went into the Songscore product.