The Sociology of Japanese Pop Music

Music is one of the most important influences on our personal lives. Many of us do not produce any music, but merely enjoy it as consumers. Some become singers, and some of them end up becoming very successful while others fail. Most of my musical experiences is singing along, and my music teacher made the empathic point that the voice is the most important and simple to use instrument. That was naturally a way to give the non-musical people, who only sing under the shower, if ever, a good feeling about themselves. My singing skills are not good at all, and I quickly figured out that a musical career was nothing for me. Most of the music that I have listened to are western. I listened to the radio, and generally found anything that the radio played to be fairly stimulating, which includes 1950s to current music, mostly rock and pop. A few years ago, I discovered classical music and became a big fan of it (Mozart, Beethoven and Bach belong to the top of the list). And then, while I was watching Japanese television, I found out about the J-Pop (Japanese pop) band AKB-48, singing a song titled “Koisuru Fortune Cookie” (‘Fortune Cookie in Love’).

As a person growing up in the West it is remarkably difficult to escape the Japanese cultural influences (cf. Kelts 2006). I have encountered Digimon, Pokemon, Dragonball, Detective Conan etc., and they have entertained the afternoons of many youths. I still remember when I was fighting with my parents about control over the TV remote just so I can get my daily portion of Japanese TV shows (followed by soccer over the weekend). But so far I have had very little contact with Japanese music, and I often denounced the purely girlish orientation of the girl bands, or just did not pay much attention to it. But this song, titled ‘Koisuru Fortune Cookie’, was an instant hit, and stuck with me. It not coincidentally made it to the top of the billboard in Japan. It is noteworthy that it doesn’t receive any audience outside of Japan besides die-hard Japan-lovers. If it is not in English the song can’t make it internationally, and especially not in the US. The Austrian singer Falco knew this all too well, when he decided to add English words into his German songs to make himself famous in the US and the rest of Europe (his main song was ‘Rock Me Amadeus’, 1986).

But I particularly like this Japanese song, and it will be particularly interesting to frame this pop band, AKB48, in sociological terms. Some comparisons with Western girl pop bands reveal some interesting findings. My essential finding is that the Japanese pop bands creates a greater amount of ritual solidarity, collective effervescence and emotional energy than Western pop bands.

The Sociology of Emotions

Before I begin my analysis of the Japanese pop band, I will describe the sociological theory, which I will use. Ritual solidarity and collective effervescence are sociological concepts as they were defined by Emile Durkheim (1995). Durkheim’s essential concern was how a society can hold together. He theorized that any society needs to have rituals, or activities that happen on a regular basis, which are done in common and by the group as a whole in order to evoke the emotion of solidarity among all the members of that group. Without solidarity the group can not sustain itself, so solidarity is essential for group maintenance. The group also needs to devise some sacred objects, or totems, like the cross for Christians or the Quran for Muslims, and reference to it will be enough to create shared feelings and ideas about the religion or ideology. The inclusion into the group is important for individual survival and well-being, and this statement is certainly more profoundly empirical than rational choice theory, which assumes that individual preferences precede societal considerations. Inclusion also implies exclusion of out-group persons, and this is the case if certain people do not share the same ideas or symbols. We all have encountered people, whose language we don’t speak, and if there is no common language that is spoken than desperate hand gestures are made to create some common understanding.

Durkheim’s ideas can be most neatly applied to religion, but it does not have to be restricted to it. One may apply the solidarity ritual principle to one’s allegiance to the nation-state. Any naturalized citizen or any person having gone to high school, where the star-spangled banner is played, may have had some experiences with nationalist solidarity rituals. Ritual solidarity may be created within the context of families. Very traditional parents may demand of their children to eat together with the parents in order to facilitate conversation among the family members. The dinner ritual is particularly apt, because everyone needs to eat every day, and in the Western culture, where every one is busy most of the time, finding a common ritual is rather difficult, and alienation among family members occurs often if there is no dinner ritual, or any other sacred ritual that is cordoned off as ‘family time’.

Collective effervescence refers to the outcome of performing ritual interactions. People interact with each other through rituals, and thereby develop a collective conscience. It is during the situation or the moment of intense ritual activity that people feel the highest amount of solidarity and collective effervescence. They are reaffirmed as members of the group, and generate a sustained strong feeling, which can only exist within the context of that group ritual. One major example is sports among sports fans. One watches sports not merely to find personal entertainment, because that may be done cheaper at home in front of the television. But people are paying an exorbitant amount of money in order to be able to go to the stadium to enjoy watching the game. What the audience of a football game gets is the collective presence of all the other players in the stadium, which produces collective effervescence as end result. The commercial implication of many people wanting to pay so much money to sit in the same stadium is fairly obvious, but the sociological implications need some further consideration. The sports fans enter the stadium, some sitting, some standing, though sometimes all people stand in order to cheer for their team. They sing songs, they boo the opposing team, they talk about the players during the game. The audience will be in uproar if the team they are supporting scores a point. When I was watching a soccer match as a teenager on the TV screen my brother would reluctantly sit next to me and ridicule the fans, who would chant along. These collective chants do sound like drunk and crazy people roaring for no rational reason. From the TV perspective there is a certain physical distance to the fans, which makes it difficult to exactly feel the emotional excitement that they are have. The TV screen also plays the audience as a background voice without regularly showing the audience, because the game itself is the visual focus of the camera. What my brother is missing about these crazy, hysteric fans is that these crazy and hysteric fans are having a good time, enjoying their collective ritual.

Randall Collins (2004) expands on the notion of interaction rituals creating solidarity, by arguing that these interaction rituals happen in chains. In other words, as people interact and perform rituals they transfer these emotions from one person to the next, and because multiple people may be involved, this transference happens in chains. Collins also theorized that people develop what is called emotional energy, or heightened excitement among individuals as a group ritual interaction is performed (ibid., 38). This emotional energy is exclusive to social interactions and situations, and can not be generated alone. Even the lone intellectual, who is hammering away at his paper on the computer screen by himself, and is experiencing a great amount of emotional energy from doing it, can only experience this emotion if he has generated the conversation about his ideas with other scholars in his own mind (whether they are generated by reading a book, by e-mails or by verbal conversations, though verbal conversations produce the most intense thoughts). Scholars with a great deal of emotional energy quite consciously position themselves along an intellectual/ideological position (cf. Collins 2004, 164).

Different people have different amounts of emotional energy, and they are situation-specific (Collins 2004, 118), and also dependent on individual endowments. In addition, successful rituals generate more emotional energy, while failed rituals drain energy (ibid., 53). Examples in personal life may illustrate this very clearly. I happen to develop the greatest amount of emotional energy during intellectual conversations, particularly if they are related with politics, history, economics or any other aspect of social reality. During such conversations, I realize that all my readings in the social sciences and from the newspapers can be adequately put to use. For this ritual to be successful and to grant me enough EE (emotional energy), the other person has to display a similar emotional interest in the topic. In the ideal case, me and the conversational partner will take turns, and only occasionally interrupt when necessary. One person adds an idea, and the other person adds another idea that is somewhat related to the first idea, but offers a different approach that I had not encountered before, or that I have encountered before but want to receive confirmation for this position. A great deal of emotional engagement ensues, and after the conversation is over, the ideas that I exchanged usually linger, and I may use that same information, which I discussed and learned in a later conversation, or build the topic into a paper that I write. EE will wear off after a few days, and hopefully more such conversations with other people ensue, though there is already a relatively high EE in anticipation of meeting a friend or colleague, with whom I had high EE conversations previously.

The opposite would involve a failed ritual, where EE quickly leaves, and from my own view it is fairly simple to conceive of such ritual failure. It is quite clearly not possible to discuss current social issues with all people, because they are not willing or capable to contemplate it. In such case, I may lecture that person for a few sentences, and would immediately notice the other person’s disengagement. In that case, I let the other person dictate the topic by asking them a question about their hobbies, activities or interests. Goffman (1967, 115) had theorized about the importance of maintaining spontaneous involvement in the conversation, even though that may be difficult. In a failed ritual, I may not be able to get that person to talk about his interest, because they don’t enjoy speaking with strangers, or they don’t enjoy socializing with other people at all. Another failed ritual would be if they started talking about an issue that I consider totally boring. In a low EE encounter, a moment of relief arises after the end of the ritual, and engaging in other activities while forgetting that encounter becomes the logical course of action.

To conclude this theoretical section on the sociology of emotions, I will repeat the definitions of the three terms: ritual solidarity, collective effervescence and emotional energy. Ritual solidarity is the creation of common feelings of group membership through the performance of rituals, which have to happen regularly in order to remain strong. Common feelings of solidarity are the pre-requisite for the coherence and stability in society or in the group. Collective effervescence is the outcome of performing ritual interactions and involves moments of strong solidarity. Emotional energy is the heightened excitement that the ritual interactions produce, and is dependent on the situation and the endowment of individuals.

Japanese Pop-Music

Let us start the sociological analysis of Japanese pop music with a description of the music video that the girl band AKB48 produced. The music video is about five and a half minutes long and contains 3,800 people (!). There are 15 members of the band, who are dancing and singing in their multi-colored costumes. The main part of the video is staged on the open street. The members of the band are the central focus, and they carry out the dancing and singing. But the band is not the exclusive focus. The camera would turn to the cheering audience, which is a collective crowd of Japanese people, who would copy the same moves as the band. For a few seconds the camera would turn to certain situations, where some of the band members would show up and do the dances, which would be copied by the respective small audience. The small audiences include school children, musical opera performers, market visitors, and judo performers. There is another set of situations in the video, which involves single people or small groups, who would perform the same dance moves that the band is making. These situations include one old man, one old woman, three school girls, two manual laborers, one truck driver, three women in a high-end restaurant, one young woman with a small baby tied to her back, four office workers of both sexes, two western tourists on the street. Another set of situations involves the band members, who dance alone or with two other band members in a studio with clothing that is different from the clothing in the main part of the video. All of these situations do not last more than a few seconds, but the longest focus is devoted to the band members, while the shortest focus is devoted to the random people that carry out the dance moves.

The myriad amount of situations leave the impression to the viewer that one does not have to be a member of the band in order to perform the dance, but that anybody, who is agile enough (the old people dancing were included on purpose), can participate in the dance. The band members were clearly the best trained to perform the dance, but all of the movements were also done by the audience members. When it came to performing the dance, the band members explicitly encouraged the audience, whether they were musical performers or school children to partake in it, and they would do so enthusiastically. The band did not mind at all to share the audience attention with the audience in the video. (I am pretty sure that the 3,800 audience members that were part of the video were recruited.) But the symbolic value of the video is quite clearly that not only the band is supposed to do the dance, but that the audience should share a part in it. The many happy audience members display a great deal of emotional energy, and by implication the viewers of the video should have the same experience, though somewhat diminished.

Here a contrast with Western music videos is appropriate. In Western music videos, the emphasis of the performance rests exclusively with the performers in the band. There is no audience in the music video itself, and if there is, then they are at best cheering spectators, but they should play no active part in the video and steal the show of the performer. The differences in sociological outcomes for these two separate cultural approaches are enormous. In the context of Japanese pop-music, the relationship between the band, which is formally in charge, and the audience is greater than in Western pop music, where the focus goes exclusively to the performing band. The opportunities for creating ritual solidarity, collective effervescence and high emotional energy are greater when viewing Japanese pop-music videos than when viewing Western pop-music videos. As participation in dancing is more encouraged in Japanese videos, the viewer has a greater sense of emotional engagement, because he wants to perform the dance as well, while for Western music passive listening experience is the most likely route for the viewer.

There is, however, one caveat to audience participation in Japanese music videos. Singing is done exclusively by the band members. Singing is the major activity of the music video besides the dancing. While dancing is participatory and democratic, singing is centralized and monopolized by the band. When it comes to the acoustics there is no difference between the West and the East. This is a point that is well taken, because the band produces the music, and not the audience, which is supporting and affirming the band with their dance, and maybe singing but without an amplifying microphone. The singing is done most likely by recorded voice anyway, and the authenticity of the song is retained. The band is in charge by doing the singing. But even in the singing part there is some collective effervescence, though it is mainly restricted to the band itself. On the video, there are 15 band members, and their voices blend into each other. There are only short passages right before the chorus, when only one singer is uttering the line, but the rest of the song is sung by the whole band. The inter-mixing of voices still involves a sense of collectivity, because after all one or more voices that are added by the audience do not make so much of a difference compared to when only one singer performs. There is not so much a sense of collectivity in single musicians or even in Western girl bands. Western girl bands may sing the chorus together, but the lines of the stanzas may be carried out separately and by one single band member. The example that comes to mind are the British-based ‘Spice Girls’.

A description of the band itself reveals fairly interesting content that can also be analyzed sociologically. AKB48 is not like a normal girl band. It currently consists of 89 members, and is considered the largest girl band in the world. The purpose of such a large band, according to the producer Yasushi Akimoto, is to enable the fans to view the band daily in a theater. A normal girl band gives a concert and is sometimes visible on TV, but the concept of AKB48 (named after the section in Tokyo, Akihabara) is to provide daily performances to a paying audience. And demand is high, because the theaters are always sold out in advance. A large number of band members is necessary, because the band is divided in four active groups (16-22 members) and one trainee group (10 members), and they perform at different times, while the other group is either traveling in concerts or is taking a break or is practicing otherwise. AKB48 has been so successful that other parallel large girl bands had been founded, e.g. JKT48, SKE 48.

This way of organizing the band does seem to have two advantages, one is financial and the other is organizational. Financially, daily performances mean daily revenues to the band. A daily cash flow is probably preferable to the band, especially its management. Whether the band members are that willing to be exposed to performance pressure on a daily basis is another question, which I can’t investigate with my data availability. The organizational advantage is that relying on such a large band, which is recruited by nation-wide auditions, means that the band can technically continue to exist indefinitely into the future, as long as the management is there to support it and the public is willing to pay for it. Here a contrast with a western girl band should be pointed out: a western girl band may similarly be recruited from the population via auditions, but it is generally small, usually 4-5 members, and their continued existence depends on the willingness of the individual members to carry on performing together. The Spice Girls may again be taken as an example, because they are the most successful western girl band in Britain. The Spice Girls consisted of five members, and they had produced several hit singles. But then one of the band members (Geri Halliwell) decided to leave the band, and it produced an uproar. Only a few years later the band of four remaining singers decided to split apart, because they either wanted to pursue a career as a solo singer (which most did successfully except Victoria Beckham) or wanted to have children. The band ceased to exist. The continued existence of the band depended on the will of the band members.

Such observation can not be made about AKB48. When a few singers retire, because they are too old or have a boyfriend/husband or children (AKB48 members are not allowed to have boyfriends1) , then the band can survive without any trouble. Group survival is primary to individual needs. If there are not enough girls in the band, then more auditions will be carried out to fill the empty spots. Alternatively, there are internal transfers from group to group. And the supply of talent is almost endless. When the first group formed, 24 slots were filled out of 7,924 applicants (0.3% success rate). The second group filled 18 slots for 11,892 applicants (0.15% success rate). No wonder that only the best and most disciplined people are part of the band. The departure from the band is also regulated. All the band members are in their teenage years (youngest member is 11) and the oldest ones are in their early-20s, when they ‘graduate’, i.e. retire. Retirement means resumption of private life and career, though some make their AKB48 career a jumping board for career advancement in other fields such as fashion designing, or even to launch solo music careers. The enormous competition for membership in the band is, therefore, enormous. The permanent ‘American Idol’-state that is required makes this Japanese band very reminiscent of the Chinese civil service bureaucracy, which was a permanent meritocracy in Chinese society. Another interesting feature about the band is that the different groups have to compete with each other for shooting the music video. Only one group, which is judged the best, will perform in the music video, so the competitive stakes are fairly high.2

Does the structure of the band create any collective effervescence among the band members? Here the picture is much more complicated than what can be said about the relationship with the consuming audience. The audience is in an immediate position to benefit, as I analyzed with regard to the music video. However, the band members enter the band under competitive pressures, and the performance itself also re-creates competition, because once one is in the band it is about delivering permanently good performance in order to be selected for the coveted music videos, and in order to be promoted to be the leader of the group (which can only happen with the highest public evaluation records). Then there are fine rank distinctions in terms of age. The older ones have proven themselves and may have an easier team becoming team leaders, but they are also the ones closest to retiring, and so the sense of solidarity is a temporary one. Competition is usually not an indicator of creating solidarity, but the picture is not that simple.

One might also say that because the girls struggle so hard to get into the band, they are really looking for a high level of emotional energy and collective effervescence. Competition does not have to merely create anxiety, but may also produce a positive type of tension to perform as well as possible, and then reap the fame and the center of attention relative to the other band members as leaders of the team, and relative to the audience as performers on the stage. The highest EE band members are likely the team leaders, who line up at the center of the stage during live performances and the video performances. One might generalize that all of the band members receive relatively high level of EE, because of their stage position, but there is a hierarchy of EE between the leaders and the non-leaders in the team. It is not only EE, but also collective effervescence that band members may gain from the performances and doing things with the band (the level of private interactions among band members has not been investigated, as far as I know). It may be hard work to get there, but that hard work yields in the reward of being able to be part of a band, and doing the same things on the stage. The enthusiastic crowd support and cheering (of which there is enough of, because the two main demographics of audience are young girls and old men) add to the band members’ EE. And this ritual is happening on a daily basis, so that the ritual reinforcement is very strong, creating a great degree of solidarity among members. And as I had said earlier, group survival precedes individual needs, and as long as the collective of the band members are willing to work in the band it will continue to persist.

The comparison with the Western girl band makes the acquisition of high EE and collective effervescence in the Japanese girl band even clearer. The Western girl band consists of four or five members, and they may know each other rather well, and the public recognizes the individuality of the singers. The best example are the Spice Girls that each had special names attached to each performer (e.g. Emma Bunton was called ‘Baby Spice’). In the Japanese girl band, there is only a limited level of individuality, because not the individual singers matter, but the band as a whole, though the competition for the coveted leadership position does produce some individual tensions within the band, which is, however, not really transparent to the viewing audience. Stage performance in the Western girl bands also create a great level of ritual solidarity, but organizational strength builds on the willingness of the band members to continue the collaboration. Because there is no institutional framework, where new band members are added to it, emotional energy can quickly be drained if only one band member does not want to be part of it anymore. In the Japanese band, one unwilling member can not disturb the institution itself and may safely be removed from the band without the wider audience noticing anything. Fragile solidarity in the Western girl band implies that any attempt to create collective effervescence on the stage is very time-limited.

This analysis leaves me to conclude this section with some macro-structural observations. The structure of AKB48 indicates the survivability of the organization beyond the demise of individuals within it, even as it is constrained by the support of all the band members (as collective), and a supportive external public. The theoretically interesting question would be whether any of these pillars might fall apart, because the supply for talent will fall off, or the demand for the product will decrease. Though answering this question empirically is impossible with my data, I logically find both outcomes to be very unlikely, because young girls are constantly willing to join these girl bands, because it is considered a convenient jumping board for more fame and career advancement later on. The many successful examples give young girls the continued aspiration to at least try to make it into the band. On the demand side, I also do not see many opportunities for leveling off demand, because Japanese pop is by definition popular. Music, besides sport, is the most effective way to mobilize the support of the masses. There are very few alternatives to it. If AKB48 collapses due to some other reason, the public will find some other band to follow.

Another macro-structural question with regard to the organization and bureaucracy of the band is whether the level of bureaucratization will increase as time goes on. An organization that has set strong parameters and rules of the game may persist for a long time, but it might have to implement more rules in order to run in the same capacity. It is because of the economic success of the band that many more successor bands were founded by the producer. More bands in the portfolio mean more revenues, but it also means more responsibilities and more managers. This might imply more rules for the girls, and it might increase alienation among the younger generations of talent, i.e. the youngest recruits. Joining a fresh and young organization may be the girl’s best dream, but joining an old bureaucracy might reduce the value that the girls place on the organization, and this could become a threat to the current band. On the other hand, expanding the portfolio by creating independent units of business might reduce potential alienation effects. Whether there is alienation among band members due to the nature of the organization needs to be empirically tested.


I have analyzed the Japanese pop band, AKB48, in terms of the sociology of emotions. Durkheim’s (1995) concept of ritual solidarity and collective effervescence, and Collins’ (2004) concept of emotional energy and interaction ritual chains may be used to describe how Japanese pop bands can produce more solidarity and emotional energy among the audience in their music video than Western pop girl bands. I also theorize that more solidarity and emotional energy is produced among Japanese pop bands than in Western bands.


Collins, Randall. 2004. Interaction Ritual Chains. Princeton: Princeton University Press.

Durkheim, Emile. 1995 [1912]. The Elementary Forms of Religious Life, Trans. Karen E. Fields. New York: Simon and Schuster.

Goffman, Erving. 1967. Interaction Ritual: Essays on Face to Face Behavior. New York: Pantheon Books.

Kelts, Roland. 2006. Japanamerica: How Japanese Pop Culture Has Invaded the U.S. New York: Palgrave Macmillan.

The Music Video:

AKB48. MV】恋するフォーチュンクッキー / AKB48[公式]


English Translation of Lyrics of “Koisuru Fortune Cookie”: Midori Translates.

1Which is very ironic, because the lyrics of the song is all about relationships or desired relationships with men. Freudian sublimation carried out to its extreme.

2All information about the band comes from Wikipedia.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s