Work on the “science of science” has demonstrated the value of a data-driven approach to the study of academic knowledge creation. Researchers like Roberta Sinatra, a professor at CNS, use big data on productivity and impact spanning decades of scientific endeavors to model and understand how scientists become successful [1]. Milan Janosov recently shared his work on extending and adapting this framework to artistic careers. More broadly, Milan is seeking to compare how productivity, impact, and network position influence success across various disciplines.
Milan uses data covering three artistic fields: nearly one million films from IMDB, 28 million songs from Discogs and last.fm, and 4 million books from Goodreads. The directors, musicians, and authors in the data span many years of their respective creative industries. The datasets also have multiple measures for impact of a creative work, for example, movies are rated by both critics and the crowd. Though expert evaluations and public opinion are usually correlated, there are many examples where they do not agree.
Milan’s first hypothesis is that artistic careers follow the so-called “random-impact rule”, just like academic researchers. The random-impact rule says that, controlling for the length of the career, the timing of someone’s highest impact work is uniformly distributed. Milan finds that this hypothesis fits the data quite well for all three of the artistic fields he analyzed, see Figure 1.
Next, Milan explored the extent to which impact of an artistic work can be predicted by a model with two factors: a deterministic variable encoding individual aspects of the creator and a random field-specific parameter. This so-called "Q-model" encodes the extent to which individual performance drives impact, compared with "luck". Finding that his data fits the model specifications similarly to academic careers, he compares subfields in his data for similarities in their "luck-skill" distribution. He finds that subfields within a domain, for example different genres of music, have different distributions. For example, the individual-skill factor of film music composers in their Q-model is more similar to that of classical musicians than movie producers or directors. In future work, Milan proposes to quantify the predictability of success in different fields on a single dimension, from deterministic to probabilistic.
Finally, Milan also discussed preliminary results about the relationship between artist networks and success. Many motivating examples can be found in popular culture, like the recent success story of Luis Fonsi's "Despacito", the first song to hit four billion hits on Youtube. "Despacito" was not an instant hit. Its explosive growth coincides with the release of a cover version by Justin Bieber, as shown in Figure 2.
Clearly, network position has an impact on the likely success of a creative work. Milan plans to examine the relationship between the Q model decomposition of impact and the network position of the artistic creators. In preliminary work, he showed a significant relationship between the “Q” of a movie director and that of his early collaborators.
The extension and elaboration of the science of success to creative domains have great promise. Finding similarities in this new context strengthens the validity of the original results on academic careers. Understanding the differences of model performance and the distributions of the Q model parameters across fields helps us understand the nature of success (i.e. luck vs skill) in these areas. Relating network position to outcome extends the science of success in a sociological direction. Milan is making progress on all these fronts and we wish him continued success.