Quick Introduction

gh-impact measures open source influence. gh-impact is based upon the stars an open source project receives on GitHub: an account has a gh-impact score of n if they have n projects with n stars. Higher gh-impact scores correspond to accounts that have many well-used projects. See here for more information.

These articles are the result of a literature scan. The basic approach was to search abstracts and titles for terms like “github” and “h-index.” Initially, arXiv was searched because it is a popular outlet for this sort of research. The scope will be expanded in the future to include other databases.


Abbas, A. M. (2010a). Analysis of Generalized Impact Factor and Indices of Journals. arXiv:1011.4879 [cs]. Retrieved from http://arxiv.org/abs/1011.4879

Abbas, A. M. (2010b). Weighted Indices for Evaluating the Quality of Research with Multiple Authorship. arXiv:1010.1904 [cs]. Retrieved from http://arxiv.org/abs/1010.1904

Abbas, A. M. (2012). Bounds and Inequalities Relating h-Index, g-Index, e-Index and Generalized Impact Factor. PLoS ONE, 7(4), e33699. http://doi.org/10.1371/journal.pone.0033699

Abdel-Aty, M. (2013). New Index for Quantifying an Individual’s Scientific Research Output. arXiv:1305.6026 [cs]. Retrieved from http://arxiv.org/abs/1305.6026

Amancio, D. R., Oliveira Jr., O. N., & Costa, L. da F. (2012). Three-feature model to reproduce the topology of citation networks and the effects from authors’ visibility on their h-index. Journal of Informetrics, 6(3), 427–434. http://doi.org/10.1016/j.joi.2012.02.005

Arandjelovic, O. (2014). Fairer citation-based metrics. arXiv:1408.3881 [cs]. Retrieved from http://arxiv.org/abs/1408.3881

Baccini, A., Barabesi, L., Marcheselli, M., & Pratelli, L. (2012). Statistical inference on the h-index with an application to top-scientist performance. Journal of Informetrics, 6(4), 721–728. http://doi.org/10.1016/j.joi.2012.07.009

Batista, P. D., Campiteli, M. G., Kinouchi, O., & Martinez, A. S. (2006). An index to quantify an individual’s scientific research valid across disciplines. Scientometrics, 68(1), 179–189. http://doi.org/10.1007/s11192-006-0090-4

Benevenuto, F., Laender, A. H. F., & Alves, B. L. (2015). The H-index Paradox: Your Coauthors Have a Higher H-index than You Do. arXiv:1510.04767 [physics]. Retrieved from http://arxiv.org/abs/1510.04767

Borges, H., Hora, A., & Valente, M. T. (2016). Understanding the Factors that Impact the Popularity of GitHub Repositories. arXiv:1606.04984 [cs]. Retrieved from http://arxiv.org/abs/1606.04984

Borges, H., Valente, M. T., Hora, A., & Coelho, J. (2015). On the Popularity of GitHub Applications: A Preliminary Note. arXiv:1507.00604 [cs]. Retrieved from http://arxiv.org/abs/1507.00604

Bornmann, L. (2014). h-index Research in Scientometrics: A Summary. arXiv:1407.2932 [cs]. Retrieved from http://arxiv.org/abs/1407.2932

Cabezas-Clavijo, A., & Lopez-Cozar, E. D. (2013). Google Scholar and the h-index in biomedicine: the popularization of bibliometric asessment. arXiv:1304.2032 [cs]. Retrieved from http://arxiv.org/abs/1304.2032

Cormode, G., Ma, Q., Muthukrishnan, S., & Thompson, B. (2012). Socializing the h-index. arXiv:1211.7133 [physics]. Retrieved from http://arxiv.org/abs/1211.7133

Cosentino, V., Izquierdo, J. L. C., & Cabot, J. (2014). Three Metrics to Explore the Openness of GitHub projects. arXiv:1409.4253 [cs]. Retrieved from http://arxiv.org/abs/1409.4253

da Silva, R., de Oliveira, J. P., de Lima, J. V., & Moreira, V. (2010). Statistics for Ranking Program Committees and Editorial Boards. arXiv:1002.1060 [physics]. Retrieved from http://arxiv.org/abs/1002.1060

de Keijzer, B., & Apt, K. R. (2013). The H-index can be easily manipulated. arXiv:1304.2557 [cs]. Retrieved from http://arxiv.org/abs/1304.2557

Ding, Y., Yan, E., Frazho, A., & Caverlee, J. (2010). PageRank for ranking authors in co-citation networks. arXiv:1012.4872 [cs]. Retrieved from http://arxiv.org/abs/1012.4872

Dong, Y., Johnson, R. A., & Chawla, N. V. (2015). Will This Paper Increase Your h-index? Scientific Impact Prediction. arXiv:1412.4754 [physics], 149–158. http://doi.org/10.1145/2684822.2685314

Dong, Y., Johnson, R. A., & Chawla, N. V. (2016). Can Scientific Impact Be Predicted? arXiv:1606.05905 [cs]. Retrieved from http://arxiv.org/abs/1606.05905

Dorta-Gonzalez, P., & Dorta-Gonzalez, M. I. (2011). Central indexes to the citation distribution: A complement to the h-index. Scientometrics, 88(3), 729–745. http://doi.org/10.1007/s11192-011-0453-3

Ebrahim, N. A., Farhadi, H., Salehi, H., Yunus, M. M., Chadegani, A. A., Farhadi, M., & Fooladi, M. (2013). Does it Matter Which Citation Tool is Used to Compare the h-index of a Group of Highly Cited Researchers? arXiv:1306.0727 [physics]. Retrieved from http://arxiv.org/abs/1306.0727

Egghe, L. (2013). Theory and practise of the g-index. Scientometrics, 69(1), 131–152. http://doi.org/10.1007/s11192-006-0144-7

Eppstein, D., & Spiro, E. S. (2012). The h-Index of a Graph and its Application to Dynamic Subgraph Statistics. Journal of Graph Algorithms and Applications, 16(2), 543–567. http://doi.org/10.7155/jgaa.00273

Ferrara, E., & Romero, A. E. (2013). Scientific impact evaluation and the effect of self-citations: mitigating the bias by discounting h-index. Journal of the American Society for Information Science and Technology, 64(11), 2332–2339. http://doi.org/10.1002/asi.22976

Fowkes, J., & Sutton, C. (2015). Parameter-Free Probabilistic API Mining at GitHub Scale. arXiv:1512.05558 [cs]. Retrieved from http://arxiv.org/abs/1512.05558

Fu, T. Z. J., Song, Q., & Chiu, D. M. (2013). The Academic Social Network. arXiv:1306.4623 [physics]. Retrieved from http://arxiv.org/abs/1306.4623

Gousios, G. (2013). The GHTorrent dataset and tool suite. In Proceedings of the 10th Working Conference on Mining Software Repositories (pp. 233–236). Piscataway, NJ, USA: IEEE Press. Retrieved from http://dl.acm.org/citation.cfm?id=2487085.2487132

Harnad, S. (2007). Open Access Scientometrics and the UK Research Assessment Exercise. arXiv:cs/0703131. Retrieved from http://arxiv.org/abs/cs/0703131

Hartung, S., Komusiewicz, C., Nichterlein, A., & Suchý, O. (2013). On Structural Parameterizations for the 2-Club Problem. arXiv:1305.3735 [cs, Math]. Retrieved from http://arxiv.org/abs/1305.3735

Hassan, S.-U., & Gillani, U. A. (2016). Altmetrics of “altmetrics” using Google Scholar, Twitter, Mendeley, Facebook, Google-plus, CiteULike, Blogs and Wiki. arXiv:1603.07992 [cs]. Retrieved from http://arxiv.org/abs/1603.07992

Havemann, F., Gläser, J., Heinz, M., & Struck, A. (2012). Identifying Overlapping and Hierarchical Thematic Structures in Networks of Scholarly Papers: A Comparison of Three Approaches. PLoS ONE, 7(3), e33255. http://doi.org/10.1371/journal.pone.0033255

Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America, 102(46), 16569–16572.

Hirsch, J. E. (2007). Does the h-index have predictive power? Proceedings of the National Academy of Sciences, 104(49), 19193–19198. http://doi.org/10.1073/pnas.0707962104

Hovden, R. (2013). Bibliometrics for Internet Media: Applying the h-Index to YouTube. Journal of the American Society for Information Science and Technology, 64(11), 2326–2331. http://doi.org/10.1002/asi.22936

Kakushadze, Z. (2015). An Index for SSRN Downloads. arXiv:1511.04275 [cs]. Retrieved from http://arxiv.org/abs/1511.04275

Korn, A., Schubert, A., & Telcs, A. (2009). Lobby index in networks. Physica A: Statistical Mechanics and Its Applications, 388(11), 2221–2226. http://doi.org/10.1016/j.physa.2009.02.013

Kudělka, M., Platoš, J., & Krömer, P. (2015). Author Evaluation Based on H-index and Citation Response. arXiv:1511.05709 [cs]. Retrieved from http://arxiv.org/abs/1511.05709

Lima, A., Rossi, L., & Musolesi, M. (2014). Coding Together at Scale: GitHub as a Collaborative Social Network. arXiv:1407.2535 [physics]. Retrieved from http://arxiv.org/abs/1407.2535

Lin, M. C., Soulignac, F. J., & Szwarcfiter, J. L. (2012). Arboricity, h-Index, and Dynamic Algorithms. Theoretical Computer Science, 426-427, 75–90. http://doi.org/10.1016/j.tcs.2011.12.006

Linoff, G. (2013). Answer to SQL for computing … h-index. Retrieved from http://stackoverflow.com/a/18787390/1146681.

Lozano, G. A. (2013). The elephant in the room: multi-authorship and the assessment of individual researchers. arXiv:1307.1330 [physics]. Retrieved from http://arxiv.org/abs/1307.1330

Martín-Martín, A., Ayllón, J. M., Orduña-Malea, E., & López-Cózar, E. D. (2016). Proceedings Scholar Metrics: H Index of proceedings on Computer Science, Electrical & Electronic Engineering, and Communications according to Google Scholar Metrics (2010-2014). arXiv:1606.05341 [cs]. http://doi.org/10.13140/RG.2.1.4504.9681

Matek, T., & Zebec, S. T. (2016). GitHub open source project recommendation system. arXiv:1602.02594 [cs]. Retrieved from http://arxiv.org/abs/1602.02594

Meho, L. I. (2006). The Rise and Rise of Citation Analysis. arXiv:physics/0701012. Retrieved from http://arxiv.org/abs/physics/0701012

Meho, L. I., & Rogers, Y. (2008). Citation Counting, Citation Ranking, and h-Index of Human-Computer Interaction Researchers: A Comparison between Scopus and Web of Science. arXiv:0803.1716 [cs]. Retrieved from http://arxiv.org/abs/0803.1716

Merelo, J. J., Rico, N., Blancas, I., Arenas, M. G., Tricas, F., & Vacas, J. A. (2015). Measuring the local GitHub developer community. arXiv:1501.06857 [cs]. Retrieved from http://arxiv.org/abs/1501.06857

Mingers, J., & Yang, L. (2016). Evaluating Journal Quality: A Review of Journal Citation Indicators and Ranking in Business and Management. arXiv:1604.06685 [cs]. Retrieved from http://arxiv.org/abs/1604.06685

Opthof, T., & Leydesdorff, L. (2011). Citation analysis cannot legitimate the strategic selection of excellence. arXiv:1102.2569 [physics]. Retrieved from http://arxiv.org/abs/1102.2569

Orduna-Malea, E., & Lopez-Cozar, E. D. (2013). Google Scholar Metrics evolution: an analysis according to languages. arXiv:1310.6162 [cs]. Retrieved from http://arxiv.org/abs/1310.6162

Otsuki, A., & Kawamura, M. (2013). GV-Index:Scientific Contribution Rating Index That Takes into Account the Growth Degree of Research Area and Variance Values of the Publication Year of Cited Paper. International Journal of Data Mining & Knowledge Management Process, 3(5), 1–13. http://doi.org/10.5121/ijdkp.2013.3501

Panaretos, J., & Malesios, C. (2008). Assessing scientific research performance and impact with single indices. arXiv:0812.4542 [physics]. Retrieved from http://arxiv.org/abs/0812.4542

Penner, O., Pan, R. K., Petersen, A. M., & Fortunato, S. (2013). The case for caution in predicting scientists’ future impact. Physics Today, 66(4), 8. http://doi.org/10.1063/PT.3.1928

Penner, O., Pan, R. K., Petersen, A. M., Kaski, K., & Fortunato, S. (2013). On the Predictability of Future Impact in Science. Scientific Reports, 3. http://doi.org/10.1038/srep03052

Perc, M. (2010). Zipf’s law and log-normal distributions in measures of scientific output across fields and institutions: 40 years of Slovenia’s research as an example. Journal of Informetrics, 4(3), 358–364. http://doi.org/10.1016/j.joi.2010.03.001

Petersen, A. M., Stanley, H. E., & Succi, S. (2011). Statistical regularities in the rank-citation profile of scientists. Scientific Reports, 1. http://doi.org/10.1038/srep00181

Petersen, A. M., & Succi, S. (2013). The Z-index: A geometric representation of productivity and impact which accounts for information in the entire rank-citation profile. Journal of Informetrics, 7(4), 823–832. http://doi.org/10.1016/j.joi.2013.07.003

Pratelli, L., Baccini, A., Barabesi, L., & Marcheselli, M. (2012). Statistical analysis of the Hirsch Index. Scandinavian Journal of Statistics, 39(4), 681–694. http://doi.org/10.1111/j.1467-9469.2011.00782.x

Radicchi, F., & Castellano, C. (2013). Analysis of bibliometric indicators for individual scholars in a large data set. Scientometrics, 97(3), 627–637. http://doi.org/10.1007/s11192-013-1027-3

Redner, S. (2010). On the meaning of the h-index. Journal of Statistical Mechanics: Theory and Experiment, 2010(03), L03005. http://doi.org/10.1088/1742-5468/2010/03/L03005

Sanatinia, A., & Noubir, G. (2016). On GitHub’s Programming Languages. arXiv:1603.00431 [cs]. Retrieved from http://arxiv.org/abs/1603.00431

Schreiber, M. (2010). Twenty Hirsch index variants and other indicators giving more or less preference to highly cited papers. Annalen Der Physik, 522(8), 536–554. http://doi.org/10.1002/andp.201000046

Schreiber, M. (2013a). A Case Study of the Arbitrariness of the h-Index and the Highly-Cited-Publications Indicator. Journal of Informetrics, 7(2), 379–387. http://doi.org/10.1016/j.joi.2012.12.006

Schreiber, M. (2013b). How relevant is the predictive power of the h-index? A case study of the time-dependent Hirsch index. Journal of Informetrics, 7(2), 325–329. http://doi.org/10.1016/j.joi.2013.01.001

Schreiber, M. (2013c). How to derive an advantage from the arbitrariness of the g-index. Journal of Informetrics, 7(2), 555–561. http://doi.org/10.1016/j.joi.2013.02.003

Schreiber, M. (2013d). The predictability of the Hirsch index evolution. arXiv:1307.5964 [physics]. Retrieved from http://arxiv.org/abs/1307.5964

Schulz, C., Mazloumian, A., Petersen, A. M., Penner, O., & Helbing, D. (2014). Exploiting citation networks for large-scale author name disambiguation. EPJ Data Science, 3(1). http://doi.org/10.1140/epjds/s13688-014-0011-3

Sidiropoulos, A., Katsaros, D., & Manolopoulos, Y. (2006). Generalized h-index for Disclosing Latent Facts in Citation Networks. arXiv:cs/0607066. Retrieved from http://arxiv.org/abs/cs/0607066

Sidiropoulos, A., Katsaros, D., & Manolopoulos, Y. (2014). Identification of Influential Scientists vs. Mass Producers by the Perfectionism Index. arXiv:1409.6099 [physics]. Retrieved from http://arxiv.org/abs/1409.6099

van Bevern, R., Komusiewicz, C., Molter, H., Niedermeier, R., Sorge, M., & Walsh, T. (2016). h-Index Manipulation by Undoing Merges. arXiv:1604.04827 [cs]. Retrieved from http://arxiv.org/abs/1604.04827

Vanclay, J. K. (2007). On the robustness of the h-index. Journal of the American Society for Information Science and Technology, 58(10), 1547–1550. http://doi.org/10.1002/asi.20616

van Raan, A. F. J. (2005). Comparison of the Hirsch-index with standard bibliometric indicators and with peer judgment for 147 chemistry research groups. arXiv:physics/0511206. Retrieved from http://arxiv.org/abs/physics/0511206

Waaijers, L. (2011). Viva the h-index. arXiv:1109.5520 [physics]. Retrieved from http://arxiv.org/abs/1109.5520

Waltman, L., & van Eck, N. J. (2011). The inconsistency of the h-index. arXiv:1108.3901 [physics]. Retrieved from http://arxiv.org/abs/1108.3901

Yong, A. (2014). Critique of Hirsch’s citation index: a combinatorial Fermi problem. arXiv:1402.4357 [math]. Retrieved from http://arxiv.org/abs/1402.4357

Zhu, X., Turney, P., Lemire, D., & Vellino, A. (2015). Measuring academic influence: Not all citations are equal. Journal of the Association for Information Science and Technology, 66(2), 408–427. http://doi.org/10.1002/asi.23179

Marketplace Scan

The “marketplace of ideas” is occupied by a number of related projects and companies.

GitHub Data Challenges

GitHub Data Sources

Metrics, Aggregation, and Attribution

  • Thompson/Reuters
    • Institute for Scientific Information, Science Citation Index
    • Web of Science
    • Journal Citation Reports
  • Altmetric
  • ImpactStory
  • CiteULike
  • CrossRef
  • Scopus
  • FigShare
  • ResearchGate
  • Academia.edu
  • SciELO
  • Google
    • Scholar
    • i10 Index
  • Microsoft Research
  • Mendeley
  • Zotero
  • Scimago Journal & Country Rank
  • EigenFactor

Learn More

Read our reports about gh-impact.