The Hyperlingual and Spatiotemporal Wikipedia API for Academic Researchers

By Brent Hecht


Northwestern University

Wikipedia’s co-founder Jimmy Wales is fond of saying “Imagine a world in which every single person...is given free access to the sum of all human knowledge”.  Wikipedia has come a long way towards accomplishing this goal.  However, Wikipedia is not immune to the splintering effect of language, “the biggest barrier to intercultural collaboration” (Yamashita et. al 2009). Research has shown that each language edition of Wikipedia covers different concepts, and even discusses identical concepts very differently (Hecht and Gergle 2010).  WikAPIdia, with its hyperlingual perspective, allows researchers to interpret and leverage this “diversity of knowledge representations” (Hecht and Gergle 2010). Since spatial (and temporal) reference systems are critical for understanding human knowledge, WikAPIdia also includes packages for working with spatiotemporal Wikipedia data.

WikAPIdia has been successfully used in a growing number of academic publications in prominent publication venues including CHI, CSCW, and COSIT.


  1. WikAPIdia is hyperlingual by nature.  This means it can be used to simultaneously access the data in not one, not two, but any number of Wikipedia languages. 25 language editions are currently supported, but it is easy to add more!

  2. WikAPIdia is spatially referenced by nature.  WikAPIdia has extensive functionality for connecting the concepts in Wikipedia to any number of spatial reference systems.  These include geography, time, and many more (see documentation for more details).

  3. WikAPIdia includes support for several different semantic relatedness algorithms based on Wikipedia, one of the most successful and potential-filled applications of Wikipedia data.

  4.   WikAPIdia is Java-based and has a MySQL back-end.


WikAPIdia in its current form has been used in the following publications in the HCI and GIScience domain:

  1. Hecht, B. and D. Gergle (2010). The Tower of Babel Meets Web 2.0: User-Generated Content and its Applications in a Multilingual Context. CHI ’10: 28th ACM Conference on Human Factors in Computing Systems. Atlanta, GA, USA.

  2. Hecht, B. and D. Gergle (2010). On the “Localness” of User-Generated Content. CSCW ’10: 2010 ACM Conference on Computer-Supported Cooperative Work. Savannah, Georgia, USA.

  3. Hecht, B. and E. Moxley (2009). Terabytes of Tobler: Evaluating the First Law in a Massive, Domain-Neutral Representation of World Knowledge. COSIT '09: 9th International Conference on Spatial Information Theory. L'Aber W'rach, France, Springer: 88 - 105.

  4. Hecht, B. and D. Gergle (2009). Measuring Self-Focus Bias in Community-Maintained Knowledge Repositories. Communities and Technologies 2009: Fourth International Conference on Communities and Technologies, University Park, PA, USA. ACM: 11-21.


Special thanks to Patti Bao, Alina Lungeau, Darren Gergle, and Johannes Schöning.

YourKit is kindly supporting open source projects with its full-featured Java Profiler.

YourKit, LLC is the creator of innovative and intelligent tools for profiling

Java and .NET applications. Take a look at YourKit's leading software products: YourKit .NET Profiler and YourKit Java Profiler.