Monday, July 15, 2013

Beginnings: Using Big Data in Russian/Eurasian History

This blog is a new venture for me based on some of the work I have been trying out in digital history in the Russian/Soviet/Eurasian history field.  My interest in using computational methods in history first grew out of courses I was taking to reconnect with a longtime interest in computer science that was waylaid by a small matter of my dissertation.  But I began to see more and more of the applications for processing sources in history with computers.  In particular, I think there is a place for computers in helping historians collect and process "big data" and in creating new ways of visualizing the past.

What are my goals for this blog?  At first, I will just post data I'm processing here.  There are a few projects I have in mind but the first one will probably be what I am considering calling "Who Was the Soviet Kevin Bacon?"  Using roughly 8,000 entries from, I am applying graph theory to figure out who were the most central figures in Soviet cinema from a quantitative standpoint.  And, as a system, I'll try to find out how connected the world of Soviet cinema was.

At some point, I would like to expand this project to include web applications.  To me, this is the most exciting and new aspect of this site.  Instead of just publishing processed data, web applications can allow users themselves to query dynamically for visualizations and information.  The sub-project here that is closest to completion is a database of marriage data for each region from the Russian/Soviet censuses from 1897 to 1989.

The last use of this blog will be a forum for me to post some broader thoughts on digital history.  Here I will let readers under the hood a little bit and talk about some of the tools I am using (mostly the programming language Python).  But I also would like to examine the digital history field as a whole and in Russian/Eurasian history.

