Friday, November 1, 2013

Soviet and Post-Soviet History Dissertation Database

For the last month or so I have been putting together a database of the titles, authors and other data on history dissertations written in the Soviet Union and post-Soviet countries. I am now ready to make it available as a web application here. When I have some time, I am planning to posts a little about the topics in the database, the geography of the database, the sources used or possibly something like the great chronological analysis of history dissertations that Ben Schmidt posted last spring (part one here). (I hesitate just because I strongly suspect that the turning point years for dissertations written in the Soviet period will follow the Communist Party's official chronology.)

Before I post a little about what is included in the database, I should put out a little crowdsourcing call: I used a great Python module called Goslate, a Google Translate parser, to generate the English titles of the dissertations. On the whole I think the translations are understandable, if not elegant. But in some cases the translations are just wrong. For example, L'viv translated automatically as Lions. This is where you come in. For each entry you can click on the English title and suggest a new translation. I'm also happy to include other corrections if there are errors.

Here is some information about the sources and general contours of the information in the database:

Sources: I came across listings for Soviet history dissertations in Voprosy istorii a few years ago and thought it would be a cool project to play around with if I could turn the listings into a database. I had the digitizing help of a former student, Aya Bara and EastView's digital archive of the journal. The journal's listings amount to 6,589 candidate and doctoral dissertations from 1945 to 1966 and another 909 doctoral dissertations from 1976 to 1986. The listings includes dissertations from all of the disciplines included under the broader Higher Qualifications Commission (VAK) code for history--Soviet history, "general history" (i.e., non-Soviet), archaeology, ethnography and so on.

The second source I used was Dissercat, a website that sells recent dissertations written in former-Soviet and nearby countries (e.g., Mongolia--thanks for the heads up, Kyle Marquardt). I do not endorse using that site to purchase dissertations. I would instead recommend contacting the scholar whose work you would like to read, if not taking a trip to the Russian State Library the next time you are in Moscow. However, the Dissercat entry is worth checking out, since it often includes the dissertation's introduction, conclusion and bibliography. All told, Dissercat gave 10,868 dissertations on Eurasian history and another 7,420 in other VAK disciplines. These entries begin in the early 1980s (with a handful before) and they include dissertations from as late as this year (2013).

I do not know how complete the listings are. My impression is that Voprosi istorii included all of the dissertations for those years it covered. The Dissercat listings are good for the 2000s but spottier for earlier years, since it is mostly interested in posting dissertations it can sell.

Years: The database includes 2,973 dissertations from 1945 to 1952, 3,622 from 1953 to 1964, 619 (mostly doctoral and mostly from later in the period) from 1965 to 1982, 1,394 from 1983 to 1991, 4,266 from 1992 to 1999 and 12,865 from 2000 to the present. Another forty-seven had no date but are probably from the post-1991 period.

Degree: Most of the dissertations (21,436) are for the lower, candidate degree. The remainder (4,350) are doctoral degrees.

Cities: I should note that the cities listed are the place of work of the author in the Voprosi istorii listings that include a city. In the Dissercat listings, the city is probably where the dissertation was defended. Overall, there are roughly 200 different cities listed as being associated with these dissertations. About a quarter are from Moscow (6,187) and the next largest number are from Petersburg/Leningrad (1,764). But the largest number are of those that did not list a city at all (6,590). This is especially unfortunate because it includes most of the listings from 1945-1964.

Institutions: Instead of giving cities of defense/work, many of the the earlier listings included the institutions whereas the later listings mostly do not. Of those 6,885 that do list an institution, Moscow State University (MGU) has the largest number--1,028. MGU is followed by the Academy of Social Sciences of the Communist Party (AON) with 810, Leningrad State University with 519 and the Institute of History of the Academy of Sciences of the USSR with 457.

And just for fun, here are the top sixty-five substantive words (prepositions, articles and words like "late" or "early" removed) that occur in the English translations:

Word Occurrences % of Total
1 russian 2973 0.76%
2 development 2650 0.68%
3 russia 2242 0.57%
4 party 2015 0.52%
5 history 1955 0.50%
6 policy 1930 0.49%
7 soviet 1904 0.49%
8 war 1829 0.47%
9 struggle 1790 0.46%
10 political 1774 0.45%
11 first 1731 0.44%
12 during 1702 0.44%
13 region 1594 0.41%
14 formation 1462 0.37%
15 state 1348 0.35%
16 historical 1340 0.34%
17 social 1338 0.34%
18 activities 1295 0.33%
19 relations 1270 0.33%
20 1917 1245 0.32%
21 materials 1227 0.31%
22 movement 978 0.25%
23 xviii 944 0.24%
24 province 902 0.23%
25 great 866 0.22%
26 education 854 0.22%
27 socialist 839 0.21%
28 communist 820 0.21%
29 cultural 792 0.20%
30 national 765 0.20%
31 siberia 744 0.19%
32 revolution 715 0.18%
33 culture 712 0.18%
34 economic 699 0.18%
35 world 698 0.18%
36 experience 678 0.17%
37 military 677 0.17%
38 republic 667 0.17%
39 system 662 0.17%
40 role 647 0.17%
41 ussr 621 0.16%
42 patriotic 617 0.16%
43 western 612 0.16%
44 problems 612 0.16%
45 1918 608 0.16%
46 life 601 0.15%
47 organization 600 0.15%
48 industry 588 0.15%
49 east 587 0.15%
50 public 579 0.15%
51 volga 576 0.15%
52 foreign 572 0.15%
53 population 556 0.14%
54 central 555 0.14%
55 north 540 0.14%
56 bolshevik 535 0.14%
57 1920 527 0.14%
58 urals 527 0.14%
59 moscow 526 0.13%
60 government 524 0.13%
61 power 514 0.13%
62 union 499 0.13%
63 caucasus 495 0.13%
64 society 485 0.12%
65 historiography 474 0.12%

5 comments:

  1. Great project! Have you considered merging the data with plagiarism data from http://www.dissernet.org? Now that it is official that 10% of post-2000 history dissertations are plagiarised, it would be very interesting to see the topical structure of academic misconduct. Are there any topic nests of plagiarised dissertations?

    ReplyDelete
  2. I hadn't heard of that site but I will have to contact them and see if they want to collaborate. Plagiarism is a huge issue in Russia and in my experience is so widely tolerated that I wonder how to make it less acceptable. But I like your idea to look for nests of plagiarism. What topics or specific dissertations are attracting the most attention? Along these lines, I thought about buying at a disk of referaty to see what topics are most common. I think there is a better way of doing this though--upcoming post. Thanks for the input!

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete
  4. This comment has been removed by a blog administrator.

    ReplyDelete
  5. This comment has been removed by a blog administrator.

    ReplyDelete