MongoDBのmap/reduceを試してみる。
以前(http://d.hatena.ne.jp/stog/20100531/1275317576)作成したコレクションをmap/reduceを使って"type"別に集計してみる。
コレクションの中身はこんな感じ。
$ mongo mytest MongoDB shell version: 1.4.4 url: mytest connecting to: mytest type "help" for help > > db.members2.find() { "_id" : ObjectId("4c03c632b4742d2998000000"), "birthday" : 342057600, "type" : "human", "name" : "おがわ", "sex" : "M" } { "_id" : ObjectId("4c03c632b4742d2998000001"), "birthday" : 130550400, "type" : "human", "name" : "たかはし", "sex" : "F" } { "_id" : ObjectId("4c03c632b4742d2998000002"), "birthday" : 1042588800, "type" : "human", "name" : "たなか", "sex" : "M" } { "_id" : ObjectId("4c03c632b4742d2998000003"), "birthday" : -291600000, "type" : "human", "name" : "さとう", "sex" : "F" } { "_id" : ObjectId("4c03c632b4742d2998000004"), "birthday" : 1118102400, "type" : "dog", "name" : "ポチ", "sex" : "F" } { "_id" : ObjectId("4c03c632b4742d2998000005"), "birthday" : 807840000, "type" : "dog", "name" : "タロ", "sex" : "M" } { "_id" : ObjectId("4c03c632b4742d2998000006"), "birthday" : 1230076800, "type" : "cat", "name" : "タマ", "sex" : "F" } { "_id" : ObjectId("4c03c633b4742d2998000007"), "birthday" : 914544000, "type" : "cat", "name" : "ミケ", "sex" : "M" } { "_id" : ObjectId("4c03c633b4742d2998000008"), "birthday" : 0, "type" : "human", "name" : "John", "sex" : "M" } { "_id" : ObjectId("4c03c633b4742d2998000009"), "birthday" : -927676800, "type" : "human", "name" : "Michael", "sex" : "M" } { "_id" : ObjectId("4c03c633b4742d299800000a"), "birthday" : 927158400, "type" : "human", "name" : "Robert", "sex" : "M" } { "_id" : ObjectId("4c03c633b4742d299800000b"), "birthday" : 1259971200, "type" : "human", "name" : "David", "sex" : "M" } { "_id" : ObjectId("4c03c633b4742d299800000c"), "birthday" : -86400, "type" : "human", "name" : "James", "sex" : "M" } { "_id" : ObjectId("4c03c633b4742d299800000d"), "birthday" : 481939200, "type" : "human", "name" : "Mary", "sex" : "F" } { "_id" : ObjectId("4c03c633b4742d299800000e"), "birthday" : -448588800, "type" : "human", "name" : "Barbara", "sex" : "F" } { "_id" : ObjectId("4c03c633b4742d299800000f"), "birthday" : 979948800, "type" : "human", "name" : "Anne", "sex" : "F" } { "_id" : ObjectId("4c03c633b4742d2998000010"), "birthday" : -765849600, "type" : "human", "name" : "Maria", "sex" : "F" } { "_id" : ObjectId("4c03c633b4742d2998000011"), "birthday" : 249696000, "type" : "human", "name" : "Susan", "sex" : "F" } >
まずは、mongodbの対話環境でやってみる。
$ mongo mytest MongoDB shell version: 1.4.4 url: mytest connecting to: mytest type "help" for help > > var m = function(){ emit(this.type, 1); }; > var r = function(k, vals){ var sum=0; for(var i in vals) sum += vals[i]; return sum; }; > res = db.members2.mapReduce(m, r); { "result" : "tmp.mr.mapreduce_1278193998_9", "timeMillis" : 39, "counts" : { "input" : 18, "emit" : 18, "output" : 3 }, "ok" : 1, } > > db[res.result].find(); { "_id" : "cat", "value" : 2 } { "_id" : "dog", "value" : 2 } { "_id" : "human", "value" : 14 } >
今度は、Pythonで集計してみる。
test_mapReduce.py
#!/usr/bin/env python # -*- coding:utf-8 -*- import pymongo conn = pymongo.Connection() db = conn["mytest"] coll = db["members2"] # type別に集計するためのmap,reduceを定義。(javascriptコードを文字列で渡す) m = pymongo.code.Code("function(){ emit(this.type, 1); }") r = pymongo.code.Code(""" function(k, vals){ var sum = 0; for(var i in vals) sum += vals[i]; return sum; } """) result = coll.map_reduce(m, r) for doc in result.find(): print doc conn.disconnect()
以下、実行結果。
$ python test_mapReduce.py {u'_id': u'cat', u'value': 2.0} {u'_id': u'dog', u'value': 2.0} {u'_id': u'human', u'value': 14.0}
参考:
http://www.mongodb.org/display/DOCSJP/MapReduce
http://api.mongodb.org/python/1.7%2B/examples/map_reduce.html