[python]Mongodb-白红宇

[python]Mongodb

阅读量：4987 次

发布时间：2019-06-12

本文共 18350 字，大约阅读时间需要 61 分钟。

文档:

http://api.mongodb.com/python/current/tutorial.html

安装:

官网直接下载安装, mac上brew安装的下载太慢, 打算手动安装

使用:

开启服务:

1 mongod #默认配置开启服务2 mongod -- dpath 
     
       # 指定数据库文件路径

连接服务:

1 mongo # 默认配置连接2 mongo [options] [db address] [file names (ending in .js)]

图形可视化程序:

https://www.robomongo.org/

shell:

1 > help 2     db.help()                    help on db methods 3     db.mycoll.help()             help on collection methods 4     sh.help()                    sharding helpers 5     rs.help()                    replica set helpers 6     help admin                   administrative help 7     help connect                 connecting to a db help 8     help keys                    key shortcuts 9     help misc                    misc things to know10     help mr                      mapreduce11 12     show dbs                     show database names13     show collections             show collections in current database14     show users                   show users in current database15     show profile                 show most recent system.profile entries with time >= 1ms16     show logs                    show the accessible logger names17     show log [name]              prints out the last segment of log in memory, 'global' is default18     use 
     
                      set current database19     db.foo.find()                list objects in collection foo20     db.foo.find( { a : 1 } )     list objects in foo where a == 121     it                           result of the last line evaluated; use to further iterate22     DBQuery.shellBatchSize = x   set default number of items to display on shell23     exit                         quit the mongo shell

more helps...

1 > db.help() 2 DB methods: 3     db.adminCommand(nameOrDocument) - switches to 'admin' db, and runs command [just calls db.runCommand(...)] 4     db.aggregate([pipeline], {options}) - performs a collectionless aggregation on this database; returns a cursor 5     db.auth(username, password) 6     db.cloneDatabase(fromhost) 7     db.commandHelp(name) returns the help for the command 8     db.copyDatabase(fromdb, todb, fromhost) 9     db.createCollection(name, {size: ..., capped: ..., max: ...})10     db.createView(name, viewOn, [{$operator: {...}}, ...], {viewOptions})11     db.createUser(userDocument)12     db.currentOp() displays currently executing operations in the db13     db.dropDatabase()14     db.eval() - deprecated15     db.fsyncLock() flush data to disk and lock server for backups16     db.fsyncUnlock() unlocks server following a db.fsyncLock()17     db.getCollection(cname) same as db['cname'] or db.cname18     db.getCollectionInfos([filter]) - returns a list that contains the names and options of the db's collections19     db.getCollectionNames()20     db.getLastError() - just returns the err msg string21     db.getLastErrorObj() - return full status object22     db.getLogComponents()23     db.getMongo() get the server connection object24     db.getMongo().setSlaveOk() allow queries on a replication slave server25     db.getName()26     db.getPrevError()27     db.getProfilingLevel() - deprecated28     db.getProfilingStatus() - returns if profiling is on and slow threshold29     db.getReplicationInfo()30     db.getSiblingDB(name) get the db at the same server as this one31     db.getWriteConcern() - returns the write concern used for any operations on this db, inherited from server object if set32     db.hostInfo() get details about the server's host33     db.isMaster() check replica primary status34     db.killOp(opid) kills the current operation in the db35     db.listCommands() lists all the db commands36     db.loadServerScripts() loads all the scripts in db.system.js37     db.logout()38     db.printCollectionStats()39     db.printReplicationInfo()40     db.printShardingStatus()41     db.printSlaveReplicationInfo()42     db.dropUser(username)43     db.repairDatabase()44     db.resetError()45     db.runCommand(cmdObj) run a database command.  if cmdObj is a string, turns it into {cmdObj: 1}46     db.serverStatus()47     db.setLogLevel(level,
      
       )48     db.setProfilingLevel(level,slowms) 0=off 1=slow 2=all49     db.setWriteConcern(
       
        ) - sets the write concern for writes to the db50     db.unsetWriteConcern(
        
         ) - unsets the write concern for writes to the db51     db.setVerboseShell(flag) display extra information in shell output52     db.shutdownServer()53     db.stats()54     db.version() current version of the server55 >

DB methods

1 > db.mycoll.help() 2 DBCollection help 3     db.mycoll.find().help() - show DBCursor help 4     db.mycoll.bulkWrite( operations, 
      
        ) - bulk execute write operations, optional parameters are: w, wtimeout, j 5     db.mycoll.count( query = {}, 
       
         ) - count the number of documents that matches the query, optional parameters are: limit, skip, hint, maxTimeMS 6     db.mycoll.copyTo(newColl) - duplicates collection by copying all documents to newColl; no indexes are copied. 7     db.mycoll.convertToCapped(maxBytes) - calls {convertToCapped:'mycoll', size:maxBytes}} command 8     db.mycoll.createIndex(keypattern[,options]) 9     db.mycoll.createIndexes([keypatterns], 
        
         )10     db.mycoll.dataSize()11     db.mycoll.deleteOne( filter, 
         
           ) - delete first matching document, optional parameters are: w, wtimeout, j12     db.mycoll.deleteMany( filter, 
          
            ) - delete all matching documents, optional parameters are: w, wtimeout, j13     db.mycoll.distinct( key, query, 
           
             ) - e.g. db.mycoll.distinct( 'x' ), optional parameters are: maxTimeMS14 db.mycoll.drop() drop the collection15 db.mycoll.dropIndex(index) - e.g. db.mycoll.dropIndex( "indexName" ) or db.mycoll.dropIndex( { "indexKey" : 1 } )16 db.mycoll.dropIndexes()17 db.mycoll.ensureIndex(keypattern[,options]) - DEPRECATED, use createIndex() instead18 db.mycoll.explain().help() - show explain help19 db.mycoll.reIndex()20 db.mycoll.find([query],[fields]) - query is an optional query filter. fields is optional set of fields to return.21 e.g. db.mycoll.find( {x:77} , {name:1, x:1} )22 db.mycoll.find(...).count()23 db.mycoll.find(...).limit(n)24 db.mycoll.find(...).skip(n)25 db.mycoll.find(...).sort(...)26 db.mycoll.findOne([query], [fields], [options], [readConcern])27 db.mycoll.findOneAndDelete( filter, 
            
              ) - delete first matching document, optional parameters are: projection, sort, maxTimeMS28 db.mycoll.findOneAndReplace( filter, replacement, 
             
               ) - replace first matching document, optional parameters are: projection, sort, maxTimeMS, upsert, returnNewDocument29 db.mycoll.findOneAndUpdate( filter, update, 
              
                ) - update first matching document, optional parameters are: projection, sort, maxTimeMS, upsert, returnNewDocument30 db.mycoll.getDB() get DB object associated with collection31 db.mycoll.getPlanCache() get query plan cache associated with collection32 db.mycoll.getIndexes()33 db.mycoll.group( { key : ..., initial: ..., reduce : ...[, cond: ...] } )34 db.mycoll.insert(obj)35 db.mycoll.insertOne( obj, 
               
                 ) - insert a document, optional parameters are: w, wtimeout, j36 db.mycoll.insertMany( [objects], 
                
                  ) - insert multiple documents, optional parameters are: w, wtimeout, j37 db.mycoll.mapReduce( mapFunction , reduceFunction , 
                 
                   )38 db.mycoll.aggregate( [pipeline], 
                  
                    ) - performs an aggregation on a collection; returns a cursor39 db.mycoll.remove(query)40 db.mycoll.replaceOne( filter, replacement, 
                   
                     ) - replace the first matching document, optional parameters are: upsert, w, wtimeout, j41 db.mycoll.renameCollection( newName , 
                    
                      ) renames the collection.42 db.mycoll.runCommand( name , 
                     
                       ) runs a db command with the given name where the first param is the collection name43 db.mycoll.save(obj)44 db.mycoll.stats({scale: N, indexDetails: true/false, indexDetailsKey: 
                      
                       , indexDetailsName: 
                       
                        })45 db.mycoll.storageSize() - includes free space allocated to this collection46 db.mycoll.totalIndexSize() - size in bytes of all the indexes47 db.mycoll.totalSize() - storage allocated for all data and indexes48 db.mycoll.update( query, object[, upsert_bool, multi_bool] ) - instead of two flags, you can pass an object with fields: upsert, multi49 db.mycoll.updateOne( filter, update, 
                        
                          ) - update the first matching document, optional parameters are: upsert, w, wtimeout, j50 db.mycoll.updateMany( filter, update, 
                         
                           ) - update all matching documents, optional parameters are: upsert, w, wtimeout, j51 db.mycoll.validate( 
                          
                            ) - SLOW52 db.mycoll.getShardVersion() - only for use with sharding53 db.mycoll.getShardDistribution() - prints statistics about data distribution in the cluster54 db.mycoll.getSplitKeysForChunks( 
                           
                             ) - calculates split points over all chunks and returns splitter function55 db.mycoll.getWriteConcern() - returns the write concern used for any operations on this collection, inherited from server/db if set56 db.mycoll.setWriteConcern( 
                            
                              ) - sets the write concern for writes to the collection57 db.mycoll.unsetWriteConcern( 
                             
                               ) - unsets the write concern for writes to the collection58 db.mycoll.latencyStats() - display operation latency histograms for this collection59 >

Collection methods

1 > sh.help() 2     sh.addShard( host )                       server:port OR setname/server:port 3     sh.addShardToZone(shard,zone)             adds the shard to the zone 4     sh.updateZoneKeyRange(fullName,min,max,zone)      assigns the specified range of the given collection to a zone 5     sh.disableBalancing(coll)                 disable balancing on one collection 6     sh.enableBalancing(coll)                  re-enable balancing on one collection 7     sh.enableSharding(dbname)                 enables sharding on the database dbname 8     sh.getBalancerState()                     returns whether the balancer is enabled 9     sh.isBalancerRunning()                    return true if the balancer has work in progress on any mongos10     sh.moveChunk(fullName,find,to)            move the chunk where 'find' is to 'to' (name of shard)11     sh.removeShardFromZone(shard,zone)      removes the shard from zone12     sh.removeRangeFromZone(fullName,min,max)   removes the range of the given collection from any zone13     sh.shardCollection(fullName,key,unique,options)   shards the collection14     sh.splitAt(fullName,middle)               splits the chunk that middle is in at middle15     sh.splitFind(fullName,find)               splits the chunk that find is in at the median16     sh.startBalancer()                        starts the balancer so chunks are balanced automatically17     sh.status()                               prints a general overview of the cluster18     sh.stopBalancer()                         stops the balancer so chunks are not balanced automatically19     sh.disableAutoSplit()                   disable autoSplit on one collection20     sh.enableAutoSplit()                    re-enable autoSplit on one collection21     sh.getShouldAutoSplit()                 returns whether autosplit is enabled22 >

sharding helpers

1 > rs.help() 2     rs.status()                                { replSetGetStatus : 1 } checks repl set status 3     rs.initiate()                              { replSetInitiate : null } initiates set with default settings 4     rs.initiate(cfg)                           { replSetInitiate : cfg } initiates set with configuration cfg 5     rs.conf()                                  get the current configuration object from local.system.replset 6     rs.reconfig(cfg)                           updates the configuration of a running replica set with cfg (disconnects) 7     rs.add(hostportstr)                        add a new member to the set with default attributes (disconnects) 8     rs.add(membercfgobj)                       add a new member to the set with extra attributes (disconnects) 9     rs.addArb(hostportstr)                     add a new member which is arbiterOnly:true (disconnects)10     rs.stepDown([stepdownSecs, catchUpSecs])   step down as primary (disconnects)11     rs.syncFrom(hostportstr)                   make a secondary sync from the given member12     rs.freeze(secs)                            make a node ineligible to become primary for the time specified13     rs.remove(hostportstr)                     remove a host from the replica set (disconnects)14     rs.slaveOk()                               allow queries on secondary nodes15 16     rs.printReplicationInfo()                  check oplog size and time range17     rs.printSlaveReplicationInfo()             check replica set members and replication lag18     db.isMaster()                              check who is primary19 20     reconfiguration helpers disconnect from the database so the shell will display21     an error, even if the command succeeds.22 >

replica set helpers

1 > help admin 2     ls([path])                      list files 3     pwd()                           returns current directory 4     listFiles([path])               returns file list 5     hostname()                      returns name of this host 6     cat(fname)                      returns contents of text file as a string 7     removeFile(f)                   delete a file or directory 8     load(jsfilename)                load and execute a .js file 9     run(program[, args...])         spawn a program and wait for its completion10     runProgram(program[, args...])  same as run(), above11     sleep(m)                        sleep m milliseconds12     getMemInfo()                    diagnostic13 >

administrative help

1 > help connect 2  3 Normally one specifies the server on the mongo shell command line.  Run mongo --help to see those options. 4 Additional connections may be opened: 5  6     var x = new Mongo('host[:port]'); 7     var mydb = x.getDB('mydb'); 8   or 9     var mydb = connect('host[:port]/mydb');10 11 Note: the REPL prompt only auto-reports getLastError() for the shell command line connection.12 13 >

connect db help

1 > help keys 2 Tab completion and command history is available at the command prompt. 3  4 Some emacs keystrokes are available too: 5   Ctrl-A start of line 6   Ctrl-E end of line 7   Ctrl-K del to end of line 8  9 Multi-line commands10 You can enter a multi line javascript expression.  If parens, braces, etc. are not closed, you will see a new line11 beginning with '...' characters.  Type the rest of your expression.  Press Ctrl-C to abort the data entry if you12 get stuck.13 14 >

shotcut keys

1 > help misc 2     b = new BinData(subtype,base64str)  create a BSON BinData value 3     b.subtype()                         the BinData subtype (0..255) 4     b.length()                          length of the BinData data in bytes 5     b.hex()                             the data as a hex encoded string 6     b.base64()                          the data as a base 64 encoded string 7     b.toString() 8  9     b = HexData(subtype,hexstr)         create a BSON BinData value from a hex string10     b = UUID(hexstr)                    create a BSON BinData value of UUID subtype11     b = MD5(hexstr)                     create a BSON BinData value of MD5 subtype12     "hexstr"                            string, sequence of hex characters (no 0x prefix)13 14     o = new ObjectId()                  create a new ObjectId15     o.getTimestamp()                    return timestamp derived from first 32 bits of the OID16     o.isObjectId17     o.toString()18     o.equals(otherid)19 20     d = ISODate()                       like Date() but behaves more intuitively when used21     d = ISODate('YYYY-MM-DD hh:mm:ss')    without an explicit "new " prefix on construction22 >

misc

1 > help mr 2  3 See also http://dochub.mongodb.org/core/mapreduce 4  5 function mapf() { 6   // 'this' holds current document to inspect 7   emit(key, value); 8 } 9 10 function reducef(key,value_array) {11   return reduced_value;12 }13 14 db.mycollection.mapReduce(mapf, reducef[, options])15 16 options17 {[query : 
      
       ]18  [, sort : 
       
        ]19  [, limit : 
        
         ]20  [, out : 
         
          ]21  [, keeptemp: 
          
           ]22  [, finalize : 
           
            ]23 [, scope :

python驱动

pip install pymongo

scrapy:

settings.py

1 ITEM_PIPELINES = ['stack.pipelines.MongoDBPipeline', ]2 3 MONGODB_SERVER = "localhost"4 MONGODB_PORT = 270175 MONGODB_DB = "stackoverflow"6 MONGODB_COLLECTION = "questions"

piplines.py

1 import pymongo 2  3 from scrapy.conf import settings 4 from scrapy.exceptions import DropItem 5 from scrapy import log 6  7  8 class MongoDBPipeline(object): 9 10     def __init__(self):11         connection = pymongo.MongoClient(12             settings['MONGODB_SERVER'],13             settings['MONGODB_PORT']14         )15         db = connection[settings['MONGODB_DB']]16         self.collection = db[settings['MONGODB_COLLECTION']]17 18     def process_item(self, item, spider):19         valid = True20         for data in item:21             if not data:22                 valid = False23                 raise DropItem("Missing {0}!".format(data))24         if valid:25             self.collection.insert(dict(item))26             log.msg("Question added to MongoDB database!",27                     level=log.DEBUG, spider=spider)28         return item

scrapy 官方文档 https://doc.scrapy.org/en/latest/topics/item-pipeline.html#write-items-to-mongodb:

piplines.py

1 import pymongo 2  3 class MongoPipeline(object): 4  5     collection_name = 'scrapy_items' 6  7     def __init__(self, mongo_uri, mongo_db): 8         self.mongo_uri = mongo_uri 9         self.mongo_db = mongo_db10 11     @classmethod12     def from_crawler(cls, crawler):13         return cls(14             mongo_uri=crawler.settings.get('MONGO_URI'),15             mongo_db=crawler.settings.get('MONGO_DATABASE', 'items')16         )17 18     def open_spider(self, spider):19         self.client = pymongo.MongoClient(self.mongo_uri)20         self.db = self.client[self.mongo_db]21 22     def close_spider(self, spider):23         self.client.close()24 25     def process_item(self, item, spider):26         self.db[self.collection_name].insert_one(dict(item))27         return item

转载于:https://www.cnblogs.com/sigai/p/8417550.html

你可能感兴趣的文章

蓝桥网试题 java 入门训练 A+B问题