侧边栏壁纸
  • 累计撰写 35 篇文章
  • 累计创建 14 个标签
  • 累计收到 0 条评论

目 录CONTENT

文章目录

mongo相关错误和运维命令(一)

子曰
2023-03-16 / 0 评论 / 0 点赞 / 835 阅读 / 1,705 字 / 正在检测是否收录...

1. mongos启动失败

May 23 15:01:50 im2-dem003 mongos[12648]: about to fork child process, waiting until server is ready for connections.
May 23 15:01:50 im2-dem003 mongos[12648]: forked process: 12650
May 23 15:01:50 im2-dem003 mongos[12648]: ERROR: child process failed, exited with error number 48
May 23 15:01:50 im2-dem003 mongos[12648]: To see additional information in this output, start without the "--fork" option.
May 23 15:01:50 im2-dem003 systemd[1]: mongos.service: control process exited, code=exited status=48
May 23 15:01:50 im2-dem003 systemd[1]: Failed to start MongoDB Database Server.
-- Subject: Unit mongos.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit mongos.service has failed.
--
-- The result is failed.
May 23 15:01:50 im2-dem003 systemd[1]: Unit mongos.service entered failed state.
May 23 15:01:50 im2-dem003 systemd[1]: mongos.service failed.

查看配置文件,发现ip写错了。

2. mongo的stepdown失败

"errmsg" : "No electable secondaries caught up as of 2022-05-25T22:20:40.764+0800. Please use the replSetStepDown command with the argument {force: true} to force node to step down.",

3. 日志分割

https://www.mongodb.com/docs/manual/reference/command/logRotate/

kill -SIGUSR1 2200

img

线上执行日志分割后,发现日志文件是重写了,老的数据文件丢失了。在线下的配置文件:

img

线上的配置:

img

发现多了一个配置:

logRotate: reopen

测试后确认确实是这个配置引起的。

4. mongodb查看所有参数配置

db.runCommand( { getParameter : '*' } )

5. zabbix监控mongo需要的权限

use admin
db.createUser(
	     {
	       user:"zabbix",
	       pwd:"!!)a1104",
	       roles:[{role:"clusterAdmin",db:"admin"},
         { role: "readAnyDatabase", db: "admin" }]  
	     })

# grant语法
db.grantRolesToUser(
    "zabbix",
    [
      { role: "readAnyDatabase", db: "admin" }
    ]
)

6. 大数据读取oplog的权限

db.createUser(
	     {
	       user:"bigdataread",
	       pwd:"bigdataread",
	       roles:[{role:"read",db:"admin"},
	       {role:"readAnyDatabase",db:"admin"}]  
	     })

7. mongo查看每个collection大小

参考文档:

https://www.jianshu.com/p/955ea98667c6

1. 获取 mongoDB 中数据库的大小命令

use databasename

db.stats()

显示信息如下

> db.stats (){ "collections" : 3, "objects" : 80614, "dataSize" : 21069700, "storageSize" : 39845376, "numExtents" : 9, "indexes" : 2, "indexSize" : 6012928, "ok" : 1} 
其中 storage 表示的就是数据库的大小,显示出的数字的单位是字节,因此如果需要转换单位为 KB 需要除以 1024 

2. 获取 MongoDB 中 collection相关的大小

db.collection.dataSize() //collection 中的数据大小

db.collection.storageSize() // 为 collection 分配的空间大小,包括未使用的空间

db.collection.totalIndexSize() //collection 中索引数据大小

db.collection.totalSize() //collection 中索引 + data 所占空间

3. 查询collection的名字及大小

查询名字

db.getCollectionNames().forEach( function(u) { printjson(u); } )

查询大小命令

# 命令1:
db.getCollectionNames().forEach( function(u) { printjson(u + " " + db.getCollection(u).totalSize()/1024/1024) } )

# 命令2:
var collNames = db.getCollectionNames();
for (var i = 0; i < collNames.length; i++) {   
  var coll = db.getCollection(collNames[i]); 
  var stats = coll.stats(1024 * 1024 * 1024); 
  print(stats.ns, stats.storageSize);
}

image-20230316151409045

8. mongostat命令

mongostat --host=127.0.0.1 --port=27017 --username=${username} --password=${passwd} --authenticationDatabase=admin

image-20230316151722377

9. Transactions with ignore_prepare=true cannot perform updates: Operation not supported Raw

相关日志:

2023-02-16T14:24:38.939+0800 E  STORAGE  [conn12085] WiredTiger error (95) [1676528678:939933][52434:0x7fdb8d6ee700], file:_mdb_catalog.wt, WT_CURSOR.insert: __wt_txn_modify, 418: Transactions with ignore_prepare=true cannot perform updates: Operation not supported Raw: [1676528678:939933][52434:0x7fdb8d6ee700], file:_mdb_catalog.wt, WT_CURSOR.insert: __wt_txn_modify, 418: Transactions with ignore_prepare=true cannot perform updates: Operation not supported
2023-02-16T14:24:38.939+0800 E  -        [conn12085] Assertion: UnknownError: WiredTigerRecordStore::insertRecord 95: Operation not supported src/mongo/db/catalog/database_impl.cpp 662
2023-02-16T14:24:38.939+0800 W  -        [conn12085] Caught Assertion while trying to profile msg against test.student: UnknownError: WiredTigerRecordStore::insertRecord 95: Operation not supported
2023-02-16T14:24:38.940+0800 I  -        [conn12085] Creating profile collection: test.system.profile
2023-02-16T14:24:38.940+0800 I  STORAGE  [conn12085] createCollection: test.system.profile with generated UUID: 1bb172c7-9af4-45dd-8aa1-be22a8556217 and options: { capped: true, size: 1048576 }
2023-02-16T14:24:38.940+0800 E  STORAGE  [conn12085] WiredTiger error (95) [1676528678:940941][52434:0x7fdb8d6ee700], file:_mdb_catalog.wt, WT_CURSOR.insert: __wt_txn_modify, 418: Transactions with ignore_prepare=true cannot perform updates: Operation not supported Raw: [1676528678:940941][52434:0x7fdb8d6ee700], file:_mdb_catalog.wt, WT_CURSOR.insert: __wt_txn_modify, 418: Transactions with ignore_prepare=true cannot perform updates: Operation not supported
2023-02-16T14:24:38.940+0800 E  -        [conn12085] Assertion: UnknownError: WiredTigerRecordStore::insertRecord 95: Operation not supported src/mongo/db/catalog/database_impl.cpp 662
2023-02-16T14:24:38.940+0800 W  -        [conn12085] Caught Assertion while trying to profile msg against test.student: UnknownError: WiredTigerRecordStore::insertRecord 95: Operation not supported

mongodb版本:4.2.7
mongodb分片集群基本上所有的节点都报错Transactions with ignore_prepare=true cannot perform updates: Operation not supported
经过查询得出结论是版本bug导致,因此将其升级到4.2.9 报错消失。

另外mongo 4.2.7版本运行时大约会在一个月或半个月左右报一次连接超时的异常,目前经过升级后此异常也没有出现(此现象未测试)

官方参考连接:https://jira.mongodb.org/browse/SERVER-47714

10. mongo副本集的flowControl机制

mongodb副本集写入时默认是写入majority,(1 + 节点数的一半。 3个节点时,majority=2),如果P (primary) – S (secondary) --A (arbiter), P ,S有一个节点挂掉或者从节点性能问题,严重落后主节点,会导致写入不足2个节点。然后新的P节点就会触发flowcontrol机制,限制写入,等待从节点赶上。有时为了让mongo先恢复写速度,可以调节参数。

db.runCommand( { getParameter : '*' } )
db.adminCommand({setParameter:1, flowControlMinTicketsPerSecond:10000}) // 可以调大flowcontrol的漏洞的ticket的数目
db.adminCommand({setParameter:1, enableFlowControl: false}) // 不行就直接关闭flowControl。

关于w:majority. 按道理说,服务器默认的写入是majority(rs.conf() 查看),那么为什么没有触发客户端的等待呢?在客户端强制设置w:majority时,客户完全hang请求(达不到w:majority时)。

rs_t13:PRIMARY> rs.conf()
{
        "_id" : "rs_t13",
        "version" : 339326,
        "protocolVersion" : NumberLong(1),
        "writeConcernMajorityJournalDefault" : true,
        "members" : [
                {
                        "_id" : 22,
                        "host" : "192.168.1.11:27017",
                        "arbiterOnly" : false,
                        "buildIndexes" : true,
                        "hidden" : false,
                        "priority" : 4,
                        "tags" : {

                        },
                        "slaveDelay" : NumberLong(0),
                        "votes" : 1
                },
                {
                        "_id" : 23,
                        "host" : "192.168.1.12:27017",
                        "arbiterOnly" : false,
                        "buildIndexes" : true,
                        "hidden" : false,
                        "priority" : 3,
                        "tags" : {

                        },
                        "slaveDelay" : NumberLong(0),
                        "votes" : 1
                },
                {
                        "_id" : 24,
                        "host" : "192.168.1.13:27017",
                        "arbiterOnly" : false,
                        "buildIndexes" : true,
                        "hidden" : false,
                        "priority" : 5,
                        "tags" : {

                        },
                        "slaveDelay" : NumberLong(0),
                        "votes" : 1
                }
        ],
        "settings" : {
                "chainingAllowed" : true,
                "heartbeatIntervalMillis" : 2000,
                "heartbeatTimeoutSecs" : 10,
                "electionTimeoutMillis" : 10000,
                "catchUpTimeoutMillis" : 60000,
                "catchUpTakeoverDelayMillis" : 30000,
                "getLastErrorModes" : {

                },
                "getLastErrorDefaults" : {
                        "w" : 1,
                        "wtimeout" : 0
                },
                "replicaSetId" : ObjectId("5dd4bb76512ab94ff5ca94a2")
        }
}

解释说明

"writeConcernMajorityJournalDefault" : true 这个字段表示的是,如果客户端写入为majority但是没有设置journal.那么默认也要等待写入journal。
getLastErrorDefaults 表示写入的确认个数,和等待时间。

{ w: <value>, j: <boolean>, wtimeout: <number> } // 在客户端指定写入为majority时,设置了超时时间,就算返回超时,没有达到大多数。也是成功写入了数据的。

0

评论区