1. mongos启动失败
May 23 15:01:50 im2-dem003 mongos[12648]: about to fork child process, waiting until server is ready for connections.
May 23 15:01:50 im2-dem003 mongos[12648]: forked process: 12650
May 23 15:01:50 im2-dem003 mongos[12648]: ERROR: child process failed, exited with error number 48
May 23 15:01:50 im2-dem003 mongos[12648]: To see additional information in this output, start without the "--fork" option.
May 23 15:01:50 im2-dem003 systemd[1]: mongos.service: control process exited, code=exited status=48
May 23 15:01:50 im2-dem003 systemd[1]: Failed to start MongoDB Database Server.
-- Subject: Unit mongos.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit mongos.service has failed.
--
-- The result is failed.
May 23 15:01:50 im2-dem003 systemd[1]: Unit mongos.service entered failed state.
May 23 15:01:50 im2-dem003 systemd[1]: mongos.service failed.
查看配置文件,发现ip写错了。
2. mongo的stepdown失败
"errmsg" : "No electable secondaries caught up as of 2022-05-25T22:20:40.764+0800. Please use the replSetStepDown command with the argument {force: true} to force node to step down.",
3. 日志分割
https://www.mongodb.com/docs/manual/reference/command/logRotate/
kill -SIGUSR1 2200
线上执行日志分割后,发现日志文件是重写了,老的数据文件丢失了。在线下的配置文件:
线上的配置:
发现多了一个配置:
logRotate: reopen
测试后确认确实是这个配置引起的。
4. mongodb查看所有参数配置
db.runCommand( { getParameter : '*' } )
5. zabbix监控mongo需要的权限
use admin
db.createUser(
{
user:"zabbix",
pwd:"!!)a1104",
roles:[{role:"clusterAdmin",db:"admin"},
{ role: "readAnyDatabase", db: "admin" }]
})
# grant语法
db.grantRolesToUser(
"zabbix",
[
{ role: "readAnyDatabase", db: "admin" }
]
)
6. 大数据读取oplog的权限
db.createUser(
{
user:"bigdataread",
pwd:"bigdataread",
roles:[{role:"read",db:"admin"},
{role:"readAnyDatabase",db:"admin"}]
})
7. mongo查看每个collection大小
参考文档:
https://www.jianshu.com/p/955ea98667c6
1. 获取 mongoDB 中数据库的大小命令
use databasename
db.stats()
显示信息如下
> db.stats (){ "collections" : 3, "objects" : 80614, "dataSize" : 21069700, "storageSize" : 39845376, "numExtents" : 9, "indexes" : 2, "indexSize" : 6012928, "ok" : 1}
其中 storage 表示的就是数据库的大小,显示出的数字的单位是字节,因此如果需要转换单位为 KB 需要除以 1024
2. 获取 MongoDB 中 collection相关的大小
db.collection.dataSize() //collection 中的数据大小
db.collection.storageSize() // 为 collection 分配的空间大小,包括未使用的空间
db.collection.totalIndexSize() //collection 中索引数据大小
db.collection.totalSize() //collection 中索引 + data 所占空间
3. 查询collection的名字及大小
查询名字
db.getCollectionNames().forEach( function(u) { printjson(u); } )
查询大小命令
# 命令1:
db.getCollectionNames().forEach( function(u) { printjson(u + " " + db.getCollection(u).totalSize()/1024/1024) } )
# 命令2:
var collNames = db.getCollectionNames();
for (var i = 0; i < collNames.length; i++) {
var coll = db.getCollection(collNames[i]);
var stats = coll.stats(1024 * 1024 * 1024);
print(stats.ns, stats.storageSize);
}
8. mongostat命令
mongostat --host=127.0.0.1 --port=27017 --username=${username} --password=${passwd} --authenticationDatabase=admin
9. Transactions with ignore_prepare=true cannot perform updates: Operation not supported Raw
相关日志:
2023-02-16T14:24:38.939+0800 E STORAGE [conn12085] WiredTiger error (95) [1676528678:939933][52434:0x7fdb8d6ee700], file:_mdb_catalog.wt, WT_CURSOR.insert: __wt_txn_modify, 418: Transactions with ignore_prepare=true cannot perform updates: Operation not supported Raw: [1676528678:939933][52434:0x7fdb8d6ee700], file:_mdb_catalog.wt, WT_CURSOR.insert: __wt_txn_modify, 418: Transactions with ignore_prepare=true cannot perform updates: Operation not supported
2023-02-16T14:24:38.939+0800 E - [conn12085] Assertion: UnknownError: WiredTigerRecordStore::insertRecord 95: Operation not supported src/mongo/db/catalog/database_impl.cpp 662
2023-02-16T14:24:38.939+0800 W - [conn12085] Caught Assertion while trying to profile msg against test.student: UnknownError: WiredTigerRecordStore::insertRecord 95: Operation not supported
2023-02-16T14:24:38.940+0800 I - [conn12085] Creating profile collection: test.system.profile
2023-02-16T14:24:38.940+0800 I STORAGE [conn12085] createCollection: test.system.profile with generated UUID: 1bb172c7-9af4-45dd-8aa1-be22a8556217 and options: { capped: true, size: 1048576 }
2023-02-16T14:24:38.940+0800 E STORAGE [conn12085] WiredTiger error (95) [1676528678:940941][52434:0x7fdb8d6ee700], file:_mdb_catalog.wt, WT_CURSOR.insert: __wt_txn_modify, 418: Transactions with ignore_prepare=true cannot perform updates: Operation not supported Raw: [1676528678:940941][52434:0x7fdb8d6ee700], file:_mdb_catalog.wt, WT_CURSOR.insert: __wt_txn_modify, 418: Transactions with ignore_prepare=true cannot perform updates: Operation not supported
2023-02-16T14:24:38.940+0800 E - [conn12085] Assertion: UnknownError: WiredTigerRecordStore::insertRecord 95: Operation not supported src/mongo/db/catalog/database_impl.cpp 662
2023-02-16T14:24:38.940+0800 W - [conn12085] Caught Assertion while trying to profile msg against test.student: UnknownError: WiredTigerRecordStore::insertRecord 95: Operation not supported
mongodb版本:4.2.7
mongodb分片集群基本上所有的节点都报错Transactions with ignore_prepare=true cannot perform updates: Operation not supported
经过查询得出结论是版本bug导致,因此将其升级到4.2.9 报错消失。
另外mongo 4.2.7版本运行时大约会在一个月或半个月左右报一次连接超时的异常,目前经过升级后此异常也没有出现(此现象未测试)
官方参考连接:https://jira.mongodb.org/browse/SERVER-47714
10. mongo副本集的flowControl机制
mongodb副本集写入时默认是写入majority,(1 + 节点数的一半。 3个节点时,majority=2),如果P (primary) – S (secondary) --A (arbiter), P ,S有一个节点挂掉或者从节点性能问题,严重落后主节点,会导致写入不足2个节点。然后新的P节点就会触发flowcontrol机制,限制写入,等待从节点赶上。有时为了让mongo先恢复写速度,可以调节参数。
db.runCommand( { getParameter : '*' } )
db.adminCommand({setParameter:1, flowControlMinTicketsPerSecond:10000}) // 可以调大flowcontrol的漏洞的ticket的数目
db.adminCommand({setParameter:1, enableFlowControl: false}) // 不行就直接关闭flowControl。
关于w:majority. 按道理说,服务器默认的写入是majority(rs.conf() 查看),那么为什么没有触发客户端的等待呢?在客户端强制设置w:majority时,客户完全hang请求(达不到w:majority时)。
rs_t13:PRIMARY> rs.conf()
{
"_id" : "rs_t13",
"version" : 339326,
"protocolVersion" : NumberLong(1),
"writeConcernMajorityJournalDefault" : true,
"members" : [
{
"_id" : 22,
"host" : "192.168.1.11:27017",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 4,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 1
},
{
"_id" : 23,
"host" : "192.168.1.12:27017",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 3,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 1
},
{
"_id" : 24,
"host" : "192.168.1.13:27017",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 5,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 1
}
],
"settings" : {
"chainingAllowed" : true,
"heartbeatIntervalMillis" : 2000,
"heartbeatTimeoutSecs" : 10,
"electionTimeoutMillis" : 10000,
"catchUpTimeoutMillis" : 60000,
"catchUpTakeoverDelayMillis" : 30000,
"getLastErrorModes" : {
},
"getLastErrorDefaults" : {
"w" : 1,
"wtimeout" : 0
},
"replicaSetId" : ObjectId("5dd4bb76512ab94ff5ca94a2")
}
}
解释说明
"writeConcernMajorityJournalDefault" : true
这个字段表示的是,如果客户端写入为majority但是没有设置journal.那么默认也要等待写入journal。
getLastErrorDefaults
表示写入的确认个数,和等待时间。
{ w: <value>, j: <boolean>, wtimeout: <number> }
// 在客户端指定写入为majority时,设置了超时时间,就算返回超时,没有达到大多数。也是成功写入了数据的。
评论区