Redis集群

2020-09-12

字数统计: 23.1k字 | 阅读时长≈ 94分

集群

回顾：redis单机单进程，缓存|数据库—–>持久化（RDB|AOF）

—->单机—–>问题：单点故障，内存压力，连接数有限

解决上面的三个问题：

AKF

方案

reids沿x轴做副本，主备多个节点，全量，镜像

做主备，如果一台机器永远不挂

感觉浪费

于是可以把读请求和写请求分开，减少压力，解决连接数过大压力问题

1570431375615

reids沿y轴，根据业务

划分模块

不同的redis实例存储不同业务的数据,解决内存容量有限问题

1570431502125

reids沿z轴做某一业务，模块的分块，如果该业务的请求过多，则客户端可以控制那些请求进哪个块，来解决内存压力，请求压力，说白了就是【对模块再细分】

1570431840702

AKF原则：

三个轴

基于X轴要是全量镜像的

基于Y轴是按业务功能存储不同的数据

基于Z轴，按优先级，逻辑再拆分

思路：

基于X轴的解决方式：

由于一变多，主备：数据一致性的问题

1，同步阻塞，当主机写完之后给备机发送，直到全部备机都写完，最后返回客户端，强一致性，但是强一致性一定破坏可用性

1570432979207

2 异步非阻塞，当主机写完之后给备机发送，不需要等待结果，先把自己写成功的结果返回客户端，最后可能造成备机数据丢失，分区容忍性，这种需要业务可以容忍一部分数据丢失，破坏一致性

1570433037080

3.同步阻塞：不过这次是再【主备中间】加上一层【可靠的集群】来维持高可用和最终一致性

可靠的集群：满足高可靠，数据能持久化，响应速度快

1570433361138

虽然说是最终一致性，主宕机，备机启用，如果还未更新完，客户端可能读到不一致数据，那么此时就得强调一致性，就是等他更新完，然后发请求。

redis可以实现主备，主从，但是企业一般倾向于主从

无论是主备还是主从，主永远是单点，那么就一定要对主做高可用HA，无论是备机顶，才是从机顶，反正顶上去的一定是主，从机或者备机挂是不需要进行实时修复的

那么人是可以把一台备机或者从机顶上去的，但人是不可靠的，所以企业追求自动化———->模拟人来对从机或者备机切换到主——->需要一个技术或者一个程序——->对于一个程序又是一个单点故障问题——–>于是需要一个可靠的集群实现这个技术去监控我的主（可以是redis也可以是其他的）

1570437123274

引出问题：

主的宕机是由一个监控来决定还是多个监控，还是全部监控

如果需要全部监控都说主宕机的话，也就是相当于强一致性

客户端需要去等待所有监控程序的返回，那么必然是破坏高可用的

——>

如果由一个监控程序就去判断主宕机

必然是脑裂的

网络分区了—->客户端使用一个监控返回的结果，另一个客户端使用另一个监控返回的结果，就会出现，不同的监控会把不同的客户端带进不同的服务

—允许脑裂————>

分区容忍性的问题

该服务能不能容忍脑裂的问题

多个服务是镜像的，我可以把请求随便给任意一服务

那么就用这种分区容忍的

例如：对于ER那种服务快速注册，快速发现的，我的50个镜像的tomcat服务器一部分成功注册到ER1，一部分成功注册到ER2,还有一部分注册到ER3,那么客户端随便访问哪个ER注册中心，都无所谓，因为对于客户端来说，ER会把跟他建立连接的tomcat给客户端，客户端随便访问哪一台tomcat都能实现想要实现的事

这就是分区容忍

—–不允许脑裂—>

不允许脑裂，一个监控程序说了就不算数

没有足够的势力范围

那么由过半的监控程序说该主是什么状态，挂没挂

这样就达到了最终一致性

让他最终一致，过半都通过了，那么少数的说话就不算数了，因为哨兵会集结势力，势力超过一半，就不再集结，直接决定主机生死

为啥要过半，

2解决3个监控的脑裂

3解决4个监控的脑裂

3解决5个监控的脑裂

4解决6个监控的脑裂

N/2+1解决N个监控的脑裂，而N/2+1中文意思就是过半

一般N个集群的N是奇数

为啥是奇数，就拿5台和6台来说，

5台需要最低保持3台说话，低于3台监控说话就是无效，所以允许挂掉2台

6台需要最低保持4台说话，低于4台监控说话就是无效，所以也允许挂掉2台

但是6台监控与5台监控的成本是不一样的，但是他们的容忍性是一致的，并且6台里面更容易出现，一半ok，一半no，那么就不知道听谁的，换句话就是两边都不听，5台就规定死，3台说了相同的答案，那就是答案，不可能一半说res，一半说no

6台我要保证4/6台是好的，5台只需要保证 3/5是好的，但是机器出现故障的概率是相等的，保证4/6的成功率要比保证3/5台的成功率要低些

CAP原则

又称CAP定理，指的是在一个分布式系统中，一致性（Consistency）、可用性（Availability）、分区容错性（Partition tolerance）。CAP 原则指的是，这三个要素最多只能同时实现两点，不可能三者兼顾。

一致性（C）：在分布式系统中的所有数据备份，在同一时刻是否同样的值。（等同于所有节点访问同一份最新的数据副本）

可用性（A）：在集群中一部分节点故障后，集群整体是否还能响应客户端的读写请求。（对数据更新具备高可用性）

分区容忍性（P）：以实际效果而言，分区相当于对通信的时限要求。系统如果不能在时限内达成数据一致性，就意味着发生了分区的情况，必须就当前操作在C和A之间做出选择

集群方式：

Redis使用默认的异步复制，其特点是低延迟和高性能。但是，从 Redis 服务器会异步地确认其【从节点】【主节点】 Redis 服务器周期接收到的数据量。

那其实redis默认使用的是弱一致性，可能出现数据丢失的那种

client——–>redis主 —-立即返回结果—->client

redis主—发送client命令给备机，不阻塞–>redis备

实操

redis默认的主从复制是不需要中间技术去实现最终一致性的

用的是【弱一致性，连接不成功会出现数据丢失】

集群之间异步非阻塞模型

cd /root/soft/redis-5.0.5
cd /utils
./install_server.sh
//然后多弄几次可以配置多个redis实例，重要的一点就是端口号一定不要重复，也不要被其他进程占用
然后
server redis_【端口号】 stop

cp /etc/redis/*  /root/RedisConf/


6379-6381：
daemonize no//服务阻塞模型
#logfile /var/log/redis_6379.log//日志打印到屏幕，就不要写进文件
appendonly no //先不要记录AOF日志

rm -rf /var/lib/redis/*  ----->清空
mkdir 6379
mkdir 6380
mkdir 6381
创建目录



redis-server /root/redisConf/6379.conf
redis-server /root/redisConf/6380.conf
redis-server /root/redisConf/6381.conf



手动通过命令让谁是主，谁是备
其实就是通过命令让备机追随主

5.0以前
SLAVEOF  host port ====奴隶of  谁的奴隶

5.0换成了
REPLICAOF host port  ====复制of 谁的复制品

REPLICAOF 127.0.0.1 6379 
----6379服务日志-》
 Replica 127.0.0.1:6380 asks for synchronization--备机127.0.0.1:6380要求同步
Partial resynchronization not accepted: Replication ID mismatch 
不接受部分重新同步：备机ID不匹配
(Replica asked for '0a49d7eee76739137fc41f638a9a5fa0dc0e5906', my replication IDs are 'cde393fbff4cdcce9bb8432a64b913b8fadb8e20' and '0000000000000000000000000000000000000000'
主机先进自己 存的备机IDs里面去找看有没有该追随者，有的话，说明以前同步过，只要部分同步就好了，
意思就是部分同步只发生在，已经主备以前连接过的情况
主备机器是第一次建立连接，是需要全量同步
)
Starting BGSAVE for SYNC with target: disk//开始后台快照写进磁盘以备同步
DB saved on disk
RDB: 4 MB of memory used by copy-on-write
Background saving started by pid 1352
Background saving terminated with success
Synchronization with replica 127.0.0.1:6380 succeeded


-------6380日志-----》redis每一个节点再没有成为备机之前都是主，都是独立的
Before turning into a replica, using my master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
在变成一个备机之前，先去rdb文件拿以前连接过主的id号，因为只要跟随过，不是AOF模式的话，rdb快照会有主的id号，rdb文件最后一次写入的offset，通过这些信息可以部分同步主的快照，给出偏移量就好了
如果以前没有连接过是拿不到该信息的

REPLICAOF 127.0.0.1:6379 enabled 启用复制127.0.0.1:6379进程(user request from 'id=3 addr=127.0.0.1:50013 fd=7 name= age=559 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=44 qbuf-free=32724 obl=0 oll=0 omem=0 events=r cmd=replicaof')
Connecting to MASTER 127.0.0.1:6379--连接到主机127.0.0.1:6379
MASTER <-> REPLICA sync started 主副本同步启动
Non blocking connect for SYNC fired the event.同步的非阻塞连接建立
Master replied to PING, replication can continue...主服务器已回复ping，同步数据可以开始…，
Trying a partial resynchronization尝试部分数据同步 (request 0a49d7eee76739137fc41f638a9a5fa0dc0e5906:1).
第一次主备连接，部分同步的请求会被拒绝，需全量同步
Full resync from master: 从主服务器完全同步快照
16d54af6e8d86df9d12ef4fb02220d508433c730:0
Discarding previously cached master state.//删除cache里面记录的主状态
MASTER <-> REPLICA sync: receiving 175 bytes from master//收到主的信息
MASTER <-> REPLICA sync: Flushing old data//删除老数据
MASTER <-> REPLICA sync: Loading DB in memory//加载从主同步过来的DB快照进内存
MASTER <-> REPLICA sync: Finished with success//完成


----6379cli--->
set k1 aaa
---6379服务日志--->
1 changes in 900 seconds. Saving...
Background saving started by pid 1355//后台子进程把增量数据保存进磁盘
DB saved on disk
RDB: 4 MB of memory used by copy-on-write//通过子进程通过系统机制copy-on-write从内存拿出4mb写进快照
Background saving terminated with success
----6380服务日志---》
0:S 07 Oct 2019 19:56:40.649 * 1 changes in 900 seconds. Saving...
1356:C 07 Oct 2019 19:56:41.188 * DB saved on disk
1356:C 07 Oct 2019 19:56:41.189 * RDB: 2 MB of memory used by copy-on-write
1280:S 07 Oct 2019 19:56:41.228 * Background saving started by pid 1356
1280:S 07 Oct 2019 19:56:41.329 * Background saving terminated with success
但是对于--6380cli--》
127.0.0.1:6380> set k2 aaa
(error) READONLY You can't write against a read only replica.
现在是不能写的，因为6380redis不是主，只能读
127.0.0.1:6380> get k1
"aaa"
想要备机也可以写的话需要该配置文件

现在测试6381----》
还没追随主
127.0.0.1:6381> set k2 bbb
OK
127.0.0.1:6381> get k2
"bbb"
--追随主--6379--》
127.0.0.1:6381> REPLICAOF 127.0.0.1 6379
OK
127.0.0.1:6381> keys *
1) "k1"---发现k2没了，说明再追随主时，会把自己的库flush

----假如备机6380挂掉--->
ctrl+c
1280:signal-handler (1570450204) Received SIGINT scheduling shutdown...
User requested shutdown...
Saving the final RDB snapshot before exiting.关闭之前保存快照rdb
DB saved on disk//rdb存进磁盘
Removing the pid file.
Redis is now ready to exit, bye bye...
[root@datanode15 ~]# ^C
----6379服务日志---->
# Connection with replica 127.0.0.1:6380 lost.//备机丢失

---6379cli-->
127.0.0.1:6379> set k2 bbb
OK
127.0.0.1:6379> keys *
1) "k1"
2) "k2"
---6381cli---》
127.0.0.1:6381> keys *
1) "k1"
2) "k2"

---6380-》
redis-server /root/redisConf/6380.conf  --replicaof 127.0.0.1 6379
--6379服务日志--》
 Replica 127.0.0.1:6380 asks for synchronization
Partial resynchronization request from 127.0.0.1:6380 accepted. Sending 56 bytes of backlog starting from offset 5007.
上面是说部分同步请求成功，说明以前连过一次，---》得出
备机发送了上一次与主丢失连接时
写的rdb文件的偏移量，只需要从backlog把偏移量后面的数据给备机就好了

-----6380服务日志--->
 Successful partial resynchronization with master.
 MASTER <-> REPLICA sync: Master accepted a Partial Resynchronization.
主接受了部分同步的请求，进行部分数据同步



------6380服务-->
ctrl+c--停掉
redis-server /root/redisConf/6380.conf  --replicaof 127.0.0.1 6379  --appendonly yes--开启AOF
-----6379服务---》
Replica 127.0.0.1:6380 asks for synchronization
Full resync requested by replica 127.0.0.1:6380
Starting BGSAVE for SYNC with target: disk
Background saving started by pid 1387
DB saved on disk
RDB: 6 MB of memory used by copy-on-write
Background saving terminated with success
Synchronization with replica 127.0.0.1:6380 succeeded
备机开了AOF，主机就会给他落一次快照，因为开了AOF的备机不会发送部分同步请求了，直接发的全量同步请求

----6380服务--》

Connecting to MASTER 127.0.0.1:6379
MASTER <-> REPLICA sync started
Non blocking connect for SYNC fired the event.
Master replied to PING, replication can continue...
Partial resynchronization not possible (no cached master)-------》开启了AOF ，就不再发送部分同步请求
Full resync from master: -->直接让住落一次快照并全量读取主的快照
16d54af6e8d86df9d12ef4fb02220d508433c730:6560
MASTER <-> REPLICA sync: receiving 197 bytes from master
MASTER <-> REPLICA sync: Flushing old data
MASTER <-> REPLICA sync: Loading DB in memory
MASTER <-> REPLICA sync: Finished with success
Background append only file rewriting started by pid 1388
AOF rewrite child asks to stop sending diffs.
 Parent agreed to stop sending diffs. Finalizing AOF...
 Concatenating 0.00 MB of AOF diff received from parent.
SYNC append only file rewrite performed
 AOF rewrite: 4 MB of memory used by copy-on-write
Background AOF rewrite terminated with success
Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
 Background AOF rewrite finished successfully


-----》以上备机挂掉


----当主挂掉，备机如果没有手动切换的换---》备机还是备机，不会变
就是只能查，不能写
但是他是知道主连不上的
此时----6380--》
replicaof no one --这是一条把主给到自己的命令
--6381-》
replicaof 127.0.0.1 6380 -----6381换主



vi dump.rdb

REDIS0009ú      redis-ver^E5.0.4ú
redis-bitsÀ@ú^EctimeÂ=1<9b>]ú^Hused-memÂ^H«^]^@ú^Nrepl-stream-dbÀ^@ú^G【repl-id】(16d54af6e8d86df9d12ef4fb02220d508433c730ú^K【repl-offset】 
Á ^Yú^Laof-preambleÀ^@þ^@û^B^@^@^Bk1^Caaa^@^Bk2^Cbbbÿ_Zt<9a>M^@@<99>


以上总结：

备机没有打开AOF功能的话，备机同步数据，第一次建立连接时全量同步，以后的每一次就不需要全量同步了，只要部分同步，
注意这里是  主机通过维持了一个backlog来实现的，如果备机断开后，主的操作日志，填满了backlog，或者超过规定时间3600秒，主就会删掉backlog，
下次 备机即使有以前的主id号和offset，但是主机已经没有backlog了，需要重新落一次快照
备机也要全量同步

其实不止主，备机也有一个backlog，备机backlog是不会删除的，因为主如果宕机了，备机就可以通过他升为主，如果又有新备机丢失一小会连接，就可以通过backlog来部分同步，
虽然不会删除，但是备机的backlog满了会触发落一次快照，然后清空backlog


5.0版本
备机打开AOF功能的话，AOF文件里没有以前追随的信息，更没有rdb日志的偏移量，所以需要全量同步

配置文件配置replication


 vi redisConf/6379.conf 

# Master-Replica replication. Use replicaof to make a Redis instance a copy of
# another Redis server. A few things to understand ASAP about Redis replication.
#
#   +------------------+      +---------------+
#   |      Master      | ---> |    Replica    |
#   | (receive writes) |      |  (exact copy) |
#   +------------------+      +---------------+
#
# 1) Redis replication is asynchronous, but you can configure a master to stop accepting writes if it appears to be not connected with at least a given number of replicas.
redis复制是异步的，但是如果主没有连接到至少给定数量的备机，则可以将其配置为停止接受写操作
# 2) Redis replicas are able to perform a partial resynchronization with the master if the replication link is lost for a relatively small amount of time. You may want to configure the replication backlog size (see the next sections of this file) with a sensible value depending on your needs.
如果主备机连接丢失在相对较少的时间内（由backlog，和ttl时间来限制），则ReDIS备机能够与主机执行部分同步。您可能需要根据需要使用合理的值来配置副本体积大小

# 3) Replication is automatic and does not need user intervention. After a network partition replicas automatically try to reconnect to masters and resynchronize with them.
备机尝试重新连接到主机并重新同步，备机同步数据和连接是自动的，不需要用户参与
# replicaof <masterip> <masterport>配置要追随谁

# If the master is password protected (using the "requirepass" configuration
# directive below) it is possible to tell the replica to authenticate before
# starting the replication synchronization process, otherwise the master will
# refuse the replica request.
#上面说主设了密码，就配置密码
# masterauth <master-password>


# When a replica loses its connection with the master, or when the replication
# is still in progress, the replica can act in two different ways:
#
# 1) if replica-serve-stale-data is set to 'yes' (the default) the replica will
#    still reply to client requests, possibly with out of date data, or the
#    data set may just be empty if this is the first synchronization.
#
# 2) if replica-serve-stale-data is set to 'no' the replica will reply with
#    an error "SYNC with master in progress" to all the kind of commands
#    but to INFO, replicaOF, AUTH, PING, SHUTDOWN, REPLCONF, ROLE, CONFIG,
#    SUBSCRIBE, UNSUBSCRIBE, PSUBSCRIBE, PUNSUBSCRIBE, PUBLISH, PUBSUB,
#    COMMAND, POST, HOST: and LATENCY.
#针对备机 要不要同步完主的数据 才可以查询，就是客户端能不能 从备机再同步时读到原来的数据，因为等下就要flush old data了，yes能，no不能
replica-serve-stale-data yes

# You can configure a replica instance to accept writes or not. Writing against
# a replica instance may be useful to store some ephemeral data (because data
# written on a replica will be easily deleted after resync with the master) but
# may also cause problems if clients are writing to it because of a
# misconfiguration.
#
# Since Redis 2.6 by default replicas are read-only.
#
# Note: read only replicas are not designed to be exposed to untrusted clients
# on the internet. It's just a protection layer against misuse of the instance.
# Still a read only replica exports by default all the administrative commands
# such as CONFIG, DEBUG, and so forth. To a limited extent you can improve
# security of read only replicas using 'rename-command' to shadow all the
# administrative / dangerous commands.
replica-read-only yes  备机只读， 为yes不能写，为no可以写

# Replication SYNC strategy: disk or socket.
#
# -------------------------------------------------------
# WARNING: DISKLESS REPLICATION IS EXPERIMENTAL CURRENTLY
# -------------------------------------------------------
#
# New replicas and reconnecting replicas that are not able to continue the replication
# process just receiving differences, need to do what is called a "full
# synchronization". An RDB file is transmitted from the master to the replicas.
# The transmission can happen in two different ways:
#
# 1) Disk-backed: The Redis master creates a new process that writes the RDB
#                 file on disk. Later the file is transferred by the parent
#                 process to the replicas incrementally.
上面是说，备对主的数据拉去是通过 等主先存入磁盘，备机通过网络IO去取
# 2) Diskless: The Redis master creates a new process that directly writes the
#              RDB file to replica sockets, without touching the disk at all.
#
上面是说，备机拉取主的数据，是直接通过网络，而不需要经过磁盘
# With disk-backed replication, while the RDB file is generated, more replicas
# can be queued and served with the RDB file as soon as the current child producing
# the RDB file finishes its work. With diskless replication instead once
# the transfer starts, new replicas arriving will be queued and a new transfer
# will start when the current one terminates.
#
# When diskless replication is used, the master waits a configurable amount of
# time (in seconds) before starting the transfer in the hope that multiple replicas
# will arrive and the transfer can be parallelized.
#
# With slow disks and fast (large bandwidth) networks, diskless replication
# works better.
repl-diskless-sync no--》默认走磁盘，yes是备机直接网络IO更新,不走磁盘

# When diskless replication is enabled, it is possible to configure the delay
# the server waits in order to spawn the child that transfers the RDB via socket
# to the replicas.
#
# This is important since once the transfer starts, it is not possible to serve
# new replicas arriving, that will be queued for the next RDB transfer, so the server
# waits a delay in order to let more replicas arrive.
#
# The delay is specified in seconds, and by default is 5 seconds. To disable
# it entirely just set it to 0 seconds and the transfer will start ASAP.
延迟以秒为单位指定，默认情况下为5秒。要完全禁用它，只需将其设置为0秒，传输将尽快开始，客户端写完，5秒延时后，主机会发送更新
repl-diskless-sync-delay 5--延时

# Replicas send PINGs to server in a predefined interval. It's possible to change
# this interval with the repl_ping_replica_period option. The default value is 10
# seconds.
#备机以预定义的间隔向服务器发送ping。可以使用repl_ping_replica_period选项更改此间隔。默认值为10秒
备机主动10拉取一次数据
# repl-ping-replica-period 10   

# The following option sets the replication timeout for:
#
# 1) Bulk transfer I/O during SYNC, from the point of view of replica.
# 2) Master timeout from the point of view of replicas (data, pings).
# 3) Replica timeout from the point of view of masters (REPLCONF ACK pings).
#
# It is important to make sure that this value is greater than the value
# specified for repl-ping-replica-period otherwise a timeout will be detected
# every time there is low traffic between the master and the replica.
#重要的是要确保这个值大于为备机 ping副本周期指定的值，否则，每当主和副本之间存在低流量时，就会检测到超时
就是大于上面的10
# repl-timeout 60

# Disable TCP_NODELAY on the replica socket after SYNC?
#
# If you select "yes" Redis will use a smaller number of TCP packets and
# less bandwidth to send data to replicas. But this can add a delay for
# the data to appear on the replica side, up to 40 milliseconds with
# Linux kernels using a default configuration.
#
# If you select "no" the delay for data to appear on the replica side will
# be reduced but more bandwidth will be used for replication.
#
# By default we optimize for low latency, but in very high traffic conditions
# or when the master and replicas are many hops away, turning this to "yes" may
# be a good idea.
默认情况下，我们会针对低延迟进行优化，但在流量非常大的情况下，或者当主和备之间的不稳定，将其设置为“yes”可能是一个好主意
repl-disable-tcp-nodelay no

# Set the replication backlog size. The backlog is a buffer that accumulates
# replica data when replicas are disconnected for some time, so that when a replica
# wants to reconnect again, often a full resync is not needed, but a partial
# resync is enough, just passing the portion of data the replica missed while
# disconnected.
#设置备机 的 备用日志大小。backlog是一个缓冲区，当备机断开连接一段时间后，backlog会累积主机操作 数据，因此当备机想要重新连接时，通常不需要完全重新同步，但是部分重新同步就足够了，只需传递副本在断开连接时丢失的部分数据
# The bigger the replication backlog, the longer the time the replica can be
# disconnected and later be able to perform a partial resynchronization.
# 备机的备用日志越大，备机被断开的时间可以越长，并且稍后能够执行部分再同步
# The backlog is only allocated once there is at least a replica connected.
#只有在  至少连接了一个备机时才会分配backlog
这是备机做增量同步的一个配置项，默认给一兆
# repl-backlog-size 1mb

# After a master has no longer connected replicas for some time, the backlog
# will be freed. The following option configures the amount of seconds that
# need to elapse, starting from the time the last replica disconnected, for
# the backlog buffer to be freed.
#在主有一段时间没有连接副本后，将释放备机的backlog。下面的选项配置需要经过的秒数，从最后一个备机断开连接时开始，释放backlog缓冲区。
# Note that replicas never free the backlog for timeout, since they may be
# promoted to masters later, and should be able to correctly "partially
# resynchronize" with the replicas: hence they should always accumulate backlog.
#请注意，【备机永远不会释放 备用日志，因为它们可能在以后升级为主机，并且应该能够与备机正确地“部分重新同步”：因此它们应该始终累积backlog】
# A value of 0 means to never release the backlog.
#值为0意味着主机永远不释放backlog
# repl-backlog-ttl 3600

# The replica priority is an integer number published by Redis in the INFO output.
备机优先级是 redis在信息输出中发布的
# It is used by Redis Sentinel in order to select a replica to promote into a
# master if the master is no longer working correctly.
#redis sentinel使用它来选择备机，以便在主不再正常工作时升级为主。
# A replica with a low priority number is considered better for promotion, so
# for instance if there are three replicas with priority 10, 100, 25 Sentinel will
# pick the one with priority 10, that is the lowest.
#优先级较低的备机被认为更适合升级，因此，例如，如果有三个优先级为10、100、25的副本，哨兵将选择优先级为10的备机，即优先级最低的备机
# However a special priority of 0 marks the replica as not able to perform the
# role of master, so a replica with priority of 0 will never be selected by
# Redis Sentinel for promotion.
#然而，特殊优先级为0的副本标记为不能执行主机的角色，因此优先权为0的备机将永远不会被Read SmithLL选择用于推广
# By default the priority is 100.
默认情况下，优先级为100
replica-priority 100

# It is possible for a master to stop accepting writes if there are less than
# N replicas connected, having a lag less or equal than M seconds.
#如果连接的备机少于n个，且延迟小于或等于m秒，则主服务器可以停止接受写入
# The N replicas need to be in "online" state.
#n个备机需要处于“联机”状态
# The lag in seconds, that must be <= the specified value, is calculated from
# the last ping received from the replica, that is usually sent every second.
#以秒为单位的延迟（必须<=指定值）是根据从备机接收的最后一次ping（通常每秒发送一次）计算得出的
# This option does not GUARANTEE that N replicas will accept the write, but
# will limit the window of exposure for lost writes in case not enough replicas
# are available, to the specified number of seconds.
#（如果没有足够的副本可用），会一直写不成功

不保证n个备机 【都写操作成功】，写成功立即返回，没有的写成功也会在指定时间内返回
# For example to require at least 3 replicas with a lag <= 10 seconds use:
#例如，需要至少3个滞后时间小于等于10秒的副本，请使用
# min-replicas-to-write 3--- 最少3【要有几个】写成功
# min-replicas-max-lag 10
#
# Setting one or the other to 0 disables the feature.
#将其中一个设置为0将禁用该功能。
# By default min-replicas-to-write is set to 0 (feature disabled) and
# min-replicas-max-lag is set to 10.
默认情况下，要写入的最小副本数设置为0（禁用功能），最小副本数最大延迟设置为10。
# A Redis master is able to list the address and port of the attached
# replicas in different ways. For example the "INFO replication" section
# offers this information, which is used, among other tools, by
# Redis Sentinel in order to discover replica instances.
redis主机能够以不同的方式列出追随备机的地址和端口。例如，“备机信息”部分提供了此信息，除其他工具外，redis sentinel还 使用此信息来发现备机实例。
# Another place where this info is available is in the output of the
# "ROLE" command of a master.
#此信息可用的另一个地方是主机的“role”命令的输出。
# The listed IP and address normally reported by a replica is obtained
# in the following way:
#复制副本通常报告的所列IP和地址按以下方式获取：
#   IP: The address is auto detected by checking the peer address
#   of the socket used by the replica to connect with the master.
#IP:通过检查备机用于连接主机的套接字的对等地址，自动检测该地址。
#   Port: The port is communicated by the replica during the replication
#   handshake, and is normally the port that the replica is using to
#   listen for connections.
#端口：该端口在复制握手期间由复制副本通信，通常是复制副本用于侦听连接的端口
# However when port forwarding or Network Address Translation (NAT) is
# used, the replica may be actually reachable via different IP and port
# pairs. The following two options can be used by a replica in order to
# report to its master a specific set of IP and port, so that both INFO
# and ROLE will report those values.
#然而，当使用端口转发或网络地址转换（nat）时，副本实际上可以通过不同的ip和端口对访问。复制副本可以使用以下两个选项，以便向其主机报告一组特定的IP和端口，以便信息和角色都将报告这些值
# There is no need to use both the options if you need to override just
# the port or the IP address.
#如果只需要覆盖端口或IP地址，则不需要同时使用这两个选项
# replica-announce-ip 5.5.5.5
# replica-announce-port 1234

Sentinel

高可用性（High Availability）：Redis Sentinel是Redis官方的高可用性解决方案。

监控（Monitoring）： Sentinel 会不断地检查你的主服务器和从服务器是否运作正常。

提醒（Notification）：当被监控的某个 Redis 服务器出现问题时， Sentinel 可以通过 API 向管理员或者其他应用程序发送通知

自动故障迁移（Automatic failover）：当一个主服务器不能正常工作时， Sentinel 会开始一次自动故障迁移操作，它会将失效主服务器的其中一个从服务器升级为新的主服务器，并让失效主服务器的其他从服务器改为复制新的主服务器；当客户端试图连接失效的主服务器时，集群也会向客户端返回新主服务器的地址，使得集群可以使用新主服务器代替失效服务器。

Redis Sentinel 是一个分布式系统，可以在一个架构中运行多个 Sentinel 进程（progress）

虽然 Redis Sentinel 为一个单独的可执行文件 redis-sentinel 启动的，但实际上它只是一个运行在特殊模式下的 Redis 服务器，你可以在启动一个普通 Redis 服务器时通过给定 –sentinel 选项来启动 Redis Sentinel 。

获取 Sentinel

目前 Sentinel 系统是 Redis 的 unstable 分支的一部分，你必须到 Redis 项目的 Github 页面克隆一份 unstable 分值，然后通过编译来获得 Sentinel 系统。

Sentinel 程序可以在编译后的 src 文档中发现，它是一个命名为 redis-sentinel 的程序。

你也可以通过下一节介绍的方法，让 redis-server 程序运行在 Sentinel 模式之下。

另外，一个新版本的 Sentinel 已经包含在了 Redis 2.8.0 版本的释出文件中。

启动 Sentinel

对于 redis-sentinel 程序，你可以用以下命令来启动 Sentinel 系统：

对于 redis-server 程序，你可以用以下命令来启动一个运行在 Sentinel 模式下的 Redis 服务器：

1 2	redis-sentinel redis-server /path/to/sentinel.conf --sentinel

两种方法都可以启动一个 Sentinel 实例。

启动 Sentinel 实例必须指定相应的配置文件，系统会使用配置文件来保存 Sentinel 的当前状态，并在 Sentinel 重启时通过载入配置文件来进行状态还原。

如果启动 Sentinel 时没有指定相应的配置文件，或者指定的配置文件不可写（not writable），那么 Sentinel 会拒绝启动。

配置 Sentinel

Redis 源码中包含了一个名为 sentinel.conf 的文件，这个文件是一个带有详细注释的 Sentinel 配置文件示例。

运行一个 Sentinel 所需的最少配置如下所示：

sentinel monitor mymaster 127.0.0.1 6379 2-->配置指示 Sentinel 去监视一个名为 mymaster 的主服务器，IP端口127.0.0.1 6379,判断为失效至少需要 2 个 Sentinel 同意(这是要看集群数量的)
在只有少数（minority） Sentinel 进程正常运作的情况下， Sentinel 是不能执行自动故障迁移的
sentinel down-after-milliseconds mymaster 60000---->选项指定了 Sentinel 认为服务器已经断线所需的毫秒数。如果服务器在给定的毫秒数之内， 没有返回 Sentinel 发送的 PING 命令的回复， 或者返回一个错误， 那么 Sentinel 将这个服务器标记为主观下线,不过只有一个 Sentinel 将服务器标记为主观下线并不一定会引起服务器的自动故障迁移： 只有在足够数量的 Sentinel 都将一个服务器标记为主观下线之后， 服务器才会被标记为客观下线
将服务器标记为客观下线所需的 Sentinel 数量由对主服务器的配置决定。
sentinel failover-timeout mymaster 180000
sentinel parallel-syncs mymaster 1---->选项指定了在执行故障转移时， 最多可以有多少个从服务器同时对新的主服务器进行同步， 这个数字越小， 完成故障转移所需的时间就越长
如果从服务器被设置为允许使用过期数据集（参见对 redis.conf 文件中对 slave-serve-stale-data 选项的说明）， 那么你可能不希望所有从服务器都在同一时间向新的主服务器发送同步请求， 因为尽管复制过程的绝大部分步骤都不会阻塞从服务器， 但从服务器在载入主服务器发来的 RDB 文件时， 仍然会造成从服务器在一段时间内不能处理命令请求： 如果全部从服务器一起对新的主服务器进行同步， 那么就可能会造成所有从服务器在短时间内全部不可用的情况出现。
你可以通过将这个值设为 1 来保证每次只有一个从服务器处于不能处理命令请求的状态。

sentinel monitor resque 192.168.1.3 6380 4
sentinel down-after-milliseconds resque 10000
sentinel failover-timeout resque 180000
sentinel parallel-syncs resque 5

其他选项的基本格式如下：

1	sentinel <选项的名字> <主服务器的名字> <选项的值>

主观下线和客观下线

前面说过， Redis 的 Sentinel 中关于下线（down）有两个不同的概念：

主观下线（Subjectively Down，简称 SDOWN）指的是单个 Sentinel 实例对服务器做出的下线判断。
客观下线（Objectively Down，简称 ODOWN）指的是多个 Sentinel 实例在对同一个服务器做出 SDOWN 判断，并且通过 SENTINEL is-master-down-by-addr 命令互相交流之后，得出的服务器下线判断。（一个 Sentinel 可以通过向另一个 Sentinel 发送 SENTINEL is-master-down-by-addr 命令来询问对方是否认为给定的服务器已下线。）

如果一个服务器没有在 master-down-after-milliseconds 选项所指定的时间内，对向它发送 PING 命令的 Sentinel 返回一个有效回复（valid reply），那么 Sentinel 就会将这个服务器标记为主观下线。

服务器对 PING 命令的有效回复可以是以下三种回复的其中一种：

返回 +PONG 。
返回 -LOADING 错误。
返回 -MASTERDOWN 错误。

如果服务器返回除以上三种回复之外的其他回复，又或者在指定时间内没有回复 PING 命令，那么 Sentinel 认为服务器返回的回复无效（non-valid）。

注意，一个服务器必须在 master-down-after-milliseconds 毫秒内，一直返回无效回复才会被 Sentinel 标记为主观下线。

举个例子，如果 master-down-after-milliseconds 选项的值为 30000 毫秒（30 秒），那么只要服务器能在每 29 秒之内返回至少一次有效回复，这个服务器就仍然会被认为是处于正常状态的。

从主观下线状态切换到客观下线状态并没有使用严格的法定人数算法（strong quorum algorithm），而是使用了流言协议：如果 Sentinel 在给定的时间范围内，从其他 Sentinel 那里接收到了足够数量的主服务器下线报告，那么 Sentinel 就会将主服务器的状态从主观下线改变为客观下线。如果之后其他 Sentinel 不再报告主服务器已下线，那么客观下线状态就会被移除。

客观下线条件只适用于主服务器：对于任何其他类型的 Redis 实例， Sentinel 在将它们判断为下线前不需要进行协商，所以从服务器或者其他 Sentinel 永远不会达到客观下线条件。

只要一个 Sentinel 发现某个主服务器进入了客观下线状态，这个 Sentinel 就可能会被其他 Sentinel 推选出，并对失效的主服务器执行自动故障迁移操作。

每个 Sentinel 都需要定期执行的任务

每个 Sentinel 以每秒钟一次的频率向它所知的主服务器、从服务器以及其他 Sentinel 实例发送一个 PING 命令。
如果一个实例（instance）距离最后一次有效回复 PING 命令的时间超过 down-after-milliseconds 选项所指定的值，那么这个实例会被 Sentinel 标记为主观下线。一个有效回复可以是： +PONG 、 -LOADING 或者 -MASTERDOWN 。
如果一个主服务器被标记为主观下线，那么正在监视这个主服务器的所有 Sentinel 要以每秒一次的频率确认主服务器的确进入了主观下线状态。
如果一个主服务器被标记为主观下线，并且有足够数量的 Sentinel （至少要达到配置文件指定的数量）在指定的时间范围内同意这一判断，那么这个主服务器被标记为客观下线。
在一般情况下，每个 Sentinel 会以每 10 秒一次的频率向它已知的所有主服务器和从服务器发送 INFO 命令。当一个主服务器被 Sentinel 标记为客观下线时， Sentinel 向下线主服务器的所有从服务器发送 INFO 命令的频率会从 10 秒一次改为每秒一次。
当没有足够数量的 Sentinel 同意主服务器已经下线，主服务器的客观下线状态就会被移除。当主服务器重新向 Sentinel 的 PING 命令返回有效回复时，主服务器的主观下线状态就会被移除。

自动发现 Sentinel 和从服务器

一个 Sentinel 可以与其他多个 Sentinel 进行连接，各个 Sentinel 之间可以互相检查对方的可用性，并进行信息交换。

你无须为运行的每个 Sentinel 分别设置其他 Sentinel 的地址，因为 Sentinel 可以通过发布与订阅功能来自动发现正在监视相同主服务器的其他 Sentinel ，这一功能是通过向频道 sentinel:hello 发送信息来实现的。

与此类似，你也不必手动列出主服务器属下的所有从服务器，因为 Sentinel 可以通过询问主服务器来获得所有从服务器的信息。

每个 Sentinel 会以每两秒一次的频率，通过发布与订阅功能，向被它监视的所有主服务器和从服务器的 sentinel:hello 频道发送一条信息，信息中包含了 Sentinel 的 IP 地址、端口号和运行 ID （runid）。
每个 Sentinel 都订阅了被它监视的所有主服务器和从服务器的 sentinel:hello 频道，查找之前未出现过的 sentinel （looking for unknown sentinels）。当一个 Sentinel 发现一个新的 Sentinel 时，它会将新的 Sentinel 添加到一个列表中，这个列表保存了 Sentinel 已知的，监视同一个主服务器的所有其他 Sentinel 。
Sentinel 发送的信息中还包括完整的主服务器当前配置（configuration）。如果一个 Sentinel 包含的主服务器配置比另一个 Sentinel 发送的配置要旧，那么这个 Sentinel 会立即升级到新配置上。
在将一个新 Sentinel 添加到监视主服务器的列表上面之前， Sentinel 会先检查列表中是否已经包含了和要添加的 Sentinel 拥有相同运行 ID 或者相同地址（包括 IP 地址和端口号）的 Sentinel ，如果是的话， Sentinel 会先移除列表中已有的那些拥有相同运行 ID 或者相同地址的 Sentinel ，然后再添加新 Sentinel 。

Sentinel API

在默认情况下， Sentinel 使用 TCP 端口 26379 （普通 Redis 服务器使用的是 6379 ）。

Sentinel 接受 Redis 协议格式的命令请求，所以你可以使用 redis-cli 或者任何其他 Redis 客户端来与 Sentinel 进行通讯。

有两种方式可以和 Sentinel 进行通讯：

第一种方法是通过直接发送命令来查询被监视 Redis 服务器的当前状态，以及 Sentinel 所知道的关于其他 Sentinel 的信息，诸如此类。
另一种方法是使用发布与订阅功能，通过接收 Sentinel 发送的通知：当执行故障转移操作，或者某个被监视的服务器被判断为主观下线或者客观下线时， Sentinel 就会发送相应的信息。

Sentinel 命令

以下列出的是 Sentinel 接受的命令：

PING ：返回 PONG 。
SENTINEL masters ：列出所有被监视的主服务器，以及这些主服务器的当前状态。
SENTINEL slaves ：列出给定主服务器的所有从服务器，以及这些从服务器的当前状态。
SENTINEL get-master-addr-by-name ：返回给定名字的主服务器的 IP 地址和端口号。如果这个主服务器正在执行故障转移操作，或者针对这个主服务器的故障转移操作已经完成，那么这个命令返回新的主服务器的 IP 地址和端口号。
SENTINEL reset ：重置所有名字和给定模式 pattern 相匹配的主服务器。 pattern 参数是一个 Glob 风格的模式。重置操作清楚主服务器目前的所有状态，包括正在执行中的故障转移，并移除目前已经发现和关联的，主服务器的所有从服务器和 Sentinel 。
SENTINEL failover ：当主服务器失效时，在不询问其他 Sentinel 意见的情况下，强制开始一次自动故障迁移（不过发起故障转移的 Sentinel 会向其他 Sentinel 发送一个新的配置，其他 Sentinel 会根据这个配置进行相应的更新）。

发布与订阅信息

客户端可以将 Sentinel 看作是一个只提供了订阅功能的 Redis 服务器：你不可以使用 PUBLISH 命令向这个服务器发送信息，但你可以用 SUBSCRIBE 命令或者 PSUBSCRIBE 命令，通过订阅给定的频道来获取相应的事件提醒。

一个频道能够接收和这个频道的名字相同的事件。比如说，名为 +sdown 的频道就可以接收所有实例进入主观下线（SDOWN）状态的事件。

通过执行 PSUBSCRIBE * 命令可以接收所有事件信息。

以下列出的是客户端可以通过订阅来获得的频道和信息的格式：第一个英文单词是频道/事件的名字，其余的是数据的格式。

注意，当格式中包含 instance details 字样时，表示频道所返回的信息中包含了以下用于识别目标实例的内容：

1	<instance-type> <name> <ip> <port> @ <master-name> <master-ip> <master-port>

@ 字符之后的内容用于指定主服务器，这些内容是可选的，它们仅在 @ 字符之前的内容指定的实例不是主服务器时使用。

+reset-master ：主服务器已被重置。
+slave ：一个新的从服务器已经被 Sentinel 识别并关联。
+failover-state-reconf-slaves ：故障转移状态切换到了 reconf-slaves 状态。
+failover-detected ：另一个 Sentinel 开始了一次故障转移操作，或者一个从服务器转换成了主服务器。
+slave-reconf-sent ：领头（leader）的 Sentinel 向实例发送了 SLAVEOF 命令，为实例设置新的主服务器。
+slave-reconf-inprog ：实例正在将自己设置为指定主服务器的从服务器，但相应的同步过程仍未完成。
+slave-reconf-done ：从服务器已经成功完成对新主服务器的同步。
-dup-sentinel ：对给定主服务器进行监视的一个或多个 Sentinel 已经因为重复出现而被移除 —— 当 Sentinel 实例重启的时候，就会出现这种情况。
+sentinel ：一个监视给定主服务器的新 Sentinel 已经被识别并添加。
+sdown ：给定的实例现在处于主观下线状态。
-sdown ：给定的实例已经不再处于主观下线状态。
+odown ：给定的实例现在处于客观下线状态。
-odown ：给定的实例已经不再处于客观下线状态。
+new-epoch ：当前的纪元（epoch）已经被更新。
+try-failover ：一个新的故障迁移操作正在执行中，等待被大多数 Sentinel 选中（waiting to be elected by the majority）。
+elected-leader ：赢得指定纪元的选举，可以进行故障迁移操作了。
+failover-state-select-slave ：故障转移操作现在处于 select-slave 状态 —— Sentinel 正在寻找可以升级为主服务器的从服务器。
no-good-slave ：Sentinel 操作未能找到适合进行升级的从服务器。Sentinel 会在一段时间之后再次尝试寻找合适的从服务器来进行升级，又或者直接放弃执行故障转移操作。
selected-slave ：Sentinel 顺利找到适合进行升级的从服务器。
failover-state-send-slaveof-noone ：Sentinel 正在将指定的从服务器升级为主服务器，等待升级功能完成。
failover-end-for-timeout ：故障转移因为超时而中止，不过最终所有从服务器都会开始复制新的主服务器（slaves will eventually be configured to replicate with the new master anyway）。
failover-end ：故障转移操作顺利完成。所有从服务器都开始复制新的主服务器了。
+switch-master ：配置变更，主服务器的 IP 和地址已经改变。这是绝大多数外部用户都关心的信息。
+tilt ：进入 tilt 模式。
-tilt ：退出 tilt 模式。

故障转移

一次故障转移操作由以下步骤组成：

发现主服务器已经进入客观下线状态。
对我们的当前纪元进行自增（详情请参考 Raft leader election ），并尝试在这个纪元中当选。
如果当选失败，那么在设定的故障迁移超时时间的两倍之后，重新尝试当选。如果当选成功，那么执行以下步骤。
选出一个从服务器，并将它升级为主服务器。
向被选中的从服务器发送 SLAVEOF NO ONE 命令，让它转变为主服务器。
通过发布与订阅功能，将更新后的配置传播给所有其他 Sentinel ，其他 Sentinel 对它们自己的配置进行更新。
向已下线主服务器的从服务器发送 SLAVEOF 命令，让它们去复制新的主服务器。
当所有从服务器都已经开始复制新的主服务器时，领头 Sentinel 终止这次故障迁移操作。

每当一个 Redis 实例被重新配置（reconfigured） —— 无论是被设置成主服务器、从服务器、又或者被设置成其他主服务器的从服务器 —— Sentinel 都会向被重新配置的实例发送一个 CONFIG REWRITE 命令，从而确保这些配置会持久化在硬盘里。

Sentinel 使用以下规则来选择新的主服务器：

在失效主服务器属下的从服务器当中，那些被标记为主观下线、已断线、或者最后一次回复 PING 命令的时间大于五秒钟的从服务器都会被淘汰。
在失效主服务器属下的从服务器当中，那些与失效主服务器连接断开的时长超过 down-after 选项指定的时长十倍的从服务器都会被淘汰。
在经历了以上两轮淘汰之后剩下来的从服务器中，我们选出复制偏移量（replication offset）最大的那个从服务器作为新的主服务器；如果复制偏移量不可用，或者从服务器的复制偏移量相同，那么带有最小运行 ID 的那个从服务器成为新的主服务器。

Sentinel 自动故障迁移的一致性特质

Sentinel 自动故障迁移使用【Raft 算法】来选举领头（leader） Sentinel ，从而确保在一个给定集群里，只有一个领头产生。

简单来说，我们可以将 Sentinel 配置看作是一个带有版本号的状态。一个状态会以最后写入者胜出（last-write-wins）的方式（也即是，最新的配置总是胜出）传播至所有其他 Sentinel 。

举个例子，当出现网络分割（network partitions）时，一个 Sentinel 可能会包含了较旧的配置，而当这个 Sentinel 接到其他 Sentinel 发来的版本更新的配置时， Sentinel 就会对自己的配置进行更新。

如果要在网络分割出现的情况下仍然保持一致性，那么应该使用 min-slaves-to-write 选项，让主服务器在连接的从实例少于给定数量时停止执行写操作，与此同时，应该在每个运行 Redis 主服务器或从服务器的机器上运行 Redis Sentinel 进程。

Sentinel 状态的持久化

Sentinel 的状态会被持久化在 Sentinel 配置文件里面。

每当 Sentinel 接收到一个新的配置，或者当领头 Sentinel 为主服务器创建一个新的配置时，这个配置会与配置信息一起被保存到磁盘里面。

这意味着停止和重启 Sentinel 进程都是安全的。

操作：

哨兵配置文件/redis源码目录
vi 26379.cof
port 26379
sentinel monitor mymaster 127.0.0.1 6379 2



这里的2是由业务来决定，能够分区容忍，那就可以小于集群的一半，不能就严格的设置数量在集群一半以上

redis-server /root/redisCof/6379.cof
redis-server /root/redisCof/6380.cof --replicaof 127.0.0.1 6379
redis-server /root/redisCof/6381.cof --replicaof 127.0.0.1 6379


跑三个哨兵

redis-server  ./26379.cof --sentinel
这时会发现redis的sentinel哨兵也有从机的相关信息，但是配置文件只有主
这是因为，主知道有哪些从或者备机，哨兵通过发布订阅，来获取从机的信息

redis-server  ./26380.cof --sentinel
第二个哨兵启动除了主从的信息之外，又额外多了第一个哨兵信息，但是配置文件里并没有哨兵之间的信息，---->主机知道有哪些哨兵在监控他

redis-server  ./26381.cof --sentinel


 Sentinel ID is 48889976350b2abdf17b4bc102df0adb7e993272
+monitor master mymaster 127.0.0.1 6379 quorum 2
+slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
+slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
+sentinel sentinel 7e752b8223188eefe98102c345445d9550543e78 127.0.0.1 26379 @ mymaster 127.0.0.1 6379-->其他哨兵通过发布，该哨兵订阅得到的其他哨兵信息
1241:X 08 Oct 2019 10:22:46.353 * +sentinel sentinel 589af062f9d4ffc6c2e29695ce77ac5ff3fdea19 127.0.0.1 26380 @ mymaster 127.0.0.1 6379


三个哨兵启动完之后
那么现在就看哨兵之间是如何知道对方的
redis自带的发布|订阅
redis-cli
PSUBSCRIBE *
1)"pmessage"
2) "*"
3) "__sentinel__:hello"
4) "127.0.0.1,26379,7e752b8223188eefe98102c345445d9550543e78,0,mymaster,127.0.0.1,6379,0"
1) "pmessage"
2) "*"
3) "__sentinel__:hello"
4) "127.0.0.1,26380,589af062f9d4ffc6c2e29695ce77ac5ff3fdea19,0,mymaster,127.0.0.1,6379,0"
1) "pmessage"
2) "*"
3) "__sentinel__:hello"
4) "127.0.0.1,26381,48889976350b2abdf17b4bc102df0adb7e993272,0,mymaster,127.0.0.1,6379,0"

哨兵会每两秒发布订阅到 Channel=pmessage，每个哨兵会告知 监控的是哪个主，自己的id





如果主机6379挂机

会发现有一个哨兵的日志明显多于另外两个，这是因为，哨兵之间会选出一个leader，去给备机发送relicaof no one，...的命令，而不是随便的一台，因为随便的一台发送命令，哨兵之间就不知道听谁的，也就不知道监控哪台新主
又一台


down master mymaster 127.0.0.1 6379 #quorum 2/2--->势力满足条件，开始重连master ，指定时间连不上，重新选主

failover-state-send-【slaveof-noone】 slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
哨兵leader  发送slaveof no one 给6380 ，6380为新主

【switch-master mymaster 127.0.0.1 6379 127.0.0.1 6380
slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6380
+slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380】--》主备切换

写入配置文件

然后vi /root/redisCof/26379.cof
会发现配置文件改掉了，---->哨兵在主备的切换 会去改配置文件

Sentinel配置文件


# *** IMPORTANT ***
#
By default Sentinel will not be reachable from interfaces different than localhost, either use the 'bind' directive to bind to a list of network interfaces, or disable protected mode with "protected-mode no" by adding it to this configuration file.
#默认情况下，无法从不同于localhost的接口访问sentinel，请使用“bind”指令绑定到网络接口列表，或通过将其添加到此配置文件，使用“protected mode no”禁用保护模式
# Before doing that MAKE SURE the instance is protected from the outside world via firewalling or other means.
#
# For example you may use one of the following:
#
# bind 127.0.0.1 192.168.1.1--绑定远程主机
#
# protected-mode no---禁用远程保护

# port <sentinel-port>
# The port that this sentinel instance will run on
port 26379--指定该哨兵端口

# By default Redis Sentinel does not run as a daemon. Use 'yes' if you need it.
# Note that Redis will write a pid file in /var/run/redis-sentinel.pid when
# daemonized.
daemonize no---服务阻塞

# When running daemonized, Redis Sentinel writes a pid file in
# /var/run/redis-sentinel.pid by default. You can specify a custom pid file
# location here.
pidfile /var/run/redis-sentinel.pid---进程id文件所在位置

# Specify the log file name. Also the empty string can be used to force
# Sentinel to log on the standard output. Note that if you use standard
# output for logging but daemonize, logs will be sent to /dev/null--如果是服务阻塞，日志将通过网络输出流方式发送显示到屏幕
logfile ""--指定名字 日志将写入磁盘，而不会走用户态

# sentinel announce-ip <ip>
# sentinel announce-port <port>
对于分布式集群
哨兵声明ip和port，
#
# The above two configuration directives are useful in environments where,
# because of NAT, Sentinel is reachable from outside via a non-local address.
#
# When announce-ip is provided, the Sentinel will claim the specified IP address
# in HELLO messages used to gossip its presence, instead of auto-detecting the
# local address as it usually does.
#
# Similarly when announce-port is provided and is valid and non-zero, Sentinel
# will announce the specified TCP port.
#
# The two options don't need to be used together, if only announce-ip is
# provided, the Sentinel will announce the specified IP and the server port
# as specified by the "port" option. If only announce-port is provided, the
# Sentinel will announce the auto-detected local IP and the specified port.
#
# Example:
#
# sentinel announce-ip 1.2.3.4

# dir <working-directory>/工作目录
# Every long running process should have a well-defined working directory.
# For Redis Sentinel to chdir to /tmp at startup is the simplest thing/默认在temp临时目录下，为了安全，目录地址需要更换
主要存日志
# for the process to don't interfere with administrative tasks such as
# unmounting filesystems.
dir /tmp

# sentinel monitor <master-name> <ip> <redis-port> <quorum>
#配置主机信息，也可以启动时指定
# Tells Sentinel to monitor this master, and to consider it in O_DOWN
# (Objectively Down) state only if at least <quorum> sentinels agree.
#
# Note that whatever is the ODOWN quorum, a Sentinel will require to
# be elected by the majority of the known Sentinels in order to
# start a failover, so no failover can be performed in minority.
#请注意，无论指定的势力范围是多少，哨兵都需要由大多数已知哨兵选出才能启动故障转移，因此少数哨兵不能执行任何故障转移
# Replicas are auto-discovered, so you don't need to specify replicas in --备机挂掉不需要修改配置文件，
# any way. Sentinel itself will rewrite this configuration file adding --->哨兵自己会重写自己的配置文件把备机信息写入
# the replicas using additional configuration options.
# Also note that the configuration file is rewritten when a
# replica is promoted to master.
#当有一个备机连接主机，配置文件（哨兵的配置文件）会被重写
# Note: master name should not include special characters or spaces.主机名不能又特殊字符
# The valid charset is A-z 0-9 and the three characters ".-_".
sentinel monitor mymaster 127.0.0.1 6379 2

# sentinel auth-pass <master-name> <password>
#哨兵访问主的密码
# Set the password to use to authenticate with the master and replicas.
# Useful if there is a password set in the Redis instances to monitor.
#
# Note that the master password is also used for replicas, so it is not
# possible to set a different password in masters and replicas instances
# if you want to be able to monitor these instances with Sentinel.
#
# However you can have Redis instances without the authentication enabled
# mixed with Redis instances requiring the authentication (as long as the
# password set is the same for all the instances requiring the password) as
# the AUTH command will have no effect in Redis instances with authentication
# switched off.
#
# Example:
#
# sentinel auth-pass mymaster MySUPER--secret-0123passw0rd

# sentinel down-after-milliseconds <master-name> <milliseconds>
#
 Number of milliseconds the master (or any attached replica or sentinel) should be unreachable (as in, not acceptable reply to PING, continuously, for the specified period) in order to consider it in S_DOWN state (Subjectively Down).
#就是主机在  指定一段时间内不能与哨兵通信了，那么哨兵认为主机是  主观关机
# Default is 30 seconds.默认30秒
sentinel down-after-milliseconds mymaster 30000

# sentinel parallel-syncs <master-name> <numreplicas>
#
# How many replicas we can reconfigure to point to the new replica simultaneously during the failover. Use a low number if you use the replicas to serve query to avoid that all the replicas will be unreachable at about the same time while performing the synchronization with the master.

在故障转移期间，哨兵可以重新配置多少个备机以同时指向新主。如果备机提供查询操作，请使用较低的数字，以避免在执行与主备的数据同步时，造成数据不一致

sentinel parallel-syncs mymaster 1

# sentinel failover-timeout <master-name> <milliseconds>
#
# Specifies the failover timeout in milliseconds. It is used in many ways:指定故障转移超时（毫秒）。它有多种用途
#
# - The time needed to re-start a failover after a previous failover was already tried against the same master by a given Sentinel, is two times the failover timeout.
#在前一次故障转移后重新启动故障恢复所需的时间已经被给定的哨兵尝试同一个主，这时故障转移超时的两倍
# - The time needed for a replica replicating to a wrong master according to a Sentinel current configuration, to be forced to replicate with the right master, is exactly the failover timeout (counting since the moment a Sentinel detected the misconfiguration).
#备机到一个错误的主机连接所需的时间，根据哨兵当前配置，备机被迫与新主建立连接，正是故障转移超时（计数以来哨兵检测到错误配置的时刻）
# - The time needed to cancel a failover that is already in progress but did not produced any configuration change (SLAVEOF NO ONE yet not acknowledged by the promoted replica).
#取消已在进行但未产生任何配置更改的故障转移所需的时间（尚未确认任何故障转移，升级备机成主，此时主备不切换，其他备机也不连新主 ）
# - The maximum time a failover in progress waits for all the replicas to be reconfigured as replicas of the new master. However even after this time the replicas will be reconfigured by the Sentinels anyway, but not with the exact parallel-syncs progression as specified.
#正在进行的故障转移的最大时间等待所有备机重新配置为新主机的备。然而，即使在这段时间之后，哨兵无论如何将备机重新配置的，但不清楚数据同步进展。
哨兵只负责主备切换，不负责数据同步
# Default is 3 minutes.默认主备切换时间最长3分钟
sentinel failover-timeout mymaster 180000

# SCRIPTS EXECUTION
#
# sentinel notification-script and sentinel reconfig-script are used in order to configure scripts that are called to notify the system administrator or to reconfigure clients after a failover. The scripts are executed with the following rules for error handling:
#sentinel notification-script和sentinel reconfig-script用于配置调用的脚本，以便在故障转移后通知系统管理员或重新配置主备机。脚本的执行遵循以下错误处理规则：
# If script exits with "1" the execution is retried later (up to a maximum number of times currently set to 10).
#如果脚本以“1”退出，则执行重试（直到当前设置为10的最大次数）
# If script exits with "2" (or an higher value) the script execution is not retried.
#如果脚本以“2”（或更高的值）退出，则脚本执行不会被重试。
# If script terminates because it receives a signal the behavior is the same as exit code 1.
#如果脚本因为接收到信号而终止，则行为与退出代码1相同
# A script has a maximum running time of 60 seconds. After this limit is reached the script is terminated with a SIGKILL and the execution retried.
脚本的最大运行时间为60秒。达到此限制后，脚本将使用sigkill终止并重试执行。
# NOTIFICATION SCRIPT
#
# sentinel notification-script <master-name> <script-path>
#
# Call the specified notification script for any sentinel event that is generated in the WARNING level (for instance -sdown, -odown, and so forth).
为警告级别中生成的任何sentinel事件调用 该类型通知脚本（例如-sdown、-odown等）
# This script should notify the system administrator via email, SMS, or any other messaging system, that there is something wrong with the monitored Redis systems.
#此脚本应通过电子邮件、短信或任何其他消息系统通知系统管理员受监控的redis系统有问题
# The script is called with just two arguments: the first is the event type and the second the event description.
#调用脚本时只有两个参数：第一个是事件类型，第二个是事件描述。
# The script must exist and be executable in order for sentinel to start if this option is provided.
#如果提供此选项，则脚本必须存在并可执行，以便Sentinel启动。
# Example:
#该脚本用于通知 【人】，以便知道那台机器宕机
# sentinel notification-script mymaster /var/redis/notify.sh

# CLIENTS RECONFIGURATION SCRIPT
#
# sentinel client-reconfig-script <master-name> <script-path>
#
# When the master changed because of a failover a script can be called in order to perform application-specific tasks to notify the clients that the configuration has changed and the master is at a different address.
#当主服务器由于故障转移而更改时，可以调用脚本来执行特定于应用程序的任务，以通知客户端配置已更改，并且主服务器位于其他地址。
# The following arguments are passed to the script:
#以下参数将传递给脚本：
# <master-name> <role> <state> <from-ip> <from-port> <to-ip> <to-port>
#
# <state> is currently always "failover" 当前总是“故障转移”
# <role> is either "leader" or "observer" 是“领导者”还是“观察者”
#
# The arguments from-ip, from-port, to-ip, to-port are used to communicate the old address of the master and the new address of the elected replica(now a master).
#参数from-ip，from-port，to-ip，to-port用于通信主备的旧地址和所选备机（现在是主副本）的新地址
# This script should be resistant to multiple invocations.
#这个脚本应该能够抵抗多次调用
# Example:
#
# sentinel client-reconfig-script mymaster /var/redis/reconfig.sh--->改脚本时通告客户端，也就是微服务段，我的redis主换掉了

# SECURITY
#
# By default SENTINEL SET will not be able to change the notification-scrip and client-reconfig-script at runtime. This avoids a trivial security issue where clients can set the script to anything and trigger a failover in order to get the program executed.
默认情况下，哨兵集将无法在运行时更改通知脚本和客户端重新配置脚本。这避免了一个微不足道的安全问题，客户机可以将脚本设置为任何内容并触发故障转移以执行程序。//任何情况都触发脚本
sentinel deny-scripts-reconfig yes

# REDIS COMMANDS RENAMING
#
# Sometimes the Redis server has certain commands, that are needed for Sentinel to work correctly, renamed to unguessable strings. This is often the case of CONFIG and SLAVEOF in the context of providers that provide Redis as a service, and don't want the customers to reconfigure the instances outside of the administration console
有时ReISIS服务器有某些命令，哨兵需要正确工作，重命名为不可猜测字符串。在提供ReDIS作为服务的提供者的上下文中，通常是配置和SLAVEOF，并且不希望客户重新配置管理控制台之外的实例
#
# In such case it is possible to tell Sentinel to use different command names instead of the normal ones. For example if the master "mymaster", and the associated replicas, have "CONFIG" all renamed to "GUESSME", I could use:
#在这种情况下，可以告诉sentinel使用不同的命令名，而不是普通的命令名。例如，如果主“mymaster”和相关联的副本都将“config”重命名为“guessme”，我可以使用
# SENTINEL rename-command mymaster CONFIG GUESSME
#---哨兵命令重命名
# After such configuration is set, every time Sentinel would use CONFIG it will use GUESSME instead. Note that there is no actual need to respect the command case, so writing "config guessme" is the same in the example above.
#在这样的配置被设置之后，每次哨兵将使用配置，它将使用Guess MeMe。请注意，实际上不需要尊重命令的情况，因此在上面的示例中写入“配置GuiMeMe”是相同的。
# SENTINEL SET can also be used in order to perform this configuration at runtime.
#也可以使用sentinel set在运行时执行此配置。
# In order to set a command back to its original name (undo the renaming), it is possible to just rename a command to itsef:
#为了将命令设置回原来的名称（撤消重命名），可以将命令重命名为：
# SENTINEL rename-command mymaster CONFIG CONFIG

主从复制总结：

AKF 的x，y，z轴三个维度

主从复制，哨兵高可用不能解决容量有限问题

—————y1——-》

1570581968476

于是在y轴上进行扩展，如果数据可以拆分，按业务逻辑拆分redis，由程序代码逻辑控制请求进哪台redis

———-z1—–》

1570582188606

在z轴上进行扩展，client通过算法【HashCode+取模】modular

模的是后面接多少个redis主从集群

sharding 分片了

HashCode+取模有一个天生的弊端：取模的数必须固定，就是要么模3，要么模4，不可能一个算法里面有不同的模数，因为相同的请求要进相同的redis，

影响了分布式对集群的扩展，好处可以作为数据库使用，当然也可以作为缓存，【一般不用】这种方案

———-z2—–》

1570583915517

在z轴上进行扩展，client通过random()算法,随机选取redis

好处就是对于分布集群，可以任意加多少台机器

但是，应用的范围少，只能作为消息队列

例如随机往一或多个key为ooxx类型为lists的redis写数据，其他客户端只需pop

向redis 1 写了5条进lists

向redis 2 写了5条进lists，只要pop就好

一般就是消息队列，像kafka

—————————-z3——-》

hash算法是映射算法–常见crc16 ，crc32 fnv ，md5—->注意这里没有取模

这些算法，会把我的请求字符串映射成等宽的字符串

1570586239289

使用一致性hash算法，对于分布式集群参与的是，机器node，数据data，两这都参与hash一致性算法，

先把node进行hash一致性算法，标记在一个数据结构里面，例如Treemap。红黑树，node计算出来，插入红黑树

然后data进行hash一致性算法，从红黑树里找，一定可以找到离他最近的节点，其实就是想插入但是不能插入的那个点的父节点

重点：物理节点少就代表数据容易倾斜，那就为每一台物理节点虚拟出多个节点，节点指向还是同一台物理机，

例如:分布式集群只有：node1，node02, 如果用IP地址做hash，那就只有两个点，很容易就数据倾斜了，

那么这样，用ip:1,ip:2,ip:3，这种去做hash一致性，每一台物理机就会虚拟出多台，机器数量多了，倾斜也就降低了

优点：确实会降低其他节点的压力，增加一个节点也不同全局洗牌（全局洗牌就是hash+取模，加一个节点，模数就得加1，数据也必须全盘移动）

缺点：就是有一小部分请求，会拿不到数据，因为有一些请求以前是请求的其他点，现在换成新加的点，就会造成数据读不到，请求不成功

——>带来的问题就是会击穿缓存，压到mysql，进行读取，然后redis进行缓存，那么下次就不会出现击穿的情况

——>方案：每次去最近的两个节点找数据，最近的那个没有，就去第二近的节点找，还不到再去mysql—->降低效率，要做取舍

——>更倾向于作为缓存的，不要作为数据库，因为数据一致性要求不高，缓存数据是可有可没有的，那就使用hash一致性算法做环形节点，

老的节点再也请求不到的数据会被redis的过期清理机制给淘汰

1570588007449

作为缓存

1570589856361

对于redis，client ，客户端可能有很多个，但是每一个客户端都需要跟所有的redis建立连接去通信

对于redis来说无论长连接，还是短连接，都要经历，socket的TCP连接原则，所以redis的连接成本很高

于是在中间来一层反向代理，负载均衡

在代理层去做 client请求的hash一致性算法

就会大量的减少连接成本

1570590072163

这是就只需看代理层机器的性能/效率

要是并发高的不得了

给代理做集群镜像，中间再来一个lvs主备，由keepalived去主备切换，并检查代理镜像服务的健康状态

1570590445472

为什么代理层可以随随便便加节点，只要一个负载的负载过去就好，就是因为代理层是无状态的，没有数据库，没有复杂的业务，只有代理，把数据请求转发，

为什么分区非常有用

Redis分区主要有两个目的:

分区可以让Redis管理更大的内存，Redis将可以使用所有机器的内存。如果没有分区，你最多只能使用一台机器的内存。
分区使Redis的计算能力通过简单地增加计算机得到成倍提升,Redis的网络带宽也会随着计算机和网卡的增加而成倍增长。

不同的分区实现方案

分区可以在程序的不同层次实现。

客户端分区就是在客户端就已经决定数据会被存储到哪个redis节点或者从哪个redis节点读取。大多数客户端已经实现了客户端分区。
代理分区 意味着客户端将请求发送给代理，然后代理决定去哪个节点写数据或者读数据。代理根据分区规则决定请求哪些Redis实例，然后根据Redis的响应结果返回给客户端。redis和memcached的一种代理实现就是Twemproxy
查询路由(Query routing) 的意思是客户端随机地请求任意一个redis实例，然后由Redis将请求转发给正确的Redis节点。Redis Cluster实现了一种混合形式的查询路由，但并不是直接将请求从一个redis节点转发到另一个redis节点，而是在客户端的帮助下直接redirected到正确的redis节点。

分区的缺点

有些特性在分区的情况下将受到限制:

涉及多个key的操作通常不会被支持。例如你不能对两个集合求交集，因为他们可能被存储到不同的Redis实例（实际上这种情况也有办法，但是不能直接使用交集指令）。
同时操作多个key,则不能使用Redis事务.
分区使用的粒度是key，不能使用一个非常长的排序key存储一个数据集
当使用分区的时候，数据处理会非常复杂，例如为了备份你必须从【不同的Redis实例】和主机同时收集【RDB / AOF】文件。
分区时动态扩容或缩容可能非常复杂。Redis集群在运行时增加或者删除Redis节点，能做到最大程度对用户透明地数据再平衡，但其他一些客户端分区或者代理分区方法则不支持这种特性。然而，有一种预分片的技术也可以较好的解决这个问题。

预分片

除非我们把Redis当做缓存使用，否则（在生产环境动态）增加和删除节点将非常麻烦，但是使用固定的keys-instances则比较简单。

一般情况下随着时间的推移，数据存储需求总会发生变化。今天可能10个Redis节点就够了，但是明天可能就需要增加到50个节点。

既然Redis是如此的轻量（单实例只使用1M内存）,为防止以后的扩容，最好的办法就是一开始就启动较多实例。【即便你只有一台服务器，你也可以一开始就让Redis以分布式的方式运行】，使用分区，在同一台服务器上启动多个实例。

一开始就多设置几个Redis实例，例如32或者64个实例，对大多数用户来说这操作起来可能比较麻烦，但是从长久来看做这点牺牲是值得的。

这样的话，当你的数据不断增长，需要更多的Redis服务器时，你需要做的就是仅仅将Redis实例从一台服务迁移到另外一台服务器而已（而不用考虑重新分区的问题）。一旦你添加了另一台服务器，你需要将你一半的Redis实例从第一台机器迁移到第二台机器。

使用Redis复制技术，你可以做到极短或者不停机地对用户提供服务：

在你新服务器启动一个空Redis实例。
把新Redis实例配置为原实例的slave节点
停止你的客户端
更新你客户端配置，以便启用新的redis实例（更新IP）。
在新Redis实例中执行SLAVEOF NO ONE命令
（更新配置后）重启你的客户端
停止你原服务器的Redis实例

其实一开是就分区的好处就是，我需要对redis里面的数据都区一次模，然后比较，最后移动，这样太慢，要分区，就直接分区，省去对每一个k的检查

持久化数据还是缓存？

无论是把Redis当做持久化的数据存储还是当作一个缓存，从分区的角度来看是没有区别的。当把Redis当做一个持久化的存储（服务）时，一个key必须严格地每次被映射到同一个Redis实例。当把Redis当做一个缓存（服务）时，即使Redis的其中一个节点不可用而把请求转给另外一个Redis实例，也不对我们的系统产生什么影响，我们可用任意的规则更改映射，进而提高系统的高可用（即系统的响应能力）。

一致性哈希能够实现当一个key的首选的节点不可用时切换至其他节点。同样地，如果你增加了一个新节点，立刻就会有新的key被分配至这个新节点。

重要结论如下:

如果Redis被当做缓存使用，使用一致性哈希实现动态扩容缩容。
如果Redis被当做一个持久化存储使用，必须使用固定的keys-to-nodes映射关系，节点的数量一旦确定不能变化。否则的话(即Redis节点需要动态变化的情况），必须使用可以在运行时进行数据再平衡的一套系统，那就是预分区

——以上总结一点就是

redis在分布式情况下，以上三种在客户端做hash+取模modula，random，hash一致性是不可以用redis做为数据库的，只能用于缓存

那么想作为数据库，在保证可扩展还要数据的一致性，首先数据必须得迁移，在迁移过程中又得考虑算法得高效率，

-=—-》预分区

Redis集群介绍

Redis 集群是一个提供在多个Redis间节点间共享数据的程序集。

Redis集群并不支持处理多个keys的命令,因为这需要在不同的节点间移动数据,从而达不到像Redis那样的性能,在高负载的情况下可能会导致不可预料的错误.

Redis 集群通过分区来提供一定程度的可用性,在实际环境中当某个节点宕机或者不可达的情况下继续处理命令. Redis 集群的优势:

自动分割数据到不同的节点上。
整个集群的部分节点失败或者不可达的情况下能够继续处理命令。

Redis 集群的数据分片

Redis 集群没有使用一致性hash, 而是引入了 哈希槽的概念.

Redis 集群有16384个哈希槽,【每个key】通过CRC16校验后对【16384取模】来决定放置哪个槽.集群的每个节点负责一部分hash槽,举个例子,比如当前集群有3个节点,那么:

节点 A 包含 0 到 5500号哈希槽.
节点 B 包含5501 到 11000 号哈希槽.
节点 C 包含11001 到 16384号哈希槽.

这种结构很容易添加或者删除节点. 比如如果我想新添加个节点D, 我需要从节点 A, B, C中得部分槽到D上. 如果我想移除节点A,需要将A中的槽移到B和C节点上,然后将没有任何槽的A节点从集群中移除即可. 由于从一个节点将哈希槽移动到另一个节点并不会停止服务,所以无论添加删除或者改变某个节点的哈希槽的数量都不会造成集群不可用的状态.

那么在槽位迁移过程中如果有写和修改或者删除操作，怎么办呢，是阻塞吗，no—》redis要的式反应快，他会先把时点数据rdb先传过去，然后去同步增量缓存backlog，同步完成，新槽位重新指向，mapping映射改变，被迁移的槽位删除

1570617729649

仅仅上面那样还不行，客户端对于redis服务是透明得，不可见，那么就意味着我随便访问哪台redis都要给我一个正确得结果,

假设k1所在槽位在redis2里，而我访问得是redis1，那么redis1要怎么给我数据，是redis1像redis2去要然后返回，这种是肯定不行得，因为redis的作者要的是redis的反应快，既然要反应快，就不会去移动数据，

—–实现—–>就是每台redis有一个对key的相同的hash算法来计算槽位（片），并且有一张mapping映射表记录着槽位的变化—》也就是记录这那个槽位，在哪个redis，这样redis很快就可以返回客户端要取得值在哪个redis，于是让客户端取另一个redis去取，这就是redis作者得思路—->思路最重要得一点就是redis要快

那么redis集群要配合主从复制一起，因为集群得每一个节点都不能突然挂机，突然宕机会造成槽位迁移不了—>服务下线

所以要对每一个节点做主从复制，保证每一个节点得高可用

此时算法在redis端，那么代理层只需要做负载均衡就好

1570619902426

分布式集群都有一个通病——>事务操作不能完成

刚才说了，redis得作者不会去平凡对数据迁移，数据移动在redis作者眼里是令他不爽的，因为数据移动牵涉到集群间的IO，磁盘,

那么假如k1 的槽位在redis1，k2的槽位在redis2



mutil 
get k1
get k2
exec 
------》直接报错，无论交给集群的哪个redis，都是无用，
------>对于事务操作，redis就是一个单机版，只能操作本机数据，就是说事务里需要的东西，本机都有，事务才能成功
---这是很严格的----但是做不了分布式事务的锅redis是不背的，
他很要强，分布式集群做不了事务那是因为人处理数据的问题，

所以他提供了一个方案，就是如果需要做事务操作，请把他放在一个槽位里-->听起来式废话，但是程序员是要去控制的，事务的东西，就必须一个槽位


总结：
数据分治，也就是分布式集群（这个集群不是主从集群），集群的每个节点都是主，只不过管理不同的数据

那么聚合操作很难实现---->事务很难实现，
list交并差集也很难实现
redis作者注重的是   计算向数据移动，而不是移动数据
数据一旦被分开，在redis集群里，就很难被拿在一起使用
但是数据不分开，就一定可以放在一起使用
hash  tag
(oo)k1
(oo)k2
由人拿着oo取模，而不是拿着k1，k2去取模，这样数据一定会在一个槽位
就一定可以放在一起使用

Redis 一致性保证

Redis 并不能保证数据的强一致性. 这意味这在实际中集群在特定的条件下可能会丢失写操作.

第一个原因是因为集群是用了异步复制. 写操作过程:

客户端向主节点B写入一条命令.
主节点B向客户端回复命令状态.
主节点将写操作复制给他得从节点 B1, B2 和 B3.

主节点对命令的复制工作发生在返回命令回复之后，因为如果每次处理命令请求都需要等待复制操作完成的话，那么主节点处理命令请求的速度将极大地降低 —— 我们必须在性能和一致性之间做出权衡。注意：Redis 集群可能会在将来提供同步写的方法。 Redis 集群另外一种可能会丢失命令的情况是集群出现了网络分区，并且一个客户端与至少包括一个主节点在内的少数实例被孤立。

举个例子假设集群包含 A 、 B 、 C 、 A1 、 B1 、 C1 六个节点，其中 A 、B 、C 为主节点， A1 、B1 、C1 为A，B，C的从节点，还有一个客户端 Z1 假设集群中发生网络分区，那么集群可能会分为两方，大部分的一方包含节点 A 、C 、A1 、B1 和 C1 ，小部分的一方则包含节点 B 和客户端 Z1 .

Z1仍然能够向主节点B中写入, 如果网络分区发生时间较短,那么集群将会继续正常运作,如果分区的时间足够让大部分的一方将B1选举为新的master，那么Z1写入B中得数据便丢失了.

注意，在网络分裂出现期间，客户端 Z1 可以向主节点 B 发送写命令的最大时间是有限制的，这一时间限制称为节点超时时间（node timeout），是 Redis 集群的一个重要的配置选项：

代理

Twemproxy

Twemproxy是Twitter维护的（缓存）代理系统，代理Memcached的ASCII协议和Redis协议。它是单线程程序，使用c语言编写，运行起来非常快。它是采用Apache 2.0 license的开源软件。

Twemproxy支持自动分区，如果其代理的其中一个Redis节点不可用时，会自动将该节点排除（这将改变原来的keys-instances的映射关系，所以你应该仅在把Redis当缓存时使用Twemproxy)。

Twemproxy本身不存在单点问题，因为你可以启动多个Twemproxy实例，然后让你的客户端去连接任意一个Twemproxy实例。

Twemproxy是Redis客户端和服务器端的一个中间层，由它来处理分区功能应该不算复杂，并且应该算比较可靠的。

predixy

支持单机代理，多机代理，集群代理

cluster 【集群】

redis自带的一个集群分区自动迁移的东西

codis

一致性Hash性质

考虑到分布式系统每个节点都有可能【挂机】

并且新的节点很可能动态的【增加】，

如何保证当系统的节点数目发生变化时仍然能够对外提供良好的服务，这是值得考虑的，尤其实在设计分布式缓存系统时，如果某台服务器失效，对于整个系统来说如果不采用合适的算法来保证一致性，那么缓存于系统中的所有数据都可能会失效（即由于系统节点数目变少，客户端在请求某一对象时需要重新计算其hash值（通常与系统中的节点数目有关），由于hash值已经改变，所以很可能找不到保存该对象的服务器节点），因此一致性hash就显得至关重要，良好的分布式cahce系统中的一致性hash算法应该满足以下几个方面：

平衡性(Balance)

平衡性是指哈希的结果能够尽可能分布到所有的缓冲中去，这样可以使得所有的缓冲空间都得到利用。—>均匀hash算法要能够让节点均匀分布在虚拟环上

单调性(Monotonicity)

单调性是指如果已经有一些内容通过哈希分派到了相应的缓冲中，又有新的缓冲区加入到系统中，那么哈希的结果应能够保证原有已分配的内容可以被映射到新的缓冲区中去，而不会被映射到旧的缓冲集合中的其他缓冲区。简单的哈希算法往往不能满足单调性的要求，如最简单的线性哈希：x = (ax + b) mod (P)，在上式中，P表示全部缓冲的大小。不难看出，当缓冲大小发生变化时(从P1到P2)，原来所有的哈希结果均会发生变化，从而不满足单调性的要求。哈希结果的变化意味着当缓冲空间发生变化时，所有的映射关系需要在系统内全部更新。而在P2P系统内，缓冲的变化等价于Peer加入或退出系统，这一情况在P2P系统中会频繁发生，因此会带来极大计算和传输负荷。单调性就是要求哈希算法能够应对这种情况。—

分散性(Spread)

两个节点不能靠在一起，要分散开，因为在一起的话，会有大量相同数据两个节点都存，冗余了，也就是接受请求的范围一致了，请求可以谁便往节点里打

负载(Load)

会有冗余，但是要尽量降低

平滑性(Smoothness)

一下不要进来太多的节点，或者不要删除太多

原理

一致性哈希将整个哈希值空间组织成一个虚拟的圆环，如假设某哈希函数H的值空间为0-2^32-1（即哈希值是一个32位无符号整形），整个哈希空间环如下：

　　整个空间按顺时针方向组织。0和2^32-1在零点中方向重合。

　　下一步将各个服务器使用Hash进行一个哈希，具体可以选择服务器的ip或主机名作为关键字进行哈希，这样每台机器就能确定其在哈希环上的位置，这里假设将上文中四台服务器使用ip地址哈希后在环空间的位置如下：

接下来使用如下算法定位数据访问到相应服务器：将数据key使用相同的函数Hash计算出哈希值，并确定此数据在环上的位置，从此位置沿环顺时针“行走”，第一台遇到的服务器就是其应该定位到的服务器。

　　例如我们有Object A、Object B、Object C、Object D四个数据对象，经过哈希计算后，在环空间上的位置如下：

根据一致性哈希算法，数据A会被定为到Node A上，B被定为到Node B上，C被定为到Node C上，D被定为到Node D上。

下面分析一致性哈希算法的容错性和可扩展性。现假设Node C不幸宕机，可以看到此时对象A、B、D不会受到影响，只有C对象被重定位到Node D。一般的，在一致性哈希算法中，如果一台服务器不可用，则受影响的数据仅仅是此服务器到其环空间中前一台服务器（即沿着逆时针方向行走遇到的第一台服务器）之间数据，其它不会受到影响。

下面考虑另外一种情况，如果在系统中增加一台服务器Node X，如下图所示：

此时对象Object A、B、D不受影响，只有对象C需要重定位到新的Node X 。一般的，在一致性哈希算法中，如果增加一台服务器，则受影响的数据仅仅是新服务器到其环空间中前一台服务器（即沿着逆时针方向行走遇到的第一台服务器）之间数据，其它数据也不会受到影响。

综上所述，一致性哈希算法对于节点的增减都只需重定位环空间中的一小部分数据，具有较好的容错性和可扩展性。

另外，一致性哈希算法在服务节点太少时，容易因为节点分部不均匀而造成数据倾斜问题。例如系统中只有两台服务器，其环分布如下，

此时必然造成大量数据集中到Node A上，而只有极少量会定位到Node B上。为了解决这种数据倾斜问题，一致性哈希算法引入了虚拟节点机制，即对每一个服务节点计算多个哈希，每个计算结果位置都放置一个此服务节点，称为虚拟节点。具体做法可以在服务器ip或主机名的后面增加编号来实现。例如上面的情况，可以为每台服务器计算三个虚拟节点，于是可以分别计算 “Node A#1”、“Node A#2”、“Node A#3”、“Node B#1”、“Node B#2”、“Node B#3”的哈希值，于是形成六个虚拟节点：

同时数据定位算法不变，只是多了一步虚拟节点到实际节点的映射，例如定位到“Node A#1”、“Node A#2”、“Node A#3”三个虚拟节点的数据均定位到Node A上。这样就解决了服务节点少时数据倾斜的问题。在实际应用中，通常将虚拟节点数设置为32甚至更大，因此即使很少的服务节点也能做到相对均匀的数据分布。

————————>data的hash值找比他大的并且是最近的那个

对于红黑树，相当于插入时，遍历，找到一个比他大的，左子树，找到比他小，就右子树，最后一定是一个叶子节点，此时如果是父亲的左子树，那么父亲就是要找的那个点，此时如果是父亲的右子树，看父亲是爷爷左子树，那爷爷就是要找的那个点，如果父亲是爷爷的右子树，以此类推，直到跟，还是右，就直接hash最小的那个节点

缓存常见问题，面试回答思路

击穿：

雪崩：

穿透：

一致性（双写）：

本文作者： 忘忧症
本文链接： https://NepenthesZGW.github.io/2020/09/12/redis/redis主从复制/
版权声明： 本博客所有文章除特别声明外，均采用 MIT 许可协议。转载请注明出处！