寻觅生命中的那一片浅草......

文章带标签 RAID

lsiutil设置磁盘带宽

1. 问题

DELL R410自带RAID卡是SAS6IR,机械盘接上去带宽是3G,而SSD,无论是三星还是闪迪,都只有1.5G,查看指令

./lsiutil -p1 -a 69,8,21,4,0,0

有问题的结果,其中1.5处为接SSD的端口

SAS1068E's links are 3.0 G, 3.0 G, 3.0 G, 1.5 G, off, off, off, off

2. 解决

2.1 方法一

换个RAID卡,如 PERC 6i或者H700,贵

2.2 方法二

换pcie口的SSD,价格是sata口的两倍,贵

2.3 方法三

可以通过lsiutil(要求Version 1.62或以上)设置磁盘的带宽,此机接了4个盘,所以,具体设置,请根据实际情况选择

[root@localhost src]# ./lsiutil -p1 -a 69

LSI Logic MPT Configuration Utility, Version 1.63, June 4, 2009

1 MPT Port found

     Port Name         Chip Vendor/Type/Rev    MPT Rev  Firmware Rev  IOC
 1.  /proc/mpt/ioc0    LSI Logic SAS1068E B3     105      00192f00     0

Main menu, select an option:  [1-99 or e/p/w or 0 to quit] 69

Seg/Bus/Dev/Fun    Board Name       Board Assembly   Board Tracer
 0   2   0   0     SAS6IR                                            

# 输入13进行操作
Main menu, select an option:  [1-99 or e/p/w or 0 to quit] 13  

# 以下3行,直接回车即可
SATA Maximum Queue Depth:  [0 to 255, default is 8] 
Device Missing Report Delay:  [0 to 2047, default is 0] 
Device Missing I/O Delay:  [0 to 255, default is 0] 

PhyNum  Link      MinRate  MaxRate  Initiator  Target    Port
   0    Enabled     1.5      3.0    Enabled    Disabled  Auto
   1    Enabled     1.5      3.0    Enabled    Disabled  Auto
   2    Enabled     1.5      3.0    Enabled    Disabled  Auto
   3    Enabled     1.5      3.0    Enabled    Disabled  Auto
   4    Disabled    1.5      3.0    Enabled    Disabled  Auto
   5    Disabled    1.5      3.0    Enabled    Disabled  Auto
   6    Disabled    1.5      3.0    Enabled    Disabled  Auto
   7    Disabled    1.5      3.0    Enabled    Disabled  Auto

# 此处选择3,是我们接SSD的位置
Select a Phy:  [0-7, 8=AllPhys, RETURN to quit] 3
# 直接回车
Link:  [0=Disabled, 1=Enabled, default is 1] 
# 设置最小带宽,关键操作,选择1
MinRate:  [0=1.5 Gbps, 1=3.0 Gbps, default is 0] 1
# 以下4行,直接回车
MaxRate:  [0=1.5 Gbps, 1=3.0 Gbps, default is 1] 
Initiator:  [0=Disabled, 1=Enabled, default is 1] 
Target:  [0=Disabled, 1=Enabled, default is 0] 
Port:  [0 to 7 for manual config, 8 for auto config, default is 8] 

PhyNum  Link      MinRate  MaxRate  Initiator  Target    Port
   0    Enabled     1.5      3.0    Enabled    Disabled  Auto
   1    Enabled     1.5      3.0    Enabled    Disabled  Auto
   2    Enabled     1.5      3.0    Enabled    Disabled  Auto
   # 注意看下面一行,MinRate为3.0代表设置正确
   3    Enabled     3.0      3.0    Enabled    Disabled  Auto
   4    Disabled    1.5      3.0    Enabled    Disabled  Auto
   5    Disabled    1.5      3.0    Enabled    Disabled  Auto
   6    Disabled    1.5      3.0    Enabled    Disabled  Auto
   7    Disabled    1.5      3.0    Enabled    Disabled  Auto

# 以下4行,直接回车
Select a Phy:  [0-7, 8=AllPhys, RETURN to quit] 
Persistence:  [0=Disabled, 1=Enabled, default is 1] 
Physical mapping:  [0=None, 1=DirectAttach, 2=EnclosureSlot, default is 2] 
Number of Target IDs to reserve:  [0 to 32, default is 8] 

# 输入0退出设置
Main menu, select an option:  [1-99 or e/p/w or 0 to quit] 0

经过以上设置,重启系统,再次确认,带宽已经变为3,cp大量小文件,性能也提升了50%

参考资料

MegaCli64导入Foreign配置

背景

Hadoop集群昨天新加机器 ,还在Rebalance,早上同事发现有个节点的2个硬盘无法写了,fdisk -l也看不到,第一感觉是硬盘坏了,想换硬盘,但硬盘是上周才买的,然后怀疑是raid卡数据线有问题,奈何机房找到的是另外一条不知道什么线,为了尽快恢复服务,就让机房重新插拔下硬盘,然后开机

BTW:每块盘做的是RAID0

开机后的处理

首先用fdisk看看能否看到硬盘,看不到,然后用MegaCli64看物理盘是否存在

通过以下命令看到物理硬盘是在的,但状态是Foreign

sudo /usr/local/bin/megasasctl  -PDList -aALL

Google一翻后,发现以下指令可以处理

执行状态检测命令:

sudo /usr/local/bin/megasasctl  -pdlist -aall |grep 'Firmware state'

输出,其中最后2个为raid有问题的盘

Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Unconfigured(good), Spun Up
Firmware state: Unconfigured(good), Spun Up

执行导入命令

sudo /usr/local/bin/megasasctl  -CfgForeign -Import -aall        

Foreign configuration is imported on controller 0.

Exit Code: 0x00

再次执行状态检测命令:

sudo /usr/local/bin/megasasctl  -pdlist -aall |grep 'Firmware state'

输出,正常了,因为是raid0,所以比较快

Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up

一点总结:

  1. 未必是硬盘坏了
  2. 硬盘线估计也没松
  3. 确认故障前,先不急着关机让机房检测,适当进行一些在线检测

参考资料

dell服务器硬盘的状态变成外来(foreign)命令行修复

2018年五月
« 2月    
 123456
78910111213
14151617181920
21222324252627
28293031