Failover Cluster logs missing??

October 2, 2015, 2:11 am

≫ Next: CSV IO redirection theory question

≪ Previous: Windows 2012 R2 2xNode Failover Cluster Drive letter

Hi Gents,

I have a cluster in hyper v 2012 r2 environment and i notice our logs (Failover Cluster > Diagnostic) are lost unexplained when i tried to review it. I have no rational explanation for this.. any1 else have this experiece?

↧

CSV IO redirection theory question

October 2, 2015, 6:29 am

≫ Next: iSCSI Connections Keep Disconnecting

≪ Previous: Failover Cluster logs missing??

Hello!

Suppose there's an active/passive two-node HV cluster where all VMs are active on the first node:

Once there's a storage connectivity problem all IO would be redirected using node1/node2's Cluster/CSV adapters:

In order not to suffer any performance degradation inside VMs the speed of Cluster/CSV nics should be at least = or > then that of the iSCSI nics (>=10Gbps). To fulfill this requirement I must invest at least in an additional 10Gb switch + a couple of 10Gb nics.

Doesn't it make more sense in this situation to initiate a cluster failover to the node wich does not have any issues with the storage?

- in this case the speed of 1Gb for the Cluster/CSV nics may be sufficient and there will no be need for the second 10Gb switch.

Q: Is it possible to configure AUTOMATIC failover due to storage connectivity problems (in spite of working heartbeats)?

Thank you in advance,

Michael

↧

iSCSI Connections Keep Disconnecting

October 2, 2015, 9:01 am

≫ Next: Hyper-V cluster - 500VM's + - clussvc.exe - CPU usage

≪ Previous: CSV IO redirection theory question

Our setup:

Server 2012 R2 Hyper-V Cluster
3 HP DL380p Hosts
HP MSA 2040 SAN
2 Cisco Nexus 5000 series Switches dedicated to iSCSI storage network
Each Host has an Intel x710-DA2 10Gb NIC, latest firmware and drivers
Each Host has 1 10Gb iSCSI over fiber connection to each Switch
Switches are not Stacked or connected
2 iSCSI subnets, one for each Switch
SAN has 4 10Gb connections, 2 for each controller, one iSCSI subnet on each controller
Host iSCSI Initiator set for MPIO, each Host IP connects to both SAN IPs
MPIO set to Round Robin with Subset

We are getting these errors in the Windows Event Log on each Host:

Event ID 20: Connection to the target was lost. The initiator will attempt to retry the connection.
Event ID 7: The initiator could not send an iSCSI PDU. Error status is given in the dump data.
Event ID 34: A connection to the target was lost, but Initiator successfully reconnected to the target. Dump data contains the target name.
Event ID 27: Intel(R) Ethernet Converged Network Adapter X710 Network link is disconnected.
Event ID 31: Intel(R) Ethernet Converged Network Adapter X710 Network link has been established at 10Gbps full duplex.
Event ID 27: Intel(R) Ethernet Converged Network Adapter X710-2 Network link is disconnected.
Event ID 31: Intel(R) Ethernet Converged Network Adapter X710-2 Network link has been established at 10Gbps full duplex.

↧

Hyper-V cluster - 500VM's + - clussvc.exe - CPU usage

October 1, 2015, 10:40 pm

≫ Next: Cluster Validation fails with error "failed to access sector 11 on physical disk"

≪ Previous: iSCSI Connections Keep Disconnecting

We have multiple hyper-V clusters running on 2012R2

In the largest cluster, 13 nodes, 500 vm's, clussvc.exe uses about 25-30% CPU, and there is a good amount of network traffic between all the nodes. (100mbit), even when the node is running just one idle VM as a test.

when stopping the SCOM agent, the CPU usages for clussvc.exe drops to 15%
the network traffic also drops a-bit.

CSV volumes are running in normal mode, not redirected.
Storage attached via fiber channel.

All the latest Windows updates are installed.

What can be done to reduce the CPU usage?

↧

Cluster Validation fails with error "failed to access sector 11 on physical disk"

October 9, 2015, 3:27 am

≫ Next: Expanding iSCSI lun from below 16TB to above 16TB for a live CSV

≪ Previous: Hyper-V cluster - 500VM's + - clussvc.exe - CPU usage

Hello,

We have created 2 Guest VM's running Windows Server 2012 R2 on VMware ESXi v5.5 host.

Both these VM's are on same subnet and have been setup in a Cluster configuration using Failover Clustering. Quorum Disk is assigned as an RDM disk using Physical sharing capability and SCSI Controller as Para-Virtual. the disk has same drive letter assigned on both the nodes and visibility is fine. However, when we run the Cluster Validation, it fails with error "failed to access sector 11 on physical disk a3ec0854 from node xxx.yyy.com. the request could not be performed because of an I/O device error".

1. Googled the issue and found an article about Persistent Reservation. So cleared that using Clear-PersistentReservation command under Powershell, with a no-go.

2. Tried a different quorum disk, with a no-go.

3. assigned altogether new disk and still the same.

Kindly help us with this issue. Its a production server.

-Karan Patani

↧

Expanding iSCSI lun from below 16TB to above 16TB for a live CSV

October 6, 2015, 5:42 am

≫ Next: Failover Cluster Manager issue (Windows 10 RSAT)

≪ Previous: Cluster Validation fails with error "failed to access sector 11 on physical disk"

All servers fully patched 2012

OK i am just wanting a second set of eyes on what i am about to do.

I have a Dell MD 3620 attached via iSCSI to a 5 node Failover Cluster. There is a 16 TB LUN being available to all the machines in the Cluster Shared Volume. It houses a 14 TB VHD and a 2 TB VHD. I want to expand the 16 TB LUN to 18 TB and dell reassures me that this not an issue, then i want to expand the 14 TB to 16 TB(This will be the limit as it was created below 16 TB and has a 4 KB cluster size).

Can i expect any curve balls? i want to do this all while in production although off hours.

↧

Failover Cluster Manager issue (Windows 10 RSAT)

October 4, 2015, 11:54 pm

≫ Next: CAFS for General Use

≪ Previous: Expanding iSCSI lun from below 16TB to above 16TB for a live CSV

Hello,

Windows 2012 R2 Cluster + Windows 10 Pro.

Windows 10 RSAT Failover Cluster Manager operations such as Create new virtual machine, change VM settings, Manage VM, Connect to VM requireMicrosoft.Virtualization.Client.Common.Types.dll

Details:

Roles -> Virtual Machines... -> New Virtual Machine... gives

Could not load file or assembly 'Microsoft.Virtualization.Client.Common.Types, Version=10.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35' or one of its dependencies. The system cannot find the file specified.

The first attempt to change VM settings - nothing happens. Choose "settings" again and you'll getMMC has detected an error in a snap-in and will unload it.

This dll is not included with any RSAT version, NET Framework or anything. Search engines know nothing about it.

My DPM blog ystartsev.wordpress.com

↧

CAFS for General Use

October 9, 2015, 11:53 am

≫ Next: WSFC - Unable to successfully cleanup. An error occurred while creating cluster

≪ Previous: Failover Cluster Manager issue (Windows 10 RSAT)

I am wanting to utilize the File Server for the General Use role in Fail Over Cluster Manager. I have successfully created the disk on the SAN and role. The disk is online. However I cannot bring the role online. The cluster indicates that I have errors, however the cluster event does not display the events. I have moved the resource to another node and the resource still will not start. The CAFS was created using the Domain Administrator's credentials.

The server manager errors I see are

ID 1194

Cluster network name resource 'CANVCSCLK' failed to create its associated computer object in domain 'co.island.wa.us' during: Resource online.

The text for the associated error code is: Access is denied.

ID 1205

The Cluster service failed to bring clustered role 'CANVCSCLK' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered role.

ID 1069

Cluster resource 'CANVCSCLK' of type 'Network Name' in clustered role 'CANVCSCLK' failed.

Everything else in the cluster appears to be working. All of the guest are online and can be failed over at will to any of the 4 nodes. I am at a loss. What have I overlooked.

Thanks

↧

WSFC - Unable to successfully cleanup. An error occurred while creating cluster

October 8, 2015, 3:56 am

≫ Next: MPIO

≪ Previous: CAFS for General Use

Hello everyone,

I am trying to do windows clustering 2012 for sql server 2012 clustering. I created 3 VMs, 1 DC and 2 server nodes and with in the DC I inatalled iSCSI target and created a iSCSI SAN.

I provided minimum required NIC cards and made the setup for 2-node-clustering. With administrator account only I am trying to create a cluster.

Validation of Cluster is successfully and it is showing setup is ready to cluster, but while creating cluster I am encountering with an error at Forming cluster....Unable to successfully cleanup....An error occurred while creating cluster......to troubleshoot run validate the cluster

But the validation is showing the setup is ready for cluster. And in some blogs I have seen to create cluster object and to give admin previlages to the account tryong to create cluster, but I am trying with administrator account only.

If anyone having idea about this please help me

Thanks in advance,

Bhargava K

↧

MPIO

October 12, 2015, 5:07 am

≫ Next: W2012R2 - can you create file shares on a CSV?

≪ Previous: WSFC - Unable to successfully cleanup. An error occurred while creating cluster

We use win2012 for clustering with a SAN for different customers

We use two channels ISCSI mpio to connect to the SAN.

The ISCSI is redundant network.

The Cluster network is redundant.

When we disconnect on of the channels we noticed that the communication with the SAN goes via the cluster management network through the other HOST and not via the redundant ISCSI channel. (We see it at some customers, not all)

What can be the reason?

↧

W2012R2 - can you create file shares on a CSV?

October 9, 2015, 5:45 am

≫ Next: Trouble adding VMs as separate roles on 2012 cluster

≪ Previous: MPIO

We have an end user who created a file share on his W2012R2 Hyper-V Cluster Shared Volume (CSV C:\ClusterStorage\Volume2\SharedFolder) directly without using the File Server Role CAFS. The Hyper-V host lets him do this and provides the UNC path name\\hostname\share.

Is this supported? - I was lead to believe the CSV was reserved for Hyper-V use...

Please advise, thanks.

↧

Trouble adding VMs as separate roles on 2012 cluster

October 12, 2015, 9:20 am

≫ Next: Quorum on 2 node cluster on different subnets

≪ Previous: W2012R2 - can you create file shares on a CSV?

I have created a two node cluster (Microsoft Windows Server 2012 R2 Datacenter) with two CSVs. My goal is to add 6 HyperV VMs on this cluster, which I can do, but the process is not behaving the way I expect it to. When I go to add the 6 preconfigured VMs, I follow these steps:

From the Actions pane, click “Configure Role…”
In the High Availability Wizard, select “Virtual Machine”
Put a check in the box next to all 6 VMs and click Next

This process is successful, however, inside the Failover Cluster Manager, there are only TWO roles displayed. I expect there to be a role for each VM. One of the VMs is a role all by itself and the second role is a single role with the remaining 5VMs inside of it. I have tried multiple things to separate the VMs into their own role, to no avail:

I tried creating an Empty Role, but apparently you cannot add a VM to an Empty Role
I tried to create a new VM from within the Failover Cluster Manager. This was successful, however, it still added it to the second role
I tried adding the VM via PowerShell. This was also successful, but again, it added it to the second role

Obviously my goal is to create 6 separate roles so that I can have the flexibility to down/move VMs around my cluster without impacting an entire group of servers. The VMs I am trying to separate are all on the same CSV. The one that is behaving as expected is on it’s own CSV, could that be the cause?

↧

Quorum on 2 node cluster on different subnets

October 12, 2015, 11:40 am

≫ Next: Hyper-V HA Orphaned Files Search

≪ Previous: Trouble adding VMs as separate roles on 2012 cluster

We are deploying a 2 node cluster on different subnets. It looks like we need to configure a fileshare as Witness. Since the cluster needs quorum what happens to the cluster if the server hosting the fileshare is rebooted or goes offline?

↧

Hyper-V HA Orphaned Files Search

October 9, 2015, 9:13 am

≫ Next: Various errors - The wrong diskette is in the drive + Failover cluster fails creating new virtual machines.

≪ Previous: Quorum on 2 node cluster on different subnets

First, I apologize if this is in the wrong section as I was debating between this category or Hyper-V.

I currently have Hyper-V with a Failover Clustering enabled between 3 VMHosts and 3 CSVs on a SAN. I am trying to find a way of identifying orphaned files/VHDs on the CSVs because we are starting to need additional disk space. I found some scripts on other forums but they only identify CSV1 and not the other 2 or they do not work on a HA Hyper-V setup.

Anyone experience this issue before and what did you do to resolve it? Thank you in advance!

↧

Various errors - The wrong diskette is in the drive + Failover cluster fails creating new virtual machines.

October 10, 2015, 2:46 am

≫ Next: validation failed on remote server - Ensure that the remote registry service is running, and have remote administration enabled

≪ Previous: Hyper-V HA Orphaned Files Search

Hello

Our rig.

2 nodes - hp dl360 connected to a HP mSA 2040 SAN directly with FC cards.

Running Failover cluster roles and senveral virtual hyper-v vm's.

We have 2 LUNS, clusterstorage 1 and 2, several weeks ago, one of the hosts (node2) lost connection to the SAN, we suspect during Veeam backup, node 2 is offhost-proxy for Veeam also, we restarted and it came online again after apporx 30 minutes, and suddenly it happened a couple of days later. And all of a sudden we couldn't bring it online, and failover didn't work, which means, the vm's didn't failover to node1.

So all the vm's on that LUN got stuck, closed/seized by node2, but we could manually mount that LUN on node1, and start the vm's, but all vm's are now not in failover mode because node2 cannot see that particular LUN/CLusterStorage\Volume1. VOlume2 works on both nodes.

After many days of searching/hotfixes and so on, we found some references to this:

There was an error loading the disk information for disk Cluster Disk 2 - Microsoft.FailoverClusters.Framework.ClusterControlCodeException: Failed to execute control code '16777713'. ---> System.ComponentModel.Win32Exception: The wrong diskette is in the drive.
Insert %2 (Volume Serial Number: %3) into drive %1
--- End of inner exception stack trace ---
at MS.Internal.FailoverClusters.Framework.ClusApiAdapter.ResourceAdapter.ExecuteOnControlCode(SafeClusterResourceHandle resourceHandle, Int32 controlCode, String resourceName, UnmanagedBuffer inBuffer, Action`2 controlCodeCallBack, Action invalidFunctionCallback)
at MS.Internal.FailoverClusters.Framework.ClusApiAdapter.ResourceAdapter.LoadDiskFromCluster(SafeClusterResourceHandle resourceHandle, PResource resource, Boolean includeMountPoints)
at MS.Internal.FailoverClusters.Framework.ClusApiAdapter.ResourceAdapter.<>c__DisplayClass2d4.<LoadDisk>b__2d1(SafeClusterResourceHandle resourceHandle)
at MS.Internal.FailoverClusters.Framework.ClusApiAdapter.ResourceAdapter.ExecuteOnResource(Guid id, String name, Action`1 actionOnResource)
at MS.Internal.FailoverClusters.Framework.ClusApiAdapter.ResourceAdapter.LoadDisk(PResource resource)

And many people resolved the issue by deleting the volume, and enable it again, and change from MBR to GPT. We on the other hand, annot delete that volume because it active and in production manually on node1. So we decides to buy a new set of discs to the SAN, create LUN3 and backup/restore the manually mounted vm's on defect lun to the new lun3, we did so, ClusterStorage\Volume3 is visible from both node1 and node2, but..... We see the same errors on the new vdisk/lun/volume3...

When trying to create a VM in failover cluster:

There was a failure configuring the virtual machine role for 'test4'.
An error occurred retrieving the disk information for the resource 'Cluster Disk 2'.

The wrong diskette is in the drive.
Insert %2 (Volume Serial Number: %3) into drive %1

This error happens on a brand new set of discs on tha hp msa 2040 san, and a new volume.

What the hell happened on node2, that causes failover cluster to not owrk on a brand new set of LUN?

This is critical...

Any help is greatly appreciated!

Best regards

Aksel

↧

validation failed on remote server - Ensure that the remote registry service is running, and have remote administration enabled

March 31, 2014, 4:45 pm

≫ Next: CAU Hotfix Plugin - The plug-in argument HotfixRootFolderPath has invalid value

≪ Previous: Various errors - The wrong diskette is in the drive + Failover cluster fails creating new virtual machines.

I am trying to setup my 2012 cluster and when i try to add my remote server it gives me an error

Failed to access remote registry on server.

Ensure that the remote registry service is running, and have remote administration enabled

I checked the server but remote registry is started already

any idea?

also checked under server manager and remote management is enabled

↧

CAU Hotfix Plugin - The plug-in argument HotfixRootFolderPath has invalid value

August 5, 2015, 6:56 am

≫ Next: CSV errors STATUS_IO_TIMEOUT Windows 2012 Hyper-V Failover cluster

≪ Previous: validation failed on remote server - Ensure that the remote registry service is running, and have remote administration enabled

Hi. I have 2012R2 cluster configured for CAU in self-updating mode with both WindowsUpdate and Hotfix plugins. The configuration went fine, however when I try to run CAU using these options, it will fail with the error "The plug-in argument HotfixRootFolderPath has invalid value".

I've repeatedly checked that the path is correct and browsable and has all the correct permissions it should have. I've tried with both DisableAclChecks True/False, didn't make a difference. The path contains a space, so I've tried enclosing it in double-quotes, that didn't help either.

I've ran CAU from the GUI, here's the command it generates:

Invoke-CauRun -ClusterName cluster01 -CauPluginName 'Microsoft.WindowsUpdatePlugin','Microsoft.HotfixPlugin' -CauPluginArguments @{ 'HotfixConfigFileName' = 'DefaultHotfixConfig.xml'; 'DisableAclChecks' = 'False'; 'HotfixRootFolderPath' = '\\fileserver\CAU\Windows Server 2012 R2\Hotfixes\Hyper-V\Root'; 'IncludeRecommendedUpdates' = 'True'; 'RequireSmbEncryption' = 'True' } -MaxFailedNodes -1 -MaxRetriesPerNode 3 -EnableFirewallRules -FailbackMode Immediate -Force

The root folder contains DefaultHotfixConfig.xml per documentation and also there's CAUHotfix_All folder (currently empty as there are no hotfixes I need to install).

As I've said above, I tried modifying the path in the command above to 'HotfixRootFolderPath' = '"\\fileserver\CAU\Windows Server 2012 R2\Hotfixes\Hyper-V\Root"', which didn't help.

Any idea what's wrong?

↧

CSV errors STATUS_IO_TIMEOUT Windows 2012 Hyper-V Failover cluster

November 6, 2012, 6:14 pm

≫ Next: How to move CSV to another Cluster

≪ Previous: CAU Hotfix Plugin - The plug-in argument HotfixRootFolderPath has invalid value

Hi there,

I'm seeing these errors sometimes, on a Windows 2012 Hyper-V Failover Cluster.

Cluster Shared Volume 'Volume1' ('Cluster Disk 1') is no longer available on this node because of 'STATUS_IO_TIMEOUT(c00000b5)'. All I/O will temporarily be queued until a path to the volume is reestablished.

There are three nodes, each with 6 NICs. first two NICs are teamed and connected to a VM virtual switch. Second two are teamed (one active) used for cluster comms. Third two are teamed (one active) and connected to a virtual switch - with the management OS and another cluster NIC.

redirected access seems to work fine (tested by removing the CSVs from one node). But it's weird that we keep seeing these on the cluster logs.

Also see STATUS_CONNECTION_DISCONNECTED sometimes too.

Does anyone know what this could be?

↧

How to move CSV to another Cluster

October 12, 2015, 11:11 am

≫ Next: RID pool and user account creation

≪ Previous: CSV errors STATUS_IO_TIMEOUT Windows 2012 Hyper-V Failover cluster

The previous IT company setup a Scale Out File Cluster within the main Hyper-V 2012 R2 Cluster. The File Cluster has a CSV share that we would like to move to a new CSV on the main Cluster. We would like to decommission the File Cluster after this.

What is the recommended way to move the CSV?

↧

RID pool and user account creation

October 13, 2015, 4:12 am

≫ Next: Host Not Responding

≪ Previous: How to move CSV to another Cluster

Hello,

I have a question about RID pool and user account creation.
Our system has two server farms each of which has two DCs (i.e. 4 DCs in total).
Two farms are a primary and a backup, and the RID master DC resides in the primary.
The entire system is under a single domain.

When the primary farm fails and the backup farm takes over,
we do not want to seize the RID master role at the backup farm,
because the primary might recover later depending on the cause of the failover.
Is it possible to avert the RID block depletion on backup farm's DCs
by setting a larger value to the 'RID Block Size' key (e.g. 5000) in advance?

In other words, is there anything that could potentially deplete
DCs' RIDs other than the account creation operations we perform
when our system enters the service (we will be creating no more than
several hundred user accounts).
We might create new user accounts while we are on the backup farm,
but the number of added users will not exceed 1000.
We will not create new computers and groups while on the backup farm, except perhaps a few.
Also we will be incorporating AD CS, Exchange, and Skype for Business
to our single domain system.

Regards,
Wanko

↧