Skip to content
This repository has been archived by the owner on May 24, 2020. It is now read-only.

Make /run/udev available to gd2 container from host #133

Merged
merged 1 commit into from
Feb 7, 2019

Conversation

kotreshhr
Copy link
Contributor

@kotreshhr kotreshhr commented Feb 5, 2019

Add devices phase in gcs setup was hung with following error in gd2 container.
" WARNING: Device /dev/vda not initialized in udev database even after waiting 10000000 microseconds."
All pv commands (e.g. pvdisplay, pvcreate..) takes lot of time causing the failure of gcs setup.
It's found out that exporting /run/udev solves the issue.

Signed-off-by: Kotresh HR [email protected]


Fixes #129

@ghost ghost assigned kotreshhr Feb 5, 2019
@ghost ghost added the in progress label Feb 5, 2019
@kotreshhr
Copy link
Contributor Author

I tested with the patch. Add devices successfully passed.


TASK [GCS | GD2 Cluster | Add devices] *****************************************
Tuesday 05 February 2019  17:16:42 +0530 (0:05:03.505)       0:46:46.697 ****** 
included: /home/kotresh/sandbox/upstream/gcs/deploy/tasks/add-devices-to-peer.yml for kube1
included: /home/kotresh/sandbox/upstream/gcs/deploy/tasks/add-devices-to-peer.yml for kube1
included: /home/kotresh/sandbox/upstream/gcs/deploy/tasks/add-devices-to-peer.yml for kube1

TASK [GCS | GD2 Cluster | Add devices | Set facts] *****************************
Tuesday 05 February 2019  17:16:42 +0530 (0:00:00.109)       0:46:46.806 ****** 
ok: [kube1]

TASK [GCS | GD2 Cluster | Add devices | Add devices for kube2] *****************
Tuesday 05 February 2019  17:16:42 +0530 (0:00:00.075)       0:46:46.882 ****** 
FAILED - RETRYING: GCS | GD2 Cluster | Add devices | Add devices for kube2 (50 retries left).
FAILED - RETRYING: GCS | GD2 Cluster | Add devices | Add devices for kube2 (49 retries left).
ok: [kube1] => (item=/dev/vdc)
FAILED - RETRYING: GCS | GD2 Cluster | Add devices | Add devices for kube2 (50 retries left).
FAILED - RETRYING: GCS | GD2 Cluster | Add devices | Add devices for kube2 (49 retries left).
ok: [kube1] => (item=/dev/vdd)
FAILED - RETRYING: GCS | GD2 Cluster | Add devices | Add devices for kube2 (50 retries left).
FAILED - RETRYING: GCS | GD2 Cluster | Add devices | Add devices for kube2 (49 retries left).
ok: [kube1] => (item=/dev/vde)

TASK [GCS | GD2 Cluster | Add devices | Set facts] *****************************
Tuesday 05 February 2019  17:17:49 +0530 (0:01:06.891)       0:47:53.773 ****** 
ok: [kube1]

TASK [GCS | GD2 Cluster | Add devices | Add devices for kube3] *****************
Tuesday 05 February 2019  17:17:49 +0530 (0:00:00.151)       0:47:53.925 ****** 
FAILED - RETRYING: GCS | GD2 Cluster | Add devices | Add devices for kube3 (50 retries left).
FAILED - RETRYING: GCS | GD2 Cluster | Add devices | Add devices for kube3 (49 retries left).
FAILED - RETRYING: GCS | GD2 Cluster | Add devices | Add devices for kube3 (48 retries left).
FAILED - RETRYING: GCS | GD2 Cluster | Add devices | Add devices for kube3 (47 retries left).
FAILED - RETRYING: GCS | GD2 Cluster | Add devices | Add devices for kube3 (46 retries left).
FAILED - RETRYING: GCS | GD2 Cluster | Add devices | Add devices for kube3 (45 retries left).
FAILED - RETRYING: GCS | GD2 Cluster | Add devices | Add devices for kube3 (44 retries left).
ok: [kube1] => (item=/dev/vdc)
ok: [kube1] => (item=/dev/vdd)
ok: [kube1] => (item=/dev/vde)

TASK [GCS | GD2 Cluster | Add devices | Set facts] *****************************
Tuesday 05 February 2019  17:19:07 +0530 (0:01:17.700)       0:49:11.626 ****** 
ok: [kube1]

TASK [GCS | GD2 Cluster | Add devices | Add devices for kube1] *****************
Tuesday 05 February 2019  17:19:07 +0530 (0:00:00.139)       0:49:11.766 ****** 
ok: [kube1] => (item=/dev/vdc)
ok: [kube1] => (item=/dev/vdd)
ok: [kube1] => (item=/dev/vde)

@kotreshhr kotreshhr added bug Something isn't working GCS/1.0 labels Feb 5, 2019
@kotreshhr
Copy link
Contributor Author

@Madhu-1 @JohnStrunk PTAL

@amarts
Copy link
Member

amarts commented Feb 5, 2019

good catch!

@kotreshhr
Copy link
Contributor Author

This fixes the issue #129

@Madhu-1
Copy link
Member

Madhu-1 commented Feb 5, 2019

#112

this creates host dependency, we should RCA this

@kotreshhr
Copy link
Contributor Author

kotreshhr commented Feb 5, 2019

#112

this creates host dependency, we should RCA this

Ah ok. The issue seems to be in lvm2 which is using udev database which was not there earlier.

Reference:
https://bugzilla.redhat.com/show_bug.cgi?id=1669266

The issue seems to be because of this commit in lvm2

 commit a063d2d123c56c4ccead986625a260df16556b9f
 Author: David Teigland <[email protected]>
 Date:   Mon Dec 3 11:22:45 2018 -0600
    devs: use udev info to improve md component detection
   
    Use udev info to supplement native md component detection.

Source: https://bbs.archlinux.org/viewtopic.php?id=242594

@aravindavk aravindavk requested a review from JohnStrunk February 5, 2019 12:26
@JohnStrunk
Copy link
Member

This is taking us in the wrong direction wrt host dependencies. Can this be solved by disabling udev in lvm.conf?

Items to examine for applicability:

  • obtain_device_list_from_udev
  • external_device_info_source
  • udev_sync
  • udev_rules
  • verify_udev_operations

This may also affect the lvmetad settings

Copy link
Member

@aravindavk aravindavk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Madhu-1
Copy link
Member

Madhu-1 commented Feb 6, 2019

@amarts can we merge this one?

@amarts
Copy link
Member

amarts commented Feb 6, 2019

As long as we depend on LVM, IMO, we would be dependent on one or such things. And we have to compromise on scale, and performance if we don't do. Considering we are targeting to move with loopback based bricks in future version, I am inclined towards taking this in, so v1.0 our user's experience is better.

@JohnStrunk @Madhu-1 my opinion is to take this in, and fix it again properly either ways (ie, to disable it totally, or to move with loopback brick in gd2).

@kshlm
Copy link
Member

kshlm commented Feb 6, 2019

This is taking us in the wrong direction wrt host dependencies. Can this be solved by disabling udev in lvm.conf?

Items to examine for applicability:

  • obtain_device_list_from_udev
  • external_device_info_source
  • udev_sync
  • udev_rules
  • verify_udev_operations

This may also affect the lvmetad settings

The image already has disabled udev_rules, udev_sync and use_lvmetad in lvm.conf. That should be enough to disable udev/lvmetad use by lvm. We should also try disabling the other options to see if it helps.

@kshlm
Copy link
Member

kshlm commented Feb 6, 2019

Ah ok. The issue seems to be in lvm2 which is using udev database which was not there earlier.

Reference:
https://bugzilla.redhat.com/show_bug.cgi?id=1669266

The issue seems to be because of this commit in lvm2

 commit a063d2d123c56c4ccead986625a260df16556b9f
 Author: David Teigland <[email protected]>
 Date:   Mon Dec 3 11:22:45 2018 -0600
    devs: use udev info to improve md component detection
   
    Use udev info to supplement native md component detection.

Source: https://bbs.archlinux.org/viewtopic.php?id=242594

We need to verify if this change did make it back to CentOS. The linked source thread is about Arch Linux which always carries the latest upstream bits of every project. RHEL (and CentOS) shouldn't be this quick in backporting such a new change.

@atinmu
Copy link

atinmu commented Feb 7, 2019

Since this is delaying the GCS 1.0 release much longer than anticipated, let's consider this as a workaround fix (if there're objections around the approach) and fix it in right way in the upcoming releases.

@JohnStrunk JohnStrunk merged commit 2aff6d8 into gluster:master Feb 7, 2019
@ghost ghost removed the in progress label Feb 7, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working GCS/1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants