Skip to content
This repository has been archived by the owner on Aug 26, 2021. It is now read-only.

Management Portal not responding #292

Open
jitinkumar2018 opened this issue Jan 8, 2019 · 11 comments
Open

Management Portal not responding #292

jitinkumar2018 opened this issue Jan 8, 2019 · 11 comments

Comments

@jitinkumar2018
Copy link

Bug Report Basic Information
REQUIRED:
vCenter Server version: 6.7 U1
Embedded or external PSC: No
Filename of the OVA you deployed: vic-dev-v1.5.0-rc2-6834-28057308.ova.
How was the OVA deployed? ovftool
Does the VIC appliance recieve configuration by DHCP? No
What stage of the Appliance Lifecycle is the VIC appliance in? Application
VIC appliance logs:
vic_appliance_logs_2019-01-08-10-16-54.tar.gz

Bug Report Detailed Information
Admiral stopped responding 24 hours after deployment.

DETAILS:
VIC appliance was deployed and 50 VCH's were deployed. 50 Projects were created and VCH's were added to project-p01 and we were able to see these changes on admiral for a day.
Management portal: https://vic-st-h2-132.eng.vmware.com:8282

Able to access the vic startup page: https://10.197.37.132:9443/
But management portal is not responding now
There was a time skew of 2 minutes between the VC and VIC appliance. Admiral still not responding even after updating the VIC appliance time.

@lgayatri
Copy link

lgayatri commented Jan 8, 2019

The time skew was corrected, but still we cannot open 8282.
@martin-borisov

@lgayatri
Copy link

lgayatri commented Jan 9, 2019

@renmaosheng @martin-borisov this is a blocker.

@lgayatri
Copy link

lgayatri commented Jan 9, 2019

        at com.vmware.xenon.services.common.LuceneDocumentIndexService.createPaginatedQuerySearcher(LuceneDocumentIndexService.java:1352)
        at com.vmware.xenon.services.common.LuceneDocumentIndexService.createOrUpdatePaginatedQuerySearcher(LuceneDocumentIndexService.java:1265)
        at com.vmware.xenon.services.common.LuceneDocumentIndexService.handleQueryTaskPatch(LuceneDocumentIndexService.java:1215)
        at com.vmware.xenon.services.common.LuceneDocumentIndexService.handleQueryRequest(LuceneDocumentIndexService.java:1089)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
]
[398108][S][2019-01-08T09:37:18.731Z][25][8282/resources/container-control-loop/control-loop-info][lambda$performMaintenance$4][Failed to retrieve container descriptions]
[398110][W][2019-01-08T09:37:18.731Z][25][8282/][processPendingServiceAvailableOperations][Service /core/local-query-tasks/31b194e97e6bb87557eef176b0f2c failed start: java.util.concurrent.CancellationException: Index writer is null]
[398111][W][2019-01-08T09:37:18.731Z][21][8282/][lambda$performServiceMaintenance$1][Service /resources/hosts-data-collections/host-info-data-collection failed maintenance: java.lang.IllegalStateException: Writer not available
        at com.vmware.xenon.services.common.LuceneDocumentIndexService.createPaginatedQuerySearcher(LuceneDocumentIndexService.java:1352)
        at com.vmware.xenon.services.common.LuceneDocumentIndexService.createOrUpdatePaginatedQuerySearcher(LuceneDocumentIndexService.java:1265)
        at com.vmware.xenon.services.common.LuceneDocumentIndexService.handleQueryTaskPatch(LuceneDocumentIndexService.java:1215)
        at com.vmware.xenon.services.common.LuceneDocumentIndexService.handleQueryRequest(LuceneDocumentIndexService.java:1089)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

@DanielXiao
Copy link

Admiral was crashed of OOM:

Jan 08 09:35:55 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: java.lang.OutOfMemoryError: GC overhead limit exceeded
Jan 08 09:35:55 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: Dumping heap to /var/admiral/java_pid5.hprof ...
Jan 08 09:35:59 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: Heap dump file created [896882459 bytes in 3.817 secs]
Jan 08 09:37:16 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: Exception in thread "Lucene Merge Thread #6206" org.apache.lucene.index.MergePolicy$Merge
Exception: java.lang.OutOfMemoryError: GC overhead limit exceeded
Jan 08 09:37:16 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]:         at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(Concurre
ntMergeScheduler.java:703)
Jan 08 09:37:16 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]:         at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMer
geScheduler.java:683)
Jan 08 09:37:16 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
Jan 08 09:37:25 vic-st-h2-132.eng.vmware.com docker[711286]: vic-admiral
Jan 08 09:37:25 vic-st-h2-132.eng.vmware.com docker[711294]: vic-admiral

@DanielXiao
Copy link

Is there memory leaking or admiral starts with not enough memory? "-Xmx768M -Xms768M -Xss256K -Xmn256M"

Jan 07 06:11:10 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: + '[' false = true ']'
Jan 07 06:11:10 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: + '[' x = x ']'
Jan 07 06:11:10 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: + MEMORY_OPTS='-Xmx768M -Xms768M -Xss256K -Xmn256M -XX:MaxMetaspaceSize=256m'
Jan 07 06:11:10 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: + CONFIG_FILES=/admiral/config/dist_configuration.properties
Jan 07 06:11:10 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: + '[' -f /configs/config.properties ']'
Jan 07 06:11:10 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: + CONFIG_FILES=/admiral/config/dist_configuration.properties,/configs/config.properties
Jan 07 06:11:10 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: + '[' x = x ']'
Jan 07 06:11:10 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: + XENON_PHOTON_MODEL_PROPS='-Dservice.document.version.retention.limit=50 -Dservice.docum
ent.version.retention.floor=10'
Jan 07 06:11:10 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: + '[' x = x ']'
Jan 07 06:11:10 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: + XENON_STACKTRACE=-Dxenon.ServiceErrorResponse.disableStackTraceCollection=true
Jan 07 06:11:10 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: + JAVA_OPTS='-Ddcp.net.ssl.trustStore=/configs/trustedcertificates.jks -Ddcp.net.ssl.trus
tStorePassword=changeit -Dencryption.key.file=/var/admiral/8282/encryption.key -Dinit.encryption.key.file=true -Xmx768M -Xms768M -Xss256K -Xmn256M -XX:MaxMeta
spaceSize=256m'
Jan 07 06:11:10 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: + JAVA_OPTS='-Ddcp.net.ssl.trustStore=/configs/trustedcertificates.jks -Ddcp.net.ssl.trus
tStorePassword=changeit -Dencryption.key.file=/var/admiral/8282/encryption.key -Dinit.encryption.key.file=true -Xmx768M -Xms768M -Xss256K -Xmn256M -XX:MaxMeta
spaceSize=256m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/admiral/'
Jan 07 06:11:10 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: + PID=5
Jan 07 06:11:10 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: + wait 5
Jan 07 06:11:10 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: + java -Djava.util.logging.config.file=/admiral/config/logging.properties -Dconfiguration
.properties=/admiral/config/dist_configuration.properties,/configs/config.properties -Ddcp.net.ssl.trustStore=/configs/trustedcertificates.jks -Ddcp.net.ssl.t
rustStorePassword=changeit -Dencryption.key.file=/var/admiral/8282/encryption.key -Dinit.encryption.key.file=true -Xmx768M -Xms768M -Xss256K -Xmn256M -XX:MaxM
etaspaceSize=256m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/admiral/ -cp '/admiral/*:/admiral/lib/*:/etc/xenon/dynamic-services/*' -Dservice.docum
ent.version.retention.limit=50 -Dservice.document.version.retention.floor=10 -Dxenon.ServiceErrorResponse.disableStackTraceCollection=true com.vmware.admiral.
host.ManagementHost --bindAddress=0.0.0.0 --port=-1 --sandbox=/var/admiral/ --publicUri=https://vic-st-h2-132.eng.vmware.com:8282/ --bindAddress=0.0.0.0 --por
t=-1 --authConfig=/configs/psc-config.properties --securePort=8282 --keyFile=/configs/server.key --certificateFile=/configs/server.crt --startMockHostAdapterI
nstance=false

@lgayatri
Copy link

lgayatri commented Jan 9, 2019

@DanielXiao we saw this issue for the very first time . The admiral memory is the default which comes with OVA

@DanielXiao
Copy link

The return value is 0 when JVM crashes, so admiral is not restarted by systemd.

systemctl status admiral.service 
● admiral.service - Admiral is a highly scalable and very lightweight Container Management platform for deploying and managing container based applications.
   Loaded: loaded (/lib/systemd/system/admiral.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Tue 2019-01-08 09:37:25 UTC; 23h ago
     Docs: https://vmware.github.io/vic-product/index.html#getting-started
  Process: 711294 ExecStopPost=/usr/bin/docker rm vic-admiral (code=exited, status=0/SUCCESS)
  Process: 711286 ExecStop=/usr/bin/docker stop vic-admiral (code=exited, status=0/SUCCESS)
  Process: 1811 ExecStartPost=/usr/bin/bash /etc/vmware/admiral/add_default_users.sh (code=exited, status=0/SUCCESS)
  Process: 1810 ExecStart=/etc/vmware/admiral/start_admiral.sh (code=exited, status=0/SUCCESS)
  Process: 1752 ExecStartPre=/usr/bin/bash /etc/vmware/admiral/configure_admiral.sh (code=exited, status=0/SUCCESS)
  Process: 1746 ExecStartPre=/usr/bin/docker rm vic-admiral (code=exited, status=1/FAILURE)
  Process: 1736 ExecStartPre=/usr/bin/docker stop vic-admiral (code=exited, status=1/FAILURE)
 Main PID: 1810 (code=exited, status=0/SUCCESS)
      CPU: 2.979s

Jan 07 06:11:42 vic-st-h2-132.eng.vmware.com systemd[1]: Started Admiral is a highly scalable and very lightweight Container Management platform for deploying
 and managing container based applications..
Jan 08 09:35:55 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: java.lang.OutOfMemoryError: GC overhead limit exceeded
Jan 08 09:35:55 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: Dumping heap to /var/admiral/java_pid5.hprof ...
Jan 08 09:35:59 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: Heap dump file created [896882459 bytes in 3.817 secs]
Jan 08 09:37:16 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: Exception in thread "Lucene Merge Thread #6206" org.apache.lucene.index.MergePolicy$Merge
Exception: java.lang.OutOfMemoryError: GC overhead limit exceeded
Jan 08 09:37:16 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]:         at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(Concurre
ntMergeScheduler.java:703)
Jan 08 09:37:16 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]:         at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMer
geScheduler.java:683)
Jan 08 09:37:16 vic-st-h2-132.eng.vmware.com start_admiral.sh[1810]: Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
Jan 08 09:37:25 vic-st-h2-132.eng.vmware.com docker[711286]: vic-admiral
Jan 08 09:37:25 vic-st-h2-132.eng.vmware.com docker[711294]: vic-admiral

glechev pushed a commit that referenced this issue Jan 9, 2019
Set new higher memory limit (1.2GB) for the jvm in the container.

Change-Id: I048391b2d53510c28a6d6597d5c77b03385d153f
Reviewed-on: https://bellevue-ci.eng.vmware.com:8080/54008
Reviewed-by: Antonio Filipov <[email protected]>
Bellevue-Verified: e_vcoauto_glob_1 <[email protected]>
Closures-Verified: e_vcoauto_glob_1 <[email protected]>
CS-Verified: e_vcoauto_glob_1 <[email protected]>
PG-Verified: e_vcoauto_glob_1 <[email protected]>
Upgrade-Verified: e_vcoauto_glob_1 <[email protected]>
@jitinkumar2018
Copy link
Author

We are bringing down VIC scale on the VC for RC3 testing.

glechev pushed a commit that referenced this issue Jan 9, 2019
Explicitly set flag to exit on OOM

Change-Id: Ie37734085d26866756a0aaaaa61e892e9c7156a2
Reviewed-on: https://bellevue-ci.eng.vmware.com:8080/54044
Reviewed-by: Sergio Sanchez <[email protected]>
Closures-Verified: jenkins <[email protected]>
Upgrade-Verified: jenkins <[email protected]>
Bellevue-Verified: jenkins <[email protected]>
CS-Verified: jenkins <[email protected]>
PG-Verified: jenkins <[email protected]>
@renmaosheng
Copy link

@lazarin could you please give a new build for us to generate rc4 to ask Jitin to verify today? we want to declare rtm today, thanks.

@lazarin
Copy link
Contributor

lazarin commented Jan 10, 2019

@jitinkumar2018 tag vic_v1.5.0-rc4 was published

@DanielXiao
Copy link

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants