-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cluster access for Red Hat (part 2, 1000 physical nodes) #22
Comments
+1 |
The soonest date to deliver 1000 nodes is end of Q1 '17 |
Thanks -- we will wait until then. |
Hi @cncfclusterteam -- could you please let us know the timing/likelihood of this allotment please? |
Hi @cncfclusterteam -- could you please let us know when this allotment might occur? |
Hello -- we've had some delays in getting hardware (see conversations on slack in #cluster). Also we found out that there's only ~375 nodes that could be allocated to us (we received tghem on March 14). Is there any chance our turn on this gear could be extended by 1-2 weeks? |
@cncfclusterteam ping |
@jeremyeder please continue to use the nodes allocated to have full test results |
We are still missing one important test (pod density). We are trying to complete that this week. The majority of test results have been posted here: |
@jeremyeder the nodes are yours till next Monday (10.04.2017), good luck with the test and thank you for the blog report! :) |
Hi @jeremyeder, We hope the time spent with the cluster has been productive. I am writing to inform you that we would like to clean up the nodes for next tenants. Please let us know when we can take them back to the free pool. Thank you, |
Hi @cncfclusterteam We've gotten all our data off. Thank you very much for access to the gear! |
First Name
Jeremy
Last Name
Eder
Email
[email protected]
Company/Organization
Red Hat
Job Title
Engineer
Project Title
Deploying 1000 nodes of OpenShift on the CNCF Cluster (Part 2)
What existing problem or community challenge does this work address? ( Please include any past experience or lessons learned )
We are interested in:
Working through the operational concepts necessary to handle a large bare metal scale-out environment.
Comparing the behavior of Kubernetes on OpenStack with Kubernetes on bare metal.
Run our newly developed workload generators and test suite
Utilizing newer features in Kubernetes to make use of bare metal hardware features.
Briefly describe the project
To compliment our earlier work on the CNCF lab (https://cncf.io/news/blogs/2016/08/deploying-1000-nodes-openshift-cncf-cluster-part-1) we would like to propose a full-lab scale test scenario once the CNCF lab is at full capacity. We will look to quantify improved performance when running on bare metal instead of virtualized. We will conduct some specific HTTP load testing and storage (persistent volume) performance testing.
Do you intend to measure specific metrics during the work? Please describe briefly
Yes, we will use our pbench framework https://github.com/distributed-system-analysis/pbench to capture metrics on each run. We expect this to involve Prometheus, a CNCF project, to the extent that we use it for gathering Kubernetes API server metrics.
Which members of the CNCF community and/or end-users would benefit from your work?
Kubernetes, Prometheus, end users who are looking to run high performance workloads on bare metal environments. Also fluentd if that is accepted (OpenShift uses fluentd for logging).
Is the code that you’re going be running 100% open source? If so, what is the URL or URLs where it is located?
Yes: https://github.com/openshift
Do you commit to publishing your results and upstreaming the open source code resulting from your work? Do you agree to this within 2 months of cluster use?
Yes, we have already open-sourced everything we write and we have shared significant amounts of data via blog and public-speaking engagements at industry conferences.
Will your testing involve containers? If not, could it? What would be entailed in changing your processes to containerize your workload?
Yes.
Are there identified risks which would prevent you from achieving significant results in the project ?
Not that we are aware of. We have good experience handling OpenShift at scale and we are proposing a two-phase approach where we prototype on 100 nodes (this proposal) with an adjacently-scheduled phase at full-lab scale of 1000 nodes.
Have you requested CNCF cluster resources or access in the past? If ‘no’, please skip the next three questions.
Yes.
Please list project titles associated with prior CNCF cluster usage.
Deploying 1000 nodes of OpenShift on the CNCF Cluster (Part 1)
Please list contributions to open source initiatives for projects listed in the last question. If you did not upstream the results of the open source initiative in any of the projects, please explain why.
Over 30 bugs were filed across projects such as Kubernetes, OpenShift and Ansible.
Have you ever been denied usage of the cluster in the past? If so, please explain why.
No.
Please state your contributions to the open source community and any other relevant initiatives
Red Hat is a fully open-source company. Red Hat is a platinum founding member of CNCF, a contributor to docker, kubernetes, openshift origin, and many more.
Number of nodes requested (minimum 20 nodes, maximum 500 nodes). In Q3, maximum increases to 1000 nodes.
1000 nodes. (We realize that there will be slightly less than 1000 available for us to use).
Duration of request (minimum 24 hours, maximum 2 weeks)
2 weeks at least.
Please schedule this immediately after #21 so that we can retain our existing environment, and expand it on to the additional nodes.
With or Without an operating system (Restricted to CNCF pre-defined OS and versions)?
With, RHEL7.3
How will this testing advance cloud native computing (specifically containerization, orchestration, microservices or some combination).
We are working to push beyond control plane scalability to simulate realistic bare metal scenarios. This will include loading applications that represent an accurate mix of what we have seen in the wild. Being able to do this at higher scale levels will help us to discover best practices from an architecture standpoint as well as to help validate capacity planning formulas to see if they hold up at higher scale and load levels.
Any other relevant details we should know about while preparing the infrastructure?
The text was updated successfully, but these errors were encountered: