Cloud platform requirements

PNDA is designed to be deployed on bare metal servers, OpenStack cloud computing infrastructure or on Amazon Web Services (AWS). This guide assumes that you are familiar with OpenStack, and that you have an environment set up in which you can create instances, whether in a public or private cloud. Or in the case of AWS, have an account with AWS.

Openstack Requirements

PNDA is supported on OpenStack Kilo or later.
Instances are created using Ubuntu 14.04 or RHEL 7
PNDA is deployed using the Heat orchestration service, using heat_template_version: 2014-10-16. Alternatively, you can use Salt Cloud.
The cluster should be set up with one network and one router, and have the possibility to provision multiple virtual machines. See below.
The cluster must have access to the public Internet for installation of dependencies.

OpenStack Swift

PNDA expects two Swift containers to be present in Swift. These must be created prior to launching a cluster. The OpenStack Horizon console can be used to create Swift containers, by navigating to Object Store > Containers > Create Container.

Application Packages

Application packages are expected to be found in a pseudo-folder named releases in a Swift container called apps. This will be shared by all PNDA clusters for distributing application packages.

The Swift container and pseudo-folder path within the Swift container can be configured using the pnda.apps_container and pnda.apps_folder settings in the salt pillar.

Data Archive

Data from PNDA data sets can be archived to Swift automatically. A Swift container must be created in Swift to be used for this purpose. By default a swift container called archive should be created but this can be configured using the pnda.archive_container setting in the salt pillar.

Resources

The resource requirements for the default pico and standard flavor PNDA clusters are detailed below. However, you are strongly encouraged to create a PNDA flavor specifically designed for your infrastructure.

HEAT

Cluster configuration: Pico

Pico flavor is intended for development / learning purposes. It is fully functional, but does not run the core services in high-availability mode and does not provide much storage space or compute resource.

Role	Instance type	Number required	CPUs	Memory	Total Storage	Root Volume Storage	Log Volume Storage
`bastion`	ec2.t2.medium	1	2	4 GB	20 GB	20 GB	0 GB
`saltmaster`	ec2.t2.medium	1	2	4 GB	20 GB	20 GB	0 GB
`edge`	ec2.m3.xlarge	1	4	15 GB	30 GB	20 GB	10 GB
`mgr1`	ec2.m3.xlarge	1	4	15 GB	30 GB	20 GB	10 GB
`datanode`	ec2.c4.xlarge	1	4	7.5 GB	65 GB	20 GB	10 GB
`kafka`	ec2.m3.large	1	2	7.5 GB	30 GB	20 GB	10 GB
-	-	-	-	-	-
`total`		6	18	53 GB	195 GB

The storage per node is allocated as:

10 GB log volume (not present on bastion or saltmaster). This is provision-time configurable.
20 GB operating system partition. This is configured in the templates per-node.
35 GB HDFS (only on datanode). This is configured in the templates for the datanode.

Cluster configuration: Standard

Standard flavor is intended for meaningful PoC and investigations at scale. It runs the core services in high-availability mode and provides reasonable storage space and compute resource.

Role	Instance type	Number required	CPUs	Memory	Total Storage	Root Volume Storage	Log Volume Storage
`bastion`	ec2.t2.medium	1	2	4 GB	50 GB	50 GB	0 GB
`saltmaster`	ec2.m3.large	1	2	7.5 GB	50 GB	50 GB	0 GB
`edge`	ec2.t2.medium	1	2	4 GB	370 GB	250 GB	120 GB
`mgr1`	ec2.m3.2xlarge	1	8	30 GB	370 GB	250 GB	120 GB
`mgr2`	ec2.m3.2xlarge	1	8	30 GB	370 GB	250 GB	120 GB
`mgr3`	ec2.m3.2xlarge	1	8	30 GB	370 GB	250 GB	120 GB
`mgr4`	ec2.m3.2xlarge	1	8	30 GB	370 GB	250 GB	120 GB
`datanode`	ec2.m4.2xlarge	3	8	32 GB	1194 GB	50 GB	120 GB
`opentsdb`	ec2.m3.xlarge	2	4	15 GB	50 GB	50 GB	0 GB
`cloudera-manager`	ec2.m3.xlarge	1	4	15 GB	170 GB	50 GB	120 GB
`jupyter`	ec2.m3.large	1	2	7.5 GB	50 GB	50 GB	0 GB
`logserver`	ec2.m3.large	1	2	7.5 GB	500 GB	500 GB	0 GB
`kafka`	ec2.m3.xlarge	2	4	15 GB	270 GB	150 GB	120 GB
`zookeeper`	ec2.m3.large	3	2	7.5 GB	170 GB	50 GB	120 GB
`tools`	ec2.m3.large	1	2	7.5 GB	50 GB	50 GB	0 GB
-	-	-	-	-	-
`total`		21	94	352 GB	7.3TB

The storage per node is allocated as:

120 GB log volume (not present on bastion, saltmaster, jupyter, tools or opentsdb). This is provision-time configurable.
1024 GB HDFS (only on datanode). This is configured in the templates for the datanode.
50-250 GB operating system partition. This is configured in the templates per-node.

OpenStack Heat vs. Salt Cloud

The primary way of provisioning a PNDA cluster on OpenStack cloud is with OpenStack Heat.

Alternatively, you can use Salt Cloud to provision a cluster. In this case, you must manually create a salt master instance, and then run salt remotely on that instance. Please refer to this guide for details.

AWS

Cluster configuration: Pico

Role	Instance type	Number required	CPUs	Memory	Storage
`bastion`	t2.medium	1	2	4 GB	20 GB
`edge`	m3.xlarge	1	4	15 GB	30 GB
`mgr1`	m3.xlarge	1	4	15 GB	30 GB
`datanode`	c4.xlarge	1	4	7.5 GB	65 GB
`kafka`	m3.large	1	2	7.5 GB	30 GB
-	-	-	-	-	-
`total`		5	16	49 GB	175 GB

The storage per node is allocated as:

10 GB log volume (not present on bastion or saltmaster). This is provision-time configurable.
20 GB operating system partition. This is configured in the templates per-node.
35 GB HDFS (only on datanode). This is configured in the templates for the datanode.

Cluster configuration: Standard

Standard flavor is intended for meaningful PoC and investigations at scale. It runs the core services in high-availability mode and provides reasonable storage space and compute resource.

Role	Instance type	Number required	CPUs	Memory	Storage
`bastion`	t2.medium	1	2	4 GB	50 GB
`saltmaster`	m3.large	1	2	7.5 GB	50 GB
`edge`	t2.medium	1	2	4 GB	370 GB
`mgr1`	m3.2xlarge	1	8	30 GB	370 GB
`mgr2`	m3.2xlarge	1	8	30 GB	370 GB
`mgr3`	m3.2xlarge	1	8	30 GB	370 GB
`mgr4`	m3.2xlarge	1	8	30 GB	370 GB
`datanode`	m4.2xlarge	3	8	32 GB	1194 GB
`opentsdb`	m3.xlarge	2	4	15 GB	50 GB
`cloudera-manager`	m3.xlarge	1	4	15 GB	170 GB
`jupyter`	m3.large	1	2	7.5 GB	50 GB
`logserver`	m3.large	1	2	7.5 GB	500 GB
`kafka`	m3.xlarge	2	4	15 GB	270 GB
`zookeeper`	m3.large	3	2	7.5 GB	170 GB
`tools`	m3.large	1	2	7.5 GB	50 GB
-	-	-	-	-	-
`total`		21	94	352 GB	7.3TB

The storage per node is allocated as:

120 GB log volume (not present on bastion, saltmaster, jupyter, tools or opentsdb). This is provision-time configurable.
1024 GB HDFS (only on datanode). This is configured in the templates for the datanode.
50-250 GB operating system partition. This is configured in the templates per-node.

Platform requirements

Cloud platform requirements

Openstack Requirements

OpenStack Swift

Application Packages

Data Archive

Resources

HEAT

Cluster configuration: Pico

Cluster configuration: Standard

OpenStack Heat vs. Salt Cloud

AWS

Cluster configuration: Pico

Cluster configuration: Standard

results matching ""

No results matching ""