GitHub

The PNDA distribution is available on GitHub at https://github.com/pndaproject, and consists of the following source code repositories and sub-projects.

To get started setting up your own PNDA cluster, see the getting started page in the guide.

Provisioning

  • platform-salt: provisioning logic for creating PNDA
  • platform-salt-cloud: cluster templates for creating PNDA with salt-cloud
  • pnda-heat-templates: cluster templates for creating PNDA with Heat
  • pnda-dib-elements: tools for building disk image templates

Platform

  • platform-libraries: libraries for working with interactive notebooks
  • platform-tools: tools for operating a cluster
    • bulkingest: tools for performing a bulk ingest of data
  • platform-console-frontend: “single pane of glass” giving operational overview and access to application and data management functions
  • platform-console-backend: APIs that provide data to the console frontend
    • console-backend-data-logger: APIs to ingest data
    • console-backend-data-manager: APIs to provide data
  • platform-testing: modules that test both the end to end platform and individual components and collect metrics
  • platform-deployment-manager: API to manage packages and application deployment and lifecycle
  • platform-data-mgmnt: tools to manage data retention
    • data-service: API to set data retention policies
    • hdfs-cleaner: cron job to clean up HDFS data
    • oozie-templates: templates that archive or delete data
  • platform-package-repository: manages a simple package repository backed by OpenStack Swift

Forked projects

  • gobblin: customized fork of the Gobblin data ingest framework

Producers

  • prod-odl-kafka: plugin to ingest data from OpenDaylight
  • prod-logstash-codec-avro: plugin to ingest data from Logstash

Examples

  • example-spark-batch: example batch data processing application
  • example-spark-streaming: example streaming data processing application
  • example-jupyter-notebooks: examples for working with Jupyter notebooks
  • example-kafka-clients: examples for working with kafka clients
    • java
    • php
    • python
  • example-kafka-spark-opentsdb-app: example consumer that feeds data to OpenTSDB

Documentation

  • pnda-guide: technical documentation for the PNDA project