NDAP provides a data ingestion service that simplifies and automates the difficult and time consuming task of building, running, and managing data pipelines.
NDAP provides an easy and interactive way to visualize, transform, and cleanse data. It helps to derive new schemas and operationalize the data preparation with a few clicks.
As an integrated application development framework, NDAP provides standardization and deep integration with diverse big data technologies with easy-to-use APIs to build, deploy and manage complex data analytics applications in the cloud or on-premises.
Metadata & Lineage
NDAP automatically captures technical, business and operational metadata and tracks lineage by understanding changing datasets and flow of data. It provides an audit log for easy traceability for data quality and compliance needs.
Security & Operations
NDAP offers sophisticated security, authentication, authorization and encryption. It provides a robust and portable production runtime environment for secure deployment and management of data lakes and data applications on Hadoop and Spark.
Developer SDK and APIs with abstractions over common data processing patterns; Sandbox mode, programmatic and UI driven debugging; In-memory mode and testing framework to simplify testing; Support for cutting edge Cloud, Apache Hadoop and Apache Spark technologies.
Metadata repository with automatic technical and operational metadata capture; Business metadata annotations; Data discovery through search based on metadata; Data governance with dataset and field level lineage and auditing; Integration with enterprise security systems.
REST APIs for every interaction; Time and process-based scheduling; Standardized logs and metrics for all execution environments.
Portable Runtime Environments
Build once, run anywhere through portability across runtime environments such as Apache Hadoop YARN and Docker.
Extensible and Reusable
Templates and blueprints for common use-cases; Hub for sharing pre-built plugins, applications and solutions; Extensible APIs for security, metadata, runtimes and storage.
Hybrid and multi-cloud
Interoperability across on-premises and Cloud environments; Support for all major public cloud providers such as Amazon Web Services, Microsoft Azure and Google Cloud Platform.
Integrated With All Data
Pipelines provide connectors to relational databases, flat files, mainframes, cloud services, NoSQL, and more.
Through portability across on-premises and public cloud environments.
Pipelines reduce complexity through a graphical interface, code-free transformations, and reusable templates.
Improved Data Trustworthiness
Through data quality libraries, metadata and lineage capture, audit logging.
Wrangler allows you to visually and interactively cleanse and prepare raw data, with the aim of making it consumable for further processing. It provides a standardized UI driven interactive flow that takes the pain out of preprocessing tasks for data engineering, data science and data analysis.
Code Free Transformations
Interactive, code-free transformations with feedback at each step using a powerful graphical UI
Extensible, comprehensive transformation library
Comprehensive library with over 1000+ built-in transformations; Extensible API for adding more transformations
Comprehensive Data Source Support
Built-in connections to popular cloud and on-prem data sources such as relational databases, file systems, object stores such as AWS S3 and Cloud Storage, Kafka, NoSQL stores
Operationalization Using Pipelines
One-click pipeline creation for creating scalable and reliable pipelines for mission critical environments
Automatic data quality and profiling
Data quality indicators for determining data quality; data quality library for improving trust and quality; profiling to understand data distribution and column relationships after every transformation
Analytics provides a simple, interactive, UI-driven approach to machine learning. It provides a seamless, automated interface for users to easily develop, train, test, evaluate and deploy their machine learning models. It reduces the need for ad-hoc custom tooling and promotes reusability and collaboration.
UI Driven Data Wrangler and Cleansing
Seamless, integrated experience from data preparation and cleansing to model development, evaluation and deployment.
Support for Popular ML Libraries
Out of the box support for common ML libraries such as SparkML.
Scoring Plugins for Running Predictions
Built-in scoring plugins take you from model development to running predictions on data in a few seconds.
Integrated metrics and visualization that provides rich summaries and graphs for evaluating model performance.
Automated Training and Test Data Split
Automated splitting into training and test datasets reduces the need for custom tooling.
Switches and knobs for advanced users to tune model performance using hyperparameters.
Rules Engine provides a way for business analysts to create and manage a knowledge base of data transformation rules that need to be automatically applied to your data. It contains an intuitive UI for business analysts to set up business rules that can be executed in a data pipeline.
Code-free Business Rules UI
Intuitive, code-free UI for business analysts to build and manage transformation rules
Available as a library to integrate with JBoss, Spring, WebLogic and SQL tools.
Centralized repository of policies and transformation rules.
Flexible and Intuitive Rules Management
Easy organization and grouping of rules using rulebooks; Intuitive business UI for creation and management of rules and rulebooks.
Integrates with NDAP data pipelines for horizontal scalability.