Talend Big Data Integration
Ingest and process your big data at scale anywhere with Talend Big Data Integration— in the cloud, on-premises, or a hybrid infrastructure.
Talend Big Data Integration Platform simplifies complex integrations to take advantage of Apache Spark, Databricks, Qubole, AWS, Microsoft Azure, Snowflake, Google Cloud Platform, and NoSQL, and provides integrated data quality so your enterprise can turn big data into trusted insights. Leverage the full power and scale of your big data framework with the leading data integration and data quality platform built on Spark for cloud, hybrid and multi-cloud architectures.
- Generates native MapReduce and Spark batch code
- Visual mapping for complex JSON, XML, and EDI on Spark
- Spark and MapReduce job designer
Data Quality, Self-Service, and Governance
- Data profiling and analytics with graphical charts and drill-down data
- Automated data standardization, cleansing, and rules enforcement
- Data privacy with masking and encryption
- Cloud: Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform, and more
- Cloud Data Warehouse and Data Lakes: Snowflake, Amazon Redshift, Azure Data Lake Storage Gen2, Azure SQL Data Warehouse, Databricks Delta Lake, Google BigQuery
- Supported big data distributions: Amazon EMR, Azure HDInsight, Cloudera, Google Dataproc, Hortonworks, MapR
- Hadoop components: HDFS, Hbase, Hive, Pig, Sqoop
- File management: open, move, compress, decompress without scripting
- Control and orchestrate data flows and data integrations with master jobs
Data Preparation and Stewardship
- 2 free licenses with subscription
- Import, export and combine data from database, Excel, CSV, Parquet and AVRO files
- Export to Tableau
Management and Monitoring
- High availability, load balancing, failover for jobs
- Deployment manager and team collaboration
- Manage users, groups, roles, projects, and licenses
Big Data Quality
- Data cleansing, profiling, masking, parsing, and matching on Spark and Hadoop
- Machine learning for data matching and deduplication
- Support for Cloudera Navigator and Apache Atlas
Advanced Data Profiling
- Fraud pattern detection using Benford Law
- Advanced statistics with indicator thresholds
- Column set
Talend charges per user, not per data volumes or connectors, which makes budgets alot easier to manage.
Improved Team Collaboration
Improved Team Collaboration with a shared repository.
Data You Can Trust
With integrated Data Quality, your organisation can turn big data into trusted insights.