Hello All,
I need below information about Apache Kafka tool for data integration and ETL needs:
Development effort:
The development effort , time and complexity is more in general?
Maintainability:
Is it less maintainable?
Error Handling:
Only possesses a single log file? or possesses a log and error port in every transform?
What kind of errors can be handled?
Various teams needed:
Separate Administration team or Unix or NT Admin will suffice needed works. hence it does not need a dedicated administer?
File Structure:
Only able to read record with single type of delimiter?
Data Integration Capability:
ODI boasts comparatively lesser range of Data Integration Products and capability which includes many related functions such as profiling and data quality ? Also, if it offers these capabilities then these are more mainstream in nature?
Market Segments:
Serves medium to large scale companies?
Debugging:
Is it offer easy debugging? Example -just place some watchers on required places and intermediate data will be saved in temporary files for easy viewing. or complex debugging process through debugger?
Company Strategy:
You can download a scaled down free version of their software and plenty of free documents available on internet?
Go live rate:
High “GO Live” success? any know issue during deployment?
Scalability:
Is there any issue with stability? If yes then why is the issue and what is impact?
Which kind of scalability is supported- horizontal, vertical?
Performance:
Can it supports High volume of data movement, transformation and integration (ETL operations)?
How about parallelism - mapping level parallelism, session level parallelism, supports multiple parallel source and multiple target data loads?
Heterogeneous system:
It integrates data from various heterogeneous systems like multiple variety of databases (SQL server, Oracle, DB2 etc), files (XML, XLS, CSV, text etc)?
Targets can be any type of DB , file etc.?
Big Data support:
It can be integrated and used for Big Data?
On cloud solution:
It is available for both- on cloud and on premises platforms?
Pricing:
Is it free ware - open source? Does it come in basic, standard and enterprise editions flavors? If yes , all flavors are free?
Repository:
Does it offers repositories ? Those repositories are for metadata?
Host for repositires should be relational database?
Push down mechanism:
Do we have pushdown optimization concepts, where it can generate SQL statements from the workflow/mapping which can be directly executed on database?
It is ETL or ELT tool?
Job scheduling:
Does it come with in-built scheduler?
Version controlling:
Does it offer version controlling?
If yes then it is tightly controlled or moderate?
Tool Bugs:
Any known tool bugs? Any issue due to those bugs?
Anything else you want to highlight?
Thanks,
Rajneesh
Top comments (0)